Ugarit archive mode manifest maker (by alaric)
When I last wrote about Ugarit progress, I had developed archive mode to the point where one could import a list of files with metadata from a "manifest file", and then search for files based on the metadata from the manifest and stream out chosen files. I gave an example of using this to play MP3s matching a search pattern:
[alaric@ahusai ugarit]$ for i in `ugarit search test.conf music '(= ($ artist) "UNKLE")' keys`; do ugarit archive-stream test.conf music $i | mpg123 -; done
Well, that was all based on hand-written manifest files, which are no fun to produce (our music collection is large). As such, I've been working on a "manifest maker" that takes a list of files and directories and makes a manifest file from them, recursing down through directories to list all the files. And for each, it automatically extracts metadata into the manifest file, which can then be hand-edited if required, and then used to import from.
The idea is that the manifest maker will have support for a number of file types it knows how to extract additional metadata from, and the first one I've implemented is ID3 tag extraction from MP3s. I've implemented the ID3 V2.2 and ID3 V2.3 specs, as those were the two that I found present in the subset of my MP3 collection I'm testing against!
For example, here's the output it produced for one of my MP3s:
(object "./test-data/THE HOLLIES - He Ain't Heavy, He's My Brother.mp3" (filename = "THE HOLLIES - He Ain't Heavy, He's My Brother.mp3") (mime-type = "audio/mpeg") ;; Unknown ID3 tag "COMM"="engiTunNORM\x00 00000402 00000000 00001B59 00000000 00004E65 00000000 000040EC 00000000 00015FD5 00000000" (keyword = "Pop") (name = "He Ain't Heavy, He's My Brother") (creator = "THE HOLLIES") (creation-date = "2002") #;(featuring = "") (collection-name = "Legends CD2") #;(collection-volume = "") #;(collection-volumes = "") (volume-index = 16) (volume-size = 18) (mtime = 1428948696.0) (ctime = 1428948696.0) (size = 4063360))
It prints out unknown ID3 tags as comments, in case a human can glean some useful information from them to put into the metadata, and it suggests the names of metadata tags I might be able to provide by hand that it hasn't found (in this case, a tag for other people featured in the music, and two for indicating that this album is part of a set. As it happens, it is, as the "CD2" in the name suggests, but it wasn't indicated in the ID3 so I'll have to hand-edit it; likewise, the date from the MP3 of 2002 is clearly for the production of the album, not that classic track... ID3 metadata is often a bit shabby!). Also included are file mtime, ctime, and size in bytes.
I hope to add Ogg Vorbis metadata next; I'd like to add EXIF support to parse information out of the JPEGs in our vast family photo library, but it looks much harder, and I'm not sure how useful it will actually be!