Further progress on Ugarit archival mode (by alaric)
Further to my last post on the matter, I've been working on the basic user interface to accessing archive metadata.
As before, let's do an import to an archive tag in a vault. I've made a manifest file with three MP3s in - all data that could be extract from ID3 tags, and I plan to write a tool to automate the generation of manifests by examining their contents in exactly that manner, but for now I had to hand-write one:
[alaric@ahusai ugarit]$ cat test.manifest (object "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3" (title = "Be There") (track = 13) (artist = "UNKLE") (album = "Psyence Fiction")) (object "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3" (title = "Rabbit in Your Headlights") (track = 11) (artist = "UNKLE") (album = "Psyence Fiction")) (object "/home/alaric/archive/sorted-music/Led Zeppelin/Remasters/1-09 Celebration Day.mp3" (title = "Celebration Day") (track = 9) (volume = 1) (artist = "Led Zeppelin") (album = "Remasters"))
As before, I import it, loading the files into the content-addressible storage of the vault, automatically deduplicating, and possibly storing the data on a cluster of remote servers (although in this case, I'm just using a local vault). This was done with Ugarit revision [80b324f3af]:
[alaric@ahusai ugarit]$ ugarit import test.conf music test.manifest Loading manifest file test.manifest... Importing from test.manifest to tag music... Importing /home/alaric/archive/sorted-music/Led Zeppelin/Remasters/1-09 Celebration Day.mp3... ...imported with key 4d64e4650333741cb56c3e6a785b6de4d23324cb1055e529 Importing /home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3... ...imported with key 370bee7debb458357a2b879014d4abbeb409215ed269c1c6 Importing /home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3... ...imported with key 39df8bafd530a66614ad60ab323033b1385cdd842528dbd2 Committing import... Imported successfully to tag music with import key ac26354ccfb0530109932c1aaddd414b59d4394d44ec43cd Written 16MiB to the vault in 24 blocks, and reused 0B in 1 blocks (before compression)
But now it's in, we can query the metadata. Firstly, let's see what properties are available - a combination of the ones we wrote in the manifest, and automatically-generated ones such as a MIME type and the original import path:
[alaric@ahusai ugarit]$ ugarit search-props test.conf music album artist filename import-path mime-type title track volume
Let's see what values there are for the "artist" property:
[alaric@ahusai ugarit]$ ugarit search-values test.conf music artist UNKLE Led Zeppelin
(they're sorted by popularity, and we have two UNKLE tracks, so that comes first)
Let's see what UNKLE albums we have, by filtering for objects with an artist property of "UNKLE" and asking what values of the "album" property are available:
[alaric@ahusai ugarit]$ ugarit search-values test.conf music '(= ($ artist) "UNKLE")' album Psyence Fiction
Let's see what we know about music by UNKLE:
[alaric@ahusai ugarit]$ ugarit search test.conf music '(= ($ artist) "UNKLE")' object 39df8bafd530a66614ad60ab323033b1385cdd842528dbd2 (album = "Psyence Fiction") (artist = "UNKLE") (filename = "13 Be There.mp3") (import-path = "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3") (mime-type = "audio/mpeg") (title = "Be There") (track = 13) object 370bee7debb458357a2b879014d4abbeb409215ed269c1c6 (album = "Psyence Fiction") (artist = "UNKLE") (filename = "11 Rabbit in Your Headlights.mp3") (import-path = "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3") (mime-type = "audio/mpeg") (title = "Rabbit in Your Headlights") (track = 11)
Ok, let's listen to all our music by UNKLE (the extra "keys" parameter to the search command says to just output the object keys, one per line, and the "archive-stream" command streams the contents of an archived file to standard output):
[alaric@ahusai ugarit]$ for i in `ugarit search test.conf music '(= ($ artist) "UNKLE")' keys`; do ugarit archive-stream test.conf music $i | mpg123 -; done
...music by UNKLE plays...
We're slowly moving towards having a usable and useful archival filesystem, backed on a modular content-addressible storage system! Isn't that neat? Of course, it's not amazingly useful as it stands - at first sight, it's like a very crude version of the browser found in any modern music collection management app these days; but this is the seed of something much more interesting. For a start, it can categorise files using any user-defined schema. The backend storage can be encrypted, and accessed remotely over a network (and, in future, replicated over a cluster, or mirrored between your laptop and a home fileserver, and automatically synchronised when they're connected). The same storage can be used to store backup snapshots as well as archives, and if files exist in any combination of archives and snapshots, then only one copy of it will be stored (or need uploading, even); most files in an archive will have started off in a backed-up directory tree, or will be extracted into one.
There are many interesting use cases for Ugarit, but my personal one is to have a fault-tolerant vault of all the data that matters to me, neatly organised so I can find things quickly, and so I can access things from different locations (even when offline). Rather than having files scattered over different disks on different machines, and having to move things around to make space, and remember where they are, I can add more disks to the vault when I need more capacity, and have Ugarit manage everything for me. With the amount of data I manage, that'll be a great weight off my mind!