March 17th, 2004


Rising (Update #3)

Got up yesterday at 9-ish, after 12 hours of sleep. Lilian had barely slept at all, but was actually pretty functional. She had a massage while I checked LJ (and made yesterday's update) and then we headed townwards. The idea was to go swimming, so she phoned Stanford Uni (where she recently gave a paper at a conference) to find out if she was allowed to use their pool. They told us that we'd have to buy day passes, but that was fine, so we headed into town. Pausing briefly at a cool second-hand bookstore to buy reading material for the day and at a fantastic bakery for breakfast, we then headed in the direction of the unversity via the free bus service.

The campus was very pretty, marred only by the fact that we'd been lied to on the phone and weren't allowed to use the facilities at all. So we headed on to the nearby shopping complex, wandered around the shops for a bit (Lilian's tiredness kicked in again, so I had to dissuade her from buying lime-green capri-pants in a moment of stark madness) and then headed home again. That seemed to take up most of the day...

I slobbed about a bit in the evening, reading more of Cryptonomicon, and then we headed out to dinner with spikeiowa, whumpdotcom, cynthia1960 and the sadly LJ-deficient Donya. Much was discussed, including metadata, further to which a post will shortly be made, which you will all ignore.

Came home and collapsed into unconsciousness, woke up at 8-ish, read Buddha for Beginners and then came out to check on LJ.

Plans for today are to head to the museum of computing for a guided tour, and then possibly pop to Frys, where I will buy a USB charging cable for my Palm (hopefully).

Oh, and to waste half an hour writing an entry none of you will read.



You may have noticed that you're drowning in data at the moment. Your hard drive is full of it, your inbox is full of it, you almost certainly get swamped by a huge wave of it from the internet every time you open your browser, newsreader or RSS-reader. Sure, you're interested in some of it, but how do you find the bits you want?

The answer is metadata.

Metadata is the data that's associated with your data. It varies from the very simple (the filename of your picture) to the extremely complicated (the relationships held in customer relationship management systems) via things like ID tags for MP3 files.

If you use any kind of halfway decent MP3 player then you can search for songs by artist, album and track. If you use Windows Media Player you can also rate songs and then search by rating. There are programs that allow you to search by mood as well, or by beat frequency. If you think of your music collection as a big mishmash of data, the metadata allows you to slice through it any way you like, turning it this way and that until you find the songs you want.

The same cannot be said for your photo collection. If you want to find the photo of uncle Bob at Jane and Guy's wedding last year, unless your photo collection is carefully sorted into folders so that you can go to pictures\2003\Weddings\JaneAndGuy\UncleBob3.jpg, you're never going to be able to find it without trawling through dozens of files called MyPic0001.jpg.

And if you want to look at all photos taken at weddings, or all photos taken of Uncle Bob, you're shit out of luck. Unless, of course, you have metadata! Which is where Microsoft's Next Gen file system comes in. It will allow you to tag any file with huge swathes of information - where it was taken, when it was taken, who took it, who's in it, what they're doing, why they're doing it, etc. This will allow you to instantly find every photo of you taken between 2001 and 2003 where you were drunk and happy. Now isn't that a great use of technology?

Which is where the conversation started with whumpdotcom last night. We basically agreed that metadata isn't going to work. Because when you've taken 30 pictures, the last thing you want to do is sit down for half an hour and apply a dozen keywords to each one. Let's put it another way - you just aren't going to do it. If you're lucky you'll call each one something halfway meaningful like UncleBob.jpg and just assume you'll be able to find it later. And you're pretty smart aren't you? Imagine what the _average_ person will do. He'll never find his files, because they'll still be called myPic001.jpg.

The answer, of course, is a Wizard that pops up whenever you upload a new photo and asks you a dozen questions about your photo. "Hi! It looks like you're trying to save a photo. Would you like to categorise it?" "First question - is it porn?", because nobody likes a disorganised porn collection. But how many times will people be willing to go through that process before they click the "never speak to me again, you evil paperclip from hell!" button?

Of course, it's just a photo collection. You'll either be organised, or you won't, and it doesn't really matter that much, does it? But what happens when the metadata is being applied to business documentation? Are we going to be able to search for "All specifications for the Bumstead Project that impact on regulatory requirements"? Or are we still going to be pressing F3 and searching files for the forseeable future?

On a larger scale - what about the Semantic Web? This takes the idea of metadata and applies it to the wole of the Intarweb. Which would be great, wouldn't it? You'd be abble to ask meaningful questions and get back reasonable results that contained the answers all neatly categorised and sorted.

If only someone would go through the internet and categorise every web page with details of what it does, what it applies to, who wrote it, when it was written, why it was written and what it might be useful for. And then keep that information up to date.

Any volunteers?


Going forth (update #4)

Woke up at 8-ish this morning. Lilian finally made it to sleep, so I left her to it and read LJ/updated. Ate banana on toast. How rock'n'roll am I?

Lilian woke up at 12:20, which was lucky, as I was going to wake her at 12:30 so that we could head off at 12:45 when spikeiowa arrived. Which she duly did, a mere 5 minutes early.

When our loins were girded and our tea was drunk we headed in the direction of the Museum o Computer History, where we were fully entertained for a couple of hours by a guide who clearly knew his business, liked computers and used to work for IBM. They had everything from abacuses to a chunk of ENIAC to a Cray 1 (which Lilian got a photo of me with). It was fantastic, and I highly recomennd that anyone visiting SF give it a look.

Afterwards we headed to Target, where I bought sandals, then Frys, where I got a USB cable for my Palm and Lilian bought a new camera, then to a mexican restaurant for dinner (nice, but I was able to eat about half, as I was stuffed). Lilian and Spike gave me a good briefing on what to expect at Corflu in Vegas this weekend. It sounds like the convention bit will be fun, although we both intend to spend a fair amount of time lounging by the pool and investigating ridiculous ongoings on The Strip. Las Vegas is possibly the most fake place on the planet, something I intend to explore fully, to get all the fake out of my system if I possibly can.

And now home, feeling rather exhausted but happy. Not sure about updates/email for the next few days - it depends on the internet access in Vegas.

catamorphism - we touch down at 6:30 on Tuesday. We'll be over ASAP after that. Looking forward to meeting you then.