Since the theme of eTech is 'remix', I've felt free to do so. So my time has been a collage of cherry picked sessions, a couple of walks through sunny San Diego, and interspersed work on material for a venture still in stealth. My cherry picking has been a union of the ontologies (semantic web) theme from last year, now morphed, and what has become known in the last year as the 'Long Tail' of citizens' media and network economics. My nose says this is yeasty territory (contra Joe Kraus' remarks re VCs and the Long Tail a few minutes past).
Clay Shirky's talk should have finally pounded a stake into the heart of the "one true ontology" myth. To be sure, we had substantial proof of this in the information retrieval experiments of the 70s, but it needed to be done for this generation, in this open network, at this scale. The net is the antithesis of the controlled, bounded, mediated information realms in which controlled vocabularies or other received classification schemes get some traction. OK, there isn't going to be one canonical librarian, called Yahoo or anything else. So what are we actually getting under the gloss of 'tagging', if it's actually an uncontrollable soup, and what might we make of it.
The most provocative part of Clay's pitch - for me at least - were statistics of tag use at the item, user, and community level in the del.icio.us interface. If you've been paying attention, it'll be no surprise at all that each graph is a Zipf's Law curve, that is, it shows the Long Tail phenomenon. No surprise to the sociologists and linguists, but tag use in a live community is behaving just the same as a natural language. Call it 'folksonomy' if you will, but that's what is going on.
That way lies Babel. Just an infinite regress as we recapitulate what has already been built many times. Right? Well, no. As much as I dish the fallacy of tags as a basis for the one-semantics-to-rule-them-all, there is enormous potential in watching the evolution of social language as a live process, as a matter of behavior, using tagging systems as the agar plate. Even better (OK, I have an ax to grind), we can experiment with the economic implications of that evolution.
Because the Long Tail curve is so fascinatingly simple, it's easy to overlook that it summarizes the outcome of a plethora of small Zipf's law curves, each a snapshot of the activity of a community of interest of some sort (which Chris Anderson is calling ''minitails' at this very moment). As open networks and media allow the freedom to individuate one's information consumption and creation, more and more of these virtual communities congeal, and create their own languages. Now we can watch.
Anyone who has worked in a group knows they evolve their own languages. A fully spun up product team can speak in private code words in an open restaurant - even in Silicon Valley - with a good chance of being incomprehensible to the competitors at the next table. So with virtual communities, from geocachers to shooters as well as the current del.icio.us crowd. Tags may let us watch that messy process. And if you are trying to figure out what is actually going on there, to serve that community with useful goods and services - you'd be better served by understanding that community lingo that by approaching with one-size-fits-all semantics (or products, or services, for that matter).
One specific intriguing bit emerging from del.icio.us is the to_read tag. It's mere existence is a rebuke to the taxonomic view of tagging - it proposes an act. Anyone else out there remember Winograd and Flores' speech act theory. Nice academic theory, transmuted into commercial disaster in the form of a 'groupware' product called The Coordinator. Turns out people don't like to be told how to speak (fancy that). Do I detect the beginnings of a process by which the speech acts may emerge from the community itself? How about a service to facilitate that?