« UN Parallel Texts Collection | Main | What I Read On My Winter Vacation »

January 09, 2005



I’m considering another “requirement”: that it be usable today without having to modify blogging tools. The biggest hurdle I can see is requiring modifications in the back end API (e.g. the Blogger API or MetaWeblog API) in order to get into use. Why not do something that can be (hand) entered into the blog body today?

I’ve been digging a bit more into the “Semantic XHTML” (which Kevin Marks brought to our attention): basically the (oversimplified) idea appears to be this: hack the “id” attribute on markup elements and the “rel” attribute on the [a] element, and start defining sets of conventions (e.g. XOXO ) for the values of those attributes and what structure is implied by what kinds of nested markup.

So, have a look at a test post I did: (do a “View Source” and search for “rosettabot”)
It all comes thru on the Atom feed.

The format here is for illustration only – NOT a proposal - just to show the KIND of thing that can be done, and how flexible it can be.

FWIW this enters easily in Blogger and validates as XHTML 1.0 Strict (er, except for a badly formed URL that Blogger generates, not my fault). My reading of the DTD says I should be able to use lists in the body of the document, so I don't see why I need the [object] element, but the validator complains if it's not there. However, it does serve nicely as a container, and allows the use of the [parameter] element for adding hidden name/value pairs.

What’s interesting is that the hacks which are the basis of “Semantic” XHTML actually allow some pretty flexible formats: using XPath specifications for the definition of the parameters (as opposed to doing DTD-like "profiles") could allow people flexibility in how they format the markup.

Tomorrow maybe I work on translating some bike racing news.

These guys might be interested in hosting the profile: gmpg. Can’t get much lighter weight.

As a step toward making the addition of structured translation metadata easier, some kind of javascript aids, similar to existing pseudo rich text format aids in comment edit boxes, could be added to the template.


Tim Oren

See my trial translation post. Has a regular A type link to the original in the title (want to see if anyone complains of that breaking their RSS reader). Also has the same information as a twiddle to your OBJECT format embedded at the beginning of the post. Typepad seems to gobble up the latter without complaint, though inserts some spurious extra spaces.

The comments to this entry are closed.