November 22, 2002
I have a radical proposal for a ubiqitous content syndication format, applicable for almost any purpose, but extremely well suited for weblogs. It's extremely simple to implement, either by software or by hand, works already in millions of clients that are very forgiving of misformed or omitted data, and is human readable both in source and output formats. Even better, it doesn't require any additional work to create the syndication format when creating your website.
My new syndication format is called XHTML. I propose that existing syndication and aggregation clients should be able to read an HTML file, detect if it has the appropriate XHTML doctype, and then render the contents of each XHTML node in the appropriate place in the client's display. All that would be needed is standardization of names and classes for page elements like DIVs and headers. A post/entry title would always be an H3, with a class set to "title", for example. Permanent links would always be P tags with their classes set to "permalink". Simple.
Content authors shouldn't have to make two versions of all their content just because people are lazy in the way they make their client software. Valid XHTML is a hierarchial outline of content, presented in a machine-parsable manner. Augment this XHTML with proper use of link tags for navigation, and the loss of the page cruft that surrounds the content on a typical HTML page wouldn't be missed at all. Even better, an XQuery-based search engine could give you a Google engine that returned relevant entries from a site, instead of an entire page, therefore rewarding people who go through the effort by making them participants in a new, better targeted web that's fully backwards compatible. Existing pages could probably be rewritten by proxies, if the authors are unable or unwilling to reflag their content. Simply iterate through the nodes in the body of the document, find the highest-level node that repeats and contains other content, and you've got the pattern that delimits individual entries. Or look for # named anchors that suggest that they're permalinks. Transform those through XSLT into elements with a predictable set of names.
So, the proposal? A documented standard set of XHTML element names targeted at standardizing class names for page elements, in order to allow HTML to serve as a syndication, aggregation, and distribution format, in addition to being a page rendering format. The side benefit would be that any tool that produced compliant code would probably also be able to share style sheets with other compliant tools, as page elements with the same name would inherit the appropriate styles. A lot easier than forcing tools to output multiple versions of content each time a page is changed. And adaptable to situations like a newspaper, so that articles using the naming convention could also appear in an aggregator.
Rich XML-based descriptions of content are great, and will always have their place. But for something as simple as syndication and distribution, HTML already has an overwhelming advantage over any nascent formats. Who wants to propose a set of basic tags?
I don't pretend to understand the various formats and protocols that make this, and other, sites work. I have a Read More
How much power should we allow the CLASS attribute to have in providing semantic structure? <h3 class="BlogPostTitle">First p0st!</h3> Consider the Read More
Here's a proposal on a standard XHTML format for blogs to do away with RSS. It makes sense. Why bother Read More
TITLE: as days pass by URL: http://www.kryogenix.org/days/000372.cas IP: 188.8.131.52 BLOG NAME: XHTML instead of RSS DATE: 11/24/2002 05:33:28 PM Read More
Anil Dash proposes to use XHTML itself for content syndication. He calls for a standardised way of posting entries to a weblog - "All that would be needed is standardization of names and classes for page elements like DIVs and headers. A post/entry tit... Read More
Interesting conversation happening at dashes.com re. Anil's suggestion that XHTML would make a perfect syndication format. There are posters there Read More
XHTML vs. the World, as seen by Tantek Celik. Base of the whole article is against RSS, and how using Read More
It looks like people are starting to wake up to the notion that XML is, well, extensible. You don't need separate syndication and archiving formats. You don't need separate syndication and display formats. The most extreme example I have seen to d Read More
I've been reading a lot lately about using XHTML instead of RSS for syndication of a site in a new Read More