syndication formats?

I have a radical proposal for a ubiqitous content syndication format, applicable for almost any purpose, but extremely well suited for weblogs. It's extremely simple to implement, either by software or by hand, works already in millions of clients that are very forgiving of misformed or omitted data, and is human readable both in source and output formats. Even better, it doesn't require any additional work to create the syndication format when creating your website.

My new syndication format is called XHTML. I propose that existing syndication and aggregation clients should be able to read an HTML file, detect if it has the appropriate XHTML doctype, and then render the contents of each XHTML node in the appropriate place in the client's display. All that would be needed is standardization of names and classes for page elements like DIVs and headers. A post/entry title would always be an H3, with a class set to "title", for example. Permanent links would always be P tags with their classes set to "permalink". Simple.

Content authors shouldn't have to make two versions of all their content just because people are lazy in the way they make their client software. Valid XHTML is a hierarchial outline of content, presented in a machine-parsable manner. Augment this XHTML with proper use of link tags for navigation, and the loss of the page cruft that surrounds the content on a typical HTML page wouldn't be missed at all. Even better, an XQuery-based search engine could give you a Google engine that returned relevant entries from a site, instead of an entire page, therefore rewarding people who go through the effort by making them participants in a new, better targeted web that's fully backwards compatible. Existing pages could probably be rewritten by proxies, if the authors are unable or unwilling to reflag their content. Simply iterate through the nodes in the body of the document, find the highest-level node that repeats and contains other content, and you've got the pattern that delimits individual entries. Or look for # named anchors that suggest that they're permalinks. Transform those through XSLT into elements with a predictable set of names.

So, the proposal? A documented standard set of XHTML element names targeted at standardizing class names for page elements, in order to allow HTML to serve as a syndication, aggregation, and distribution format, in addition to being a page rendering format. The side benefit would be that any tool that produced compliant code would probably also be able to share style sheets with other compliant tools, as page elements with the same name would inherit the appropriate styles. A lot easier than forcing tools to output multiple versions of content each time a page is changed. And adaptable to situations like a newspaper, so that articles using the naming convention could also appear in an aggregator.

Rich XML-based descriptions of content are great, and will always have their place. But for something as simple as syndication and distribution, HTML already has an overwhelming advantage over any nascent formats. Who wants to propose a set of basic tags?

Explore This Site

Recent Entries

  • Monoculture Is Bad For Business

    It's been demonstrated over and over again, but businesses refuse to learn the lesson: Homogeneity is its own punishment in the world of business. From...

  • The Difference Between Lemons and Limes

    A few weeks ago, I asked the people who follow my Twitter account to describe the difference between lemons and limes. My immediate prompt was...

  • Phones are For Hardcore Gamers

    Please (re-)visit Dan Cook's seminal Nintendo's Genre Innovation Strategy essay from 2005. It's chock-full of his signature revelatory insights, in this case inspired by the...

  • How To Get Windows

    If you'd like to open up the package for your licensed copy of Microsoft Windows Vista, you only need to follow these three helpfully-illustrated steps....

  • Fonts for Contemporary Use

    In a blog post that I wrote for work today, I had occasion to use an interrobang as part of a title. Hooray! A chance...

1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
  Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan
  Feb Feb Feb Feb Feb Feb Feb Feb Feb  
  Mar Mar Mar Mar Mar Mar Mar Mar Mar  
  Apr Apr Apr Apr Apr Apr Apr Apr Apr  
  May May May May May May May May May  
  Jun Jun Jun Jun Jun Jun Jun Jun Jun  
Jul Jul Jul Jul Jul Jul Jul Jul Jul Jul  
Aug Aug Aug Aug Aug Aug Aug Aug Aug Aug  
Sep Sep Sep Sep Sep Sep Sep Sep Sep Sep  
Oct Oct Oct Oct Oct Oct Oct Oct Oct Oct  
Nov Nov Nov Nov Nov Nov Nov Nov Nov Nov  
Dec Dec Dec Dec Dec Dec Dec Dec Dec Dec