Monday, August 24, 2009

Dereferencable Identifiers

A document URL is a dereferencable document identifier. We use URLs all over the Web to identify HTML pages and other web resources. When you can't give out a brochure you can share a URL. Instead of sending a large email attachment, you might just send a URL instead. Rather then creating long appendixes, you can simply link to other resources. It is so much more useful to pass around URLs then it is trying to transfer entire documents around.

This model has worked well for document and is now being adopted for other type of resources. With the popularity of XML, using URLs to identify data resources is now commonplace. Rather then passing around a complete record, agents pass around an identifier that can be used to lookup the record later. By using a URL as the identifier these agents don't need to be tied to any single dataset and are much more reusable.

From the HTML5 standardization process has risen the debate on the usefulness of URLs as model identifier. Most people agree that a URL is a good way to identify documents, web resources and data resources. However, the debate continues on the usefulness of using a URL as an identifier within a model vocabulary. One side claims that a model vocabulary should be centralized and therefore does not require the flexibility of a URL. The other side claims the model vocabulary should be extensible and requires a universal identifying scheme that URLs provide.

To understand the potential usefulness of using a URL as a model identifier, consider the behaviour difference between a missing DTD and a missing Java class. A DTD is identified using a URL and a Java class is not. When an XML validator encounters a DTD it does not understand it dereferences the identifier and uses the resulting model to process the XML document. When a JVM encounters a Java class it does not understand it throws an exception, often terminating the entire process. Now consider how much easier it would be to program if a programming environment used URLs for classes and model versions. Dependency management would become as simple as managing import statements. As the Web becomes the preferred programming environment of the future, we must consider these basic programming concerns.

Although I enjoy working in abstractions, I certainly understand how things always get more complicated when you go meta: using URLs to describes other URLs. However, this complexity is essential to continue to maintain the flexibility and extensibility of the Web.

See Also: HTML5/RDFa Arguments


Reblog this post [with Zemanta]

No comments:

Post a Comment