Thursday, January 15, 2009

Validating RDF

RDFS/OWL is criticized for its weak ability to validate documents in contrary to XML, which has many mature validation tools.

A common confusion in RDF is the rdfs:range/rdfs:domain properties. A property value can always be assumed to have the type of the rdfs:range value. This is very different to XML, which only has rules to validate tags, but cannot conclude anything. Many of the predicates in RDF are used for similar inferencing, but they lacks any way to validate or check if a statement really is true. This is a critical feature for data interchange, which RDF is otherwise well suited for.

To address this limitation, an RDF graph can be sorted and serialized into RDF/XML. With a little organization of statements, such as grouping by subject, and controlled serialization, common XML validation tools can be applied to a more formal RDF/XML document. Our validation was done with relatively small graphs and we restricted the use of BNodes to specific statements to ensure similarly structured data would produce similar XML documents.

Although TriX could also have been used (it is a more formal XML serialization of RDF), it was considered that the format produced would not be as easy to work with for validation tools.

With a controlled RDF/XML structure we were able to apply RNG to provide structure validation before accepting foreign data and able to automate the export into more controlled formats using XSLT. (We used a rule engine for state validation.) Although RDF is a great way to interchange data against an changing model, XML is still better over the last mile to restrict the vocabulary of the data accepted.

Reblog this post [with Zemanta]

2 comments:

  1. Hi James, we are trying to figure out the same issue at the Canadian Writing and Research Collaboratory. Would it be possible for you to share the RNG you came up with?

    ReplyDelete
    Replies
    1. Thanks for asking, but the relax-ng schema we used is not available. However, if you start with the basic rdf/xml schema (about 200 lines) you can customize it from there. You'll want to start by replacing the local elements with properties from your own vocabulary.

      http://www.w3.org/TR/rdf-syntax-grammar/#section-RELAXNG-Schema

      Delete