Mapping Microdata to RDF

From W3C Wiki

Jump to: navigation, search

This page describes how microdata content can be consumed by a consumer whose back-end systems are based on an RDF (or RDF-like) model, as part of the work of the HTML Data TF.

Transformation description moved to ReSpec document

Contents

Property URI generation

Microdata allows properties to be specified as simple names, which then have a URI generation rule applied to them. As different vocabularies have different requirements for property URIs, the idea is to provide a way to inform the processor of how to generate URIs, and have the processor fall back to a specific URI generation strategy if no other information is available.

There are different strategies for generating property URIs from names:

hashSlash
Infer the vocabulary from the @itemtype, and append the name to the resulting vocabulary URI. This would take advantage of the typically RDF strategy of having a flat namespace for classes and properties, so that the class name could be removed from the @itemtype URI to which the name can be appended. For example, if the type were http://schema.org/Thing the property 'name' would be be http://schema.org/name. Types are inherited by items without an @itemtype. Items without a type (explicit or inherited) append the name to the document base URI, in the case that the item has no type. For example, if the document had a base of http://example.com/doc, name could be appended along with a '#', yielding http://example.com/doc#name
fragID
Append the name to the @itemtype URI. For example, given the URI http://microformats.org/profile/hcard as the type, the property 'fn' would result in the following URI: http://microformats.org/profile/hcard#fn. Note this is only possible if the type does not include a '#', which would result in an error and/or no generated property URI.
contextual
Append the name to a combination of @itemtype and the property path, and ensure that property URIs generated from names are distinct from explicit property URIs. For example, given the type http://microformats.org/profile/hcard, the property 'fn' would result in http://www.w3.org/1999/xhtml/microdata#http://microformats.org/profile/hcard#:%23fn. However, if there is an intervening item without a type, it would construct a different URI. Assuming an intervening property 'foo', the resulting URI would be http://www.w3.org/1999/xhtml/microdata#http://microformats.org/profile/hcard#:%23foo%20fn.

These strategies can be the value of a _propertyURIGeneration_ parameter added to the initial evaluation context.

Vocabulary-specific URI generation

A registry may associate different vocabularies with property URI generation schemes, for example:

<http://schema.org/> a :Vocabulary; :propertyURIscheme :slashHash .
<http://microformats.org/profile/hcard> a :Vocabulary; :propertyURIscheme :contextual .

A vocabulary-aware processor could then change URI generation schemes when encountering @itemtype URIs contained in the registry, and fallback to a default setting otherwise.

Multiple types for an item

TBD.

Examples

Additional examples can be added here.

An example of a a http://schema.org/Organization that is the provider, publisher and copyrightHolder of a http://schema.org/NewsArticle. When converting this sample to RDF, it might be interesting that the "itemid" of the Organization object happens to be the same URL that is used as a property expecting a URL (the "url" property of http://schema.org/Thing in this case) from the same object. The "url" property of http://schema.org/Thing is not meant to take a http://schema.org/Organization as a value, but a URL.

<body itemscope="itemscope" itemtype="http://schema.org/NewsArticle"
  itemid="http://www.businesswire.com/news/home/20110106006854/en">
...
<span itemprop="provider publisher copyrightHolder" itemscope="itemscope"
          itemtype="http://schema.org/Organization" itemid="http://businesswire.com">
  <meta itemprop="name" content="Business Wire"/>
  <a itemprop="url" href="http://www.businesswire.com">
     <img itemprop="image"
              src="http://www.businesswire.com/images/Powered-by-Business-Wire.gif"
              title="Business Wire is the leading source for full-text breaking news and press releases, 
              multimedia and regulatory filings for companies and groups throughout the world"
              alt="Powered by Business Wire"/>
  </a>
</span>
...
</body>


The resulting RDF from this example is:

<http://www.businesswire.com/news/home/20110106006854/en> a schema:NewsArticle;
 schema:copyrightHolder <http://www.businesswire.com> .


<http://businesswire.com> a schema:Organization;
   schema:image <http://www.businesswire.com/images/Powered-by-Business-Wire.gif>;
   schema:name "Business Wire";
   schema:url <http://www.businesswire.com> .
Personal tools