You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1074 lines
32 KiB
1074 lines
32 KiB
<?xml version="1.0" encoding="UTF-8"?><!--*- nxml -*-->
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head profile="http://www.w3.org/2003/g/data-view">
|
|
<title>Semantic Web Data Integration with hCalendar and GRDDL</title>
|
|
<meta name="copyright"
|
|
content="Copyright © 2005 W3C (MIT, ERCIM, Keio)
|
|
$Revision: 1.33 $" />
|
|
<meta name="Date" content="$Date: 2007/07/10 15:27:52 $" />
|
|
<link rel="transformation"
|
|
href="http://www.w3.org/2002/12/cal/glean-hcal.xsl" />
|
|
<link rel="stylesheet" href="http://www.w3.org/Talks/Tools/Slidy/slidy.css"
|
|
type="text/css"
|
|
media="screen, projection, print" />
|
|
<link rel="stylesheet" href="http://www.w3.org/Talks/Tools/Slidy/w3c-blue.css"
|
|
type="text/css" media="screen, projection, print" />
|
|
<script src="http://www.w3.org/Talks/Tools/Slidy/slidy.js"
|
|
type="text/javascript">
|
|
</script>
|
|
|
|
<style type="text/css">
|
|
.source { text-align: right; font-size: smaller }
|
|
.footnote { font-size: smaller }
|
|
div.figure { text-align: center }
|
|
pre b { color: blue }
|
|
blockquote { border-left: double; padding-left: 1em; font-style: italic; text-align: justify }
|
|
</style>
|
|
</head>
|
|
<body>
|
|
|
|
<div class="background">
|
|
<img alt="" id="head-icon"
|
|
src="http://www.w3.org/Talks/Tools/Slidy/icon-blue.png" />
|
|
<object id="head-logo"
|
|
data="http://www.w3.org/Talks/Tools/Slidy/w3c-logo-blue.svg"
|
|
type="image/svg+xml" title="W3C logo">
|
|
<a href="http://www.w3.org/"><img
|
|
alt="W3C logo" id="head-logo-fallback"
|
|
src="http://www.w3.org/Talks/Tools/Slidy/w3c-logo-blue.gif" /></a>
|
|
</object>
|
|
</div>
|
|
|
|
<div class="background slanty">
|
|
<img src="http://www.w3.org/Talks/Tools/Slidy/w3c-logo-slanted.jpg"
|
|
alt="slanted W3C logo" />
|
|
</div>
|
|
|
|
<div class="slide cover">
|
|
<img align="right"
|
|
src="http://www.w3.org/Talks/Tools/Slidy/keys.jpg"
|
|
alt="Cover page images (keys)" class="cover" />
|
|
|
|
<h1>Semantic Web Data Integration with hCalendar and GRDDL</h1>
|
|
|
|
<address class="vcard">
|
|
<a class="url fn n" href="http://www.w3.org/People/Connolly/">
|
|
<span class="given-name">Dan</span>
|
|
<span class="family-name">Connolly</span></a><br />
|
|
</address>
|
|
|
|
<div class="vevent">
|
|
<a class="url summary" href="http://2005.xmlconference.org/">XML
|
|
Conference & Exposition 2005 | From Syntax to Semantics (XML
|
|
2005)</a><br />
|
|
<span class="location">Atlanta, GA, USA</span><br />
|
|
<abbr class="dtstart" title="2005-11-16">16 November 2005</abbr>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>Toward Open Data</h1>
|
|
<ul>
|
|
<li><blockquote><p>I want my data back.</p>
|
|
<div class="source"><cite>Jon Bosak circa 1997</cite></div>
|
|
</blockquote></li>
|
|
<li><blockquote>
|
|
<p>I've long believed that customers of any application own the
|
|
data they enter into it.</p>
|
|
<div class="source"><cite><a
|
|
href="http://www.veen.com/jeff/archives/000810.html">Jeffrey Veen 2
|
|
November 2005</a></cite></div>
|
|
</blockquote>
|
|
</li>
|
|
</ul>
|
|
|
|
<p><em>Is this what Web 2.0 is all about? If so, maybe it's not such a
|
|
bad thing.</em></p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Outline</h1>
|
|
|
|
<ol>
|
|
<li>A history of (X)HTML and the Web</li>
|
|
<li>Toward data integration, Web style</li>
|
|
<li>Semantic Web * iCalendar = RDF Calendar </li>
|
|
<li>GRDDL: Semantic Web data in XHTML</li>
|
|
<li>hCalendar and microformats</li>
|
|
<li>hCalendar * GRDDL = RDF Calendar</li>
|
|
<li>Microformats + SPARQL</li>
|
|
<li>Microformats + OWL, rules</li>
|
|
</ol>
|
|
|
|
<p><strong>Objective:</strong> making it cost-effective to
|
|
record and share knowledge <em>formally</em>, i.e. so that computers can
|
|
manipulate it.</p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Getting into the Web</h1>
|
|
<p>What was the tipping point— the <q>killer app</q>—<em>for you</em>?</p>
|
|
|
|
<ul class="incremental">
|
|
<li>for Tim Berners-Lee's colleagues at CERN: a cross-platform phone
|
|
directory</li>
|
|
<li>for me at Convex in 1991: a nifty hypertext system design:
|
|
<blockquote>
|
|
<pre>
|
|
Newsgroups: alt.hypertext
|
|
Subject: WorldWideWeb: Summary
|
|
Date: 6 Aug 91 16:00:12 GMT
|
|
|
|
WorldWideWeb - Executive Summary
|
|
|
|
|
|
The WWW project merges the techniques of information retrieval and hypertext to
|
|
make an easy but powerful global information system. ...
|
|
</pre>
|
|
</blockquote>
|
|
</li>
|
|
<li>in the early 1990s: NCSA What's New</li>
|
|
<li>these days: maps, auctions, wikipedia, webmail, ...</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Getting into the Web: downhill steps</h1>
|
|
|
|
<p>To grow, start with <em>some actual value</em> plus lots of
|
|
potential value and a downhill path for contributors:</p>
|
|
|
|
<ul class="incremental">
|
|
<li>for a little investment
|
|
<p>Note: licensing and legal agreements are <em>not little</em>.</p>
|
|
</li>
|
|
<li>a big return</li>
|
|
<li>due to network effects</li>
|
|
<li>Scaling
|
|
<ul class="incremental">
|
|
<li>yes, scalability to 10^9 nodes and up is important</li>
|
|
<li>but so is scalability down to families and scout troops</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
|
|
<p><q>Architect for participation</q> -- <cite>Tim O'Reilly @ W3C's 10 anniversary</cite></p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Web Basics: URIs, HTTP, ...</h1>
|
|
<p>So what makes all that work? Much of it is a story for another day.</p>
|
|
<p>See <cite><a href="http://www.w3.org/TR/2004/REC-webarch-20041215/">Architecture of the World Wide Web, Volume One</a></cite><br />
|
|
W3C Recommendation 15 December 2004
|
|
</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Web Basics: HTML</h1>
|
|
|
|
<ul>
|
|
<li>In the "garage" at CERN:<br />
|
|
NextStep RTF object meets SGML</li>
|
|
<li>IETF HTML Working Group:<br />
|
|
<em>Is <tt><p></tt> a container or a separator?</em></li>
|
|
<li>HTML Validation Service at Halsoft</li>
|
|
<li>W3C HTML Working Group:<br />
|
|
Netscape trades <tt><blink></tt> for Microsoft's <tt><marquee></tt><br />
|
|
HTML 3.2 is born</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Aside: CSS deployment</h1>
|
|
<p>... and the importance of test suites</p>
|
|
<ul class="incremental">
|
|
<li>Dec 1996: <a href="http://www.w3.org/TR/REC-CSS1-961217">Original CSS
|
|
Rec</a></li>
|
|
<li>Goodbye <tt><FONT></tt>, right?</li>
|
|
<li>At least <tt><FONT></tt> didn't crash my browser</li>
|
|
<li>
|
|
<blockquote>
|
|
<p>The experience of the CSS working group, with the CSS1 Test
|
|
Suite, and the many compliant and interoperable implementations
|
|
that followed, demonstrated fairly clearly that a simple test
|
|
suite is far far better than none at all.</p>
|
|
<div class="source"><cite><a
|
|
href="http://www.w3.org/Style/CSS/Test/testsuitedocumentation.html#something"
|
|
>CSS Test Suite Documentation</a></cite></div>
|
|
</blockquote>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>HTML / SGML * XML = XHTML</h1>
|
|
<p>along with CSS, of course</p>
|
|
<ul>
|
|
<li>HTML 4.0: a vocabulary we can understand and agree on</li>
|
|
<li>XML 1.0: a syntax we can understand and agree on</li>
|
|
<li>CSS: a style sheet mechanism with graceful degradation</li>
|
|
<li>After ~5 years of single-pixel-gifs and other black magic...</li>
|
|
<li class="incremental">XHTML + CSS arrives<br />
|
|
<cite><a href="http://www.zeldman.com/dwws/">Designing With
|
|
Web Standards</a></cite> by Zeldman, May 2003
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>The Personal Information Disaster</h1>
|
|
|
|
<ul>
|
|
<li>For data, we are pre-web:
|
|
<dl>
|
|
<dt>Airline sends me email from their database</dt>
|
|
<dd>I copy/paste each of the data into my PDA</dd>
|
|
<dt>Soccer coach distributes a schedule</dt>
|
|
<dd>each players with an online calendar re-keys the data</dd>
|
|
</dl>
|
|
</li>
|
|
<li>
|
|
<blockquote>
|
|
<p>The bane of my existence is doing things I know the computer could do for me.</p>
|
|
<div class="source"><cite><a
|
|
href="http://www.nature.com/nature/webmatters/xml/xml.html">The XML Revolution</a></cite>, Nature Web Matters Oct 1998</div>
|
|
</blockquote>
|
|
</li>
|
|
<li>Let's find ways to make it cost-effective record and share
|
|
knowledge <em>formally</em>, i.e. so that computers can manipulate
|
|
it.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>RDF Calendar for travel itineraries</h1>
|
|
<img align="right" alt="cal screen shot"
|
|
src="http://www.w3.org/2000/10/swap/pim/calIntShot.png" />
|
|
<ul>
|
|
<li>travel admin mails text itinerary
|
|
<ul>
|
|
<li>dumped from database, probably</li>
|
|
</ul>
|
|
</li>
|
|
<li>nasty perl script scrapes RDF statements in travel terms</li>
|
|
<li>logical rules convert to iCalendar vocabulary
|
|
<ul><li>mixing in airport timezone info</li></ul>
|
|
</li>
|
|
<li>toIcal.py converts RDF calendar to iCalendar .ics syntax</li>
|
|
</ul>
|
|
<p><em>But why not just spit out .ics from the nasty perl script?</em></p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Travel from place to place</h1>
|
|
|
|
<img align="right" alt="map of itinerary"
|
|
src="../../../../2003/04dc-mia/itin-mia.png" />
|
|
<ul>
|
|
<li>for details, see <a href="../../../../2000/10/swap/pim/travel">Integration Example: travel tools</a> in <cite><a href="../../../../2000/10/swap/doc/">Semantic Web Tutorial Using N3</a></cite> given at WWW2003 in Budapest</li>
|
|
</ul>
|
|
|
|
<p class="footnote">iCalendar does have location/geo fields, but only
|
|
one per event. Flights have a departure and an arrival.</p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>TODO list views of email archives</h1>
|
|
|
|
<img align="right" src="http://lists.w3.org/Archives/Public/www-archive/2004Feb/att-0080/n3bugstodo-grab.png" alt="iCal screenshot" />
|
|
<ul>
|
|
<li>lists.w3.org gives message headers in RDF</li>
|
|
<li>convention: <tt>[closed]</tt> marks closed threads in public-cwm-bugs</li>
|
|
<li>some rules relate these conventions to RDF/iCalendar</li>
|
|
<li>toIcal.py converts to .ics format</li>
|
|
<li>developers subscribe to TODO list</li>
|
|
</ul>
|
|
|
|
<p class="footnote">details: <a
|
|
href="http://lists.w3.org/Archives/Public/www-rdf-calendar/2004Jan/0011.html">bug
|
|
status in .ics</a> of Jan 2004</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Semantic Web Basics</h1>
|
|
<ul>
|
|
<li>What is the Semantic Web?
|
|
<ul>
|
|
<li>Data integration across application, organizational boundaries</li>
|
|
</ul>
|
|
</li>
|
|
<li>How does it work?
|
|
<ol>
|
|
<li>Apply power of URIs to concepts of relational data
|
|
<ul>
|
|
<li>Don't say "colour" say <http://example.com/2002/std6#col></li>
|
|
</ul>
|
|
</li>
|
|
<li>Model real things, not just documents or database tables</li>
|
|
</ol>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>The element of the Semantic Web</h1>
|
|
|
|
<p><img alt="arrow tail, body and head are l are subject, property and value." src="../../../../DesignIssues/diagrams/spv-arrow.png" /></p>
|
|
|
|
<ul>
|
|
<li>The Resource Description Framework (RDF)
|
|
<ul>
|
|
<li>abstract syntax, formal semantics</li>
|
|
<li>standard encoding in XML</li>
|
|
<li>emerging programmer-friendly short-hand encoding: <a href="http://www.w3.org/2000/10/swap/Primer.html">N3</a>/<a href="http://www.dajobe.org/2004/01/turtle/">turtle</a>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
|
|
</ul>
|
|
<pre>
|
|
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
|
|
<people.rdf#dan> foaf:name "Dan Connolly".
|
|
<people.rdf#dan> foaf:interest <http://www.w3.org/XML/>.
|
|
</pre>
|
|
<p>Note the relationship to HTML links, especially with the
|
|
re-discovery of the <tt>rel</tt> attribute.</p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Semantic web includes tables, trees...</h1>
|
|
|
|
<p><img alt="Arrows can make a table, an arrow from each row to each value"
|
|
src="../../../../DesignIssues/diagrams/arrow-table.png" /></p>
|
|
|
|
<p><img alt="Arrows can make a table, an arrow from each row to each value"
|
|
src="../../../../DesignIssues/diagrams/tree.png" /></p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>... and tangly messes</h1>
|
|
|
|
<p><img alt="Arrows can make a table, an arrow from each row to each value"
|
|
src="../../../../DesignIssues/diagrams/tree-and-table2.png" /></p>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>RDFS and OWL</h1>
|
|
|
|
<p>RDF Schema (RDFS) and the Web Ontology Language (OWL) correspond to
|
|
UML notions such as subclass, domain, range, cardinality, ...</p>
|
|
|
|
<img align="right"
|
|
src="http://www.w3.org/2000/10/swap/pim/travelFig.png"
|
|
alt="travel concepts schema" />
|
|
|
|
<ul>
|
|
<li style="color: red;">cyc terms in red</li>
|
|
<li style="color: purple;">DAML airport ontology terms in purple</li>
|
|
<li style="color: green;">custom travelTerms in green.</li>
|
|
<li style="color: blue;">RDF standard terms in blue</li>
|
|
<li style="color: orange;">XML Schema terms in orange</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>RDF Calendar in a nutshell</h1>
|
|
|
|
<ul>
|
|
|
|
<li>In iCalendar terms (<a href="../rfc2445">RFC2445</a>), an event is
|
|
a <a href="../rfc2445#sec4.6">component</a>
|
|
with various <a
|
|
href="../rfc2445#sec4.5">properties</a>:
|
|
|
|
<pre>BEGIN:VEVENT
|
|
<span xml:lang="en" lang="en">UID:20020630T230445Z-3895-69-1-7@jammer</span>
|
|
<span xml:lang="en" lang="en">DTSTART;VALUE=DATE:20020703</span>
|
|
<span xml:lang="en" lang="en">DTEND;VALUE=DATE:20020706</span>
|
|
<span xml:lang="en" lang="en">SUMMARY:Scooby Conference</span>
|
|
<span xml:lang="en" lang="en">LOCATION:San Francisco</span>
|
|
<span xml:lang="en" lang="en">END:VEVENT</span></pre>
|
|
</li>
|
|
<li>and RDF/XML has analogous class and property constructs:
|
|
<pre> <Vevent>
|
|
<uid>20020630T230445Z-3895-69-1-7@jammer</uid>
|
|
<dtstart>2002-07-03</dtstart>
|
|
<dtend>2002-07-06</date>
|
|
<summary>Scooby Conference</summary>
|
|
<location>San Francisco</location>
|
|
</Vevent></pre>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>For details, see <cite><a
|
|
href="http://www.w3.org/TR/rdfcal/">RDF Calendar - an application of
|
|
the Resource Description Framework to iCalendar Data</a></cite>,
|
|
Connolly and Miller September 2005</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Round-trip testing</h1>
|
|
|
|
<p>Comparing .ics files is tricky, so...</p>
|
|
|
|
<ul>
|
|
<li>to test .ics -> RDF:
|
|
<ol>
|
|
<li>convert <tt>$testcase.ics</tt> to temporary <tt>$testcase-actual.rdf</tt></li>
|
|
<li>Use RDF graph comparison to check actual results vs. expected results</li>
|
|
</ol>
|
|
</li>
|
|
<li>to test RDF -> .ics<br />
|
|
w.r.t a known-good .ics -> RDF tool:
|
|
<ol>
|
|
<li>convert <tt>$testcase.rdf</tt> to <tt>$testcase-temp.ics</tt></li>
|
|
<li>Use the known-good tool get <tt>$testcase-actual.rdf</tt></li>
|
|
<li>Use RDF graph comparison to check <tt>$test-case-actual.rdf</tt> against expected results</li>
|
|
</ol>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>On RDF/XML Syntax</h1>
|
|
<ul>
|
|
<li>Too constrained for some
|
|
<ul>
|
|
<li>Tries to look like Book/author/title metadata but has subtle
|
|
constraints with striping, rdf:about,rdf:resource, etc.</li>
|
|
</ul>
|
|
</li>
|
|
<li>Not constrained enough for others
|
|
<ul>
|
|
<li>doesn't work with DTDs nor W3C XML Schemas; works awkwardly
|
|
with XPath and XSLT</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<p>At least the issues in the 1998 spec have all been resolved,
|
|
complete with test cases. There are plenty of interoperable
|
|
parsers. And it works great with Relax-NG and nxml-mode :)</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Observation: lots of structured data in XHTML
|
|
dialects</h1>
|
|
|
|
<ul>
|
|
<li>bibliographies</li>
|
|
<li>schedules</li>
|
|
<li>issues lists</li>
|
|
<li>news items</li>
|
|
<li>blog rolls</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Data in Documents</h1>
|
|
|
|
<blockquote>
|
|
<p>I believe that one of the best ways to transition into RDF,
|
|
if not a long-term deployment strategy for RDF, is to manage the
|
|
information in human-consumable form (XHTML) annotated with just
|
|
enough info to extract the RDF statements that the human info
|
|
is intended to convey. In other words: using a relational
|
|
database or some sort of native RDF data store, and spitting
|
|
out HTML dynamically, is a lot of infrastructure to operate
|
|
and probably not worth it for lots of interesting cases.</p>
|
|
|
|
<p>We all know that we have to produce a human-readable version
|
|
of the thing... why not use that as the primary source?</p>
|
|
<div class="source">
|
|
<cite><a href="http://lists.w3.org/Archives/Public/www-rdf-interest/2000Mar/0103.html">XSLT for screen-scraping RDF out of real-world data</a><br />Dan Connolly to www-rdf-interest March 2000</cite>
|
|
</div>
|
|
</blockquote>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Case Study: News Syndication at W3C</h1>
|
|
<ul>
|
|
<li>W3C Communications Team knows HTML well; picked up XHTML quickly</li>
|
|
<li>RSS emerged as a syndication mechanism</li>
|
|
<li>W3C's news content management system is just CVS; no database</li>
|
|
<li>But by adding just a <tt>class</tt> attribute here and a
|
|
<tt>rel</tt> attribute there, we get enough structure to transform it
|
|
to RSS using XSLT</li>
|
|
</ul>
|
|
|
|
<p><cite><a href="../../../../2000/08/w3c-synd/">Site Summaries
|
|
in XHTML</a></cite> is a cost-effective way to formalize our news
|
|
metadata.</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>GRDDL Semantics: explicit, grounded in the Web</h1>
|
|
<ul>
|
|
<li>Screen scraping: at your own risk</li>
|
|
<li>GRDDL: author explicitly agrees to data extraction algorithm</li>
|
|
|
|
<li><tt>rel="transformation"</tt> is grounded in URI space: <br />
|
|
<tt>http://www.w3.org/2003/g/data-view#transformation</tt>
|
|
</li>
|
|
|
|
<li>output of transformation is expressed in RDF, using URIs for
|
|
property names</li>
|
|
|
|
</ul>
|
|
<p>A person (or a machine) can "follow your nose" from the document
|
|
to the transformation algorithm, to the data, to the definitions of
|
|
the terms used in the data.</p>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>GRDDL Syntax: author's choice</h1>
|
|
|
|
<ul>
|
|
<li>works with DTD-happy XHTML<br />
|
|
via a link with <tt>rel="transformation"</tt>,
|
|
qualified by the GRDDL profile</li>
|
|
<li>works with XML in general<br />
|
|
via <tt>grdd:transformation</tt> attribute on the root element</li>
|
|
<li>the link refers to a transformation, typically XSLT, that maps
|
|
to RDF/XML</li>
|
|
<li>the link may be indirect via a profile or namespace document</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>GRDDL drawback: turing completeness</h1>
|
|
|
|
<ul>
|
|
<li>XSLT is a big hammer
|
|
<ul>
|
|
<li>much less so than in 2000</li>
|
|
</ul>
|
|
</li>
|
|
<li>Trust issues: consumer runs XSLT from <em>who knows where</em>
|
|
<ul>
|
|
<li>well-known transformations should cover 80%
|
|
<ul>
|
|
<li>short-cut local implementation</li>
|
|
<li>known-good cached copies</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>GRDDL Details</h1>
|
|
|
|
<ul>
|
|
<li>In addition to structured XHTML, add the GRDDL profile and
|
|
a typed link:
|
|
|
|
<pre><html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head <b>profile="http://www.w3.org/2003/g/data-view"</b>>
|
|
<b><link rel="transformation"
|
|
href="http://www.w3.org/2000/08/w3c-synd/home2rss.xsl" /></b>
|
|
...
|
|
</head>
|
|
<body>
|
|
...
|
|
<div id="x200501110b" class="item">
|
|
<h3><img alt="" width="17" height="11" src="/Icons/right" />Policy for
|
|
Authorized W3C Translations Announced</h3>
|
|
|
|
<p><span class="date">2005-11-10:</span> W3C is pleased to announce...</pre>
|
|
</li>
|
|
|
|
<li>The <tt>home2rss</tt> transformation turns that into RDF/XML:
|
|
<pre>...
|
|
<item rdf:about="http://www.w3.org/News/2005#item158">
|
|
<title>Policy for Authorized W3C Translations Announced</title>
|
|
<description>2005-11-10: W3C is pleased to announce ...</pre>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>GRDDL: multiple dialects allowed</h1>
|
|
|
|
<div class="figure">
|
|
<img src="../../../../2004/01/rdxh/figMultiTxform.png"
|
|
alt="one document, multiple transformations"/>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="slide"><h1>GRDDL: one profile for lots of documents</h1>
|
|
|
|
<div class="figure">
|
|
<img src="../../../../2004/01/rdxh/figGleanProfile.png"
|
|
alt="transformation via profile"/>
|
|
</div>
|
|
|
|
<p class="footnote">For details, see <cite><a
|
|
href="http://www.w3.org/TR/grddl/">Gleaning Resource Descriptions from
|
|
Dialects of Languages (GRDDL)</a></cite>, Hazaël-Massieux and
|
|
Connolly May 2005</p>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>GRDDL for any XML, not just XHTML</h1>
|
|
|
|
<pre>
|
|
<java version="1.5.0_04" class="java.beans.XMLDecoder"
|
|
<b>xmlns:grddl="http://www.w3.org/2003/g/data-view#"
|
|
grddl:transformation="<a href="http://www.w3.org/2001/tag/2005/09/grokVioletUML.xsl">grokVioletUML.xsl</a>"></b>
|
|
<object class="com.horstmann.violet.ClassDiagramGraph">
|
|
<void method="addNode">
|
|
...
|
|
</pre>
|
|
|
|
<div class="figure">
|
|
<img src="http://www.w3.org/2001/tag/2005/09/20-class-uml.png"
|
|
alt="UML diagram with OWL formalization" />
|
|
</div>
|
|
|
|
</div>
|
|
|
|
<div class="slide">
|
|
<h1>Origins of hCalendar and Microformats</h1>
|
|
|
|
<p><em>Borrowing from <a
|
|
href="http://tantek.com/presentations/2005/09/microformats-evolution/">Microformats:
|
|
Evolving the Web</a> by Tantek Çelik Sep 2005</em></p>
|
|
|
|
<ul class="incremental">
|
|
<li>Very common "structures" on weblogs</li>
|
|
<li>Create explicit structures
|
|
<ul><li>in order to easily publish, index, aggregate</li></ul>
|
|
</li>
|
|
<li>Minimize impact on authors (and developers)</li>
|
|
<li>Avoid duplicating all content
|
|
<ul><li>Avoid requiring file uploads</li></ul>
|
|
</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<div class="slide">
|
|
<h1>Microformats Principles</h1>
|
|
<ul class="incremental">
|
|
<li>solve a specific problem</li>
|
|
<li>simple as possible
|
|
<ul class="incremental">
|
|
<li>evolutionary improvements</li></ul>
|
|
</li>
|
|
<li>humans first, machines second
|
|
|
|
<ul class="incremental">
|
|
<li>presentable <em>and</em> parsable</li>
|
|
<li>adapt to current behaviors</li>
|
|
</ul>
|
|
</li>
|
|
<li>reuse from widely adopted standards
|
|
<ul class="incremental">
|
|
<li>semantic (X)HTML, schemas from interoperable RFCs</li>
|
|
</ul>
|
|
</li>
|
|
<li>modularity / embeddability</li>
|
|
|
|
<li>decentralized development, content, services
|
|
<ul><li>explicitly encourage "spirit of the Web"</li></ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide">
|
|
<h1>XHTML * iCalendar = hCalendar</h1>
|
|
<ul>
|
|
<li>Map iCalendar constructs 1:1 into XHTML
|
|
<ul>
|
|
<li><a href="http://microformats.org/wiki/hcalendar">hCalendar</a> specification, issues, examples... wiki</li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li>example
|
|
<pre><code>
|
|
<ol <span class="added">class="vcalendar"</span>>
|
|
<li <span class="added">class="vevent"</span>>
|
|
<a href="http://tantek.com/presentations/..."
|
|
<span class="added">class="url"</span>>
|
|
|
|
<span <span class="added">class="summary"</span>>Microformats: Evolving the Web</span
|
|
<abbr <span class="added">class="dtstart" title="20050930T1530+1000"</span>>
|
|
September 30th, 2005
|
|
</abbr>
|
|
</a>
|
|
</li>
|
|
|
|
</ol>
|
|
</code></pre>
|
|
</li></ul>
|
|
</div>
|
|
|
|
<div class="slide">
|
|
<h1 style="text-transform:none">more hCalendar</h1>
|
|
<ul>
|
|
<li><a href="http://theryanking.com/microformats/hcalendar-creator.html">hCalendar creator</a></li>
|
|
<li>This presentation</li>
|
|
<li><a href="http://concerts.shrub.ca/">Sunnyvale House Concerts site</a></li>
|
|
<li><a href="http://evdb.com/">EVDB</a> and <a href="http://upcoming.org/">Upcoming.org</a></li>
|
|
|
|
<li><a href="http://favelets.com/#microformats">Favelets</a> for hCalendars</li>
|
|
<li><a href="http://tantek.com/presentations/2005/09/microformats-evolution/we05-program.html">Web Essentials 05 Program</a></li>
|
|
</ul>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>hCalendar + GRDDL = RDF Calendar</h1>
|
|
|
|
<ul>
|
|
<li><a href="e1.html">hCalendar</a> + <a href="../glean-hcal.xsl">glean-hcal.xsl</a> GRDDL transformation:<br />
|
|
<iframe width="80%" height="80" src="e1.html">
|
|
<a href="e1.html">event 1</a>
|
|
</iframe>
|
|
</li>
|
|
<li>RDF:
|
|
<pre>
|
|
@prefix : <http://www.w3.org/2002/12/cal/icaltzd#> .
|
|
@prefix XML: <http://www.w3.org/2001/XMLSchema#> .
|
|
|
|
[ a :Vevent;
|
|
:attendee [ :cn "Hoopy Frood";
|
|
:calAddress <mailto:frood@example> ];
|
|
:dtend "2005-10-07"^^XML:date;
|
|
:dtstart "2005-10-05"^^XML:date;
|
|
:geo (37.0625 -95.677068 );
|
|
:location "Argent Hotel, San Francisco, CA";
|
|
:summary "Web 2.0 Conference";
|
|
:url <http://www.web2con.com/> ].
|
|
</pre>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Upcoming events on blogs</h1>
|
|
|
|
<p>From <a href="http://tantek.com/log/">Tantek's Thoughts</a>:</p>
|
|
<blockquote>
|
|
<ul>
|
|
|
|
<li class="vevent">
|
|
<a class="url" href="http://web2con.com/">
|
|
<abbr class="dtstart" title="20051005">
|
|
10/5</abbr>-<abbr class="dtend" title="20051008">7</abbr> <span class="summary">
|
|
|
|
Web 2.0 </span>
|
|
- at
|
|
<span class="location">
|
|
The Argent, San Francisco </span>
|
|
</a>
|
|
</li>
|
|
|
|
<li class="vevent">
|
|
<abbr class="dtstart" title="20051017">
|
|
10/17</abbr>-<abbr class="dtend" title="20051020">19</abbr>
|
|
|
|
<span class="summary">W3C CSS Working Group f2f</span>
|
|
at <span class="location">San Francisco</span>
|
|
</li>
|
|
</ul>
|
|
</blockquote>
|
|
|
|
<p>Mix with <a href="../glean-hcal.xsl">glean-hcal.xsl</a> and we get:</p>
|
|
<pre style="text-size: smaller">
|
|
<r:RDF xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:c="http://www
|
|
.w3.org/2002/12/cal/icaltzd#">
|
|
...
|
|
<c:Vevent>
|
|
<c:summary xml:lang="en-us">Web 2.0 </c:summary>
|
|
<c:dtstart r:datatype="http://www.w3.org/2001/XMLSchema#date">2005-10-05</c:dtstart>
|
|
<c:dtend r:datatype="http://www.w3.org/2001/XMLSchema#date">2005-10-08</c:dtend>
|
|
<c:url r:resource="http://web2con.com/"/>
|
|
<c:location xml:lang="en-us">The Argent, San Francisco </c:location>
|
|
</c:Vevent>
|
|
</pre>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>SQL * URIs = SPARQL</h1>
|
|
|
|
<p>Use GRDDL to aggregate data from friends etc, then...</p>
|
|
<img align="right"
|
|
src="http://www.w3.org/DesignIssues/diagrams/spv-table.png"
|
|
alt="table subject/property/value" />
|
|
|
|
<pre>
|
|
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
PREFIX c: <http://www.w3.org/2002/12/cal/icaltzd#>
|
|
SELECT ?name, ?summary, ?when
|
|
FROM <myFriendsBlogsData>
|
|
WHERE { ?somebody foaf:name ?name; foaf:mbox ?mbox.
|
|
?event c:summary ?summary;
|
|
c:dtstart ?ymd;
|
|
c:attendee [ c:calAddress ?mbox ]
|
|
}.
|
|
</pre>
|
|
|
|
<table border="1">
|
|
<tr><th>?name</th><th>?summary</th><th>?when</th></tr>
|
|
<tr><td>Tantek Çelik</td><td>Web 2.0</td><td>2005-10-05</td></tr>
|
|
<tr><td>Norm Walsh</td><td>XML 2005</td><td>2005-11-13</td></tr>
|
|
<tr><td>Dan Connolly</td><td>W3C tech plenary</td><td>2006-02-27</td></tr>
|
|
</table>
|
|
|
|
<p>See <cite><a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL
|
|
Query Language for RDF</a></cite> W3C Working Draft 21 July 2005, plus
|
|
<a href="http://norman.walsh.name/2005/itinerary/">Norm's travel this
|
|
year</a></p>
|
|
</div>
|
|
|
|
|
|
<div class="slide"><h1>Fill in blanks with rules</h1>
|
|
|
|
<p>Mix with some logical rules:</p>
|
|
<ul>
|
|
<li>Infer the place implicit in the iCalendar geo and location properties.
|
|
<pre>
|
|
{ ?E c:geo (?LAT ?LONG) }
|
|
=> { ?E k:eventOccursAt [ geo:lat ?LAT; geo:long ?LONG] }.
|
|
|
|
{ ?E c:location ?PLACENAME; k:eventOccursAt ?WHERE
|
|
} => {
|
|
?WHERE k:nameOfAgent ?PLACENAME;
|
|
s:label ?PLACENAME;
|
|
k:inRegion [ a k:City ]. # Assume each event is in some city
|
|
}.
|
|
</pre>
|
|
</li>
|
|
<li>Cities containing points far from each other are distinct
|
|
<pre>
|
|
{
|
|
?C1 a k:City; is k:inRegion of [geo:lat ?X1; geo:long ?Y1 ].
|
|
?C2 a k:City; is k:inRegion of [geo:lat ?X2; geo:long ?Y2 ].
|
|
|
|
# DSQ = (X1-X2)^2 + (Y1-Y2)^2
|
|
(((?X2 ?X1).m:difference 2).m:exponentiation
|
|
((?Y2 ?Y1).m:difference 2).m:exponentiation) m:sum ?DSQ.
|
|
?DSQ m:greaterThan 0.2.
|
|
} => { ?C1 owl:differentFrom ?C2 }.
|
|
</pre>
|
|
</li>
|
|
</ul>
|
|
<p class="footnote">Full details:
|
|
<a href="calbg.n3">calendar background rules</a>
|
|
</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>A model behind hCalendar</h1>
|
|
|
|
<div class="figure">
|
|
<object width="90%" height="400" data="e1c.svg">
|
|
<img src="e1c.png" alt="person, place, dates" />
|
|
</object>
|
|
</div>
|
|
|
|
<p>Can we do this event too?</p>
|
|
|
|
<iframe width="80%" height="80" src="e2.html">
|
|
<a href="e2.html">event 2</a>
|
|
</iframe>
|
|
</div>
|
|
|
|
<div class="slide"><h1>RDF merges trivially</h1>
|
|
|
|
<img style="float:right" width="50%" class="incremental"
|
|
src="http://www.w3.org/2000/Talks/0906-xmlweb-tbl/arcs-3.gif"
|
|
alt="stack" />
|
|
|
|
<img src="http://www.w3.org/2000/Talks/0906-xmlweb-tbl/arcs-1.gif"
|
|
alt="stack" />
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Partial Understanding</h1>
|
|
|
|
<p>RDF statements* are independent. RDF semantics are <dfn><a href=
|
|
"http://en.wikipedia.org/wiki/Monotonicity_of_entailment"
|
|
>monotonic</a></dfn>.</p>
|
|
|
|
<table>
|
|
<tr><th></th><th>RDF</th><th>XML</th></tr>
|
|
<tr>
|
|
<th>Premise</th>
|
|
<td>
|
|
<pre>
|
|
<Book rdf:ID="book1">
|
|
<dc:title>The Grapes of Wrath</title>
|
|
<dc:creator>Steinbeck</author>
|
|
</Book>
|
|
</pre>
|
|
</td>
|
|
<td>
|
|
<pre>
|
|
<xsd:simpleType name="myInteger">
|
|
<xsd:restriction base="xsd:integer">
|
|
<xsd:minInclusive value="10000"/>
|
|
<xsd:maxInclusive value="99999"/>
|
|
</xsd:restriction>
|
|
</xsd:simpleType>
|
|
</pre>
|
|
</td>
|
|
|
|
</tr>
|
|
<tr><th>Conclusion</th>
|
|
<td>
|
|
<pre>
|
|
<Book rdf:ID="book1">
|
|
<dc:title>The Grapes of Wrath</title>
|
|
</Book>
|
|
</pre>
|
|
</td>
|
|
<td>
|
|
<pre>
|
|
<!-- no, this does not follow -->
|
|
<xsd:simpleType name="myInteger">
|
|
<xsd:restriction base="xsd:integer">
|
|
<xsd:maxInclusive value="99999"/>
|
|
</xsd:restriction>
|
|
</xsd:simpleType>
|
|
</pre>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
<p class="footnote">*RDF/XML does have a
|
|
<tt>rdf:parseType="Collection"</tt> syntax, which
|
|
expands to a lisp style binary tree in the abstract syntax. This
|
|
erasure property works not on XML elements, but on RDF statements.
|
|
</p>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Merging hCalendar data</h1>
|
|
|
|
<div class="figure">
|
|
<object width="90%" height="500" data="mash.svg">
|
|
<img src="mash.png" alt="two events, one person, several days and places" />
|
|
</object>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="slide"><h1>OWL identity reasoning</h1>
|
|
|
|
<table>
|
|
<tr>
|
|
<th>Premise</th>
|
|
<td>
|
|
<pre>
|
|
# one-to-many
|
|
foaf:mbox a owl:InverseFunctionalProperty.
|
|
|
|
:dan foaf:mbox <mailto:connolly@w3.org>.
|
|
:dan foaf:name "Dan Connolly".
|
|
|
|
:daniel foaf:mbox <mailto:connolly@w3.org>.
|
|
:daniel foaf:name "Daniel W. Connolly".
|
|
</pre>
|
|
</td>
|
|
<td>
|
|
<object data="owl_smush1.svg">
|
|
<img style="float:right" alt="" src="owl_smush1.png" />
|
|
</object>
|
|
</td>
|
|
</tr>
|
|
<tr><th>Conclusion:</th>
|
|
<td>
|
|
<pre>
|
|
:daniel owl:sameAs :dan.
|
|
:daniel foaf:name "Dan Connolly".
|
|
:daniel foaf:name "Daniel W. Connolly".
|
|
</pre>
|
|
</td>
|
|
<td>
|
|
<object data="owl_smush2.svg">
|
|
<img style="float:right" alt="" src="owl_smush2.png" />
|
|
</object>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Consistency checking</h1>
|
|
|
|
|
|
<ul>
|
|
<li>Modelling assumption: on a given day, a person can only be in one city
|
|
<pre>
|
|
PersonDay a owl:Class; s:subClassOf k:Entity;
|
|
s:subClassOf [ owl:onProperty _city; owl:maxCardinality 1 ].
|
|
</pre>
|
|
</li>
|
|
|
|
<li>Then run <a
|
|
href="http://www.mindswap.org/2003/pellet/index.shtml">pellet</a>, an
|
|
OWL consistency checker:
|
|
<pre>
|
|
java -jar pellet.jar -inputFile ,mash.rdf -unsat
|
|
Input file: ,mash.rdf
|
|
OWL Species: Full
|
|
DL Expressivity: ALCHIF(D)
|
|
Consistent: No
|
|
Reason: There is an anonymous individual X, identified by
|
|
this path (file:...cal/mash/calbg#<b>oct6</b>
|
|
http://www.cyc.com/2004/06/04/cyc#temporallyIntersects X),
|
|
which has <b>more than 1</b> values for property file:...cal/mash/calbg#<b>_city</b>
|
|
violating the cardinality restriction
|
|
Time: 8784 ms (Loading: 6485 Preprocessing: 0
|
|
Species Validation: 2202 Consistency: 96 )
|
|
</pre>
|
|
</li>
|
|
<li>In other words: <em>You can't be two places at once!</em></li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<div class="slide"><h1>Review</h1>
|
|
|
|
<ul>
|
|
<li>hCalendar is not too much to ask of authors</li>
|
|
<li>GRDDL makes it authorized Semantic Web data</li>
|
|
<li>Semantic Web data:
|
|
<ul class="incremental">
|
|
<li>compares straightforwardly</li>
|
|
<li>merges easily</li>
|
|
<li>can be queried with SPARQL</li>
|
|
<li>can be checked for consistency with OWL reasoners</li>
|
|
<li>mixes with emerging rule languages</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
<div class="slide"><h1>Acknowledgements, Colophon</h1>
|
|
|
|
<p>These slides: <tt><a
|
|
href="http://www.w3.org/2002/12/cal/mash/slides">http://www.w3.org/2002/12/cal/mash/slides</a></tt></p>
|
|
|
|
<ul>
|
|
<li>Dog-food: view source for...
|
|
<ul>
|
|
<li>hCalendar</li>
|
|
<li>hCard</li>
|
|
<li>GRDDL</li>
|
|
</ul>
|
|
</li>
|
|
<li>written using nxml-mode</li>
|
|
<li>CSS/javascript presentation magic is <a href="http://www.w3.org/Talks/Tools/Slidy/help.html">Slidey</a></li>
|
|
|
|
<li>References
|
|
|
|
<ul>
|
|
<li><cite><a href="../../../../2000/10/swap/doc/">Semantic Web Tutorial Using N3</a></cite> given at WWW2003 in Budapest</li>
|
|
<li><cite><a href="http://www.dajobe.org/2004/01/turtle/">Turtle - Terse RDF Triple Language</a></cite> Dave Beckett, in progress 2004-2005</li>
|
|
<li><a href="http://esw.w3.org/topic/RdfCalendarPresentation">RDF calendar presentation</a>, Semantic Web Interest Group Meeting March 2004</li>
|
|
<li>See <cite><a href="http://www.w3.org/TR/2004/REC-webarch-20041215/">Architecture of the World Wide Web, Volume One</a></cite><br />
|
|
W3C Recommendation 15 December 2004</li>
|
|
<li><cite><a
|
|
href="http://www.w3.org/TR/grddl/">Gleaning Resource Descriptions from
|
|
Dialects of Languages (GRDDL)</a></cite>, Hazaël-Massieux and
|
|
Connolly May 2005</li>
|
|
<li><cite><a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL Query Language for RDF</a></cite>
|
|
W3C Working Draft 21 July 2005</li>
|
|
<li><cite><a
|
|
href="http://www.w3.org/TR/rdfcal/">RDF Calendar - an application of
|
|
the Resource Description Framework to iCalendar Data</a></cite>,
|
|
Connolly and Miller September 2005</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</body>
|
|
</html>
|