server_playground/doc/www.w3.org/DesignIssues/ReadWriteLinkedData.html


								<?xml version="1.0" encoding="UTF-8"?>

								<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

								      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

								<html xmlns="http://www.w3.org/1999/xhtml">

								<head>

								  <title>Read-Write linked data - Design Issues</title>

								  <link rel="Stylesheet" href="di.css" type="text/css" />

								  <meta http-equiv="content-type" content="text/html; charset=UTF-8" />

								</head>


								<body>

								<address>

								  Tim Berners-Lee<br />

								  Date: 2009-16-08, last change: $Date: 2011/10/08 21:12:23 $<br />

								  Status: personal view only. Editing status: Not finished at all @@

								</address>


								<p><a href="./">Up to Design Issues</a></p>

								<hr />


								<h1>Read-Write Linked Data</h1>


								<p class="abstract">There is an architecture in which a few existing or Web

								protocols are gathered together with some glue to make a world wide system in

								which applications (desktop or Web Application) can work on top of a layer of

								commodity read-write storage. The result is that storage becomes a commodity,

								independent of the application running on it.</p>


								<p>Introduction</p>


								<p>The <a href="LinkedData.html">Linked Data article</a> gave simple rules

								for putting data on the web so that it is linked. This article follows on

								from that to discuss allowing applications to write as well as read data.</p>


								<p>It is part of a series: future notes discuss socially-aware decentralized

								access control of reading and of writing to linked data, and of notification

								of changes. The overall goal is one in which storage with the necessarily

								functionality is a ubiquitous commodity, and application growth becomes

								dramatic as the provision of storage is decoupled from the design and

								deployment of applications. The storage is aware of different people and

								groups which may want access; it is aware of metadata such as licensing and

								appropriate uses of the data, so to help agents behave responsibly; and it

								can alert those who are interested when data changes. Without looking ahead

								too much, though, here let us look at protocol options for writing to the web

								of data.</p>


								<h3>Motivating Writing</h3>


								<p>I hope I do not have to motive here the fact that the Web in general

								should be read-write. That has been done in many places, from 'Weaving the

								Web', to the Read-Write Web blog. (I actually realize that in 20 year of

								writing these articles, I haven't written a separate page on that topic! )

								Let me summarize here by saying the WWW was originally developed with the

								goal to be a collaborative space in which people could collectively design,

								discuss, share and manage things. Being able to impart one's knowledge, or

								put down a new design or correct or annotate existing work, is I think a key

								functionality of the Web. Even better, can it be a place we we are creative

								jointly ("intercreative™") .</p>


								<p>This applies to data as much as to documents. To take just one example,

								shared calendar systems are one example of shared data systems which, while

								they are silos within the domain of calendaring, they have a classic burning

								need for multi-person collaboration and the need to be able to create and

								modify as well as read. In fact, collaboratively figuring out people's

								intersecting calendars is a classic challenge task. The goal is to make an

								infrastructure which will make it easy to write powerful collaborative

								applications. Also, I like the maxim that wherever you have access to

								information which you have the authority to correct or extend, there should

								be an easy way for you to do at that place. This clearly applies as much to

								data as to documents.</p>


								<h2>Outstanding issues</h2>


								<p>This article does not deal with the database-like storage APIs and

								specifically with atomic transactions, or fine-grained access control.</p>


								<h2>Architectures</h2>


								<p>The linked data world has a simple model. A set of documents in the Web

								each have a URI and a graph of linked RDF triples. Modification to this

								space consists of modification the triples in one more documents. We will

								consider here the question of small incremental changes, and not consider the

								question of large atomic changes which must be performed as an atomic

								transaction. @@</p>


								<h3>File write-back</h3>


								<p>The model is that all data is stored in a document (virtual or actual file)

								named with a URI. One way of changing the data

								is to overwrite the whole file with an HTTP PUT operation.

								Whereas typical Apache servers are not configured out of the box

								to accept PUT, when they are configured for WebDAV

								(The Web Distributed Authoring and Versioning specs)

								then they do.

								For historical reasons*, they advertise that they support with

								PUT with an HTTP header line

								</p>

								<pre>

								MS-Author-Via: DAV

								</pre>


								<h3>SPARQL Update</h3>


								<p>An alternative protocol for doing a change is to send just a small change

								as a patch back to the server. The patch fill is a small file which describes

								the change necessary to the graph. The patch may be generated directly by a

								user interface action, or an inference result, or alternatively the change

								may have been made in copy local to the client, and the patch file of the

								differences generated automatically.


								For compatibility with the above, An HTTP server advertises that it supports with

								SPARQL/Update with an HTTP header line

								<pre>

								MS-Author-Via: SPARQL

								</pre>

								<p>

								The change in the resource is described in SPARQL UPDATE message,

								which is posted to the URI of the data file itself.

								</p><p>

								The SPARQL update message only uses the <b>default graph, which

								is the graph of the document in question</b>. The SPARQL GRAPH directive

								is not used.

								</p><p>

								The query is sent using SPARQL in the body of the HTTP POST.

								A content-type header must be sent. The content type is <b>

								application/sparql-query</b>.

								(The following single line curl command is an example).

								</p>

								<pre>

								curl -d 'INSERT \

								{ &lt;http://dig.csail.xvm.mit.edu/2007/wiki/people/JoeLambda#JL> \

								             &lt;http://xmlns.com/foaf/0.1/age> 66 }' \

								 -H Content-type:application/sparql-query \

								  http://dig.csail.mit.mit.edu/2007/wiki/people/JoeLambda

								</pre><p>Note that a WHERE clause may well be used,

								as when modifications to the document are made which

								involve blank nodes,

								it may be necessary to give enough context to unambiguously identify the

								blank nodes.

								<p>

								(A server may also support SPARQL queries as well as SPARQL updates.

								if this is so,

								note the content-type header is the same, and the request body must be parsed

								to know whether it matches the query or the update grammar.)

								</p>

								<h4>Note: 409 Conflict</h4>

								<p>A SPARQL update message often contains both a DELETE and then an INSERT.

								This may be used to update a field from one value to another.

								When more than one application or user is using the same data,

								there may arise times when the DELETE fails

								because another user has already deleted the same data.

								In this case it <strong>very important</strong>

								that the delete does not fail silently.

								The HTTP server MUST return error status 409.

								("409 Conflict" indicates that the request could not be processed because of conflict in the request, such as an edit conflict).

								The client can then for example inform the user by backing out the

								change the user was trying to make, or it can retry a reservation later.

								The atomicity of the DELETE,INSERT function can be used to provide

								various mutual exclusion systems, such as reserving a resource

								or generating unique sequential numbers, and so on.

								</p>

								<h3>A Data Wiki</h3>

								<p>The protocols above can be used to implement a data wiki.

								This is a piece of URI space in which

								any data can be edited by anyone, just as a text wiki can be.

								To make a data wiki, also one needs this extra rule:

								</p>

								<p>

								Whenever a client requests a page which doesn't previously exist,

								instead of returning a "400 Not found" error, the server

								returns 200 OK, and a valid data document (in RDF/XML or N3)

								which contains zero triples.

								(This does not mean a zero length document in RDF/XML, but it can be in N3)

								</p>

								<h2>Conclusion</h2>


								<p>

								The world of linked data can be extended to a world of read-write linked data

								easily.   The existing protocols and formats HTTP, WebDav, RDF and SPARQL

								can be connected together as defined above, with a little glue from

								the reuse of existing HTTP headers. This creates a space in which

								new applications can easily be written to operate using shared linked data.

								</p>

								<p>

								Of course for many real-world applications,

								one does not want a data wiki in which anyone can

								write.  We therefore need to extend the system to include access control.

								This is discussed in the <a href="CloudStorage.html">article on Socially-aware Cloud Storage</a>.

								</p>

								<hr />


								<h2>Followup</h2>

								<p>The <a href="http://esw.w3.org/EditingData">Editing Data wiki page</a>

								is a place to list clinet and server implementations, and pointer to more inoformation</p>

								<ul>

								  <li>The video mix by dataprtability.org, first minutes</li>

								  <li>The video by the diaspora team in leading up to summer 2010</li>

								  <li>2010-06-01, Chimezie Ogbuji, editor, <a href="http://www.w3.org/TR/sparql11-http-rdf-update/">SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs</a> is a working draft which currently (2010) may hopefully specifiy this functionality.</li>

								</ul>


								<h2>Footnotes</h2>

								<p>

								* Microsoft introduced a completely proprietary protocol

								for write-back called the "Microsoft Frontpage Extensions".

								Later, this MS-Author-Via header was <a href="http://msdn.microsoft.com/en-us/library/cc250217%28v=PROT.10%29.aspx">

								introduced by Microsoft</a> to allow Microsoft clients to

								turn <em>off</em> the front page extensions and use WebDAV.

								As a result, most WebDAV servers in 200X provided that header.

								It was natural therefore natural to use the same header to

								adverize the availablity of SPARQL.

								Perhaps the MS stands for "modification service".

								</p>


								<h2>References</h2>

								<p>

								See <a href="CloudStorage#references">references in next article.</a>

								</p>


								<p><a href="Overview.html">Up to Design Issues</a></p>


								<p><a href="../People/Berners-Lee">Tim BL</a></p>

								</body>

								</html>