You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
249 lines
12 KiB
249 lines
12 KiB
<?xml version="1.0" encoding="us-ascii"?>
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head>
|
|
<title>Webizing existing systems - Design Issues </title>
|
|
<link rel="Stylesheet" href="di.css" type="text/css" />
|
|
<meta http-equiv="content-type" content="text/html; charset=us-ascii" />
|
|
<link href="di.css" rel="stylesheet" type="text/css" />
|
|
</head>
|
|
|
|
<body bgcolor="#DDFFDD" text="#000000" xml:lang="en" lang="en">
|
|
<address>
|
|
Tim Berners-Lee<br />
|
|
Date: 1998, last change: $Date: 2010/03/09 14:07:04 $<br />
|
|
Status: personal view only. Editing status: first draft.
|
|
</address>
|
|
|
|
<p><a href="./">Up to Design Issues</a> </p>
|
|
<hr />
|
|
|
|
<h1><a name="Webizing" id="Webizing">Webizing existing systems</a> </h1>
|
|
|
|
<p><em>This discusses the introduction of URIs as names in a system to scale
|
|
it to the web.</em> </p>
|
|
|
|
<p>The web is extended in two ways - by adding new bits of technology to the
|
|
existing stuff, and by "webizing" existing applications and systems. Webizing
|
|
is really important, not only as a way of bootstrapping the web using large
|
|
amount of legacy information, but because the existing systems have been
|
|
researched and designed over the years and it is really important we do not
|
|
lose the knowledge accrued during that process. </p>
|
|
|
|
<p>The essential process in webizing is to take a system which is designed as
|
|
a closed world, and then ask what happens when it is considered as part of an
|
|
open world. Practically, this effect on a computer language is to replace the
|
|
names/tokens/identifiers for URIs. Thus, where before reference could only be
|
|
made to something in the same document/program/module one can with equal ease
|
|
make reference to something in a different one somewhere in that abstract
|
|
space which is the Web. </p>
|
|
|
|
<p>In a clean case, this will be done so that the URI for an object is rather
|
|
naturally related to its representation in the original language. For
|
|
example, the element with ID "foo" in bar.xml is bar.xml#foo. However, to do
|
|
the same for an attribute defined in a DTD or schema is more difficult,
|
|
because of the complex nature of the spaces and subspaces for element and
|
|
attribute names in XML. It is great when the webized language is very similar
|
|
to the original language, and ideal when it actually compiles. Dan Connolly's
|
|
2000/8 <a href="#Connolly,">webization of KIF</a> uses URIs for identifiers,
|
|
but to be accurate because URIs are case sensitive and KIF tokens not, lower
|
|
case letter had to be marked with escaped with backslashes in the translation
|
|
which made the result less readable. Changing the underlying language in
|
|
small ways can make the translation much less cumbersome!. </p>
|
|
|
|
<p>Here is a slightly flippant view on the webize() function, each row of
|
|
which probably needs an essay of explanation, but provided here without
|
|
any.</p>
|
|
|
|
<table border="4">
|
|
<caption></caption>
|
|
<tbody>
|
|
<tr>
|
|
<td>x</td>
|
|
<td>webize(x)</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Hypertext</td>
|
|
<td>WWW</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Data</td>
|
|
<td><a href="LinkedData.html">Linked data</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Top-down structured design</td>
|
|
<td>Bottom-up ontology design</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Data Hiding</td>
|
|
<td>Data Re-use</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Goto Considered Harmful</td>
|
|
<td>Goto drives the economy</td>
|
|
</tr>
|
|
<tr>
|
|
<td>unix file system</td>
|
|
<td><a href="CloiudStorage.html">ACL'd r/w linked data</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Large-scale structure: Hierachy</td>
|
|
<td><a href="Fractal.html">Large-scale structure Scale free</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>"Tired"</td>
|
|
<td>"Wired"</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<h3><a name="Example" id="Example">Example - webizing a database</a> </h3>
|
|
|
|
<p>Imagine that a database is to be made available on the web in RDF. Suppose
|
|
the database itself will have a URI of http://weather.org/current An SQL
|
|
database is essentially a closed world, in that the various thing in it were
|
|
not designed to be linked to from outside. An SQL statement </p>
|
|
<pre>SELECT temp, zip FROM weather WHERE temp > 30</pre>
|
|
|
|
<p>makes reference to terms which have meaning within the database. There is
|
|
no reference in that statement to the database - that is simply part of the
|
|
context. </p>
|
|
|
|
<p>Now suppose we determine what the URI will be for the pieces of the
|
|
database, perhaps current/weather for a table, and current/weather.temp for a
|
|
column in a table. We could then expend the syntax (excuse my SQL - I am
|
|
making this up) </p>
|
|
<pre><span style="color: #FF0000">USING c FOR http://weather.org/current</span><br style="color: #FF0000" />
|
|
|
|
<span style="color: #FF0000">USING u FOR http://places.org/usa</span><br />
|
|
|
|
SELECT <span style="color: #FF0000">c:</span>readings.temp, <span style="color: #FF0000">u:</span>location.lat, <span style="color: #FF0000">u:</span>location.long
|
|
FROM JOIN <span style="color: #FF0000">c:</span>readings, <span style="color: #FF0000">u:</span>location
|
|
WHERE <span style="color: #FF0000">c:</span>readings.zip = <span style="color: #FF0000">u:</span>location.zip
|
|
AND <span style="color: #FF0000">c:</span>readings.temp > 30;</pre>
|
|
|
|
<p>This is an (incorrect I expect @@@) SQL which links out of the local
|
|
database to combine it with information from a remote one. This syntax I am
|
|
sure won't work in practice, but should illustrate the principle. Namespaces
|
|
c and u are introduced for two reasons: for brevity, as repeating them in the
|
|
code would have been too cumbersome; and for syntactic reasons as URIs tend
|
|
to contain characters which would be ambiguous with other syntax is allowed
|
|
in SQL column names. </p>
|
|
|
|
<p>Of course, whether actually SQL on a set of scattered databases is
|
|
valuable may be questionable - it may not optimize as well as some other
|
|
query languages. However, suddenly the things defined by the database are
|
|
available to the outside world. For example, the concept of temperature
|
|
reading as used by weather.org in its database of current conditions </p>
|
|
|
|
<p><code>http://weather.org/current/readings.temp</code> </p>
|
|
|
|
<p>is now a concept, an RDF property in fact, which is available for all the
|
|
world to refer to. These references need not all be in SQL. Because the
|
|
schema for the database will declare it to be an RDF property or something
|
|
equivalent, many different systems can use the information and refer to the
|
|
concept. </p>
|
|
|
|
<h4 id="Notes">Notes specifically on this example </h4>
|
|
|
|
<p>I note, before we leave this example, that there are two concepts
|
|
important to a table. One is the type of thing described by a row. A row in
|
|
the reading table, for example, defined a weather reading, something which
|
|
had a location and temperature and humidity and place. The other concept is
|
|
the set of objects which are actually in the table. In the classic SQL
|
|
example of the employees table, there is a rdf:class employee, subclass of
|
|
person, and also the fact that someone works for the company iff they are in
|
|
the table. </p>
|
|
|
|
<p>A second note on exporting databases. When you really put something on the
|
|
web, there is often, for flexibility and security, a layer between what you
|
|
expose and the internal storage. Just as web pages are not files though often
|
|
closely related to files, and have the same form - a string of bytes and a
|
|
MIME type. Exposed remote operations are not local procedures though closely
|
|
related to them, and have the same form -- a service URI and a method name
|
|
and parameters. Similarly one would probably export a derived view of a
|
|
database in many cases - one which would have the form of a database. This
|
|
allows different engineering decisions to be made on the external
|
|
manifestation (persistent and what the customer wants) and the internal form
|
|
(efficient and convenient for you). </p>
|
|
|
|
<h2 id="webize">Webizing nested languages </h2>
|
|
|
|
<p>Sometimes this is easy and sometimes it is hard. It is hard, for example,
|
|
when the language uses nested scoping to great effect. In this case there is
|
|
a very large amount of context which is completely different between the
|
|
beginning and end of such a link. The <em>go to</em> instruction is
|
|
considered harmful [<a href="#GTCH">ref</a>] by Dijkstra because it "<em>as
|
|
it stands is just too primitive; it is too much an invitation to make a mess
|
|
of one's program</em>." This of course is true of the hypertext link too, in
|
|
a way. Both allow an open webbed world which typically, if used with no
|
|
restraint, remove rules which give sanity and analysability to a language and
|
|
allow optimization of the code compiled. So, just as some languages prevent
|
|
one from jumping into or out of an inner loop of a program, so it may make no
|
|
sense to allow a link to be made into something within a nested structure,
|
|
because the referenced thing just does not have any meaning when taken out of
|
|
context. </p>
|
|
|
|
<p>When dealing with language which have nested context, it may be necessary
|
|
either to define how something inside represented independently of context,
|
|
or to make it impossible. </p>
|
|
|
|
<p>Be careful, though, before jumping to this conclusion. In many cases, it
|
|
is important to webize nested objects completely. For example, in a 3d scene
|
|
language, an object may be within a scene within an object within a scene and
|
|
still have identity which is important to be able to refer to. In a hypertext
|
|
document, there is a nested context which for example affects the style, and
|
|
the reference is made to the destination anchor not as a isolated piece of
|
|
hypertext, but in the context of the whole document. </p>
|
|
|
|
<p>The principle that on the Web, anything must be able to say anything about
|
|
anything means that these innermost nested objects must have URIs. </p>
|
|
|
|
<p>It may also be the case that an attempt to webize a language reveals bad
|
|
points in the design which really need to be ironed out anyway for the cause
|
|
of good software engineering. If a name in some module has in fact quite
|
|
different meanings when used in different contexts, then it isn't suitable
|
|
for webizing as it is, and maybe two separate derived URIs should be made in
|
|
the mapping. Maybe the language should actually be cleaned up so that the
|
|
concepts are distinct. </p>
|
|
|
|
<p>A very simple case is in a documentation control system, when humans use
|
|
the same document name ("the pipe size draft") to refer to a particular
|
|
document and also to the set of documents from </p>
|
|
|
|
<p>An exercise for the reader is to contemplate and determine whether it is
|
|
webized, and if not, what it would take, and what would be the cleanest way
|
|
of going it. Try looking at XML schemas (what is the URI of an element
|
|
type?). </p>
|
|
|
|
<p>When stuck, recourse to common sense. Ask what the construct actually
|
|
represents in a global context, if anything. This might mean clarifying the
|
|
language itself. </p>
|
|
|
|
<h2><a name="Conclusion" id="Conclusion">Conclusion</a> </h2>
|
|
|
|
<p>Webizing a language involves turning from a system which assumes a closed
|
|
world to one which will operate as part of the open web. Some cases are
|
|
easier than others. Webizing one application gets one a good idea of what
|
|
sorts of design decisions force a closed world assumption and make webizing
|
|
difficult, and what by contrast makes a weblike application which immediately
|
|
benefits from the rest of everything out there. </p>
|
|
<hr />
|
|
|
|
<h2 id="References">References </h2>
|
|
|
|
<p><a name="GTCH" id="GTCH">GTCH</a>: Edsger W. Dijkstra, "<a
|
|
href="http://www.acm.org/classics/oct95/">Go To Statement Considered
|
|
Harmful</a>", <em>Communications of the ACM</em>, Vol. 11, No. 3, March 1968,
|
|
pp. 147-148. </p>
|
|
|
|
<p><a name="Connolly,">Connolly, Dan,</a> "<a
|
|
href="/2000/07/hs78/KIF.html">Knowledge Interchange Format (KIF) as an RDF
|
|
Schema</a>", 2000/8 </p>
|
|
|
|
<p><a href="Overview.html">Up to Design Issues</a> </p>
|
|
|
|
<p><a href="../People/Berners-Lee">Tim BL</a> </p>
|
|
|
|
<p>2000/8/31 </p>
|
|
</body>
|
|
</html>
|