server_playground/doc/www.w3.org/DesignIssues/RDFnot.html


								<?xml version="1.0"?>

								<html xmlns="http://www.w3.org/1999/xhtml">

								  <head>

								    <meta name="generator" content=

								    "HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 13), see www.w3.org" />

								    <title>

								      Web design issues; What a semantic can represent

								    </title>

								    <meta http-equiv="Content-Type" content="text/html" />

								    <link href="di.css" rel="stylesheet" type="text/css" />

								  </head>

								  <body bgcolor="#DDFFDD" text="#000000">

								    <address>

								      Tim Berners-Lee

								      <p>

								        <small>Date: September 1998. Last modified: $Date:

								        1998/09/17 20:10:41 $</small>

								      </p>

								      <p>

								        Status: . Editing status: Comments please. An parenthetical

								        discussion to the <a href="Architecture.html">Web

								        Architecture at 50,000 feet</a>. and the <a href=

								        "Semantic.html">Semantic Web roadmap</a>.

								      </p>

								    </address>

								    <p>

								      <a href="Overview.html">Up to Design Issues</a>

								    </p>

								    <hr />

								    <p>

								      Parenthetically, so as not to disturb the flow of what a

								      semantic web <i>is</i>,...what it is not, and how other data

								      models map into directed labelled graphs.

								    </p>

								    <h1>

								      What the Semantic Web can represent

								    </h1>

								    <p>

								      There are many other data models which RDF's Directed

								      Labelled Graph (DLG) model compares closely with, and maps

								      onto. This page is written with the intention of enumerating

								      the similarity and diferences between the models, to indicate

								      how the mapping might be done and what extra information

								      muast be added in the process. Where the other models are

								      related to previous unmet promises of computer science, now

								      passed into folk law as unsolvable problems, they suggest a

								      fear that the goal of a Semantic Web is inappropriate.

								    </p>

								    <p>

								      One consistent difference between the Semantic Web and many

								      data models for programming langauges is the "closed world

								      assumption".

								    </p>

								    <h3>

								      <a name="Semantic" id="Semantic">A Semantic Web is not

								      Artificial Intelligence</a>

								    </h3>

								    <p>

								      The concept of machine-understandable documents does not

								      imply some magical artificial intelligence which allows

								      machines to comprehend human mumblings. It only indicates a

								      machine's ability to solve a well-defined problem by

								      performing well-defined operations on existing well-defined

								      data. Instead of asking machines to understand people's

								      language, it involves asking people to make the extra effort

								    </p>

								    <p>

								      Even though it simple to define, RDF at the level with the

								      power of a semantic web will be complete language, capable of

								      expressing paradox and tautology, and in which it will be

								      possible to phrase questions whose answers would to a machine

								      require a search of the entire web and an unimaginable amount

								      of time to resolve. This should not deter us from making the

								      language complete. Each mechanical RDF application will use a

								      schema to restrict its use of RDF to a deliberately limited

								      language. However, when links are made between the RDF webs,

								      the result will be an expression of a huge amount of

								      information. It is clear that because the Semantic Web must

								      be able to include all kinds of data to represent the world,

								      tha the language itself must be compeletely expressive

								    </p>

								    <h3>

								      <a name="semantic2" id="semantic2">A semantic Web will not

								      require every application to use expressions of arbitrary

								      complexity</a>

								    </h3>

								    <p>

								      Even though the language itself allows expressions of

								      arbitrary complexity and computability, applications which

								      generate RDF will in practice be limited to generating simple

								      expressions such as access control lists, privacy

								      preferences, and search criteria. This does not mean that

								      where a "not" is needed, it should not be drawn from a

								      standard vocabulary so than any RDF engine will be able to

								      recognise it as a "not".

								    </p>

								    <p>

								      (more)

								    </p>

								    <h3>

								      <a name="semantic1" id="semantic1">A semantic Web will not

								      require proof generation to be useful: proof validation will

								      be enough.</a>

								    </h3>

								    <p>

								      The first uses, such as access control on web sites, involve

								      validation of a previously prepared proof, not a requirement

								      to answer an arbitrary question, find the path the construct

								      a valid proof. It is well known that to search for and

								      generate a proof for an arbitrary question is typically an

								      intractable process for many real world problems, and RDF

								      does not require this (unsolvable) problem to be solved to be

								      useful.

								    </p>

								    <h3>

								      <a name="semantic" id="semantic">A semantic web is not an

								      exact rerun of a previous failed experiment</a>

								    </h3>

								    <p>

								      Other concerns at this point are raised about the

								      relationship to Knowledge representation systems: has this

								      not been tried before with projects such as <a href=

								      "Semantic.html#kif">KIF</a>and <a href=

								      "Semantic.html#cyc">cyc</a>? The answer is yes, it has, more

								      or less, and such systems have been developed a long way.

								      They should feed the semantic Web with design experience and

								      the Semantic Web may provide a source of data for reasoning

								      engines developed in similar projects.

								    </p>

								    <p>

								      Many KR systems had a problem merging or interrelating two

								      separate knowledge bases, as the model was that any concept

								      had one and only one place in a tree of knowledge. They

								      therefore did not scale, or pass the test of independent

								      invention. [see evolvability]. The RDF world, by contrast is

								      designed for this in mind, and the retrospective

								      documentation of relationships between originally independent

								      concepts.

								    </p>

								    <h3>

								      <a name="Knowledge" id="Knowledge">Knowledge Representation

								      goes Global</a>

								    </h3>

								    <p>

								      Knowledge representation is a field which is currently seems

								      to have the reputation of being initially interesting, but

								      which did not seem to shake the world to the extent that some

								      of its proponents hoped. It made sense but was of limited use

								      on a small scale, but never made it to the large scale. This

								      is exactly the state which the hypertext field was in before

								      the Web. Each field had made certain centralist assumptions

								      -- if not in the philosophy, then in the implementations,

								      which prevented them from spreading globally. But each field

								      was based on fundamentally sound ideas about the

								      representation of knowledge. The Semantic Web is what we will

								      get if we perform the same globalization process to Knowledge

								      Representation that the Web initially did to Hypertext. We

								      remove the centralized concepts of absolute truth, total

								      knowledge, and total provability, and see what we can do with

								      limited knowledge.

								    </p>

								    <h2>

								      <a name="ER" id="ER">The Semantic Web and Entity-Relationship

								      models</a>

								    </h2>

								    <p>

								      Is the RDF model an entity-relationship mode? Yes and no. It

								      is great as a basis for ER-modelling, but because RDF is used

								      for other things as well, RDF is more general. RDF is a model

								      of entities (nodes) and relationships. If you are used to the

								      "ER" modelling system for data, then the RDF model is

								      basically an opening of the ER model to work on the Web. In

								      typical ER model involved entity types, and for each entity

								      type there are a set of relationships (slots in the typical

								      ER diagram). The RDF model is the same, except that

								      relationships are first class objects: they are identified by

								      a URI, and so anyone can make one. Furthurmore, the set of

								      slots of an object is not defined when the class of an object

								      is defined. The Web works though anyone being (technically)

								      allowed to say anything about anything. This means that a

								      relationship between two objects may be stored apart from any

								      other information about the two objects. This is different

								      from object-oriented systems often used to implement ER

								      models, which generally assume that information about an

								      object is stored in an object: the definition of the class of

								      an object defines the storage implied for its properties.

								    </p>

								    <p>

								      For example, one person may define a vehicle as having a

								      number of wheels and a weight and a length, but not foresee a

								      color. This will not stop another person making the assertion

								      that a given car is red, using the color vocabulary from

								      elsewhere.

								    </p>

								    <p>

								      Apart from this simple but significant change, many concepts

								      involved in the ER modelling take across directly onto the

								      Semantic Web model.

								    </p>

								    <h2>

								      <a name="Semantic1" id="Semantic1">The Semantic Web and

								      Relational Databases</a>

								    </h2>

								    <p>

								      The semantic web data model is very directly connected with

								      the model of relational databases. A relational database

								      consists of tables, which consists of rows, or records. Each

								      record consists of a set of fields. The record is nothing but

								      the content of its fields, just as an RDF node is nothing but

								      the connections: the property values. The mapping is very

								      direct

								    </p>

								    <ul>

								      <li>a record is an RDF node;

								      </li>

								      <li>the field (column) name is RDF propertyType; and

								      </li>

								      <li>the record field (table cell) is a value.

								      </li>

								    </ul>

								    <p>

								      Indeed, one of the main driving forces for the Semantic web,

								      has always been the expression, on the Web, of the vast

								      amount of relational database information in a way that can

								      be processsed by machines.

								    </p>

								    <p>

								      RDF's serialization format -- its syntax in XML -- is a very

								      suitable format for expressing relational database

								      information.

								    </p>

								    <p>

								      Relational database systems, manage RDF data, but in a

								      specialized way. In a table, there are many records with the

								      same set of properties. An individual cell (which corresponds

								      to an RDF property) is not often thought of on its own. SQL

								      queries can join tables and extract data from tables, and the

								      result is generally a table. So, the practical use for which

								      RDB software is used typically optimized for soing operations

								      with a small number of tables some of which may have a large

								      number of elements.

								    </p>

								    <p>

								      RDB systems have datatypes at the atomic (unstructured)

								      level, as RDF and XML will/do. Combination rules tend in RDBs

								      to be loosely enforced, in that a query can join tables by

								      any comlumns which match by datatype -- without any check on

								      the semantics. You could for example create a list of houses

								      that have the same number as rooms as an employee's shoe

								      size, for every employee, even though the sense of that would

								      be questionable.

								    </p>

								    <p>

								      The Semantic Web is not designed just as a new data model -

								      it is specifically appropriate to the linking of data of many

								      different models. One of the great things it will allow is to

								      add information relating different databases on the Web, to

								      allow sophisticated operations to be performed across them.

								    </p>

								    <h2>

								      <a name="Inference" id="Inference">RDF is not an Inference

								      system</a>

								    </h2>

								    <p>

								      I am not proposing any FPOC or HOL inference engine. I just

								      note that HOL allows integration of multiple systems which

								      use different inference engines spanning the range from from

								      SQL to AI. For example, a simple HOL would allow any SHOE

								      rules, data and results expressed, and a proof found by a

								      SHOE engine to be verified by anyone.

								    </p>

								    <h3>

								      <a name="Surely" id="Surely">Surely all first-order or

								      higher-order predicate caluculus based systems (such as KIF)

								      have failed historically to have wide impact?</a>

								    </h3>

								    <p>

								      The same was true of hypertext systems between 1970 and 1990,

								      ie before the Web. Indeed, the same objection was raised to

								      the Web, and the same reasons apply for pressing on with the

								      dream.

								    </p>

								    <p>

								      The problem with all such systems was that they were

								      conceptually or physically centralized. They required link

								      global consistency.

								    </p>

								    <p>

								      Guess what? KIF is very centralized in its approach to

								      organizing knowledge (the cyc ontology for example suggests

								      that everyone agree on the same terms for common english

								      words, which RDF does not) and it does not promote its

								      concepts to being first class web objects, ie it doesn't use

								      URIs to identify them. To webize KIF or KR in general is, in

								      many ways, the same as to webize hypertext in many ways.

								      Replace identifiers with URIs. Remove any requirement for

								      global consistency. Put in a significant effort into getting

								      critical mass. Sit back.

								    </p>

								    <h3>

								      Surely, many things expressible in FOPC are not efficiently

								      computable?

								    </h3>

								    <p>

								      Dead right. The goal of the semantic web is to express real

								      life. Many things in real life, real questions which we will

								      face are not efficiently computable. There are two solutions

								      to this: The classical (pre-web) solution is to constrain the

								      langage of expression so that all queries terminate in finite

								      time. The weblike solution is to allow the expression of

								      facts and rules in an overall language which is sufficiently

								      flexible and powerful to express real life. Create subsets fo

								      the web in which specific constraints give you specific

								      computational properties. An anlogy is with the

								      human-information systems which existed before the web. Most

								      forced one to keep ones data in a hierarchy (sometimes of

								      fixed depth or a matrix (often with a specific number of

								      dimensions). This gave consistency properties within the

								      information system. I bet DARPA has many of these systems and

								      still does. They only way they could be integrated was to

								      express them in terms of a much more powerful language -

								      global hypertext. Hypertext did not have any of these

								      reassuring properties. People were frightened about getting

								      lost in it. You could follow links forever. As it turns out,

								      it is true of course that there is a problem that you can

								      follow links forever in the Web. And on the Semantic Web an

								      inference engine will not necessarily terminate. However, on

								      eth Web there are many subsystems such as many websites where

								      life is very ordered and predictable, and searches give

								      definitive results and there are no dangling links. But there

								      is a HUGE advantage from exposing all this information in a

								      way that allows it to be unified with all the other systems,

								      ordered and unordered.

								    </p>

								    <h3>

								      We should not expect a base inference level to include

								      non-decidable computations

								    </h3>

								    <p>

								      I have no expecatation of any inference capability in the SW

								      core design. The semantic web does not have HOL inference as

								      a standard. I would expect any SW compliant device to be able

								      to <em>validate</em> a HOL proof, but not <em>generate</em>

								      one.

								    </p>

								    <p>

								      If you take a non-HOL-complete langauge and extend it to HOL,

								      unless you have first defined where you are going (by

								      defininbg the HOL langauge and expressing SHOE in it first)

								      you will very likely end up with a rather baroque HOL

								      langauge.

								    </p>

								    <h3>

								      The FOPC inference model is extremely intolerant of

								      inconsistency [i.e. P(x) &amp; NOT (P(X)) -&gt; Q], the

								      semantic web has to tolerate many kinds of inconsistency.

								    </h3>

								    <p>

								      Toleration of inconsistecy can only be done by fuzzy systems.

								      We need a semantic web which will provide guarantees, and

								      about which one can reson with logic. (A fuzzy system might

								      be good for finding a proof -- but then it should be able to

								      go back and justify each deduction logically to produce a

								      proof in the unifying HOL language which anyone can check)

								      Any real SW system will work not by believing anything it

								      reads on the web but by checking the source of any

								      information. (I wish people would learn to do this on the Web

								      as it is!). So in fact, a rule will allow a system to infer

								      things only from statements of a particular form signed by

								      particular keys. Within such a system, an inconsistency is a

								      serious problem, not something to worked around. If my bank

								      says my bank balance is $100 and my computer says it is $200,

								      then we need to figure out the problem. Same with launching

								      missiles, IMHO. The semantic web model is that a URI

								      dereferences to a document which parses to a directed labeled

								      graph of statements. The statements can have URIs as

								      prameters, so they can may statements about documents and

								      about other statements. So you can express trust and reason

								      about it, and limit your information to trusted consistent

								      data.

								    </p>

								    <h3>

								      Again, extension to higher order logic makes sense to me,

								      requirement of FOPC inference model seems dangerous.

								    </h3>

								    <p>

								      Most KR systems confuse information with inference tips. When

								      a system stores a rule <em>a daughter of one's daughter is

								      one's grandaughter</em> it is typically not just tored as

								      that statement, but in a table of rules to be used by the

								      algorithm at a particular time (for example whenever a parent

								      of a daughter is found). The classicfication between data and

								      various type of rule is a sort of meta level information

								      which is general not itself expressed in the language. Two

								      systems must be able to interchange the logical meaning of

								      the rule, even when the type of rule may be unknown to each

								      others inference engines. (Of couse, the rule expressed in

								      general logic may be recongizable as a rule by another system

								      and absorbed as such.) The example above is logically

								    </p>

								    <p>

								      &forall;&alpha;,&beta;,&chi; (d(a,b) &amp; d(b,c) =&gt;

								      gd(a,c))

								    </p>

								    <p>

								      while for example a SHOE-based system and an Algernon-based

								      system may have quite different systems for applying rules at

								      different times.

								    </p>

								    <h2>

								      <a name="CG" id="CG">Conceptual Graphs and the Semantic

								      Web</a>

								    </h2>I have written <a href="CG.html">a separate set of

								    notes</a> about the relationship between Conceptual Graphs and

								    the Semantic Web.

								    <hr />

								    <p>

								      A few unsorted references - see also other pages in this set.

								    </p>

								    <ul>

								      <li>

								        <a href=

								        "http://www.cs.umd.edu/projects/plus/SHOE/index.html">SHOE:

								        simple hypertext ontology extensions</a>

								      </li>

								    </ul>

								    <p>

								      Shoe

								    </p>

								    <p>

								      References on KR on the Web from Tim Finin:

								    </p>

								    <p>

								      Here are some relevant papers from the <a href=

								      "http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-23/">

								      IJCAI-99 Workshop on Intelligent Information Integration</a>,

								      . The first is a nice overview...

								    </p>

								    <ul>

								      <li>

								        <a href=

								        "http://www.cs.vu.nl/~frankh/postscript/IJCAI99-III.html">Practical

								        Knowledge Representation for the Web</a>, Frank van

								        Harmelen and Dieter Fensel,

								      </li>

								      <li>

								        <a href=

								        "http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-23/crainfield-ijcai99-iii.pdf">

								        UML as an Ontology Modelling Language</a>, Stephen

								        Cranefield, Martin Purvis,

								      </li>

								      <li>

								        <a href=

								        "http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-23/fensel-ijcai99-iii.ps">

								        On2broker: Semantic-Based Access to Information Sources at

								        the WWW</a>, Dieter Fensel, Jurgen Angele, Stefan Decker,

								        Michael Erdmann, Hans-Peter Schnurr, Steffen Staab, Rudi

								        Studer, Andreas Witt,

								      </li>

								    </ul>

								    <p>

								      and here are some others of possible interest...

								    </p>

								    <p>

								      Embedding Knowledge in Web Documents, Philippe Martin and

								      Peter Eklund, Eighth International World Wide Web Conference,

								      Toronto, May 11-14, 1999.

								    </p>

								    <p>

								      Ontobroker: Or How to Enable Intelligent Access to the WWW,

								      Dieter Fensel, Stefan Decker, Michael Erdmann, and Rudi

								      Studer, Eleventh Workshop on Knowledge Acquisition, Modeling

								      and Management, Voyager Inn, Banff, Alberta, Canada, Saturday

								      18th to Thursday 23rd April, 1998

								    </p>

								    <p>

								      and if we want a good overview of cyc as a backgrounder

								    </p>

								    <p>

								      CYC: A Large-Scale Investment in Knowledge Infrastructure

								      Douglas B. Lenat, CACM, 1995. I have a local copy at

								      http://www.cs.umbc.edu/471/papers/cyc95.pdf

								    </p>

								  </body>

								</html>