You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
694 lines
28 KiB
694 lines
28 KiB
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
|
|
system "http://www.w3.org/TR/WD-html40-970917/sgml/HTML4-loose.dtd"
|
|
>
|
|
<HTML>
|
|
<HEAD>
|
|
<!-- $Id: NOTE-webarch-extlang-19980210.html,v 1.7 1998/02/10 21:38:32 connolly Exp $ -->
|
|
<TITLE>Web Architecture: Extensible languages</TITLE>
|
|
</HEAD>
|
|
<BODY BGCOLOR="white" TEXT="black">
|
|
<DIV class="header">
|
|
<H3>
|
|
<A href="../../"><IMG border="none" align="left" alt="W3C" src="../../Icons/WWW/w3c_home"></A>
|
|
</H3>
|
|
<H1 align="center">
|
|
Web Architecture: Extensible Languages
|
|
</H1>
|
|
<H3 align="center">
|
|
W3C Note 10 Feb 1998
|
|
</H3>
|
|
<DL>
|
|
<DT>
|
|
This Version:
|
|
<DD>
|
|
<A HREF="http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210">http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210</A>
|
|
<DT>
|
|
Latest Version:
|
|
<DD>
|
|
<A HREF="http://www.w3.org/TR/NOTE-webarch-extlang">http://www.w3.org/TR/NOTE-webarch-extlang</A>
|
|
<DT>
|
|
Authors:
|
|
<DD>
|
|
<A href="http://www.w3.org/People/Berners-Lee/">Tim Berners-Lee</A>
|
|
<TT><A href="mailto:timbl@w3.org"><timbl@w3.org></A></TT> W3C <BR>
|
|
<A href="http://www.w3.org/People/Connolly/">Dan Connolly</A>
|
|
<TT><A href="mailto:connolly@w3.org"><connolly@w3.org></A></TT> W3C
|
|
</DL>
|
|
<H2>
|
|
Status of This Document
|
|
</H2>
|
|
<P>
|
|
<I>This document is a NOTE made available by the W3 Consortium for discussion
|
|
only. This indicates no endorsement of its content, nor that the Consortium
|
|
has, is, or will be allocating any resources to the issues addressed by the
|
|
NOTE.</I>
|
|
<P>
|
|
This work is related to the Architecture domain of the W3C, and particularly
|
|
to the <A HREF="../../XML/">XML</A> activity, but is related to
|
|
<A HREF="../../MarkUp/">HTML</A>, <A HREF="../../Protocols/HTTP/">HTTP</A>
|
|
and <A HREF="../../Metadata/">Metadata</A> activities.
|
|
<P>
|
|
Comments should be sent to the authors and
|
|
<A HREF="mailto:www-talk@w3.org">www-talk@w3.org</A>.
|
|
<P>
|
|
This document is meant to be a fairly explanatory synthesis of the requirements
|
|
for namespace extension in languages on the web, and in particular for the
|
|
general language planned to be the common basis of many future applications,
|
|
XML. It was originally written as part of the
|
|
"<A HREF="../../DesignIssues/">Design Issues</A>" series of notes. Whilst
|
|
technically the personal opinion of the authors, it their best attempt as
|
|
technical coordinators at outlining common architectural principles
|
|
for W3C development.
|
|
<P>
|
|
At the time of writing [1998/02], various drafts in the XML and RDF community
|
|
address these requirements in various ways. The document may evolve
|
|
if further clarity is seen to be needed, or further requirements added. Some
|
|
open issue are noted.
|
|
<H2>
|
|
Abstract
|
|
</H2>
|
|
<P>
|
|
Experience with the task of coordinating developments by independent groups
|
|
has allows us to define properties of languages which will allow the unfettered
|
|
growth of the Web technology in a chaotic but still well defined way. These
|
|
take the form of constraints on the language features for making reference
|
|
to multiple different vocabularies, and on languges for "schema" documents
|
|
which define those vocabularies.
|
|
<HR>
|
|
<H2>
|
|
Contents
|
|
</H2>
|
|
<OL>
|
|
<LI>
|
|
<A HREF="#Introduction">Introduction</A>
|
|
<LI>
|
|
<A HREF="#Requirements">Requirements</A>
|
|
<OL>
|
|
<LI>
|
|
<A HREF="#Glossary">Glossary</A>
|
|
<LI>
|
|
<A HREF="#Mixing">Mixing vocabularies</A>
|
|
<LI>
|
|
<A HREF="#Scenario">Scenario</A>
|
|
<LI>
|
|
<A HREF="#Local">Local scope</A>
|
|
<LI>
|
|
<A HREF="#Ambiguity">Lack of ambiguity</A>
|
|
<LI>
|
|
<A HREF="#Evolving">Evolving new scheme languages</A>
|
|
<LI>
|
|
<A HREF="#Correctness">Correctness of documents with multiple vocabularies</A>
|
|
<LI>
|
|
<A HREF="#Granularity">Granularity</A>
|
|
<LI>
|
|
<A HREF="#Incorporation">Incorporation into the language</A>
|
|
</OL>
|
|
<LI>
|
|
<A HREF="#Related">References</A>
|
|
</OL>
|
|
</DIV>
|
|
<!-- end header division -->
|
|
<P>
|
|
<HR>
|
|
<H2>
|
|
<A NAME="Introduction">Introduction</A>
|
|
</H2>
|
|
<P>
|
|
When the World Wide Web Consortium was first put together, high on the list
|
|
of goals of the Consortium was making the web "evolvable". At that
|
|
time, it was a philosophical goal and it wasn't clear what it would mean
|
|
technically. Since then, W3C has had plenty of experience in the deployment
|
|
of new technology, particularly in an environment of thousands of
|
|
independent groups developing in closely related or identical fields.
|
|
<P>
|
|
The HTTP and HTML specifications have both grown rapidly in this environment.
|
|
The existence of an open and freely usable standard allows anyone in the
|
|
world to experiment with extensions. Deployment of experimental features
|
|
was enabled by one simple rule, inherited with care from the Internet email
|
|
community:
|
|
<H4>
|
|
Rule used to date:
|
|
</H4>
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="4">
|
|
<TR>
|
|
<TD><P ALIGN=Left>
|
|
<I>Old rule:</I> If you find a language element you don't understand, ignore
|
|
it.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<P>
|
|
(The exact definition of "Ignore" varies - HTTP headers are actually ignored
|
|
and HTML elements are replaced with their contents (ie unknown tags are ignored)
|
|
- but the principle has been the same.)
|
|
<P>
|
|
This rule has covered web development from 1989 to the present. The result
|
|
has been a very high speed of growth. However, a state of ambiguity
|
|
and lack of interoperability always exists from the introduction of an
|
|
experimental feature until the later agreement on a common standard . This
|
|
weakened the reliability and credibility of the Web. Furthermore, there has
|
|
always been a threat that lack of consensus on new features would lead to
|
|
a permanent fragmentation of the evolutionary paths.
|
|
<P>
|
|
The problem was that neither the specification of new elements nor the effect
|
|
of ignoring them was ever clearly defined. Contrast this to the situation
|
|
in most distributed object systems. In these cases, objects and support
|
|
classes generally have well defined interfaces. Whilst ensuring interoperability,
|
|
the rigidity of this system, in which new interfaces had to be explicitly
|
|
agreed between parties, has been one of the factors inhibiting such systems
|
|
from spreading in web-like or virus-like manner. As discussed in
|
|
<A HREF="#Ascent">[Ascent]</A>:
|
|
<BLOCKQUOTE>
|
|
And yet the ability to combine resources that were developed independently
|
|
is an essential survival property of technology in a distributed information
|
|
system.
|
|
</BLOCKQUOTE>
|
|
<P>
|
|
Can we have the best of both worlds, and have clearly defined interfaces,
|
|
but also allow systems from different communities to communicate when having
|
|
only a partial understanding of each other's specifications? This need
|
|
has surfaced from many areas from HTTP extensions (see the
|
|
<A HREF="../../Protocols/Activity.html#PEPspec">PEP requirements</A>) to
|
|
Metadata (see design notes on
|
|
<A HREF="../../DesignIssues/Metadata.html">Metadata architecture</A>).
|
|
<H2>
|
|
<A NAME="Requirements">Requirements</A>
|
|
</H2>
|
|
<P>
|
|
The need is for two systems to be able to communicate when they have a common
|
|
vocabulary but not complete understanding of all the features they each use.
|
|
As these requirements are derived from experience across many different
|
|
systems, we will have to chose which words to use in this document.
|
|
<H4>
|
|
<A NAME="Glossary">Glossary</A>
|
|
</H4>
|
|
<P>
|
|
For the purposes of this document, words are used as follows:
|
|
<P>
|
|
<DL>
|
|
<DT>
|
|
<B>element</B>
|
|
<DD>
|
|
A range text within of a document, identified by a local identifier.
|
|
<DT>
|
|
<B>vocabulary</B>
|
|
<DD>
|
|
a set of local identifiers in a document, (which identify parts of the document),
|
|
and whose meaning (at some level) is defined by generic resource. The namespace
|
|
resource conceptually represents the vocabulary in general, which may be
|
|
represented by one or more schemata.
|
|
<DT>
|
|
<B>schema</B>
|
|
<DD>
|
|
A specific document which defines a vocabulary (at some level)
|
|
<DT>
|
|
<DD>
|
|
</DL>
|
|
<P>
|
|
Although this is a general document, it is hoped that these terms are not
|
|
used inconsistently with their use in XML (element) and RDF (schema).
|
|
<P>
|
|
There is some rough correspondence in the soup of terms as follows.<BR>
|
|
<CENTER>
|
|
<TABLE BORDER CELLPADDING="2" ALIGN="Center">
|
|
<TR>
|
|
<TH>This document</TH>
|
|
<TH>SGML</TH>
|
|
<TH>HTTP</TH>
|
|
<TH>Programming languages</TH>
|
|
<TH>RDF</TH>
|
|
</TR>
|
|
<TR>
|
|
<TD>Element</TD>
|
|
<TD>Element</TD>
|
|
<TD>Header</TD>
|
|
<TD>Function/Procedure/Method call</TD>
|
|
<TD>Element</TD>
|
|
</TR>
|
|
<TR>
|
|
<TD>Binding</TD>
|
|
<TD>-</TD>
|
|
<TD>(PEP header)</TD>
|
|
<TD>"Import", external declaration</TD>
|
|
<TD></TD>
|
|
</TR>
|
|
<TR>
|
|
<TD>-</TD>
|
|
<TD>Entity declaration</TD>
|
|
<TD>-</TD>
|
|
<TD>#Include</TD>
|
|
<TD></TD>
|
|
</TR>
|
|
<TR>
|
|
<TD>Declaration</TD>
|
|
<TD>Element declaration</TD>
|
|
<TD>(http spec)</TD>
|
|
<TD>Function declaration</TD>
|
|
<TD></TD>
|
|
</TR>
|
|
<TR>
|
|
<TD>Schema</TD>
|
|
<TD>DTD</TD>
|
|
<TD>(none!)</TD>
|
|
<TD>Module interface definition</TD>
|
|
<TD>Schema</TD>
|
|
</TR>
|
|
<TR>
|
|
<TD></TD>
|
|
<TD>Content model</TD>
|
|
<TD></TD>
|
|
<TD>Parameter type</TD>
|
|
<TD></TD>
|
|
</TR>
|
|
<TR>
|
|
<TD></TD>
|
|
<TD>Attributes</TD>
|
|
<TD></TD>
|
|
<TD>Parameter type</TD>
|
|
<TD></TD>
|
|
</TR>
|
|
</TABLE>
|
|
</CENTER>
|
|
<P>
|
|
<H3>
|
|
<A NAME="Mixing">Mixing vocabularies</A>
|
|
</H3>
|
|
<P>
|
|
When a message is sent across the Internet as part of a Web communications
|
|
protocol, it is tempting as above to compare the message with a remote procedure
|
|
call, and to adopt the characteristics of a procedure/method call from
|
|
distributed OO systems. A procedure call identifies the target object,
|
|
one of a finite number of methods from the exported interface, and a set
|
|
of typed parameters.
|
|
<P>
|
|
However, this analogy is not powerful enough. A message should be
|
|
considered an expression and, if one takes an analogy with programming languages,
|
|
the analogy should be with an expression or program rather than with a function
|
|
call. [Or, if considered a function call, strictly, the parameters have to
|
|
be extended to allow other nested function calls]. In this case, there
|
|
may be many functions identified, in many interfaces. In other words,
|
|
don't think of an HTTP message or an HTML document as an RPC call, but rather
|
|
as the transmission of a expression in some language.
|
|
<P>
|
|
In the case of an XML document, this corresponds to a document which contains
|
|
elements whose declarations occur in many different specifications (SGML:
|
|
many different DTDs). This is the requirement brought out under "Metadata
|
|
architecture", of <A HREF="../../DesignIssues/Metadata.html#Mixing">mixing
|
|
vocabularies</A>:
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="4">
|
|
<TR>
|
|
<TD>It must be possible at one point in a document for more than one vocabulary
|
|
to be in scope.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<H4>
|
|
</H4>
|
|
<H4>
|
|
</H4>
|
|
<H3>
|
|
<A NAME="Scenario">Scenario</A>
|
|
</H3>
|
|
<P>
|
|
Imagine that I send you an invoice for an aeroplane part I am shipping to
|
|
you. The invoice is mostly in common business language, and the vocabulary
|
|
such as item, cost, quantity, authorizing signature, total cost and due date
|
|
are well known to both of us. However, the item is specified in an
|
|
expression which details exactly which engine lower inspection hatch door
|
|
mount bracket lock nut is involved. Neither you nor I actually have
|
|
to understand this vocabulary and references to part numbers and the like.
|
|
Only the person or machine loading the part onto the truck, and the person
|
|
or machine installing the part in the aircraft need to know it. It
|
|
is true that we need to agree about the cost and the significance of the
|
|
signing authority, as that is part of the protocol between us.
|
|
<P>
|
|
This sort of thing happens all the time in real life. Documents mix vocabularies
|
|
defined in different places. We are always making decisions about which of
|
|
the myriad of things we don't understand are important to us. We are
|
|
constantly handling information with partial understanding. Imagine
|
|
if an old version of a word processor could read a file written by a new
|
|
version with partial understanding, rather than panicing that it had met
|
|
a being from the future. It also happens all the time on the web, as
|
|
people bury private elements such as index tags and editing information inside
|
|
HTML files.
|
|
<P>
|
|
The requirement is for the new vocabularies to be well defined, like the
|
|
basic vocabulary.
|
|
<P>
|
|
By analogy with a programming language, a Web document or protocol message
|
|
should be able to include expressions combining calls to functions
|
|
from many modules. This is so fundamental to programming languages that it
|
|
has gone without saying, but it has not been possible in SGML.
|
|
<H4>
|
|
Same scope
|
|
</H4>
|
|
<P>
|
|
What does "within the same scope" mean? It means that just nesting one sort
|
|
of document inside another is not good enough. It means that I must
|
|
be able to write an expression or compound element which combines elements
|
|
from two vocabularies. (In fact, strictly, wherever there is an expression
|
|
tree which combines identifiers from more than one vocabulary, one can in
|
|
theory break it down to a set of nested subtrees each of which only uses
|
|
one vocabulary and could be considered a "subdocument", but in practice this
|
|
is impractically cumbersome.) For example, if I can extend HTML to include
|
|
Math, in this way one is able to use HTML bold tags still within a Math
|
|
expression.
|
|
<P>
|
|
<H3>
|
|
<A NAME="Local">Local scope</A>
|
|
</H3>
|
|
<P>
|
|
There is a practical requirement that it must be possible to introduce a
|
|
new vocabulary in part of a document in a way that requires changes only
|
|
locally within the document. This means that for example it must be
|
|
possible to introduce a new vocabulary within a local block. Here is an example
|
|
in an arbitrary syntax, where "NS:using" is the <B><I>binding</I></B> of
|
|
local identifiers starting with "<CODE>f</CODE>" to a schema
|
|
<CODE>http://blah/currency</CODE>
|
|
<PRE> <a:details>
|
|
<NS:using href="http://blah/currency" as="f">
|
|
<a:price>
|
|
<f:chf>4.00</f:chf>
|
|
</a:price>
|
|
</NS:using>
|
|
</a:details>
|
|
</PRE>
|
|
<P>
|
|
The binding between the local identifier and the schema is textually local.
|
|
There is no need to a binding in the document's head. In general this makes
|
|
document management much easier. It makes checking a document easier, as
|
|
you can in some cases verify an embedded piece without having to check the
|
|
whole document.
|
|
<H4>
|
|
Why?
|
|
</H4>
|
|
<P>
|
|
A specific need for local scoping comes from the fact that many documents
|
|
are generated (for example by CGI scripts) by calling programs to output
|
|
parts in context, and the program which generates the parts has no access
|
|
to the rest of the document.
|
|
<P>
|
|
In theory it would always be possible to take such a document with nested
|
|
bindings of namespaces, and find all those bindings, and generate new local
|
|
prefixes for each so that they are unique, and then move all the bindings
|
|
to the top of the document. Therefore, a document using local scope can be
|
|
converted into one which only uses global scoping. However, this requires
|
|
buffering of all the document, and so cannot be done in pipelined systems,
|
|
and pipelined systems are often a necessity in the Web in order to achieve
|
|
acceptable response times.
|
|
<P>
|
|
Another case involves very long documents using many namespaces. Typically
|
|
web applications have to be able to cope with documents of arbitrary length.
|
|
Imagine a document which, every paragraph, refers to a new name space. (A
|
|
proof by example would be a document documenting many namespaces.. but image
|
|
also a list of suppliers each of which has its own catalog schema.). As
|
|
processing of the document continues, if the bindings of namespaces are local,
|
|
then each is made and discarded. The working set needed for processing the
|
|
document is finite. In the case in which the bindings are global in scope,
|
|
then the working set size increases linearly with the length of the document,
|
|
and the product of resource utilization and processing time then rises as
|
|
the square of the document size.
|
|
<P>
|
|
A third example of a need for local scoping is that for many uses of XML
|
|
(take SMIL for example) concatenation of two documents to make one document
|
|
should be a simple process. Indeed, a worthy design goal would be to require
|
|
that the concatenation of any two XML documents be an XML document. If local
|
|
scoping is not available, the concatenation function requires the rewriting
|
|
of one document from beginning to end changing local identifiers where they
|
|
clash.
|
|
<P>
|
|
In general, one can call on all the design experience of the computer science
|
|
community which, over the years, has seen the need for block structured languages
|
|
with local scoping. There have been many factors influencing this, but one
|
|
unmentioned to date has been the maintainability of programs/documents. When
|
|
the binding of a name and its use can be close together, for human-maintained
|
|
documents, mistakes are much less likely.
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>It must be possible to introduce a new vocabulary in part of a document
|
|
in a way that requires changes only locally within the document.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<H3>
|
|
<A NAME="Ambiguity">Lack of ambiguity</A>
|
|
</H3>
|
|
<P>
|
|
Some programming languages allow one to introduce identifiers from new name
|
|
spaces in such a way that it is not possible to know which namespace a local
|
|
identifier belongs to without accessing both the module interface specifications
|
|
and checking which one has with the highest priority, or most recently
|
|
in the document, redefined a given local identifier.
|
|
<P>
|
|
This may have some uses in a programming language such as
|
|
Java<A HREF="#Java">[Java]</A>, but it has a serious flaw in that when one
|
|
module changes (without the knowledge of the designers of the other module),
|
|
it can unwittingly redefine a local identifier used by the second module,
|
|
completely changing the meaning of a previously written document. Clearly,
|
|
in the Web world in which modules evolve but documents must have clearly
|
|
defined meanings, this is unacceptable. Contrast with Modula-3, where
|
|
all names are either lexically scoped or fully qualified
|
|
<A HREF="#SPwM3">[SPwM3]</A>.
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>The syntax must unambiguously associate an identifier in a document with
|
|
the related schema without requiring inspection of that or another schema.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<P>
|
|
This is the reason for the use of a prefix in the XML namespace proposal
|
|
to tie the use of an identifier directly to the specification of the name
|
|
space. Notice that in the example above, the fact that the binding element
|
|
actually creates a new level of nesting removing all ambiguity.
|
|
<H3>
|
|
</H3>
|
|
<H3>
|
|
<A NAME="Evolving">Evolving new schema languages</A>
|
|
</H3>
|
|
<P>
|
|
In SGML the "DTD" defines, for an SGML element, what possible other elements
|
|
may be nested inside it. For example, on an invoice, it may specify
|
|
that the signing authority must be either Tom or Joe. It may specify that
|
|
an item can be any part number or any accessory number or any book number.
|
|
Checking the SGML validity of a document is a process which can be done
|
|
automatically from the DTD. This is a check at a certain low level
|
|
in that it does not verify semantic correctness, only structural correctness.
|
|
But the structural constraints alone are useful in many ways. For example,
|
|
a user interface for constructing a document can be generated automatically
|
|
from the structural constraints.
|
|
<P>
|
|
We plan to introduce more powerful languages for describing not only the
|
|
structure of a document, but the semantics to an extent that not only can
|
|
checking be automated to a higher level, but also so can the processing of
|
|
a document and reasoning about its contents be automated. Therefore it is
|
|
essential that when a document is written to refer to a namespace, the name
|
|
space definition should be a generic resource whose instances may include
|
|
schemas in various languages at various levels of sophistication. This
|
|
is an essential growth point for the web.
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>The resource defining a namespace may be generic and allow definitions
|
|
of the namespace in varying present or future languages.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<H3>
|
|
<A NAME="Correctness">Correctness of documents with multiple vocabularies</A>
|
|
</H3>
|
|
<P>
|
|
How does one check the validity/correctness of a document with multiple
|
|
namespaces? Clearly one must be able to find definitions of the namespaces
|
|
at the appropriate level, and combine them. Looking at the example above
|
|
of the invoice, we notice a difference.
|
|
<P>
|
|
In the case of the "content model" for an authorizing person, the designer
|
|
of the invoice intended that in fact the schema should be extensible so that
|
|
any new object could be included as an item. For example, one could
|
|
use a part number system from any new supplier, just by incorporating the
|
|
namespace. However, when it came to the "content model" for an authorizing
|
|
person, only Tom or Joe should be able to sign. No namespace extension should
|
|
be allowed to redefine the permissible content model
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>There must be a way of indicating when a given content model may be extended
|
|
by new schemas.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>There must be a way, in a new schema, of specifying that a given
|
|
new content model is designed an extension to the existing content model
|
|
of an existing schema.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<P>
|
|
These are constraints on the schema language. (They are
|
|
<A HREF="http://www.w3.org/TR/1998/NOTE-XML-data-0105/Overview.html#OpenClosed">addressed</A>
|
|
by the XML-DATA discussion NOTE.)
|
|
<H3>
|
|
<A NAME="Granularity">Granularity</A>
|
|
</H3>
|
|
<P>
|
|
With what granularity should one be able to define new vocabularies in XML?
|
|
The analogy with programming languages suggest that we can understand
|
|
how to add new elements (functions) but that adding new attributes to existing
|
|
elements (parameters to existing functions) is difficult to define when one
|
|
gets above the structural level.
|
|
<P>
|
|
Although scheme languages do not yet exist to define semantic relations and
|
|
typing, clearly there will be need for extension of concepts to type. Perhaps
|
|
the need for content model extension will in fact represent the same need.
|
|
<H3>
|
|
<A NAME="Incorporation">Incorporation into the language</A>
|
|
</H3>
|
|
<P>
|
|
The namespace functionality is a very fundamental part of the language.
|
|
A language processor which does not understand it can check what in XML is
|
|
called "well-formedness", ie basic syntactic correctness, of a document,
|
|
but can do no more.
|
|
<P>
|
|
A fundamental processing need outlined above is "partial understanding".
|
|
I envisage three ways in which partial understanding can be accomplished,
|
|
when a document in an "original" schema's vocabulary includes some of a "new"
|
|
schema's vocabulary:
|
|
<P>
|
|
<OL>
|
|
<LI>
|
|
It may be possible to mathematically deduce what information can be ignored
|
|
from properties of the original schema;
|
|
<LI>
|
|
At a simple level this could be built into the language itself so that it
|
|
can be expressed in the document itself; (analogy with PEP extensions
|
|
to HTTP).
|
|
<LI>
|
|
The "new" schema may allow one to deduce what can be ignored. It may even
|
|
give mappings which allow expressions in the new schema's vocabulary to be
|
|
replaced with simpler expressions in better known vocabularies.
|
|
</OL>
|
|
<P>
|
|
Notice that the first two ways do not require one to be able to access or
|
|
understand the "new" schema in order to decide whether to ignore it. This
|
|
is a powerful and important feature. Taking against the invoice example
|
|
above, it is essential to be able to process the invoice at some level without
|
|
even looking up on the Web any definition of the part numbers. It is sufficient
|
|
for the invoice itself declare that the item specifications don't matter
|
|
as far as the validity of the invoice as an invoice.
|
|
<P>
|
|
<TABLE BORDER CELLPADDING="2">
|
|
<TR>
|
|
<TD>It should be possible to create an original document schema such that
|
|
one can determine, without access to the extension schema, which uses
|
|
of extensions to that document can be ignored.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<P>
|
|
The difference between the first two ways above is whether some
|
|
functionality is regarded as basic to the language or part of a very commonly
|
|
understood namespace of elements for document construction. This design decision
|
|
is not currently clear.
|
|
<H3>
|
|
Revision and evolution of namespaces
|
|
</H3>
|
|
<P>
|
|
This document does not define the requirements of schema languages, nor of
|
|
languages with which to assert the equivalence of assertions made using different
|
|
vocabularies. However it is worth noting that the architecture expects
|
|
machine-readable documents to describe the relationship between different
|
|
schemas, including between a schema and later evolved versions of the schema.
|
|
The namespace functionality itself is not required to address that issue
|
|
directly.
|
|
<H2>
|
|
<A NAME="Related">References</A>
|
|
</H2>
|
|
<DL>
|
|
<DT>
|
|
<A NAME="Ascent">[Ascent]</A>
|
|
<DD>
|
|
<A HREF="http://www.cs.caltech.edu/~adam/papers/xml/ascent-of-xml.html"><CITE>The
|
|
Evolution of Web Documents: The Ascent of XML</CITE></A>
|
|
<DD>
|
|
Dan Connolly, Rohit Khare, and Adam Rifkin, W3J special Issue on XML, Vol
|
|
2, Number 4, Fall 1997, Pages 119-128
|
|
<DT>
|
|
<A NAME="Java">[Java]</A>
|
|
<DD>
|
|
<A HREF="http://java.sun.com/docs/books/jls/html/index.html">The Java Language
|
|
Specification</A>, James Gosling, Bill Joy, Guy Steele, Edition 1.0, (Converted
|
|
from the printed book, August 1996, first printing) esp. Section 6.5
|
|
<A HREF="http://java.sun.com/docs/books/jls/html/6.doc.html#20569"><CITE>Determining
|
|
the Meaning of a Name</CITE></A>
|
|
<DT>
|
|
<A NAME="SPwM3">[SPwM3]</A>
|
|
<DD>
|
|
Systems Programming with Modula-3, November 1989. esp. Section 2.5
|
|
<A href="http://www.research.digital.com/SRC/m3defn/html/units.html"> Modules
|
|
and interfaces</A>
|
|
<DT>
|
|
<A NAME="SMIL">[SMIL]</A>
|
|
<DD>
|
|
<A HREF="WD-smil-0202">Synchronized Multimedia Integration Language</A> W3C
|
|
Working Draft 2-February-98 Philipp Hoschka
|
|
<DD>
|
|
</DL>
|
|
<H3>
|
|
Bibliography
|
|
</H3>
|
|
<DL>
|
|
<DT>
|
|
<A HREF="http://opera.inrialpes.fr/OPERA/BibOpera.html#[Akpotsui97]"><CITE>Type
|
|
Modelling for Document Transformation in Structured Editing
|
|
Systems</CITE></A>
|
|
<DD>
|
|
E. Akpotsui, V. Quint and C. Roisin. Mathematical and Computer Modelling,
|
|
Volume 25, Number 4, Pages 1-19, 1997.
|
|
<DT>
|
|
<CITE>Theory of Semiotics</CITE>
|
|
<DD>
|
|
<A HREF="http://www.dsc.unibo.it/istituto/people/eco/eco.htm">Umberto Eco</A>
|
|
Indiana Univ Press February 1979 ISBN: 0253202175
|
|
<DT>
|
|
<A HREF="http://www.hf.ntnu.no/anv/WWWpages/Hyper/Hypermedia.html"><CITE>The
|
|
electronic hypermedia encyclopædia: transcending the constraints of
|
|
the "authoritative work"?</CITE></A>
|
|
<DD>
|
|
Patrick J. COPPOCK<BR>
|
|
The University of Trondheim <BR>
|
|
College of Arts and Science <BR>
|
|
Dept. of Applied Linguistics <BR>
|
|
N-7055 Dragvoll Norway <BR>
|
|
e-mail: patcop@alfa.avh.unit.no <BR>
|
|
e-mail: coppack@bo.nettuno.it
|
|
<DT>
|
|
<CITE>Authoritative Sources in a Hyperlinked Environment</CITE>
|
|
<DD>
|
|
IBM Research Report RJ 10076 (91892) May 29, 1997<BR>
|
|
Jon M. Kleinbert <kleinber@cs.cornell.edu>
|
|
<DT>
|
|
<A HREF="http://www.microsoft.com/oledev/olecom/title.htm">The Component
|
|
Object Model Specification</A>
|
|
<DD>
|
|
Draft Version 0.9, October 24, 1995 Microsoft Corporation and Digital Equipment
|
|
Corporation. (esp
|
|
<A HREF="http://www.microsoft.com/oledev/olecom/Ch01.htm#Objects">Objects
|
|
and Interfaces</A>)
|
|
<DT>
|
|
<A HREF="../WD-doctypes">HTML Dialects: Internet Media and SGML Document
|
|
Types</A>
|
|
<DD>
|
|
W3C Working Draft 06-Mar-96 Daniel W. Connolly
|
|
<DT>
|
|
<A HREF="file://ftp.cs.utexas.edu/pub/qsim/papers/Crawford-PhD-91.ps.Z">Access-Limited
|
|
Logic: A Language for Knowledge Representation.</A>
|
|
<DD>
|
|
James Crawford. 1990. Doctoral dissertation, Department of Computer Sciences,
|
|
University of Texas at Austin, Austin, Texas. UT Artificial Intelligence
|
|
TR AI90-141, October 1990.
|
|
(<A HREF="http://www.cs.utexas.edu/users/qr/algernon.html">Algernon and
|
|
Access-Limited Logic</A>)
|
|
<DD>
|
|
</DL>
|
|
<P>
|
|
</BODY></HTML>
|