You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
359 lines
56 KiB
359 lines
56 KiB
<!doctype html public '-//W3C//DTD HTML 4.0 Transitional//EN' 'http://www.w3.org/TR/REC-html40-971218/loose.dtd'>
|
|
<HTML><HEAD><meta name='GENERATOR' content='XML/XH/Lark'><TITLE>Comparison of SGML and XML</TITLE></HEAD><BODY BGCOLOR='#ffffff'>
|
|
|
|
<H3 align='right'><A HREF='http://www.w3.org/'><IMG align='left' alt='W3C' src='http://www.w3.org/Icons/WWW/w3c_home'></A>NOTE-sgml-xml-971215</H3><br><H1 align='center'>Comparison of SGML and XML</H1>
|
|
|
|
<h3 align='center'>World Wide Web Consortium Note
|
|
15-December-1997 </h3>
|
|
<BR>
|
|
<DL><DT>This version:</DT>
|
|
<dd><A HREF='http://www.w3.org/TR/NOTE-sgml-xml-971215'>http://www.w3.org/TR/NOTE-sgml-xml-971215</A></DL>
|
|
<dl><dt>Author:</dt>
|
|
<DD>
|
|
James Clark
|
|
<A HREF='mailto:jjc@jclark.com'><jjc@jclark.com></A>
|
|
</DD>
|
|
</dl>
|
|
<hr><h2>Status of this document</h2><p>
|
|
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no
|
|
endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues
|
|
addressed by the NOTE.<BR>
|
|
Errors or omissions in this document should be reported to the
|
|
<A HREF='mailto:jjc@jclark.com'>author</A>.<BR>
|
|
</p><hr>
|
|
|
|
<H2>Abstract</H2><P>
|
|
This document provides a detailed comparison of SGML (ISO 8879) and XML.<BR>
|
|
</P><HR>
|
|
|
|
|
|
|
|
|
|
|
|
<H1>Comparison of SGML and XML</H1><H1>Version 1.0</H1><h2>Table of Contents</h2>1. <A HREF='#null'>Differences Between XML and SGML</A><BR>
|
|
2. <A HREF='#null2'>Transforming SGML to XML</A><BR>
|
|
3. <A HREF='#null3'>SGML Declaration for XML</A><BR>
|
|
|
|
<HR>
|
|
|
|
|
|
|
|
<H2><A NAME='null'>1. Differences Between XML and SGML</a></h2>
|
|
<P>XML allows only documents that use the SGML declaration in this note. This
|
|
declares all the following SGML features as <code>NO</code>:</P>
|
|
<UL>
|
|
<LI><code>DATATAG</code></LI>
|
|
<LI><code>OMITTAG</code></LI>
|
|
<LI><code>RANK</code></LI>
|
|
<LI><code>LINK</code> (<code>SIMPLE</code>, <code>IMPLICIT</code> and
|
|
<code>EXPLICIT</code>)</LI>
|
|
<LI><code>CONCUR</code></LI>
|
|
<LI><code>SUBDOC</code></LI>
|
|
<LI><code>FORMAL</code></LI>
|
|
</UL>
|
|
|
|
<P>Note that it differs from the reference concrete syntax in a number of
|
|
ways:</P>
|
|
<UL>
|
|
<LI>It also declares no short reference delimiters; it follows that <code>
|
|
SHORTREF</code> and <code>USEMAP</code> declarations cannot occur in XML</LI>
|
|
<LI>The <code>PIC</code> (processing instruction close) delimiter is
|
|
<CODE><FONT SIZE='+1'>?></FONT></CODE></LI>
|
|
<LI>Quantities and capacities are effectively unlimited</LI>
|
|
<LI>Names are case sensitive (<code>NAMECASE GENERAL</code> is
|
|
<code>NO</code>)</LI>
|
|
<LI>Underscore and colon are allowed in names</LI>
|
|
<LI>Names can use Unicode characters and are not restricted to ASCII
|
|
</LI>
|
|
</UL>
|
|
|
|
<P>The following constructs which are permitted in SGML when <code>SHORTTAG
|
|
</code> is <code>YES</code> are not allowed in XML:</P>
|
|
<UL>
|
|
<LI>Unclosed start-tags</LI>
|
|
<LI>Unclosed end-tags</LI>
|
|
<LI>Empty start-tags</LI>
|
|
<LI>Empty end-tags</LI>
|
|
<LI>Attribute values in attribute specifications entered directly rather
|
|
than as literals</LI>
|
|
<LI>Attribute specifications that omit the attribute name</LI>
|
|
</UL>
|
|
|
|
<P><code>NET</code> delimiters can be used only to close an empty element. In
|
|
SGML without the Web SGML Adaptations Annex, the <code>NET</code> delimiter is declared as
|
|
<CODE><FONT SIZE='+1'>/></FONT></CODE>. With this approach, XML is not allowing null end-tags and is allowing
|
|
net-enabling start-tags only for elements with no end-tag. In SGML with the
|
|
Web SGML Adaptations Annex, there is a separate NESTC (net-enabling start tag close) delimiter.
|
|
This allows the XML <CODE><FONT SIZE='+1'><e/></FONT></CODE> syntax to be handled as a combination
|
|
of a net-enabling start-tag <CODE><FONT SIZE='+1'><e/</FONT></CODE> and a null end-tag
|
|
<CODE><FONT SIZE='+1'>></FONT></CODE>. With this approach, XML is allowing a net-enabling start-tag
|
|
only when immediately followed by a null end-tag.</P>
|
|
<P>XML imposes the following restrictions not in SGML:</P>
|
|
<UL>
|
|
<LI>Entity references
|
|
<UL>
|
|
<LI>Entity references must be closed with a <code>REFC</code> delimiter
|
|
</LI>
|
|
<LI>References to external data entities in content are not allowed
|
|
</LI>
|
|
<LI>General entity references in content are required to be synchronous
|
|
</LI>
|
|
<LI>External entity references in attribute values are not allowed</LI>
|
|
<LI>Parameter entity references are allowed in the internal subset only
|
|
within a declaration separator (that is, at a point where a markup declaration
|
|
could occur)</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Character references
|
|
<UL>
|
|
<LI>Character references must be closed with a <code>REFC</code> delimiter
|
|
</LI>
|
|
<LI>Named character references are not allowed</LI>
|
|
<LI>Numeric character references to non-SGML characters are not allowed</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Entity declarations
|
|
<UL>
|
|
<LI>A <code>#DEFAULT</code> entity cannot be declared</LI>
|
|
<LI>External <code>SDATA</code> entities are not allowed</LI>
|
|
<LI>External <code>CDATA</code> entities are not allowed</LI>
|
|
<LI>Internal <code>SDATA</code> entities are not allowed</LI>
|
|
<LI>Internal <code>CDATA</code> entities are not allowed</LI>
|
|
<LI><code>PI</code> entities are not allowed</LI>
|
|
<LI>Bracketed text entities are not allowed</LI>
|
|
<LI>External identifiers must include a system identifier</LI>
|
|
<LI>Attributes cannot be specified for an entity</LI>
|
|
<LI>The replacement text of general text entities and external parameter
|
|
entities is required to be well-formed</LI>
|
|
<LI>An ampersand in a parameter literal must be followed by
|
|
a syntactically valid entity reference or numeric character
|
|
reference</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Attribute definition list declarations
|
|
<UL>
|
|
<LI>Associated element type in attribute definition list declarations
|
|
cannot be a name group</LI>
|
|
<LI>Attributes cannot be declared for a notation</LI>
|
|
<LI><code>CURRENT</code> attributes are not allowed</LI>
|
|
<LI>Content reference attributes are not allowed</LI>
|
|
<LI><code>NUTOKEN(S)</code> declared values are not allowed</LI>
|
|
<LI><code>NUMBER(S)</code> declared values are not allowed</LI>
|
|
<LI><code>NAME(S)</code> declared values are not allowed</LI>
|
|
<LI>A name token group must use the or connector</LI>
|
|
<LI>Attribute values specified as defaults in attribute definition list
|
|
declarations must be literals (SGML allows them not to be even when
|
|
<code>SHORTTAG</code> is <code>NO</code>)</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Element type declarations
|
|
<UL>
|
|
<LI>Associated element type in element type declaration cannot be a name
|
|
group</LI>
|
|
<LI>In an element declaration, a generic identifier cannot be specified
|
|
as a rank stem and rank suffix (SGML allows this even when the <code>RANK</code>
|
|
feature is <code>NO</code>)</LI>
|
|
<LI>Minimization parameters in element declarations are not
|
|
allowed</LI>
|
|
<LI><code>RCDATA</code> declared content are not allowed</LI>
|
|
<LI><code>CDATA</code> declared content are not allowed</LI>
|
|
<LI>Content models cannot use the and connector</LI>
|
|
<LI>Content models for mixed content have a restricted form</LI>
|
|
<LI>Inclusions are not allowed</LI>
|
|
<LI>Exclusions are not allowed</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Comments
|
|
<UL>
|
|
<LI>A parameter separator cannot contain comments; this means that markup
|
|
declarations (other than comment declarations) cannot contain comments
|
|
</LI>
|
|
<LI>Empty comment declarations (<CODE><FONT SIZE='+1'><!></FONT></CODE> in the reference
|
|
concrete syntax) are not allowed</LI>
|
|
<LI>A comment declaration cannot contain more than one comment</LI>
|
|
<LI>In a comment declaration, an S separator is not allowed before the
|
|
final <code>MDC</code></LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Processing instructions
|
|
<UL>
|
|
<LI>Processing instructions must start with a name (the PI target)
|
|
</LI>
|
|
<LI>A processing instruction whose PI target is <CODE><FONT SIZE='+1'>xml</FONT></CODE> can
|
|
only occur at the beginning of a external entity and must be an XML declaration
|
|
if it occurs in the document entity, and otherwise an text declaration</LI>
|
|
<LI>A PI target must not match <CODE><FONT SIZE='+1'>[Xx][Mm][Ll]</FONT></CODE> unless it is
|
|
<CODE><FONT SIZE='+1'>xml</FONT></CODE></LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Marked sections
|
|
<UL>
|
|
<LI>In marked section declarations, <code>TEMP</code> status keyword is not
|
|
allowed</LI>
|
|
<LI><code>RCDATA</code> marked sections are not allowed</LI>
|
|
<LI><code>INCLUDE</code>/<code>IGNORE</code> marked sections are not allowed
|
|
in the document instance</LI>
|
|
<LI>In a marked section declaration, a status keyword specification that
|
|
contains no status keywords is not allowed</LI>
|
|
<LI>In a marked section declaration, a status keyword specification cannot
|
|
contain more than one status keyword</LI>
|
|
<LI>Marked sections are not allowed in the internal subset</LI>
|
|
<LI>Parameter separators are not allowed in status keyword specifications
|
|
in the document instance; in particular, parameter entity references are not
|
|
allowed</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
<LI>Other
|
|
<UL>
|
|
<LI>Names beginning with <CODE><FONT SIZE='+1'>[Xx][Mm][Ll]</FONT></CODE> are reserved</LI>
|
|
<LI>The SGML declaration must be implied and cannot be explicitly present
|
|
in the document entity</LI>
|
|
<LI>When <CODE><FONT SIZE='+1'><</FONT></CODE> and <CODE><FONT SIZE='+1'>&</FONT></CODE> occur as data, they
|
|
must be entered as <CODE><FONT SIZE='+1'>&lt;</FONT></CODE> and <CODE><FONT SIZE='+1'>&amp;</FONT></CODE></LI>
|
|
<LI>A parameter separator required by the formal syntax must always
|
|
be present and cannot be omitted when it is adjacent to a delimiter</LI>
|
|
</UL>
|
|
|
|
</LI>
|
|
</UL>
|
|
|
|
<P>XML predefines the semantics of the attributes <CODE><FONT SIZE='+1'>xml:space</FONT></CODE> and
|
|
<CODE><FONT SIZE='+1'>xml:lang</FONT></CODE>. It also reserves all attribute, element type and
|
|
notation names beginning with <CODE><FONT SIZE='+1'>[Xx][Mm][Ll]</FONT></CODE>.</P>
|
|
<P>XML requires that an SGML parser use an entity manager that behaves as
|
|
follows:</P>
|
|
<UL>
|
|
<LI>Lines are terminated by newline (Unicode code #X000A) rather than
|
|
being delimited by RS and RE as with a typical SGML entity manager</LI>
|
|
<LI>System identifiers are treated as URLs</LI>
|
|
<LI>The entity manager must support entities encoded in UTF-16 and UTF-8,
|
|
and must be able automatically to detect which encoding an entity uses based on
|
|
the presence of the byte order mark</LI>
|
|
<LI>The entity manager should be able to recognize the encoding declaration
|
|
in the XML declaration and encoding PI and use it to determine the encoding
|
|
of entity</LI>
|
|
</UL>
|
|
|
|
<P>XML imposes requirements on the information that a parser must make
|
|
available to an application.</P>
|
|
<P>XML depends on the following changes to SGML made by Web SGML Adaptations Annex:
|
|
</P>
|
|
<UL>
|
|
<LI><code>HCRO</code> delimiter (for hex numeric character references);
|
|
for XML this is <CODE><FONT SIZE='+1'>&#x</FONT></CODE></LI>
|
|
<LI>EMPTYNRM feature that allows elements declared <code>EMPTY</code> to have
|
|
end-tags</LI>
|
|
<LI><code>NESTC</code> delimiter</LI>
|
|
<LI>Duplicate enumerated attribute tokens are allowed</LI>
|
|
<LI>Relaxation of rules on use of parameter entity references inside
|
|
groups</LI>
|
|
<LI>Multiple <code>ATTLIST</code> declarations for a single element type
|
|
</LI>
|
|
<LI><code>ATTLIST</code> declarations which don't declare any attributes
|
|
</LI>
|
|
<LI>KEEPRSRE feature that turns off SGML's rules for ignoring RSs and REs</LI>
|
|
<LI>Fully-tagged SGML documents; a document that is fully-tagged
|
|
but not type-valid is a conforming SGML document; this makes all XML
|
|
documents, including those that are well-formed but not valid,
|
|
conforming SGML documents</LI>
|
|
<LI>Predefined data character entities in the SGML declaration (for lt,
|
|
amp and so on)</LI>
|
|
<LI>Unlimited capacities and quantities</LI>
|
|
</UL>
|
|
|
|
<P>The Web SGML Adaptations Annex also enables some XML restrictions
|
|
to be enforced in SGML:</P>
|
|
<UL>
|
|
<LI><code>SHORTTAG</code> is unbundled, so the SGML declaration can allow
|
|
attribute defaulting and <code>NET</code> without allowing other
|
|
<code>SHORTTAG</code> constructs</LI>
|
|
<LI>The SGML declaration can assert that a document is integrally stored,
|
|
which disallows improperly nested entity references in content</LI>
|
|
</UL>
|
|
|
|
|
|
|
|
|
|
<H2><A NAME='null2'>2. Transforming SGML to XML</a></h2>
|
|
<P>For most restrictions in XML that go beyond SGML,
|
|
it is possible to transform
|
|
an SGML document automatically into a document that meets the restrictions,
|
|
and is equivalent in the sense that it has the same ESIS. There are a number
|
|
of restrictions for which this is not the case:</P>
|
|
<DL>
|
|
<DT>External <code>SDATA</code> entities, external <code>CDATA</code> entities
|
|
</DT>
|
|
<DD>
|
|
These could be transformed into <code>NDATA</code> entities.
|
|
</DD>
|
|
|
|
<DT>Subdocument entities</DT>
|
|
<DD>
|
|
These could be converted into <code>NDATA</code> entities with a notation that
|
|
indicates that they are SGML or XML.
|
|
</DD>
|
|
|
|
<DT>References to external data entities in content</DT>
|
|
<DD>
|
|
These could be transformed into an empty element with an attribute whose
|
|
declared value is <code>ENTITY</code>.
|
|
</DD>
|
|
|
|
<DT>Data attributes</DT>
|
|
<DD>
|
|
Since an external data entity can only be used in an <code>ENTITY</code> or
|
|
<code>ENTITIES</code> attribute on an element, these could be transformed into
|
|
other attributes on the element.
|
|
</DD>
|
|
|
|
<DT>Internal <code>SDATA</code> entities</DT>
|
|
<DD>
|
|
References could be transformed into numeric character references to the
|
|
appropriate Unicode character; if used in an entity or entities attribute,
|
|
the entity will have to be made external.
|
|
</DD>
|
|
|
|
<DT>Internal <code>CDATA</code> entities</DT>
|
|
<DD>
|
|
If used in an <code>ENTITY</code> or <code>ENTITIES</code> attribute, the entity
|
|
will have to be made external (references to <code>CDATA</code> entities are
|
|
not part of ESIS).
|
|
</DD>
|
|
|
|
<DT><code>PI</code> entities</DT>
|
|
<DD>
|
|
If they contain <CODE><FONT SIZE='+1'>?></FONT></CODE>, they cannot be converted into an XML PI.
|
|
It could be an application convention that entity references are replaced
|
|
in PIs. Also if they do not start with a name, they cannot be converted into
|
|
a well-formed XML PI.
|
|
</DD>
|
|
|
|
<DT>names</DT>
|
|
<DD>
|
|
An SGML document can have a concrete syntax which allows characters in
|
|
names that XML does not allow in names.
|
|
</DD>
|
|
|
|
</DL>
|
|
|
|
|
|
|
|
|
|
<H2><A NAME='null3'>3. SGML Declaration for XML</a></h2>
|
|
<P>The following SGML declaration takes advantage of
|
|
the Extended Naming Rules Technical Corrigendum to ISO 8879,
|
|
but does not make use of the Web SGML Adaptations Annex:</P>
|
|
<table cellpadding='5' border='1' bgcolor='#80ffff' width='100%'><tr><td><code><font size='+1'><!SGML -- SGML Declaration for XML --<BR> "ISO 8879:1986 (ENR)"<BR><BR> CHARSET<BR> BASESET<BR> "ISO Registration Number 176//CHARSET<BR> ISO/IEC 10646-1:1993 UCS-4 with implementation <BR> level 3//ESC 2/5 2/15 4/6"<BR> DESCSET<BR> 0 9 UNUSED<BR> 9 2 9<BR> 11 2 UNUSED<BR> 13 1 13<BR> 14 18 UNUSED<BR> 32 95 32<BR> 127 1 UNUSED<BR> 128 32 UNUSED<BR> 160 55136 160<BR> 55296 2048 UNUSED -- surrogates --<BR> 57344 8190 57344<BR> 65534 2 UNUSED -- FFFE and FFFF --<BR> 65536 1048576 65536<BR> CAPACITY SGMLREF<BR> -- Capacities are not restricted in XML --<BR> TOTALCAP 99999999<BR> ENTCAP 99999999<BR> ENTCHCAP 99999999<BR> ELEMCAP 99999999<BR> GRPCAP 99999999<BR> EXGRPCAP 99999999<BR> EXNMCAP 99999999<BR> ATTCAP 99999999<BR> ATTCHCAP 99999999<BR> AVGRPCAP 99999999<BR> NOTCAP 99999999<BR> NOTCHCAP 99999999<BR> IDCAP 99999999<BR> IDREFCAP 99999999<BR> MAPCAP 99999999<BR> LKSETCAP 99999999<BR> LKNMCAP 99999999<BR><BR><BR> SCOPE DOCUMENT<BR><BR> SYNTAX<BR> SHUNCHAR NONE<BR> BASESET "ISO Registration Number 176//CHARSET<BR> ISO/IEC 10646-1:1993 UCS-4 with implementation <BR> level 3//ESC 2/5 2/15 4/6"<BR> DESCSET<BR> 0 1114112 0<BR> FUNCTION<BR> RE 13<BR> RS 10<BR> SPACE 32<BR> TAB SEPCHAR 9<BR><BR> NAMING<BR> LCNMSTRT ""<BR> UCNMSTRT ""<BR> NAMESTRT<BR> 58 95 192-214 216-246 248-305 308-318 321-328<BR> 330-382 384-451 461-496 500-501 506-535 592-680<BR> 699-705 902 904-906 908 910-929 931-974 976-982<BR> 986 988 990 992 994-1011 1025-1036 1038-1103<BR> 1105-1116 1118-1153 1168-1220 1223-1224<BR> 1227-1228 1232-1259 1262-1269 1272-1273<BR> 1329-1366 1369 1377-1414 1488-1514 1520-1522<BR> 1569-1594 1601-1610 1649-1719 1722-1726<BR> 1728-1742 1744-1747 1749 1765-1766 2309-2361<BR> 2365 2392-2401 2437-2444 2447-2448 2451-2472<BR> 2474-2480 2482 2486-2489 2524-2525 2527-2529<BR> 2544-2545 2565-2570 2575-2576 2579-2600<BR> 2602-2608 2610-2611 2613-2614 2616-2617<BR> 2649-2652 2654 2674-2676 2693-2699 2701<BR> 2703-2705 2707-2728 2730-2736 2738-2739<BR> 2741-2745 2749 2784 2821-2828 2831-2832<BR> 2835-2856 2858-2864 2866-2867 2870-2873 2877<BR> 2908-2909 2911-2913 2949-2954 2958-2960<BR> 2962-2965 2969-2970 2972 2974-2975 2979-2980<BR> 2984-2986 2990-2997 2999-3001 3077-3084<BR> 3086-3088 3090-3112 3114-3123 3125-3129<BR> 3168-3169 3205-3212 3214-3216 3218-3240<BR> 3242-3251 3253-3257 3294 3296-3297 3333-3340<BR> 3342-3344 3346-3368 3370-3385 3424-3425<BR> 3585-3630 3632 3634-3635 3648-3653 3713-3714<BR> 3716 3719-3720 3722 3725 3732-3735 3737-3743<BR> 3745-3747 3749 3751 3754-3755 3757-3758 3760<BR> 3762-3763 3773 3776-3780 3904-3911 3913-3945<BR> 4256-4293 4304-4342 4352 4354-4355 4357-4359<BR> 4361 4363-4364 4366-4370 4412 4414 4416 4428<BR> 4430 4432 4436-4437 4441 4447-4449 4451 4453<BR> 4455 4457 4461-4462 4466-4467 4469 4510 4520<BR> 4523 4526-4527 4535-4536 4538 4540-4546 4587<BR> 4592 4601 7680-7835 7840-7929 7936-7957<BR> 7960-7965 7968-8005 8008-8013 8016-8023 8025<BR> 8027 8029 8031-8061 8064-8116 8118-8124 8126<BR> 8130-8132 8134-8140 8144-8147 8150-8155<BR> 8160-8172 8178-8180 8182-8188 8486 8490-8491<BR> 8494 8576-8578 12295 12321-12329 12353-12436<BR> 12449-12538 12549-12588 19968-40869 44032-55203<BR><BR> LCNMCHAR ""<BR> UCNMCHAR ""<BR> NAMECHAR<BR> 45-46 183 720-721 768-837 864-865 903 1155-1158<BR> 1425-1441 1443-1465 1467-1469 1471 1473-1474<BR> 1476 1600 1611-1618 1632-1641 1648 1750-1764<BR> 1767-1768 1770-1773 1776-1785 2305-2307 2364<BR> 2366-2381 2385-2388 2402-2403 2406-2415<BR> 2433-2435 2492 2494-2500 2503-2504 2507-2509<BR> 2519 2530-2531 2534-2543 2562 2620 2622-2626<BR> 2631-2632 2635-2637 2662-2673 2689-2691 2748<BR> 2750-2757 2759-2761 2763-2765 2790-2799<BR> 2817-2819 2876 2878-2883 2887-2888 2891-2893<BR> 2902-2903 2918-2927 2946-2947 3006-3010<BR> 3014-3016 3018-3021 3031 3047-3055 3073-3075<BR> 3134-3140 3142-3144 3146-3149 3157-3158<BR> 3174-3183 3202-3203 3262-3268 3270-3272<BR> 3274-3277 3285-3286 3302-3311 3330-3331<BR> 3390-3395 3398-3400 3402-3405 3415 3430-3439<BR> 3633 3636-3642 3654-3662 3664-3673 3761<BR> 3764-3769 3771-3772 3782 3784-3789 3792-3801<BR> 3864-3865 3872-3881 3893 3895 3897 3902-3903<BR> 3953-3972 3974-3979 3984-3989 3991 3993-4013<BR> 4017-4023 4025 8400-8412 8417 12293 12330-12335<BR> 12337-12341 12441-12442 12445-12446 12540-12542<BR><BR> NAMECASE<BR> GENERAL NO<BR> ENTITY NO<BR><BR><BR> DELIM<BR> GENERAL SGMLREF<BR> NET "/>"<BR> PIC "?>"<BR> SHORTREF NONE<BR> NAMES<BR> SGMLREF<BR><BR><BR> QUANTITY SGMLREF<BR> -- Quantities are not restricted in XML --<BR> ATTCNT 99999999<BR> ATTSPLEN 99999999<BR> -- BSEQLEN not used --<BR> -- DTAGLEN not used --<BR> -- DTEMPLEN not used --<BR> ENTLVL 99999999<BR> GRPCNT 99999999<BR> GRPGTCNT 99999999<BR> GRPLVL 99999999<BR> LITLEN 99999999<BR> NAMELEN 99999999<BR> -- no need to change NORMSEP --<BR> PILEN 99999999<BR> TAGLEN 99999999<BR> TAGLVL 99999999<BR><BR><BR> FEATURES<BR> MINIMIZE<BR> DATATAG NO<BR> OMITTAG NO<BR> RANK NO<BR> SHORTTAG YES -- SHORTTAG is needed for NET --<BR> LINK<BR> SIMPLE NO<BR> IMPLICIT NO<BR> EXPLICIT NO<BR> OTHER<BR> CONCUR NO<BR> SUBDOC NO<BR> FORMAL NO<BR> APPINFO NONE<BR>></font></code></td></tr></table>
|
|
<P>The following SGML declaration takes advantage of
|
|
the Web SGML Adaptations Annex to ISO 8879:</P>
|
|
<table cellpadding='5' border='1' bgcolor='#80ffff' width='100%'><tr><td><code><font size='+1'><!SGML -- SGML Declaration for XML --<BR> "ISO 8879:1986 (WWW)"<BR><BR> CHARSET<BR> BASESET<BR> "ISO Registration Number 176//CHARSET<BR> ISO/IEC 10646-1:1993 UCS-4 with implementation <BR> level 3//ESC 2/5 2/15 4/6"<BR> DESCSET<BR> 0 9 UNUSED<BR> 9 2 9<BR> 11 2 UNUSED<BR> 13 1 13<BR> 14 18 UNUSED<BR> 32 95 32<BR> 127 1 UNUSED<BR> 128 32 UNUSED<BR> 160 55136 160<BR> 55296 2048 UNUSED -- surrogates --<BR> 57344 8190 57344<BR> 65534 2 UNUSED -- FFFE and FFFF --<BR> 65536 1048576 65536<BR> CAPACITY NONE<BR><BR> SCOPE DOCUMENT<BR><BR> SYNTAX<BR> SHUNCHAR NONE<BR> BASESET "ISO Registration Number 176//CHARSET<BR> ISO/IEC 10646-1:1993 UCS-4 with implementation <BR> level 3//ESC 2/5 2/15 4/6"<BR> DESCSET<BR> 0 1114112 0<BR> FUNCTION<BR> RE 13<BR> RS 10<BR> SPACE 32<BR> TAB SEPCHAR 9<BR><BR> NAMING<BR> LCNMSTRT ""<BR> UCNMSTRT ""<BR> NAMESTRT<BR> 58 95 192-214 216-246 248-305 308-318 321-328<BR> 330-382 384-451 461-496 500-501 506-535 592-680<BR> 699-705 902 904-906 908 910-929 931-974 976-982<BR> 986 988 990 992 994-1011 1025-1036 1038-1103<BR> 1105-1116 1118-1153 1168-1220 1223-1224<BR> 1227-1228 1232-1259 1262-1269 1272-1273<BR> 1329-1366 1369 1377-1414 1488-1514 1520-1522<BR> 1569-1594 1601-1610 1649-1719 1722-1726<BR> 1728-1742 1744-1747 1749 1765-1766 2309-2361<BR> 2365 2392-2401 2437-2444 2447-2448 2451-2472<BR> 2474-2480 2482 2486-2489 2524-2525 2527-2529<BR> 2544-2545 2565-2570 2575-2576 2579-2600<BR> 2602-2608 2610-2611 2613-2614 2616-2617<BR> 2649-2652 2654 2674-2676 2693-2699 2701<BR> 2703-2705 2707-2728 2730-2736 2738-2739<BR> 2741-2745 2749 2784 2821-2828 2831-2832<BR> 2835-2856 2858-2864 2866-2867 2870-2873 2877<BR> 2908-2909 2911-2913 2949-2954 2958-2960<BR> 2962-2965 2969-2970 2972 2974-2975 2979-2980<BR> 2984-2986 2990-2997 2999-3001 3077-3084<BR> 3086-3088 3090-3112 3114-3123 3125-3129<BR> 3168-3169 3205-3212 3214-3216 3218-3240<BR> 3242-3251 3253-3257 3294 3296-3297 3333-3340<BR> 3342-3344 3346-3368 3370-3385 3424-3425<BR> 3585-3630 3632 3634-3635 3648-3653 3713-3714<BR> 3716 3719-3720 3722 3725 3732-3735 3737-3743<BR> 3745-3747 3749 3751 3754-3755 3757-3758 3760<BR> 3762-3763 3773 3776-3780 3904-3911 3913-3945<BR> 4256-4293 4304-4342 4352 4354-4355 4357-4359<BR> 4361 4363-4364 4366-4370 4412 4414 4416 4428<BR> 4430 4432 4436-4437 4441 4447-4449 4451 4453<BR> 4455 4457 4461-4462 4466-4467 4469 4510 4520<BR> 4523 4526-4527 4535-4536 4538 4540-4546 4587<BR> 4592 4601 7680-7835 7840-7929 7936-7957<BR> 7960-7965 7968-8005 8008-8013 8016-8023 8025<BR> 8027 8029 8031-8061 8064-8116 8118-8124 8126<BR> 8130-8132 8134-8140 8144-8147 8150-8155<BR> 8160-8172 8178-8180 8182-8188 8486 8490-8491<BR> 8494 8576-8578 12295 12321-12329 12353-12436<BR> 12449-12538 12549-12588 19968-40869 44032-55203<BR><BR> LCNMCHAR ""<BR> UCNMCHAR ""<BR> NAMECHAR<BR> 45-46 183 720-721 768-837 864-865 903 1155-1158<BR> 1425-1441 1443-1465 1467-1469 1471 1473-1474<BR> 1476 1600 1611-1618 1632-1641 1648 1750-1764<BR> 1767-1768 1770-1773 1776-1785 2305-2307 2364<BR> 2366-2381 2385-2388 2402-2403 2406-2415<BR> 2433-2435 2492 2494-2500 2503-2504 2507-2509<BR> 2519 2530-2531 2534-2543 2562 2620 2622-2626<BR> 2631-2632 2635-2637 2662-2673 2689-2691 2748<BR> 2750-2757 2759-2761 2763-2765 2790-2799<BR> 2817-2819 2876 2878-2883 2887-2888 2891-2893<BR> 2902-2903 2918-2927 2946-2947 3006-3010<BR> 3014-3016 3018-3021 3031 3047-3055 3073-3075<BR> 3134-3140 3142-3144 3146-3149 3157-3158<BR> 3174-3183 3202-3203 3262-3268 3270-3272<BR> 3274-3277 3285-3286 3302-3311 3330-3331<BR> 3390-3395 3398-3400 3402-3405 3415 3430-3439<BR> 3633 3636-3642 3654-3662 3664-3673 3761<BR> 3764-3769 3771-3772 3782 3784-3789 3792-3801<BR> 3864-3865 3872-3881 3893 3895 3897 3902-3903<BR> 3953-3972 3974-3979 3984-3989 3991 3993-4013<BR> 4017-4023 4025 8400-8412 8417 12293 12330-12335<BR> 12337-12341 12441-12442 12445-12446 12540-12542<BR><BR> NAMECASE<BR> GENERAL NO<BR> ENTITY NO<BR><BR> DELIM<BR> GENERAL SGMLREF<BR> HCRO "&#38;#x" -- 38 is the number for ampersand --<BR> NESTC "/"<BR> NET ">"<BR> PIC "?>"<BR> SHORTREF NONE<BR><BR> NAMES<BR> SGMLREF<BR><BR> QUANTITY NONE<BR><BR> ENTITIES<BR> "amp" 38<BR> "lt" 60<BR> "gt" 62<BR> "quot" 34<BR> "apos" 39<BR><BR> FEATURES<BR> MINIMIZE<BR> DATATAG NO<BR> OMITTAG NO<BR> RANK NO<BR> SHORTTAG<BR> STARTTAG<BR> EMPTY NO<BR> UNCLOSED NO <BR> NETENABL IMMEDNET<BR> ENDTAG<BR> EMPTY NO <BR> UNCLOSED NO<BR> ATTRIB<BR> DEFAULT YES<BR> OMITNAME NO<BR> VALUE NO<BR> EMPTYNRM YES<BR> IMPLYDEF<BR> ATTLIST YES<BR> DOCTYPE YES<BR> ELEMENT YES<BR> ENTITY YES<BR> NOTATION YES<BR> LINK<BR> SIMPLE NO<BR> IMPLICIT NO<BR> EXPLICIT NO<BR> OTHER<BR> CONCUR NO<BR> SUBDOC NO<BR> FORMAL NO<BR> URN NO<BR> KEEPRSRE YES<BR> VALIDITY TAG<BR> ENTITIES<BR> REF ANY<BR> INTEGRAL YES<BR> APPINFO NONE<BR> SEEALSO "ISO 8879//NOTATION Application Requirements for XML//EN"<BR>></font></code></td></tr></table>
|
|
|
|
<HR>
|