You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
408 lines
55 KiB
408 lines
55 KiB
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta><title> Authoring Techniques for XHTML & HTML Internationalization: Characters and Encodings 1.0</title><style type="text/css" >
|
|
</style><link rel="stylesheet" type="text/css" href="techniques.css" /><link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-WD" /></head><body><div style="text-align:center;"><p><a href="#contents">[ contents ]</a></p></div><div class="head" ><p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home" alt="W3C" height="48" width="72" /></a></p> <h1><a name="title" id="title" /> Authoring Techniques for XHTML & HTML Internationalization: Characters and Encodings 1.0</h1> <h2><a name="w3c-doctype" id="w3c-doctype" />W3C Working Draft 9 May 2004</h2><dl><dt>This version:</dt><dd>
|
|
<a href="http://www.w3.org/TR/2004/WD-i18n-html-tech-char-20040509/">http://www.w3.org/TR/2004/WD-i18n-html-tech-char-20040509/</a>
|
|
</dd><dt>Latest version:</dt><dd>
|
|
<a href="http://www.w3.org/TR/i18n-html-tech-char/">http://www.w3.org/TR/i18n-html-tech-char/</a>
|
|
</dd><dt>Previous version:</dt><dd><a href="http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/">http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/</a></dd><dt>Editor:</dt><dd>Richard Ishida, W3C</dd></dl><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2004 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>, <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-software">software licensing</a> rules apply.</p></div><hr /><div > <h2><a name="abstract" id="abstract" />Abstract</h2><p>It is important to consider character encoding matters when producing internationalization content, and
|
|
further to understand how to choose and declare encodings, how and when to use character escapes, etc.</p><p>This document is one of a series of documents providing HTML authors with techniques for developing
|
|
internationalized HTML using XHTML 1.0 or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3. It focuses
|
|
specifically on advice about character sets, encodings, and other character-specific matters. It is produced by the
|
|
Guidelines, Education & Outreach Task Force (GEO) of the
|
|
<a href="http://www.w3.org/International/">W3C Internationalization Working Group (I18N WG)</a>. The GEO
|
|
Task Force encourages feedback about the content of this document as well as participation in the development of the
|
|
techniques by people who have experience creating Web content that conforms to internationalization needs.</p></div><div > <h2><a name="status" id="status" />Status of this Document</h2><p><em>This section describes the status of this document at the time of its publication. Other documents may
|
|
supersede this document. A list of current W3C publications and the latest revision of this technical report can be
|
|
found in the
|
|
<a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p><p>This is the First Public Working Draft of a document produced by the
|
|
<a href="http://www.w3.org/International/geo/">GEO (Guidelines, Education & Outreach) Task Force</a> of
|
|
the
|
|
<a href="http://www.w3.org/International/">W3C Internationalization Working Group (I18N WG)</a>. The
|
|
Internationalization Working Group is part of the
|
|
<a href="http://www.w3.org/International/Activity">W3C Internationalization Activity</a>. This is a draft
|
|
document that does not fully represent the consensus of the group at this time. The Working Group expects to advance
|
|
this Working Draft to Working Group Note.</p><p>The document provides practical techniques related to character sets, encodings, and other character-specific
|
|
matters that HTML content authors can use to ensure that their HTML is easily adaptable for an international audience.
|
|
These are techniques that are best addressed from the start of content development if unnecessary costs and resource
|
|
issues are to be avoided later on.</p><p>This document was last published as part of a larger document entitled
|
|
<a href="http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/">Authoring Techniques for XHTML & HTML
|
|
Internationalization 1.0</a>. The material in that document will now be published as a number of smaller independent
|
|
documents to allow for easier ongoing improvements and updates. The total number of such documents is not fixed, but
|
|
will grow as material and resources become available. The title of all related documents will begin with "Authoring
|
|
Techniques for XHTML & HTML Internationalization:..." and they can be found in the
|
|
<a href="http://www.w3.org/TR/">W3C technical reports index</a>.</p><p>The Task Force encourages feedback about the content of this document as well as participation in the
|
|
development of the guidelines by people who have experience creating Web content that conforms to internationalization
|
|
needs. Send comments about this document to
|
|
<a href="mailto:www-i18n-comments@w3.org">www-i18n-comments@w3.org</a>. The
|
|
<a href="http://lists.w3.org/Archives/Public/www-i18n-comments/">archives</a> for this list are publicly
|
|
available.</p><p>The Internationalization Working Group will not allow early implementation to constrain its ability to make
|
|
changes to this specification prior to final release. Publication as a Working Draft does not imply endorsement by the
|
|
W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It
|
|
is inappropriate to cite this document as other than work in progress.</p><p>This document has been produced under the
|
|
<a href="http://www.w3.org/TR/2002/NOTE-patent-practice-20020124">24 January 2002 CPP</a> as amended by the
|
|
|
|
<a href="http://www.w3.org/2004/02/05-pp-transition">W3C Patent Policy Transition Procedure</a>. The
|
|
Working Group maintains a
|
|
<a href="http://www.w3.org/International/2002/Disclosures">public list of patent disclosures</a> relevant
|
|
to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge
|
|
of a patent which the individual believes contains Essential Claim(s) with respect to this specification should
|
|
disclose the information in accordance with
|
|
<a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent
|
|
Policy</a>. At the time of publication, the Working Group believed there were no patent disclosures relevant to this
|
|
specification.</p></div><div class="toc" > <h2><a name="contents" id="contents" />Table of Contents</h2><p class="toc">1 <a href="#ri20030912.142608197">Introduction</a><br /> 1.1 <a href="#ri20031001.170046667">Who should use this document</a><br /> 1.2 <a href="#ri20030912.142616699">How to use this document</a><br /> 1.3 <a href="#ri20030912.143319987">Standards addressed</a><br /> 1.4 <a href="#ri20030912.144634229">User agents addressed</a><br /> 1.5 <a href="#IDA4MFO">Editorial notes</a><br />2 <a href="#IDAPNFO">Choosing a page encoding </a><br />3 <a href="#ri20040310.054442951">Specifying a page encoding</a><br /> 3.1 <a href="#IDARVFO">Using the HTTP header</a><br /> 3.2 <a href="#IDAK1FO">Declaring the encoding in-document</a><br /> 3.3 <a href="#IDAIIGO">Declaring the encoding in more than one place</a><br /> 3.4 <a href="#IDA5JGO">Choosing names for your encodings</a><br />4 <a href="#IDAPNGO">Representing characters using escapes</a><br /></p> <h3><a name="appendices" id="appendices" />Appendices</h3><p class="toc">A <a href="#IDAPXGO">Acknowledgements</a><br />B <a href="#IDAXXGO">References</a><br /></p></div><hr /><div class="body" ><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.142608197" id="ri20030912.142608197" />1 Introduction</h2><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20031001.170046667" id="ri20031001.170046667" />1.1 Who should use this document</h3><p >All HTML content authors working with XHTML 1.0, HTML 4.01, XHTML 1.1, CSS1, CSS2 and CSS3.</p><p >The term author is used in the sense described by the HTML 4.01 spec, ie. as a person or program that writes
|
|
or generates HTML documents.</p><p >This document provides guidance for the development of HTML so that it will support international usage.
|
|
This is the responsibility of all content authors, not just the localization group, and is relevant from the very start
|
|
of development. Ignoring the advice in this document, or relegating it to a later phase in the development, will only
|
|
add unnecessary costs and resource issues at a later date.</p><p >It is assumed that readers of this document are proficient in developing HTML and XHTML pages - this
|
|
document limits itself to providing advice related specifically to internationalization.</p></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.142616699" id="ri20030912.142616699" />1.2 How to use this document</h3><p >If you are new to this topic you may wish to read this document from end to end. It is, however, expected
|
|
that this document will normally be used for reference purposes - the reader dipping in to a particular section to find
|
|
out how to perform a specific task with internationalization in mind. </p><p >This document is one of several documents relating to the design of XHTML and HTML documents. An
|
|
<a href="http://www.w3.org/International/geo/html-tech/outline/html-authoring-outline.html">overview
|
|
document</a> is available that summarises all the recommendations of this and its companion documents together,
|
|
organized according to tasks that a developer of XHMTL/HTML content may want to perform. When this material is used as
|
|
a reference, it is recommended that the overview document is used as a starting point.</p><p >Cross references and further resources are summarized at the end of each section.</p><p >Editorial notes have been left in this version of the document. These are marked
|
|
.</p><p >For information about the applicability of recommendations to user agents see below.</p></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.143319987" id="ri20030912.143319987" />1.3 Standards addressed</h3><p >This document provides techniques for developing pages using HTML 4.01, XHTML 1.0 and XHTML 1.1 with CSS1,
|
|
CSS2 and some parts of CSS3.</p><p >Note that XHTML source can be served as XML (using MIME types <code class="keyword">application/xhtml+xml</code>,
|
|
<code class="keyword">application/xml</code> or <code class="keyword">text/xml</code>) or HTML (using the MIME type <code class="keyword">text/html</code>).</p><p >It is very common for XHTML to be served as HTML, following the
|
|
<a href="http://www.w3.org/TR/xhtml1/#guidelines">compatibility guidelines in Appendix C </a>of the XHTML
|
|
1.0 specification. This allows authors with the right editing tools to produce valid XML code, which therefore lends
|
|
itself to processing with such things as scripting or XSLT, but is also well supported for display by most mainstream
|
|
browsers. (XHTML served as <code>application/xhtml+xml</code> is not well supported for browser display at the moment.)
|
|
In this document we wish to reflect practical reality for content authors, so we cover XHTML served as
|
|
<code class="keyword">text/html</code> in the techniques.</p><p >Indeed we encourage the use of XHTML, and all the examples (unless trying to make a specific point about
|
|
HTML 4.01) are written in XHTML.</p><p > For XHTML served as XML, this document limits its advice to documents served as
|
|
<code class="keyword">application/xhtml+xml</code>. Note that user agent support for XHTML served as XML is still patchy.</p></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.144634229" id="ri20030912.144634229" />1.4 User agents addressed</h3><p >In order to improve the value of this information to the user we try to ground techniques with information
|
|
about their applicability to particular user agents.</p><p >User agents, in this current version, means a number of mainstream browsers. (The scope may grow as
|
|
resources and test results become available for other user agents.)</p><p >In an attempt to make the task of tracking browser applicability manageable, we have chosen a 'base version'
|
|
for each of the user agents we are tracking for applicability. This base version represents a fairly recent,
|
|
standards-compliant version of the browser. Where a browser operates in both standards- and quirks-mode, standards-mode
|
|
is assumed (ie. you should use a DOCTYPE statement).</p><p >The base versions considered for this version of the document include:</p><ul ><li><p >Internet Explorer 6 (Windows)</p></li><li><p >Mozilla 1.4</p></li><li><p >Opera 7</p></li><li><p >Netscape Navigator 7</p></li><li><p >Safari</p></li><li><p >Internet Explorer 5 (Mac)</p></li></ul><p >If the technique is applicable to a base version of a user agent the name of that user agent will appear
|
|
immediately below the summary of the technique. If the technique is not applicable, the name will appear crossed out.
|
|
If the name does not appear at all, this signifies that further investigation is needed. If the technique is applicable
|
|
to a later version than the chosen base version, this will be indicated by adding the version number to the name.</p><p >Detailed information may also be provided from time to time about behavior of a user agent in an earlier
|
|
version than the base version, or about some particular aspect of the behavior of a base version or later user agent.
|
|
This is provided in a special boxed section within the body of the text.</p></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDA4MFO" id="IDA4MFO" />1.5 Editorial notes</h3><p ></p><p ></p><p ></p></div></div><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPNFO" id="IDAPNFO" />2 Choosing a page encoding </h2><div class="rule"><a id="ri20030112.213746362" name="ri20030112.213746362" href="#ri20030112.213746362">
|
|
Choose UTF-8 or another Unicode encoding for all content.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >When selecting a page encoding, consider both current and future localization requirements, and the benefits
|
|
of using the same encoding across all pages and all languages. These considerations make the use of Unicode an
|
|
attractive choice for the following reasons:</p>
|
|
<ul ><li><p >Unicode supports many languages, enabling the use of a single encoding across all pages and forms,
|
|
regardless of language.</p></li><li><p >Unicode allows many more languages to be mixed on a single page than almost any other choice. If the set
|
|
of languages to be represented on a single page cannot be represented directly by any single native encoding (such as
|
|
ISO-8859-1, Shift-JIS, etc.), then Unicode is almost certainly the best choice.</p></li><li><p >For dynamically-generated pages, a single encoding for all pages eliminates the need for server-side
|
|
logic to determine the character encoding for each page served.</p></li><li><p >For interactive applications using forms, a single encoding eliminates the need for server-side logic to
|
|
determine the character encoding of incoming form data.</p></li><li><p >Unicode enables a form in one language (e.g. English) to accept input in a different language (e.g.
|
|
Chinese).</p></li><li><p >Unicode (UTF-8) forms will be easier to migrate to XForms.</p></li></ul>
|
|
<p >UTF-8 and UTF-16 are both Unicode encodings. Since support for Unicode is currently limited to UTF-8 in many
|
|
user agents, UTF-8 is usually the appropriate Unicode encoding. However, as user agent support for UTF-16 expands,
|
|
UTF-16 will become an increasingly viable alternative.</p>
|
|
<p >Although there are other multi-script encodings (such as ISO-2022 and GB18030), Unicode generally provides
|
|
the best combination of user agent and script support.</p></div><div class="rule"><a id="ri20030112.21374337" name="ri20030112.21374337" href="#ri20030112.21374337">
|
|
If you don't use a Unicode encoding, select an encoding that best supports the languages / characters to be
|
|
included in the page text.
|
|
|
|
</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >There are some situations where selecting a Unicode encoding is not practical. If content is encoded in a
|
|
native encoding (legacy content or content originating from an external source) and the system lacks functionality for
|
|
converting content between encodings, Unicode may greatly complicate implementation. If such a site is only required to
|
|
serve single-script pages (containing languages that can be represented by a single native encoding), then the cost of
|
|
using a Unicode encoding may outweigh the benefits. In this case, a native encoding (such as ISO-8859-1, Shift-JIS,
|
|
etc.) may be a better choice.</p>
|
|
<p >Be sure to select an encoding that covers most
|
|
of the characters required for the content, and (if it is a form) all
|
|
of the characters that must be accepted as input.</p></div><div class="rule"><a id="ri20030314.181040685" name="ri20030314.181040685" href="#ri20030314.181040685">
|
|
Check that user agents (all agents that must render the page) adequately support the page encoding that you
|
|
have selected. If not, you might need to use a more widely supported encoding to achieve an adequate degree of user
|
|
agent support.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Not all user agents support all page encodings, so it is important to understand which user agents must be
|
|
able to render the page, and be sure that they have adequate support for the page encoding you have selected.</p>
|
|
<p >In general, user agents are most likely to support the commonly-used native character encodings for the
|
|
major languages used on the web. Support for less commonly used encodings depends on the user agent. Older user agents,
|
|
or user agents that operate under severe memory limitations, may not support UTF-8.</p>
|
|
<p >It is important to note that support for a given encoding does not necessarily imply support for all writing
|
|
systems that encoding supports. For example, a user agent might support UTF-8, but not correctly display bidirectional
|
|
Arabic text encoded in UTF-8. To display a page correctly, a user agents must support both the page encoding and the
|
|
writing system.</p></div><div class="rule"><a id="ri20030112.213752611" name="ri20030112.213752611" href="#ri20030112.213752611">
|
|
Use character sets and encodings that will be accessible and common to your
|
|
users.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >.</p></div>
|
|
</div><div class="resources"><div class="small-head">Resources:</div><h4><a id="FIIDAPNFO" name="FIIDAPNFO">Further information</a></h4><ul><li>How do I specify the encoding?<br /><a href="#ri20040310.054442951" ><b>3 Specifying a page encoding</b></a><br /></li></ul><h4><a id="IGIDAPNFO" name="IGIDAPNFO">Implementation guidelines</a></h4><ul><li><a title="The Unicode Standard, Version 3" href="#unicode">[Unicode]</a> <a href="http://www.unicode.org/versions/Unicode4.0.0/">The Unicode Standard 4.0</a><br />The Unicode Standard
|
|
is very readable and contains a large amount of useful information besides code point
|
|
listings.</li></ul><h4><a id="RLIDAPNFO" name="RLIDAPNFO">Reference links</a></h4><ul><li> <a href="http://www.alanwood.net/unicode/index.html">Alan Wood’s Unicode Resources</a><br />Various resources
|
|
about Unicode and multilingual support in HTML, fonts, web browsers and other applications.</li></ul><h4><a id="SIDAPNFO" name="SIDAPNFO">Sources</a></h4><ul><li><a title="Character Model for the World Wide Web 1.0" href="#charmod">[CharMod]</a> <a href="http://www.w3.org/TR/charmod/#sec-Escaping">3.7 Character Escaping</a><br />Character Model for the
|
|
World Wide Web 1.0</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.1">5.2.1 Choosing an encoding</a><br />HTML 4.01
|
|
spec</li></ul></div><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20040310.054442951" id="ri20040310.054442951" />3 Specifying a page encoding</h2><p >For overviews of the mechanics of specifying a page encoding and additional examples, see the tutorial
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
|
|
encodings</a>.</p><div class="rule"><a id="ri20040215.100236230" name="ri20040215.100236230" href="#ri20040215.100236230">
|
|
Always declare the encoding of your documents.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Whether you declare the encoding by passing information alongside the document in the HTTP header, or inside
|
|
the document itself, you should always ensure that the encoding is declared. If you don't do this, the chances are high
|
|
that your document will be incorrectly rendered.</p>
|
|
<p >Note also that you should include a character encoding declaration even if your document uses a basic Latin
|
|
encoding such as ISO 8859-1. For example, Japanese user agents will default to a Japanese encoding that does not
|
|
include the accented letters, so they may not see your text correctly unless you specified the
|
|
encoding.</p></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDARVFO" id="IDARVFO" />3.1 Using the HTTP header</h3><div class="rule"><a id="ri20030509.093901773" name="ri20030509.093901773" href="#ri20030509.093901773">
|
|
Where appropriate, declare the page's character encoding by setting the <code class="keyword">charset</code> parameter in the
|
|
HTTP <code class="keyword">Content-Type</code> header.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >According to the HTML specification, in a case of conflict the HTTP charset declaration has the highest
|
|
priority of all means of declaring the character set.</p>
|
|
<p >Advantages to this approach:</p>
|
|
<ul ><li><p >User agents can easily find the character encoding information when it is sent in the HTTP header.</p></li><li><p >The HTTP header information has the highest priority in case of conflict, so this approach should be
|
|
used by intermediate servers that transcode the data (ie. convert to a different encoding). This is sometimes done for
|
|
small devices that only recognize a small number of encodings. Because the HTTP header information has precedence over
|
|
any in-document declaration, it doesn't matter that transcoders typically do not change the internal encoding
|
|
declarations, just the document encoding.</p></li></ul>
|
|
<p >There may be some disadvantages when dealing with static files or templates:</p>
|
|
<ul ><li><p >It may be difficult for content authors to change the encoding information on the server - especially
|
|
when dealing with an ISP. They will need knowledge of and access to the server settings.</p></li><li><p >Server settings may get out of synchronization with the document for one reason or another. This may
|
|
happen, for example, if you rely on the server default, and that default is changed. This is a very bad situation,
|
|
since the higher precedence of the HTTP information versus the in-document declaration may cause the document to become
|
|
unreadable.</p></li></ul>
|
|
<p >In addition, there are potential problems for both static and dynamic documents if they are to be saved by
|
|
the user or used from a location such as a CD or hard disk. In these cases encoding information from an HTTP header is
|
|
not available.</p>
|
|
<p >Similarly, if the character encoding is only declared in the HTTP header, this information may become
|
|
separated from files that are processed by such things as XSLT or scripts, or from files that are sent for
|
|
translation.</p>
|
|
<p >For these reasons you should always ensure that encoding information is <em>also</em> declared inside
|
|
the document.</p>
|
|
<p >Care should also be taken to ensure that the server-side settings are maintained if the file is moved or
|
|
the server technology is changed.</p></div><div class="rule"><a id="ri20040215.104619262" name="ri20040215.104619262" href="#ri20040215.104619262">
|
|
If declaring the character encoding in the HTTP header, ensure that the server-side settings will be
|
|
maintained, especially if the file is moved or the server technology is changed.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Discrepancies may arise due to the document being moved, because a server administrator or other content
|
|
author changes settings that cascade to your document, or because the server or server version has changed, etc. Since
|
|
encoding declarations in the HTTP header have highest priority in determining the encoding of the document, it is a
|
|
very bad situation if the server-side settings are inadvertently changed.</p>
|
|
<p >If content authors need to set server-side settings, it is important to also ensure that they have the
|
|
required knowledge, access and privileges to do so. This is especially important when dealing with a third-party
|
|
ISP.</p>
|
|
</div><div class="rule"><a id="ri20040215.101337371" name="ri20040215.101337371" href="#ri20040215.101337371">
|
|
If declaring the character encoding in the HTTP header, always declare the encoding inside the document
|
|
too.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >This does not rule out also declaring it in the HTTP information provided by the server, but provides for
|
|
use of the document when the HTTP information is not available.</p>
|
|
<p >This is important for both static and dynamic documents if there is a chance that your documents will be
|
|
saved to or read from disk, CD, etc.</p>
|
|
<p >Also, if the character encoding is only declared in the HTTP header, this information may become separated
|
|
from files from files that are sent for translation or processed by such things as XSLT or scripts.</p>
|
|
<p >It is also valuable for developers, testers, or translation production managers who may want to perform a
|
|
visual check of a document.</p>
|
|
</div></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAK1FO" id="IDAK1FO" />3.2 Declaring the encoding in-document</h3><div class="rule"><a id="ri20030112.213757177" name="ri20030112.213757177" href="#ri20030112.213757177">
|
|
For HTML documents and XHTML documents served as text/html, always use a <code class="keyword">meta</code> element to explicitly
|
|
declare the document's character encoding.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
|
|
encodings</a>.</p>
|
|
<div class="example"><div class="small-head">Example:</div><p ><code><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></code></p></div>
|
|
<p >This approach is not appropriate for documents served as XML, but when serving a document as HTML, there
|
|
are no disadvantages and a couple of definite advantages, even if the encoding has been declared in the HTTP
|
|
header:</p>
|
|
<ul ><li><p >An in-document encoding allows the document to be read correctly when not on a server. This applies
|
|
not only to static documents read from disk or CD, but also dynamic documents that are saved by the reader.</p></li><li><p >An in-document declaration of this kind helps developers, testers, or translation production managers
|
|
who want to perform a visual check of a document. This applies particularly to static documents or templates used to
|
|
generate dynamic documents.</p></li></ul>
|
|
</div><div class="rule"><a id="ri20030112.223147682" name="ri20030112.223147682" href="#ri20030112.223147682">
|
|
Use <code class="keyword">meta</code> charset declarations as early as possible in the <code class="keyword">head</code>
|
|
element.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >This maximizes the likelihood that non-ASCII characters will be correctly recognized by the user
|
|
agent.</p>
|
|
<p >The HTML spec says "The <code class="keyword">meta</code> declaration must only be used when the character encoding is
|
|
organized such that ASCII-valued bytes stand for ASCII characters (at least until the <code class="keyword">meta</code> element is parsed).
|
|
"
|
|
</p></div><div class="rule"><a id="ri20031001.14582550" name="ri20031001.14582550" href="#ri20031001.14582550">
|
|
For XHTML served as <code class="keyword">application/xhtml+xml</code>, always use an XML declaration with an encoding
|
|
attribute.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
|
|
encodings</a>.</p>
|
|
<div class="example"><div class="small-head">Example:</div><p ><code><?xml version="1.0" encoding="UTF-8"?></code></p></div>
|
|
<p >If you are serving XHTML as <code class="keyword">application/xhtml+xml</code>, the encoding attribute is mandatory unless you
|
|
are using UTF-8 or UTF-16 or declaring the encoding in the HTTP header.</p>
|
|
<p >Even if the file document is encoding in UTF-8 or UTF-16, declaring the encoding in the document is useful
|
|
for the following reasons:</p>
|
|
<ul ><li><p >It is useful to have the encoding declared in the document when editing or processing the file as
|
|
XML.</p></li><li><p >An in-document declaration helps developers, testers, or translation production managers who want to
|
|
perform a visual check of a document. This is a good reason for including the encoding declaration even if the file is
|
|
in UTF-8 or UTF-16, despite the fact that it is not strictly necessary for these encodings.</p></li><li><p >An in-document encoding allows the document to be read correctly when not read from the server.</p></li><li><p >There is likely to be no other in-document alternative to express the character encoding. (The charset
|
|
<code class="keyword">meta</code> declaration is not recognized by XML processors.)</p></li></ul></div><div class="rule"><a id="ri20030509.100837166" name="ri20030509.100837166" href="#ri20030509.100837166">
|
|
For XHTML served as text/html, where practical use an XML declaration with an encoding
|
|
attribute.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
|
|
encodings</a>.</p>
|
|
<div class="example"><div class="small-head">Example:</div><p ><code><?xml version="1.0" encoding="UTF-8"?></code></p></div>
|
|
<p >Key reasons for using XHTML are to take advantage of the benefits that XML brings for editing and
|
|
processing, but when these documents are served as text/html to user agents, they are treated as HTML, not XML.</p>
|
|
<p >Advantages to including an XML declaration include the following:</p>
|
|
<ul ><li><p >If your document is not encoded in UTF-8 or UTF-16 and the encoding is not declared in an HTTP header,
|
|
it is necessary to have this in-document encoding declaration when editing or processing the file as XML, eg. using
|
|
XSLT transformations or scripting, since the XML processors do not see HTTP information, and do not recognize the meta
|
|
charset statement described earlier.</p></li><li><p >In some cases, you may want to serve the same static document as either HTML or XML, depending on the
|
|
capabilities of the requesting user agent. This can be achieved by server-side logic. In these cases you will want to
|
|
have an XML declaration in the document when it is served as XML. (We are assuming that the appropriate declaration can
|
|
be added to the file via scripting for dynamically created documents.)</p></li></ul>
|
|
<p >On the other hand:</p>
|
|
<ul ><li><p >Because the XML declaration may cause undesirable effects in some user agents (see
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#serving">Serving HTML &
|
|
XHTML</a>), you may prefer to omit it.</p></li><li><p >The XML declaration is not actually needed for HTML documents (which is what we are discussing here).
|
|
HTML processors do not use this information, and the encoding information should be included in the meta charset
|
|
statement described above.</p></li></ul>
|
|
<p >In summary we could say the following:</p>
|
|
<ul ><li><p >If the XML declaration will not cause your document any harm, it is best to include it. If you do use
|
|
an XML declaration, you should always declare the encoding in it.</p></li><li><p >If you are worried about the undesirable effects sometimes associated with use of the XML declaration
|
|
in HTML files, the best solution is to omit the declaration but serve the file as UTF-8 or UTF-16.</p></li><li><p >If you use UTF-8 or UTF-16 the file is still perfectly valid XML, but no XML declaration is
|
|
required.</p></li></ul></div><div class="rule"><a id="ri20040215.115249590" name="ri20040215.115249590" href="#ri20040215.115249590">
|
|
If you serve an XHTML file without an encoding declaration in the HTTP header or the XML declaration, you
|
|
must use either UTF-8 or UTF-16 as the document encoding.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >This is required by the XHTML specification.</p>
|
|
</div></div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAIIGO" id="IDAIIGO" />3.3 Declaring the encoding in more than one place</h3><div class="rule"><a id="ri20040215.121036394" name="ri20040215.121036394" href="#ri20040215.121036394">
|
|
If you declare the document's character encoding in more than one place, take steps to ensure that it is
|
|
always correct.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >If all declarations are correct, then there will be no conflicts.</p>
|
|
<p >If you serve encoding information in the HTTP header, it is particularly important to ensure that it is
|
|
always served correctly since this declaration has the highest priority. It is also the method most open to risks of
|
|
inadvertent change.</p>
|
|
<p >Also ensure that any editing or scripting tools you use consistently apply the correct encoding
|
|
information - especially if your tools add the declarations automatically.</p>
|
|
</div>
|
|
</div><div class="div2">
|
|
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDA5JGO" id="IDA5JGO" />3.4 Choosing names for your encodings</h3><div class="rule"><a id="ri20030112.213749756" name="ri20030112.213749756" href="#ri20030112.213749756">
|
|
Use the preferred names from IANA's charset registry.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >The IANA charset registry shows a name plus a list of aliases for each registered charset value. One of
|
|
these is identified as the preferred MIME name. Wherever you declare the character encoding, use the preferred MIME
|
|
name in the charset value.</p>
|
|
<p >This maximizes the likelihood of interoperability.</p></div><div class="rule"><a id="ri20040215.112209454" name="ri20040215.112209454" href="#ri20040215.112209454">
|
|
Do not invent your own encoding names using the <code class="keyword">x-</code> syntax.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >This is not usually a good idea since it limits interoperability.</p>
|
|
</div>
|
|
</div></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="BIIDAAUFO" name="BIIDAAUFO">Background information</a></h4><ul><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#serving">Serving HTML &
|
|
XHTML</a><br />Describes possible problems when serving HTML files with an XML
|
|
declaration.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
|
|
encoding</a><br />Provides a description of how the charset information is passed with the HTTP header, and
|
|
more background.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
|
|
encoding</a><br />Shows how to set the character encoding in a meta
|
|
statement.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
|
|
encoding</a><br />Shows how to set the character encoding in an XML
|
|
declaration.</li></ul><h4><a id="RLIDAAUFO" name="RLIDAAUFO">Reference links</a></h4><ul><li><a title="Official Names for Character Sets" href="#iana">[IANA]</a> <a href="http://www.iana.org/assignments/character-sets">IANA charset
|
|
registry</a><br /></li><li> <a href="http://www.w3.org/International/O-HTTP-charset.html">The HTTP
|
|
charset parameter</a><br />Explains how to set the HTTP charset parameter of the Content-Type header on
|
|
various servers and with various dynamic technologies.</li></ul><h4><a id="SIDAAUFO" name="SIDAAUFO">Sources</a></h4><ul><li><a title="Hypertext
 Transfer Protocol -- HTTP/1.1" href="#rfc2616">[RFC2616]</a> <a href="http://www.ietf.org/rfc/rfc2616.txt">RFC2616: Hypertext Transfer Protocol --
|
|
HTTP/1.1</a><br /></li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#C_9">3.1.1. Strictly Conforming Documents (towards the bottom of the
|
|
section)</a><br />General requirements for specification of encoding in XHTML
|
|
documents.</li><li> <a href="http://www.w3.org/International/questions/qa-setting-encoding-in-applications.html">FAQ: Setting encoding in web
|
|
authoring applications</a><br />How do I set character encoding in my web authoring
|
|
applications?</li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#C_9">C.9 Character encoding</a><br />How to specify
|
|
character encoding for XHTML served as text/html using compatibility markup.</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.2">5.2.2 Specifying the character
|
|
encoding</a><br />HTML 4.01 spec</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.2">5.2.2 Specifying the character
|
|
encoding</a><br />HTML 4.01 spec</li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#strict">3.1.1. Strictly Conforming Documents</a><br />XHTML 1.0
|
|
requirements for use of the XML declaration.</li></ul></div><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPNGO" id="IDAPNGO" />4 Representing characters using escapes</h2><p >For an explanation of the different types of escape available in XHTML, HTML and CSS, see
|
|
<a href="/International/tutorials/tutorial-char-enc.html#entities">What are entities and NCRs?</a>.</p><div class="rule"><a id="ri20030112.223401895" name="ri20030112.223401895" href="#ri20030112.223401895">
|
|
Only use escapes for characters in exceptional circumstances. Create pages using an encoding that supports all
|
|
the characters you need.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Using escapes can make it difficult to read and maintain source code, and can also significantly increase
|
|
file size. Many English-speaking developers have the expectation that other languages only make occasional use of
|
|
non-ASCII characters, but this is wrong.</p>
|
|
<p >There are three characters which should always appear in content as escapes, so that they do not interact
|
|
with the syntax of the markup:</p>
|
|
<ul ><li><p >&lt; (<)</p></li><li><p >&gt; (>)</p></li><li><p >&amp; (&)</p></li></ul>
|
|
<p >You may also want to represent the double-quote (") as &quot; - particularly in attribute text when you
|
|
need to use the same type of quotes as you used to surround the attribute value.</p>
|
|
<p >Escapes can be useful to represent characters not supported by the encoding you chose for the document. For
|
|
example, to represent Chinese characters in an ISO Latin 1 document. You should ask yourself first, however, why you
|
|
have not changed the encoding of the document to something that covers all the characters you need (such as, of course,
|
|
UTF-8).</p>
|
|
<p >If your editing tool does not allow you to easily enter needed characters you may also resort to using
|
|
escapes. Note that this is not a long-term solution, nor one that works well if you have to enter a lot of such
|
|
characters - it takes longer and makes maintenance more difficult. Ideally you would choose an editing tool that
|
|
allowed you to enter these characters as characters.</p>
|
|
<p >A potentially very useful role for escapes is for characters that are invisible or ambiguous in
|
|
presentation.</p>
|
|
<p >One example would be Unicode character <span class="uname">200F: RIGHT-TO-LEFT MARK</span>. This character can be used
|
|
to clarify directionality in bidirectional text (eg. when using the Arabic or Hebrew scripts). It has no graphic form,
|
|
however; so it is difficult to see where these characters are in the text, and if they are lost or forgotten they could
|
|
create unexpected results during later editing. Using &rlm; (or its NCR equivalent &#x200F;) instead makes it
|
|
very easy to spot these characters.</p>
|
|
<p >An example of an ambiguous character is <span class="uname">00A0: NO-BREAK SPACE</span>. This type of space prevents
|
|
line breaking, but it looks just like any other space when used as a character. Using &nbsp; (or &#xA0;) makes
|
|
it quite clear where such spaces appear in the text.</p></div><div class="rule"><a id="ri20030112.223703527" name="ri20030112.223703527" href="#ri20030112.223703527">
|
|
Ensure that numbers in numeric character references always reference a Unicode
|
|
codepoint.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >It is a common error for people working on a page encoded in Windows code page 1252, for example, to try to
|
|
represent the euro sign using &#x80;. This is because the euro appears at position 80 on the Windows 1252 code
|
|
page. Using &#x80; would actually produce a control character, since the escape would be expanded as the character
|
|
at position 80 in the Unicode repertoire. What was really needed was &#x20AC;.</p></div><div class="rule"><a id="ri20040312.072207969" name="ri20040312.072207969" href="#ri20040312.072207969">
|
|
When using escapes, use the hexadecimal form.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Typically when the Unicode Standard refers to or lists characters it does so using a hexadecimal value. For
|
|
instance, the code point for the letter á may be referred to as U+00E1. Given the prevalence of this convention, it is
|
|
often useful, though not required, to use hexadecimal numeric values in escapes rather than decimal values. You do not
|
|
need to use leading zeros in escapes.</p></div><div class="rule"><a id="ri20040312.105840211" name="ri20040312.105840211" href="#ri20040312.105840211">
|
|
Use numeric character references rather than entities if your document is to be processed by unknown XML tools
|
|
or converted to XML.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Any XML application recognizes numeric character references such as &#xE1; as representing Unicode
|
|
characters. On the other hand, an entity such as &aacute; has to be declared in the DTD or Schema to be recognized
|
|
in the XML. Character entities are defined as part of the HTML / XHTML standard, but are often not incorporated in
|
|
other flavours of XML.</p>
|
|
<p >If there is a likelihood that you will want to repurpose or process this information (including sometimes
|
|
running it through localization tools), you should think carefully about which approach is most
|
|
appropriate.</p></div><div class="rule"><a id="ri20040312.070547700" name="ri20040312.070547700" href="#ri20040312.070547700">
|
|
If you use escapes, to represent characters in a <code class="keyword">style</code> attribute consider using CSS escapes, rather
|
|
than NCRs or entities.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >This is likely to be a very rare occurrence, firstly, because it is usually better to use style information
|
|
in a separate stylesheet or stylesheet element; and, secondly, because there are not many situations where you are
|
|
likely to need non-ASCII characters in styling that appears in an attribute.</p>
|
|
<p >The issue arises because a <code class="keyword">style</code> attribute in XHTML or HTML can represent characters using NCRs,
|
|
entities or CSS escapes. On the other hand, the <code class="keyword">style</code> <em>element</em> in HTML can contain neither NCRs
|
|
nor entities, and the same applies to an external style sheet.</p>
|
|
<p >Because there is a tendency to want to move styles declared in attributes to the style element or an
|
|
external style sheet (for example, this might be done automatically using an application or script), it is safest to
|
|
use only CSS escapes.</p>
|
|
<p >For example, it is better to use</p>
|
|
<div class="example"><div class="small-head">Example:</div><p ><code><span style="font-family: L\FC beck">...</span></code></p></div>
|
|
<p >than</p>
|
|
<div class="example"><div class="small-head">Example:</div><p ><code><span style="font-family: L&#xFC;beck">...</span></code></p></div></div><div class="rule"><a id="ri20030112.223804174" name="ri20030112.223804174" href="#ri20030112.223804174">
|
|
If, for a specific application, it becomes necessary to refer to characters outside [ISO10646], characters
|
|
should be assigned to a private zone to avoid conflicts with present or future versions of the standard. Use of private
|
|
use characters is highly discouraged, however, for reasons of portability.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >tbd</p></div><div class="rule"><a id="ri20030112.223911671" name="ri20030112.223911671" href="#ri20030112.223911671">
|
|
</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
|
|
<p >Discuss</p></div></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="BIIDAPNGO" name="BIIDAPNGO">Background information</a></h4><ul><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="/International/tutorials/tutorial-char-enc.html">What are entities and
|
|
NCRs?</a><br />Background reading about the use of escapes, including some examples not found
|
|
here.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="/International/tutorials/tutorial-char-enc.html">What are entities and NCRs?</a><br />Background
|
|
reading about the use of escapes, including some examples not found here.</li></ul><h4><a id="SIDAPNGO" name="SIDAPNGO">Sources</a></h4><ul><li><a title="Cascading Style Sheets, level 2 revision 1" href="#css21">[CSS2.1]</a> <a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/syndata.html#q24">4.4.1 Referring to characters not represented in a
|
|
character encoding</a><br />Advises use of CSS escapes in style attributes.</li><li><a title="Character Model for the World Wide Web 1.0" href="#charmod">[CharMod]</a> <a href="http://www.w3.org/TR/charmod/#sec-Escaping">3.7 Character Escaping</a><br />Character Model for the
|
|
World Wide Web 1.0</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.3">5.3 Specifying the character
|
|
encoding</a><br />HTML 4.01 spec</li></ul></div></div><div class="back" ><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPXGO" id="IDAPXGO" />A Acknowledgements</h2><p >The following GEO Task Force members have contributed their time and valuable comments to shaping these
|
|
guidelines:</p><p >Phil Arko, Steve Billings, Deborah Cawkwell, Wendy Chisholm, Andrew Cunningham, Martin Dürst, Lloyd Honomichl,
|
|
Russ Rolfe, Peter Sigrist, Tex Texin, Najib Tounsi</p></div><div class="div1">
|
|
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAXXGO" id="IDAXXGO" />B References</h2><dl><dt class="label" ><a name="charEncTutorial" id="charEncTutorial" />CharEncTutorial</dt><dd >Richard Ishida,
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html"><cite>Character Sets & Encodings in
|
|
XHTML, HTML and CSS</cite></a>, Draft. (See
|
|
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">http://www.w3.org/International/tutorials/tutorial-char-enc.html).</a></dd><dt class="label" ><a name="charmod" id="charmod" />CharMod</dt><dd >M. J. Dürst, F. Yergeau, R. Ishida, M. Wolf, T. Texin,
|
|
<a href="http://www.w3.org/TR/charmod/"><cite>Character Model for the World Wide Web 1.0</cite></a>, Working Draft in
|
|
Last Call . (See <a href="http://www.w3.org/TR/charmod/">http://www.w3.org/TR/charmod/</a>.)</dd><dt class="label" ><a name="css21" id="css21" />CSS2.1</dt><dd >Håkon Wium Lie, Bert Bos, Tantek Çelik, Ian Hickson, Eds.,
|
|
<a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/"><cite>Cascading Style Sheets, level 2 revision 1</cite></a>,
|
|
Candidate Recommendation, W3C Recommendation. (See <a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/">http://www.w3.org/TR/2004/CR-CSS21-20040225</a>.) </dd><dt class="label" ><a name="html401" id="html401" />HTML 4.01</dt><dd >Dave Raggett, Arnaud Le Hors, Ian Jacobs, Eds.,
|
|
<a href="http://www.w3.org/TR/html401/"><cite>HTML 4.01 Specification</cite></a>, W3C Recommendation. (See
|
|
<a href="http://www.w3.org/TR/html401/">http://www.w3.org/TR/html401</a>.) </dd><dt class="label" ><a name="iana" id="iana" />IANA</dt><dd >Internet Assigned Numbers Authority,
|
|
<a href="http://www.iana.org/assignments/character-sets"><cite>Official Names for Character Sets</cite></a>. (See
|
|
<a href="http://www.iana.org/assignments/character-sets">http://www.iana.org/assignments/character-sets</a>.)
|
|
</dd><dt class="label" ><a name="rfc2616" id="rfc2616" />RFC2616</dt><dd >R. Fielding et al., <a href="http://www.ietf.org/rfc/rfc3066.txt"><cite>Hypertext
|
|
Transfer Protocol -- HTTP/1.1</cite></a>, January 2001. (See <a href="http://www.ietf.org/rfc/rfc2616.txt">http://www.ietf.org/rfc/rfc2616.txt</a></dd><dt class="label" ><a name="unicode" id="unicode" />Unicode</dt><dd >The Unicode Consortium, <cite>The Unicode Standard, Version 3</cite>, ISBN
|
|
0-201-61633-5, as updated from time to time by the publication of new versions. (See
|
|
<a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions</a> for the
|
|
latest version and additional information on versions of the standard and of the Unicode Character Database).</dd><dt class="label" ><a name="xhtml1" id="xhtml1" />XHTML 1.0</dt><dd >W3C HTML Working Group, <a href="http://www.w3.org/TR/xhtml1/"><cite>XHTML™ 1.0
|
|
The Extensible HyperText Markup Language (Second Edition)</cite></a>, W3C Recommendation. (See
|
|
<a href="http://www.w3.org/TR/xhtml1/">http://www.w3.org/TR/xhtml1/</a>.) </dd></dl></div></div></body></html>
|