server_playground/doc/www.w3.org/TR/html5/parsing.html


								<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

								<html lang="en-US-x-Hixie" ><head><title>8.2 Parsing HTML documents &#8212; HTML5 </title><style type="text/css">

								   pre { margin-left: 2em; white-space: pre-wrap; }

								   h2 { margin: 3em 0 1em 0; }

								   h3 { margin: 2.5em 0 1em 0; }

								   h4 { margin: 2.5em 0 0.75em 0; }

								   h5, h6 { margin: 2.5em 0 1em; }

								   h1 + h2, h1 + h2 + h2 { margin: 0.75em 0 0.75em; }

								   h2 + h3, h3 + h4, h4 + h5, h5 + h6 { margin-top: 0.5em; }

								   p { margin: 1em 0; }

								   hr:not(.top) { display: block; background: none; border: none; padding: 0; margin: 2em 0; height: auto; }

								   dl, dd { margin-top: 0; margin-bottom: 0; }

								   dt { margin-top: 0.75em; margin-bottom: 0.25em; clear: left; }

								   dt + dt { margin-top: 0; }

								   dd dt { margin-top: 0.25em; margin-bottom: 0; }

								   dd p { margin-top: 0; }

								   dd dl + p { margin-top: 1em; }

								   dd table + p { margin-top: 1em; }

								   p + * > li, dd li { margin: 1em 0; }

								   dt, dfn { font-weight: bold; font-style: normal; }

								   dt dfn { font-style: italic; }

								   pre, code { font-size: inherit; font-family: monospace; font-variant: normal; }

								   pre strong { color: black; font: inherit; font-weight: bold; background: yellow; }

								   pre em { font-weight: bolder; font-style: normal; }

								   @media screen { code { color: orangered; } code :link, code :visited { color: inherit; } }

								   var sub { vertical-align: bottom; font-size: smaller; position: relative; top: 0.1em; }

								   table { border-collapse: collapse; border-style: hidden hidden none hidden; }

								   table thead, table tbody { border-bottom: solid; }

								   table tbody th:first-child { border-left: solid; }

								   table tbody th { text-align: left; }

								   table td, table th { border-left: solid; border-right: solid; border-bottom: solid thin; vertical-align: top; padding: 0.2em; }

								   blockquote { margin: 0 0 0 2em; border: 0; padding: 0; font-style: italic; }


								   .bad, .bad *:not(.XXX) { color: gray; border-color: gray; background: transparent; }

								   .matrix, .matrix td { border: none; text-align: right; }

								   .matrix { margin-left: 2em; }

								   .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }

								   .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }

								   .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }


								   .toc dfn, h1 dfn, h2 dfn, h3 dfn, h4 dfn, h5 dfn, h6 dfn { font: inherit; }

								   img.extra { float: right; }

								   pre.idl { border: solid thin; background: #EEEEEE; color: black; padding: 0.5em 1em; }

								   pre.idl :link, pre.idl :visited { color: inherit; background: transparent; }

								   pre.css { border: solid thin; background: #FFFFEE; color: black; padding: 0.5em 1em; }

								   pre.css:first-line { color: #AAAA50; }

								   dl.domintro { color: green; margin: 2em 0 2em 2em; padding: 0.5em 1em; border: none; background: #DDFFDD; }

								   hr + dl.domintro, div.impl + dl.domintro { margin-top: 2.5em; margin-bottom: 1.5em; }

								   dl.domintro dt, dl.domintro dt * { color: black; text-decoration: none; }

								   dl.domintro dd { margin: 0.5em 0 1em 2em; padding: 0; }

								   dl.domintro dd p { margin: 0.5em 0; }

								   dl.switch { padding-left: 2em; }

								   dl.switch > dt { text-indent: -1.5em; }

								   dl.switch > dt:before { content: '\21AA'; padding: 0 0.5em 0 0; display: inline-block; width: 1em; text-align: right; line-height: 0.5em; }

								   dl.triple { padding: 0 0 0 1em; }

								   dl.triple dt, dl.triple dd { margin: 0; display: inline }

								   dl.triple dt:after { content: ':'; }

								   dl.triple dd:after { content: '\A'; white-space: pre; }

								   .diff-old { text-decoration: line-through; color: silver; background: transparent; }

								   .diff-chg, .diff-new { text-decoration: underline; color: green; background: transparent; }

								   a .diff-new { border-bottom: 1px blue solid; }


								   h2 { page-break-before: always; }

								   h1, h2, h3, h4, h5, h6 { page-break-after: avoid; }

								   h1 + h2, hr + h2.no-toc { page-break-before: auto; }


								   p  > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]),

								   li > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]), { border-bottom: solid #9999CC; }


								   div.head { margin: 0 0 1em; padding: 1em 0 0 0; }

								   div.head p { margin: 0; }

								   div.head h1 { margin: 0; }

								   div.head .logo { float: right; margin: 0 1em; }

								   div.head .logo img { border: none } /* remove border from top image */

								   div.head dl { margin: 1em 0; }

								   div.head p.copyright, div.head p.alt { font-size: x-small; font-style: oblique; margin: 0; }


								   body > .toc > li { margin-top: 1em; margin-bottom: 1em; }

								   body > .toc.brief > li { margin-top: 0.35em; margin-bottom: 0.35em; }

								   body > .toc > li > * { margin-bottom: 0.5em; }

								   body > .toc > li > * > li > * { margin-bottom: 0.25em; }

								   .toc, .toc li { list-style: none; }


								   .brief { margin-top: 1em; margin-bottom: 1em; line-height: 1.1; }

								   .brief li { margin: 0; padding: 0; }

								   .brief li p { margin: 0; padding: 0; }


								   .category-list { margin-top: -0.75em; margin-bottom: 1em; line-height: 1.5; }

								   .category-list::before { content: '\21D2\A0'; font-size: 1.2em; font-weight: 900; }

								   .category-list li { display: inline; }

								   .category-list li:not(:last-child)::after { content: ', '; }

								   .category-list li > span, .category-list li > a { text-transform: lowercase; }

								   .category-list li * { text-transform: none; } /* don't affect <code> nested in <a> */


								   .XXX { color: #E50000; background: white; border: solid red; padding: 0.5em; margin: 1em 0; }

								   .XXX > :first-child { margin-top: 0; }

								   p .XXX { line-height: 3em; }

								   .annotation { border: solid thin black; background: #0C479D; color: white; position: relative; margin: 8px 0 20px 0; }

								   .annotation:before { position: absolute; left: 0; top: 0; width: 100%; height: 100%; margin: 6px -6px -6px 6px; background: #333333; z-index: -1; content: ''; }

								   .annotation :link, .annotation :visited { color: inherit; }

								   .annotation :link:hover, .annotation :visited:hover { background: transparent; }

								   .annotation span { border: none ! important; }

								   .note { color: green; background: transparent; font-family: sans-serif; }

								   .warning { color: red; background: transparent; }

								   .note, .warning { font-weight: bolder; font-style: italic; }

								   p.note, div.note { padding: 0.5em 2em; }

								   span.note { padding: 0 2em; }

								   .note p:first-child, .warning p:first-child { margin-top: 0; }

								   .note p:last-child, .warning p:last-child { margin-bottom: 0; }

								   .warning:before { font-style: normal; }

								   p.note:before { content: 'Note: '; }

								   p.warning:before { content: '\26A0 Warning! '; }


								   .bookkeeping:before { display: block; content: 'Bookkeeping details'; font-weight: bolder; font-style: italic; }

								   .bookkeeping { font-size: 0.8em; margin: 2em 0; }

								   .bookkeeping p { margin: 0.5em 2em; display: list-item; list-style: square; }

								   .bookkeeping dt { margin: 0.5em 2em 0; }

								   .bookkeeping dd { margin: 0 3em 0.5em; }


								   h4 { position: relative; z-index: 3; }

								   h4 + .element, h4 + div + .element { margin-top: -2.5em; padding-top: 2em; }

								   .element {

								     background: #EEEEFF;

								     color: black;

								     margin: 0 0 1em 0.15em;

								     padding: 0 1em 0.25em 0.75em;

								     border-left: solid #9999FF 0.25em;

								     position: relative;

								     z-index: 1;

								   }

								   .element:before {

								     position: absolute;

								     z-index: 2;

								     top: 0;

								     left: -1.15em;

								     height: 2em;

								     width: 0.9em;

								     background: #EEEEFF;

								     content: ' ';

								     border-style: none none solid solid;

								     border-color: #9999FF;

								     border-width: 0.25em;

								   }


								   .example { display: block; color: #222222; background: #FCFCFC; border-left: double; margin-left: 2em; padding-left: 1em; }

								   td > .example:only-child { margin: 0 0 0 0.1em; }


								   ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; }

								   ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; }

								   ul.domTree li li { list-style: none; }

								   ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }

								   ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }

								   ul.domTree span { font-style: italic; font-family: serif; }

								   ul.domTree .t1 code { color: purple; font-weight: bold; }

								   ul.domTree .t2 { font-style: normal; font-family: monospace; }

								   ul.domTree .t2 .name { color: black; font-weight: bold; }

								   ul.domTree .t2 .value { color: blue; font-weight: normal; }

								   ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; }

								   ul.domTree .t7 code, .domTree .t8 code { color: green; }

								   ul.domTree .t10 code { color: teal; }


								   body.dfnEnabled dfn { cursor: pointer; }

								   .dfnPanel {

								     display: inline;

								     position: absolute;

								     z-index: 10;

								     height: auto;

								     width: auto;

								     padding: 0.5em 0.75em;

								     font: small sans-serif, Droid Sans Fallback;

								     background: #DDDDDD;

								     color: black;

								     border: outset 0.2em;

								   }

								   .dfnPanel * { margin: 0; padding: 0; font: inherit; text-indent: 0; }

								   .dfnPanel :link, .dfnPanel :visited { color: black; }

								   .dfnPanel p { font-weight: bolder; }

								   .dfnPanel * + p { margin-top: 0.25em; }

								   .dfnPanel li { list-style-position: inside; }


								   #configUI { position: absolute; z-index: 20; top: 10em; right: 1em; width: 11em; font-size: small; }

								   #configUI p { margin: 0.5em 0; padding: 0.3em; background: #EEEEEE; color: black; border: inset thin; }

								   #configUI p label { display: block; }

								   #configUI #updateUI, #configUI .loginUI { text-align: center; }

								   #configUI input[type=button] { display: block; margin: auto; }


								   fieldset { margin: 1em; padding: 0.5em 1em; }

								   fieldset > legend + * { margin-top: 0; }

								   fieldset > :last-child { margin-bottom: 0; }

								   fieldset p { margin: 0.5em 0; }


								   .stability {

								     position: fixed;

								     bottom: 0;

								     left: 0; right: 0;

								     margin: 0 auto 0 auto !important;

								    z-index: 1000;

								     width: 50%;

								     background: maroon; color: yellow;

								     -webkit-border-radius: 1em 1em 0 0;

								     -moz-border-radius: 1em 1em 0 0;

								     border-radius: 1em 1em 0 0;

								     -moz-box-shadow: 0 0 1em #500;

								     -webkit-box-shadow: 0 0 1em #500;

								     box-shadow: 0 0 1em red;

								     padding: 0.5em 1em;

								     text-align: center;

								   }

								   .stability strong {

								     display: block;

								   }

								   .stability input {

								     appearance: none; margin: 0; border: 0; padding: 0.25em 0.5em; background: transparent; color: black;

								     position: absolute; top: -0.5em; right: 0; font: 1.25em sans-serif; text-align: center;

								   }

								   .stability input:hover {

								     color: white;

								     text-shadow: 0 0 2px black;

								   }

								   .stability input:active {

								     padding: 0.3em 0.45em 0.2em 0.55em;

								   }

								   .stability :link, .stability :visited,

								   .stability :link:hover, .stability :visited:hover {

								     background: transparent;

								     color: white;

								   }


								  </style><link href="data:text/css,.impl%20%7B%20display:%20none;%20%7D%0Ahtml%20%7B%20border:%20solid%20yellow;%20%7D%20.domintro:before%20%7B%20display:%20none;%20%7D" id="author" rel="alternate stylesheet" title="Author documentation only"><link href="data:text/css,.impl%20%7B%20background:%20%23FFEEEE;%20%7D%20.domintro:before%20%7B%20background:%20%23FFEEEE;%20%7D" id="highlight" rel="alternate stylesheet" title="Highlight implementation

								requirements"><link href="http://www.w3.org/StyleSheets/TR/W3C-WD" rel="stylesheet" type="text/css"><style type="text/css">


								   .applies thead th > * { display: block; }

								   .applies thead code { display: block; }

								   .applies tbody th { whitespace: nowrap; }

								   .applies td { text-align: center; }

								   .applies .yes { background: yellow; }


								   .matrix, .matrix td { border: hidden; text-align: right; }

								   .matrix { margin-left: 2em; }


								   .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }

								   .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }

								   .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }


								   td.eg { border-width: thin; text-align: center; }


								   #table-example-1 { border: solid thin; border-collapse: collapse; margin-left: 3em; }

								   #table-example-1 * { font-family: "Essays1743", serif; line-height: 1.01em; }

								   #table-example-1 caption { padding-bottom: 0.5em; }

								   #table-example-1 thead, #table-example-1 tbody { border: none; }

								   #table-example-1 th, #table-example-1 td { border: solid thin; }

								   #table-example-1 th { font-weight: normal; }

								   #table-example-1 td { border-style: none solid; vertical-align: top; }

								   #table-example-1 th { padding: 0.5em; vertical-align: middle; text-align: center; }

								   #table-example-1 tbody tr:first-child td { padding-top: 0.5em; }

								   #table-example-1 tbody tr:last-child td { padding-bottom: 1.5em; }

								   #table-example-1 tbody td:first-child { padding-left: 2.5em; padding-right: 0; width: 9em; }

								   #table-example-1 tbody td:first-child::after { content: leader(". "); }

								   #table-example-1 tbody td { padding-left: 2em; padding-right: 2em; }

								   #table-example-1 tbody td:first-child + td { width: 10em; }

								   #table-example-1 tbody td:first-child + td ~ td { width: 2.5em; }

								   #table-example-1 tbody td:first-child + td + td + td ~ td { width: 1.25em; }


								   .apple-table-examples { border: none; border-collapse: separate; border-spacing: 1.5em 0em; width: 40em; margin-left: 3em; }

								   .apple-table-examples * { font-family: "Times", serif; }

								   .apple-table-examples td, .apple-table-examples th { border: none; white-space: nowrap; padding-top: 0; padding-bottom: 0; }

								   .apple-table-examples tbody th:first-child { border-left: none; width: 100%; }

								   .apple-table-examples thead th:first-child ~ th { font-size: smaller; font-weight: bolder; border-bottom: solid 2px; text-align: center; }

								   .apple-table-examples tbody th::after, .apple-table-examples tfoot th::after { content: leader(". ") }

								   .apple-table-examples tbody th, .apple-table-examples tfoot th { font: inherit; text-align: left; }

								   .apple-table-examples td { text-align: right; vertical-align: top; }

								   .apple-table-examples.e1 tbody tr:last-child td { border-bottom: solid 1px; }

								   .apple-table-examples.e1 tbody + tbody tr:last-child td { border-bottom: double 3px; }

								   .apple-table-examples.e2 th[scope=row] { padding-left: 1em; }

								   .apple-table-examples sup { line-height: 0; }


								   .details-example img { vertical-align: top; }


								   #base64-table {

								     white-space: nowrap;

								     font-size: 0.6em;

								     column-width: 6em;

								     column-count: 5;

								     column-gap: 1em;

								     -moz-column-width: 6em;

								     -moz-column-count: 5;

								     -moz-column-gap: 1em;

								     -webkit-column-width: 6em;

								     -webkit-column-count: 5;

								     -webkit-column-gap: 1em;

								   }

								   #base64-table thead { display: none; }

								   #base64-table * { border: none; }

								   #base64-table tbody td:first-child:after { content: ':'; }

								   #base64-table tbody td:last-child { text-align: right; }


								   #named-character-references-table {

								     white-space: nowrap;

								     font-size: 0.6em;

								     column-width: 30em;

								     column-gap: 1em;

								     -moz-column-width: 30em;

								     -moz-column-gap: 1em;

								     -webkit-column-width: 30em;

								     -webkit-column-gap: 1em;

								   }

								   #named-character-references-table > table > tbody > tr > td:first-child + td,

								   #named-character-references-table > table > tbody > tr > td:last-child { text-align: center; }

								   #named-character-references-table > table > tbody > tr > td:last-child:hover > span { position: absolute; top: auto; left: auto; margin-left: 0.5em; line-height: 1.2; font-size: 5em; border: outset; padding: 0.25em 0.5em; background: white; width: 1.25em; height: auto; text-align: center; }

								   #named-character-references-table > table > tbody > tr#entity-CounterClockwiseContourIntegral > td:first-child { font-size: 0.5em; }


								   .glyph.control { color: red; }


								   @font-face {

								     font-family: 'Essays1743';

								     src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743.ttf');

								   }

								   @font-face {

								     font-family: 'Essays1743';

								     font-weight: bold;

								     src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-Bold.ttf');

								   }

								   @font-face {

								     font-family: 'Essays1743';

								     font-style: italic;

								     src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-Italic.ttf');

								   }

								   @font-face {

								     font-family: 'Essays1743';

								     font-style: italic;

								     font-weight: bold;

								     src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-BoldItalic.ttf');

								   }


								  </style><style type="text/css">

								   .domintro:before { display: table; margin: -1em -0.5em -0.5em auto; width: auto; content: 'This box is non-normative. Implementation requirements are given below this box.'; color: black; font-style: italic; border: solid 2px; background: white; padding: 0 0.25em; }

								  </style><script type="text/javascript">

								   function getCookie(name) {

								     var params = location.search.substr(1).split("&");

								     for (var index = 0; index < params.length; index++) {

								       if (params[index] == name)

								         return "1";

								       var data = params[index].split("=");

								       if (data[0] == name)

								         return unescape(data[1]);

								     }

								     var cookies = document.cookie.split("; ");

								     for (var index = 0; index < cookies.length; index++) {

								       var data = cookies[index].split("=");

								       if (data[0] == name)

								         return unescape(data[1]);

								     }

								     return null;

								   }

								  </script>

								  <script src="link-fixup.js" type="text/javascript"></script>

								  <link href="style.css" rel="stylesheet"><link href="syntax.html" title="8 The HTML syntax" rel="prev">

								  <link href="spec.html#contents" title="Table of contents" rel="index">

								  <link href="tokenization.html" title="8.2.4 Tokenization" rel="next">

								  </head><body><div class="head" id="head">

								<div id="multipage-common">

								  <p class="stability" id="wip"><strong>This is a work in

								  progress!</strong> For the latest updates from the HTML WG, possibly

								  including important bug fixes, please look at the <a href="http://dev.w3.org/html5/spec/Overview.html">editor's draft</a> instead.

								  There may also be a more

								  <a href="http://www.w3.org/TR/html5">up-to-date Working Draft</a>

								   with changes based on resolution of Last Call issues.

								  <input onclick="closeWarning(this.parentNode)" type="button" value="&#9587;&#8413;"></p>

								  <script type="text/javascript">

								   function closeWarning(element) {

								     element.parentNode.removeChild(element);

								     var date = new Date();

								     date.setDate(date.getDate()+4);

								     document.cookie = 'hide-obsolescence-warning=1; expires=' + date.toGMTString();

								   }

								   if (getCookie('hide-obsolescence-warning') == '1')

								     setTimeout(function () { document.getElementById('wip').parentNode.removeChild(document.getElementById('wip')); }, 2000);

								  </script></div>


								   <p><a href="http://www.w3.org/"><img alt="W3C" height="48" src="http://www.w3.org/Icons/w3c_home" width="72"></a></p>


								   <h1>HTML5</h1>

								   </div><div>

								   <a href="syntax.html" class="prev">8 The HTML syntax</a> &#8211;

								   <a href="spec.html#contents">Table of contents</a> &#8211;

								   <a href="tokenization.html" class="next">8.2.4 Tokenization</a>

								  <ol class="toc"><li><ol><li><a href="parsing.html#parsing"><span class="secno">8.2 </span>Parsing HTML documents</a>

								    <ol><li><a href="parsing.html#overview-of-the-parsing-model"><span class="secno">8.2.1 </span>Overview of the parsing model</a></li><li><a href="parsing.html#the-input-stream"><span class="secno">8.2.2 </span>The input stream</a>

								      <ol><li><a href="parsing.html#determining-the-character-encoding"><span class="secno">8.2.2.1 </span>Determining the character encoding</a></li><li><a href="parsing.html#character-encodings-0"><span class="secno">8.2.2.2 </span>Character encodings</a></li><li><a href="parsing.html#preprocessing-the-input-stream"><span class="secno">8.2.2.3 </span>Preprocessing the input stream</a></li><li><a href="parsing.html#changing-the-encoding-while-parsing"><span class="secno">8.2.2.4 </span>Changing the encoding while parsing</a></li></ol></li><li><a href="parsing.html#parse-state"><span class="secno">8.2.3 </span>Parse state</a>

								      <ol><li><a href="parsing.html#the-insertion-mode"><span class="secno">8.2.3.1 </span>The insertion mode</a></li><li><a href="parsing.html#the-stack-of-open-elements"><span class="secno">8.2.3.2 </span>The stack of open elements</a></li><li><a href="parsing.html#the-list-of-active-formatting-elements"><span class="secno">8.2.3.3 </span>The list of active formatting elements</a></li><li><a href="parsing.html#the-element-pointers"><span class="secno">8.2.3.4 </span>The element pointers</a></li><li><a href="parsing.html#other-parsing-state-flags"><span class="secno">8.2.3.5 </span>Other parsing state flags</a></li></ol></li></ol></li></ol></li></ol></div>


								  <div class="impl">


								  <h3 id="parsing"><span class="secno">8.2 </span>Parsing HTML documents</h3>


								  <p><i>This section only applies to user agents, data mining tools,

								  and conformance checkers.</i></p>


								  <p class="note">The rules for parsing XML documents into DOM trees

								  are covered by the next section, entitled "<a href="the-xhtml-syntax.html#the-xhtml-syntax">The XHTML

								  syntax</a>".</p>


								  <p>For <a href="dom.html#html-documents">HTML documents</a>, user agents must use the parsing

								  rules described in this section to generate the DOM trees. Together,

								  these rules define what is referred to as the <dfn id="html-parser">HTML

								  parser</dfn>.</p>


								  <div class="note">


								   <p>While the HTML syntax described in this specification bears a

								   close resemblance to SGML and XML, it is a separate language with

								   its own parsing rules.</p>


								   <p>Some earlier versions of HTML (in particular from HTML2 to

								   HTML4) were based on SGML and used SGML parsing rules. However, few

								   (if any) web browsers ever implemented true SGML parsing for HTML

								   documents; the only user agents to strictly handle HTML as an SGML

								   application have historically been validators. The resulting

								   confusion &#8212; with validators claiming documents to have one

								   representation while widely deployed Web browsers interoperably

								   implemented a different representation &#8212; has wasted decades

								   of productivity. This version of HTML thus returns to a non-SGML

								   basis.</p>


								   <p>Authors interested in using SGML tools in their authoring

								   pipeline are encouraged to use XML tools and the XML serialization

								   of HTML.</p>


								  </div>


								  <p>This specification defines the parsing rules for HTML documents,

								  whether they are syntactically correct or not. Certain points in the

								  parsing algorithm are said to be <dfn id="parse-error" title="parse error">parse

								  errors</dfn>. The error handling for parse errors is well-defined:

								  user agents must either act as described below when encountering

								  such problems, or must abort processing at the first error that they

								  encounter for which they do not wish to apply the rules described

								  below.</p>


								  <p>Conformance checkers must report at least one parse error

								  condition to the user if one or more parse error conditions exist in

								  the document and must not report parse error conditions if none

								  exist in the document. Conformance checkers may report more than one

								  parse error condition if more than one parse error condition exists

								  in the document. Conformance checkers are not required to recover

								  from parse errors.</p>


								  <p class="note">Parse errors are only errors with the

								  <em>syntax</em> of HTML. In addition to checking for parse errors,

								  conformance checkers will also verify that the document obeys all

								  the other conformance requirements described in this

								  specification.</p>


								  <p>For the purposes of conformance checkers, if a resource is

								  determined to be in <a href="syntax.html#syntax">the HTML syntax</a>, then it is an

								  <a href="dom.html#html-documents" title="HTML documents">HTML document</a>.</p>


								  </div><div class="impl">


								  <h4 id="overview-of-the-parsing-model"><span class="secno">8.2.1 </span>Overview of the parsing model</h4>


								  <p>The input to the HTML parsing process consists of a stream of

								  Unicode characters, which is passed through a

								  <a href="tokenization.html#tokenization">tokenization</a> stage followed by a <a href="tree-construction.html#tree-construction">tree

								  construction</a> stage. The output is a <code><a href="infrastructure.html#document">Document</a></code>

								  object.</p>


								  <p class="note">Implementations that <a href="infrastructure.html#non-scripted">do not

								  support scripting</a> do not have to actually create a DOM

								  <code><a href="infrastructure.html#document">Document</a></code> object, but the DOM tree in such cases is

								  still used as the model for the rest of the specification.</p>


								  <p>In the common case, the data handled by the tokenization stage

								  comes from the network, but <a href="apis-in-html-documents.html#dynamic-markup-insertion" title="dynamic markup

								  insertion">it can also come from script</a> running in the user

								  agent, e.g. using the <code title="dom-document-write"><a href="apis-in-html-documents.html#dom-document-write">document.write()</a></code> API.</p>


								  <p><img alt="" height="554" src="parsing-model-overview.png" width="427"></p>


								  <p id="nestedParsing">There is only one set of states for the

								  tokenizer stage and the tree construction stage, but the tree

								  construction stage is reentrant, meaning that while the tree

								  construction stage is handling one token, the tokenizer might be

								  resumed, causing further tokens to be emitted and processed before

								  the first token's processing is complete.</p>


								  <div class="example">


								   <p>In the following example, the tree construction stage will be

								   called upon to handle a "p" start tag token while handling the

								   "script" end tag token:</p>


								   <pre>...

								&lt;script&gt;

								 document.write('&lt;p&gt;');

								&lt;/script&gt;

								...</pre>


								  </div>


								  <p>To handle these cases, parsers have a <dfn id="script-nesting-level">script nesting

								  level</dfn>, which must be initially set to zero, and a <dfn id="parser-pause-flag">parser

								  pause flag</dfn>, which must be initially set to false.</p>


								  </div><div class="impl">


								  <h4 id="the-input-stream"><span class="secno">8.2.2 </span>The <dfn>input stream</dfn></h4>


								  <p>The stream of Unicode characters that comprises the input to the

								  tokenization stage will be initially seen by the user agent as a

								  stream of bytes (typically coming over the network or from the local

								  file system). The bytes encode the actual characters according to a

								  particular <em>character encoding</em>, which the user agent must

								  use to decode the bytes into characters.</p>


								  <p class="note">For XML documents, the algorithm user agents must

								  use to determine the character encoding is given by the XML

								  specification. This section does not apply to XML documents. <a href="references.html#refsXML">[XML]</a></p>


								  <h5 id="determining-the-character-encoding"><span class="secno">8.2.2.1 </span>Determining the character encoding</h5>


								  <p>In some cases, it might be impractical to unambiguously determine

								  the encoding before parsing the document. Because of this, this

								  specification provides for a two-pass mechanism with an optional

								  pre-scan. Implementations are allowed, as described below, to apply

								  a simplified parsing algorithm to whatever bytes they have available

								  before beginning to parse the document. Then, the real parser is

								  started, using a tentative encoding derived from this pre-parse and

								  other out-of-band metadata. If, while the document is being loaded,

								  the user agent discovers an encoding declaration that conflicts with

								  this information, then the parser can get reinvoked to perform a

								  parse of the document with the real encoding.</p>


								  <p id="documentEncoding">User agents must use the following

								  algorithm (the <dfn id="encoding-sniffing-algorithm">encoding sniffing algorithm</dfn>) to determine

								  the character encoding to use when decoding a document in the first

								  pass. This algorithm takes as input any out-of-band metadata

								  available to the user agent (e.g. the <a href="fetching-resources.html#content-type" title="Content-Type">Content-Type metadata</a> of the document)

								  and all the bytes available so far, and returns an encoding and a

								  <dfn id="concept-encoding-confidence" title="concept-encoding-confidence">confidence</dfn>. The

								  confidence is either <i>tentative</i>, <i>certain</i>, or

								  <i>irrelevant</i>. The encoding used, and whether the confidence in

								  that encoding is <i>tentative</i> or <i>certain</i>, is <a href="tree-construction.html#meta-charset-during-parse">used during the parsing</a> to

								  determine whether to <a href="#change-the-encoding">change the encoding</a>. If no

								  encoding is necessary, e.g. because the parser is operating on a

								  stream of Unicode characters and doesn't have to use an encoding at

								  all, then the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> is

								  <i>irrelevant</i>.</p>


								  <ol><li><p>If the user has explicitly instructed the user agent to

								   override the document's character encoding with a specific

								   encoding, optionally return that encoding with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								   <i>certain</i> and abort these steps.</p></li>


								   <li><p>If the transport layer specifies an encoding, and it is

								   supported, return that encoding with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								   <i>certain</i>, and abort these steps.</p></li>


								   <li>


								    <p>The user agent may wait for more bytes of the resource to be

								    available, either in this step or at any later step in this

								    algorithm. For instance, a user agent might wait 500ms or 1024

								    bytes, whichever came first. In general preparsing the source to

								    find the encoding improves performance, as it reduces the need to

								    throw away the data structures used when parsing upon finding the

								    encoding information. However, if the user agent delays too long

								    to obtain data to determine the encoding, then the cost of the

								    delay could outweigh any performance improvements from the

								    preparse.</p>


								    <p class="note">The authoring conformance requirements for

								    character encoding declarations limit them to only appearing <a href="semantics.html#charset1024">in the first 1024 bytes</a>. User agents are

								    therefore encouraged to use the preparse algorithm below (part of

								    these steps) on the first 1024 bytes, but not to stall beyond

								    that.</p>


								   </li>


								   <li><p>For each of the rows in the following table, starting with

								   the first one and going down, if there are as many or more bytes

								   available than the number of bytes in the first column, and the

								   first bytes of the file match the bytes given in the first column,

								   then return the encoding given in the cell in the second column of

								   that row, with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								   <i>certain</i>, and abort these steps:</p>


								    <table><thead><tr><th>Bytes in Hexadecimal

								       </th><th>Encoding

								     </th></tr></thead><tbody><tr><td>FE FF

								       </td><td>Big-endian UTF-16

								      </td></tr><tr><td>FF FE

								       </td><td>Little-endian UTF-16

								      </td></tr><tr><td>EF BB BF

								       </td><td>UTF-8

								    </td></tr></tbody></table><p class="note">This step looks for Unicode Byte Order Marks

								   (BOMs).</p></li>


								   <li><p>Otherwise, the user agent will have to search for explicit

								   character encoding information in the file itself. This should

								   proceed as follows:


								    </p><p>Let <var title="">position</var> be a pointer to a byte in the

								    input stream, initially pointing at the first byte. If at any

								    point during these substeps the user agent either runs out of

								    bytes or decides that scanning further bytes would not be

								    efficient, then skip to the next step of the overall character

								    encoding detection algorithm. User agents may decide that scanning

								    <em>any</em> bytes is not efficient, in which case these substeps

								    are entirely skipped.</p>


								    <p>Now, repeat the following "two" steps until the algorithm

								    aborts (either because user agent aborts, as described above, or

								    because a character encoding is found):</p>


								    <ol><li><p>If <var title="">position</var> points to:</p>


								      <dl class="switch"><dt>A sequence of bytes starting with: 0x3C 0x21 0x2D 0x2D (ASCII '&lt;!--')</dt>

								       <dd>


								        <p>Advance the <var title="">position</var> pointer so that it

								        points at the first 0x3E byte which is preceded by two 0x2D

								        bytes (i.e. at the end of an ASCII '--&gt;' sequence) and comes

								        after the 0x3C byte that was found. (The two 0x2D bytes can be

								        the same as the those in the '&lt;!--' sequence.)</p>


								       </dd>


								       <dt>A sequence of bytes starting with: 0x3C, 0x4D or 0x6D, 0x45 or 0x65, 0x54 or 0x74, 0x41 or 0x61, and finally one of 0x09, 0x0A, 0x0C, 0x0D, 0x20, 0x2F (case-insensitive ASCII '&lt;meta' followed by a space or slash)</dt>

								       <dd>


								        <ol><li><p>Advance the <var title="">position</var> pointer so

								         that it points at the next 0x09, 0x0A, 0x0C, 0x0D, 0x20, or

								         0x2F byte (the one in sequence of characters matched

								         above).</p></li>


								         <li><p>Let <var title="">attribute list</var> be an empty

								         list of strings.</p></li>

								         <li><p>Let <var title="">got pragma</var> be false.</p></li>


								         <li><p>Let <var title="">need pragma</var> be null.</p></li>


								         <li><p>Let <var title="">charset</var> be the null value

								         (which, for the purposes of this algorithm, is distinct from

								         an unrecognised encoding or the empty string).</p></li>


								         <li><p><i>Attributes</i>: <a href="#concept-get-attributes-when-sniffing" title="concept-get-attributes-when-sniffing">Get an

								         attribute</a> and its value. If no attribute was sniffed,

								         then jump to the <i>processing</i> step below.</p></li>


								         <li><p>If the attribute's name is already in <var title="">attribute list</var>, then return to the step

								         labeled <i>attributes</i>.</p>


								         </li><li><p>Add the attribute's name to <var title="">attribute

								         list</var>.</p>


								         </li><li>


								          <p>Run the appropriate step from the following list, if one

								          applies:</p>


								          <dl class="switch"><dt>If the attribute's name is "<code title="">http-equiv</code>"</dt>


								           <dd><p>If the attribute's value is "<code title="">content-type</code>", then set <var title="">got

								           pragma</var> to true.</p></dd>


								           <dt>If the attribute's name is "<code title="">content</code>"</dt>


								           <dd><p>Apply the <a href="fetching-resources.html#algorithm-for-extracting-an-encoding-from-a-meta-element">algorithm for extracting an encoding

								           from a <code>meta</code> element</a>, giving the

								           attribute's value as the string to parse. If an encoding is

								           returned, and if <var title="">charset</var> is still set

								           to null, let <var title="">charset</var> be the encoding

								           returned, and set <var title="">need pragma</var> to

								           true.</p></dd>


								           <dt>If the attribute's name is "<code title="">charset</code>"</dt>


								           <dd><p>Let <var title="">charset</var> be the encoding

								           corresponding to the attribute's value, and set <var title="">need pragma</var> to false.</p></dd>


								          </dl></li>


								         <li><p>Return to the step labeled <i>attributes</i>.</p></li>


								         <li><p><i>Processing</i>: If <var title="">need pragma</var>

								         is null, then jump to the second step of the overall "two

								         step" algorithm.</p></li>


								         <li><p>If <var title="">mode</var> is true but <var title="">got pragma</var> is false, then jump to the second

								         step of the overall "two step" algorithm.</p></li>


								         <li><p>If <var title="">charset</var> is a UTF-16 encoding,

								         change the value of <var title="">charset</var> to

								         UTF-8.</p></li>


								         <li><p>If <var title="">charset</var> is not a supported

								         character encoding, then jump to the second step of the

								         overall "two step" algorithm.</p></li>


								         <li><p>Return the encoding given by <var title="">charset</var>, with <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								         <i>tentative</i>, and abort all these steps.</p></li>


								        </ol></dd>


								       <dt>A sequence of bytes starting with a 0x3C byte (ASCII &lt;), optionally a 0x2F byte (ASCII /), and finally a byte in the range 0x41-0x5A or 0x61-0x7A (an ASCII letter)</dt>

								       <dd>


								        <ol><li><p>Advance the <var title="">position</var> pointer so

								         that it points at the next 0x09 (ASCII TAB), 0x0A (ASCII LF),

								         0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E

								         (ASCII &gt;) byte.</p></li>


								         <li><p>Repeatedly <a href="#concept-get-attributes-when-sniffing" title="concept-get-attributes-when-sniffing">get an

								         attribute</a> until no further attributes can be found,

								         then jump to the second step in the overall "two step"

								         algorithm.</p></li>


								        </ol></dd>


								       <dt>A sequence of bytes starting with: 0x3C 0x21 (ASCII '&lt;!')</dt>

								       <dt>A sequence of bytes starting with: 0x3C 0x2F (ASCII '&lt;/')</dt>

								       <dt>A sequence of bytes starting with: 0x3C 0x3F (ASCII '&lt;?')</dt>

								       <dd>


								        <p>Advance the <var title="">position</var> pointer so that it

								        points at the first 0x3E byte (ASCII &gt;) that comes after the

								        0x3C byte that was found.</p>


								       </dd>


								       <dt>Any other byte</dt>

								       <dd>


								        <p>Do nothing with that byte.</p>


								       </dd>


								      </dl></li>


								     <li>Move <var title="">position</var> so it points at the next

								     byte in the input stream, and return to the first step of this

								     "two step" algorithm.</li>


								    </ol><p>When the above "two step" algorithm says to <dfn id="concept-get-attributes-when-sniffing" title="concept-get-attributes-when-sniffing">get an

								    attribute</dfn>, it means doing this:</p>


								    <ol><li><p>If the byte at <var title="">position</var> is one of 0x09

								     (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF), 0x0D (ASCII CR),

								     0x20 (ASCII space), or 0x2F (ASCII /) then advance <var title="">position</var> to the next byte and redo this

								     substep.</p></li>


								     <li><p>If the byte at <var title="">position</var> is 0x3E (ASCII

								     &gt;), then abort the "get an attribute" algorithm. There isn't

								     one.</p></li>


								     <li><p>Otherwise, the byte at <var title="">position</var> is the

								     start of the attribute name. Let <var title="">attribute

								     name</var> and <var title="">attribute value</var> be the empty

								     string.</p></li>


								     <li><p><i>Attribute name</i>: Process the byte at <var title="">position</var> as follows:</p>


								      <dl class="switch"><dt>If it is 0x3D (ASCII =), and the <var title="">attribute

								       name</var> is longer than the empty string</dt>


								       <dd>Advance <var title="">position</var> to the next byte and

								       jump to the step below labeled <i>value</i>.</dd>


								       <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII

								       FF), 0x0D (ASCII CR), or 0x20 (ASCII space)</dt>


								       <dd>Jump to the step below labeled <i>spaces</i>.</dd>


								       <dt>If it is 0x2F (ASCII /) or 0x3E (ASCII &gt;)</dt>


								       <dd>Abort the "get an attribute" algorithm. The attribute's

								       name is the value of <var title="">attribute name</var>, its

								       value is the empty string.</dd>


								       <dt>If it is in the range 0x41 (ASCII A) to 0x5A (ASCII

								       Z)</dt>


								       <dd>Append the Unicode character with code point <span title=""><var title="">b</var>+0x20</span> to <var title="">attribute

								       name</var> (where <var title="">b</var> is the value of the

								       byte at <var title="">position</var>).</dd>


								       <dt>Anything else</dt>


								       <dd>Append the Unicode character with the same code point as the

								       value of the byte at <var title="">position</var>) to <var title="">attribute name</var>. (It doesn't actually matter how

								       bytes outside the ASCII range are handled here, since only

								       ASCII characters can contribute to the detection of a character

								       encoding.)</dd>


								      </dl></li>


								     <li><p>Advance <var title="">position</var> to the next byte and

								     return to the previous step.</p></li>


								     <li><p><i>Spaces</i>: If the byte at <var title="">position</var> is one of 0x09 (ASCII TAB), 0x0A (ASCII

								     LF), 0x0C (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space) then

								     advance <var title="">position</var> to the next byte, then,

								     repeat this step.</p></li>


								     <li><p>If the byte at <var title="">position</var> is

								     <em>not</em> 0x3D (ASCII =), abort the "get an attribute"

								     algorithm. The attribute's name is the value of <var title="">attribute name</var>, its value is the empty

								     string.</p></li>


								     <li><p>Advance <var title="">position</var> past the 0x3D (ASCII

								     =) byte.</p></li>


								     <li><p><i>Value</i>: If the byte at <var title="">position</var> is one of 0x09 (ASCII TAB), 0x0A (ASCII

								     LF), 0x0C (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space) then

								     advance <var title="">position</var> to the next byte, then,

								     repeat this step.</p></li>


								     <li><p>Process the byte at <var title="">position</var> as

								     follows:</p>


								      <dl class="switch"><dt>If it is 0x22 (ASCII ") or 0x27 (ASCII ')</dt>


								       <dd>


								        <ol><li>Let <var title="">b</var> be the value of the byte at

								         <var title="">position</var>.</li>


								         <li>Advance <var title="">position</var> to the next

								         byte.</li>


								         <li>If the value of the byte at <var title="">position</var>

								         is the value of <var title="">b</var>, then advance <var title="">position</var> to the next byte and abort the "get

								         an attribute" algorithm. The attribute's name is the value of

								         <var title="">attribute name</var>, and its value is the

								         value of <var title="">attribute value</var>.</li>


								         <li>Otherwise, if the value of the byte at <var title="">position</var> is in the range 0x41 (ASCII A) to

								         0x5A (ASCII Z), then append a Unicode character to <var title="">attribute value</var> whose code point is 0x20 more

								         than the value of the byte at <var title="">position</var>.</li>


								         <li>Otherwise, append a Unicode character to <var title="">attribute value</var> whose code point is the same as

								         the value of the byte at <var title="">position</var>.</li>


								         <li>Return to the second step in these substeps.</li>


								        </ol></dd>


								       <dt>If it is 0x3E (ASCII &gt;)</dt>


								       <dd>Abort the "get an attribute" algorithm. The attribute's

								       name is the value of <var title="">attribute name</var>, its

								       value is the empty string.</dd>


								       <dt>If it is in the range 0x41 (ASCII A) to 0x5A (ASCII

								       Z)</dt>


								       <dd>Append the Unicode character with code point <span title=""><var title="">b</var>+0x20</span> to <var title="">attribute

								       value</var> (where <var title="">b</var> is the value of the

								       byte at <var title="">position</var>). Advance <var title="">position</var> to the next byte.</dd>


								       <dt>Anything else</dt>


								       <dd>Append the Unicode character with the same code point as the

								       value of the byte at <var title="">position</var>) to <var title="">attribute value</var>. Advance <var title="">position</var> to the next byte.</dd>


								      </dl></li>


								     <li><p>Process the byte at <var title="">position</var> as

								     follows:</p>


								      <dl class="switch"><dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII

								       FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E (ASCII

								       &gt;)</dt>


								       <dd>Abort the "get an attribute" algorithm. The attribute's

								       name is the value of <var title="">attribute name</var> and its

								       value is the value of <var title="">attribute value</var>.</dd>


								       <dt>If it is in the range 0x41 (ASCII A) to 0x5A (ASCII

								       Z)</dt>


								       <dd>Append the Unicode character with code point <span title=""><var title="">b</var>+0x20</span> to <var title="">attribute

								       value</var> (where <var title="">b</var> is the value of the

								       byte at <var title="">position</var>).</dd>


								       <dt>Anything else</dt>


								       <dd>Append the Unicode character with the same code point as the

								       value of the byte at <var title="">position</var>) to <var title="">attribute value</var>.</dd>


								      </dl></li>


								     <li><p>Advance <var title="">position</var> to the next byte and

								     return to the previous step.</p></li>


								    </ol><p>For the sake of interoperability, user agents should not use a

								    pre-scan algorithm that returns different results than the one

								    described above. (But, if you do, please at least let us know, so

								    that we can improve this algorithm and benefit everyone...)</p>


								   </li>


								   <li><p>If the user agent has information on the likely encoding for

								   this page, e.g. based on the encoding of the page when it was last

								   visited, then return that encoding, with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								   <i>tentative</i>, and abort these steps.</p></li>


								   <li>


								    <p>The user agent may attempt to autodetect the character encoding

								    from applying frequency analysis or other algorithms to the data

								    stream. Such algorithms may use information about the resource

								    other than the resource's contents, including the address of the

								    resource. If autodetection succeeds in determining a character

								    encoding, then return that encoding, with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								    <i>tentative</i>, and abort these steps. <a href="references.html#refsUNIVCHARDET">[UNIVCHARDET]</a></p>


								    <p class="note">The UTF-8 encoding has a highly detectable bit

								    pattern. Documents that contain bytes with values greater than

								    0x7F which match the UTF-8 pattern are very likely to be UTF-8,

								    while documents with byte sequences that do not match it are very

								    likely not. User-agents are therefore encouraged to search for

								    this common encoding. <a href="references.html#refsPPUTF8">[PPUTF8]</a> <a href="references.html#refsUTF8DET">[UTF8DET]</a></p>


								   </li>


								   <li>


								    <p>Otherwise, return an implementation-defined or user-specified

								    default character encoding, with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>

								    <i>tentative</i>.</p>


								    <p>In controlled environments or in environments where the

								    encoding of documents can be prescribed (for example, for user

								    agents intended for dedicated use in new networks), the

								    comprehensive <code title="">UTF-8</code> encoding is

								    suggested.</p>


								    <p>In other environments, the default encoding is typically

								    dependent on the user's locale (an approximation of the languages,

								    and thus often encodings, of the pages that the user is likely to

								    frequent). The following table gives suggested defaults based on

								    the user's locale, for compatibility with legacy content. Locales

								    are identified by BCP 47 language tags. <a href="references.html#refsBCP47">[BCP47]</a></p>


								    <table><thead><tr><th>Locale language

								       </th><th>Suggested default encoding

								     </th></tr></thead><tbody><tr><td>ar

								       </td><td>UTF-8


								      </td></tr><tr><td>be

								       </td><td>ISO-8859-5


								      </td></tr><tr><td>bg

								       </td><td>windows-1251


								      </td></tr><tr><td>cs<!-- -CZ -->

								       </td><td>ISO-8859-2


								      </td></tr><tr><td>cy

								       </td><td>UTF-8


								      </td></tr><tr><td>fa<!-- -IR -->

								       </td><td>UTF-8


								      </td></tr><tr><td>he<!-- -IL -->

								       </td><td>windows-1255


								      </td></tr><tr><td>hr

								       </td><td>UTF-8


								      </td></tr><tr><td>hu<!-- -HU -->

								       </td><td>ISO-8859-2


								      </td></tr><tr><td>ja

								       </td><td>Windows-31J


								      </td></tr><tr><td>kk

								       </td><td>UTF-8


								      </td></tr><tr><td>ko<!-- -KR -->

								       </td><td>windows-949 <!-- EUC-KR -->


								      </td></tr><tr><td>ku

								       </td><td>windows-1254 <!-- ISO-8859-9 -->


								      </td></tr><tr><td>lt

								       </td><td>windows-1257


								      </td></tr><tr><td>lv<!-- -LV -->

								       </td><td>ISO-8859-13


								      </td></tr><tr><td>mk<!-- -MK -->

								       </td><td>UTF-8


								      </td></tr><tr><td>or

								       </td><td>UTF-8


								      </td></tr><tr><td>pl<!-- -PL -->

								       </td><td>ISO-8859-2


								      </td></tr><tr><td>ro

								       </td><td>UTF-8


								      </td></tr><tr><td>ru

								       </td><td>windows-1251


								      </td></tr><tr><td>sk

								       </td><td>windows-1250


								      </td></tr><tr><td>sl

								       </td><td>ISO-8859-2


								      </td></tr><tr><td>sr

								       </td><td>UTF-8


								      </td></tr><tr><td>th

								       </td><td>windows-874 <!-- TIS-620 -->


								      </td></tr><tr><td>tr<!-- -TR -->

								       </td><td>windows-1254 <!-- ISO-8859-9 -->


								      </td></tr><tr><td>uk

								       </td><td>windows-1251


								      </td></tr><tr><td>vi

								       </td><td>UTF-8


								      </td></tr><tr><td>zh-CN

								       </td><td>GB18030


								      </td></tr><tr><td>zh-TW

								       </td><td>Big5


								      </td></tr><tr><td>All other locales

								       </td><td>windows-1252


								    </td></tr></tbody></table></li>


								  </ol><p>The <a href="dom.html#document-s-character-encoding">document's character encoding</a> must immediately

								  be set to the value returned from this algorithm, at the same time

								  as the user agent uses the returned value to select the decoder to

								  use for the input stream.</p>


								  <p class="note">This algorithm is a <a href="introduction.html#willful-violation">willful violation</a>

								  of the HTTP specification, which requires that the encoding be

								  assumed to be ISO-8859-1 in the absence of a <a href="semantics.html#character-encoding-declaration">character

								  encoding declaration</a> to the contrary, and of RFC 2046,

								  which requires that the encoding be assumed to be US-ASCII in the

								  absence of a <a href="semantics.html#character-encoding-declaration">character encoding declaration</a> to the

								  contrary. This specification's third approach is motivated by a

								  desire to be maximally compatible with legacy content. <a href="references.html#refsHTTP">[HTTP]</a> <a href="references.html#refsRFC2046">[RFC2046]</a></p>


								  <h5 id="character-encodings-0"><span class="secno">8.2.2.2 </span>Character encodings</h5>


								  <p>User agents must at a minimum support the UTF-8 and Windows-1252

								  encodings, but may support more. <a href="references.html#refsRFC3629">[RFC3629]</a> <a href="references.html#refsWIN1252">[WIN1252]</a></p>


								  <p class="note">It is not unusual for Web browsers to support dozens

								  if not upwards of a hundred distinct character encodings.</p>


								  <p>User agents must support the <a href="infrastructure.html#preferred-mime-name">preferred MIME name</a> of

								  every character encoding they support, and should support all the

								  IANA-registered names and aliases of every character encoding they

								  support. <a href="references.html#refsIANACHARSET">[IANACHARSET]</a></p>


								  <p>When comparing a string specifying a character encoding with the

								  name or alias of a character encoding to determine if they are

								  equal, user agents must remove any leading or trailing <a href="common-microsyntaxes.html#space-character" title="space character">space characters</a> in both names, and

								  then perform the comparison in an <a href="infrastructure.html#ascii-case-insensitive">ASCII

								  case-insensitive</a> manner.</p>


								  <hr><p>When a user agent would otherwise use an encoding given in the

								  first column of the following table to either convert content to

								  Unicode characters or convert Unicode characters to bytes, it must

								  instead use the encoding given in the cell in the second column of

								  the same row. When a byte or sequence of bytes is treated

								  differently due to this encoding aliasing, it is said to have been

								  <dfn id="misinterpreted-for-compatibility">misinterpreted for compatibility</dfn>.</p>


								  <table id="table-encoding-overrides"><caption>Character encoding overrides</caption>

								   <thead><tr><th> Input encoding </th><th> Replacement encoding </th><th> References

								   </th></tr></thead><tbody><tr><td> EUC-KR </td><td> windows-949 </td><td>

								         <a href="references.html#refsEUCKR">[EUCKR]</a>

								         <a href="references.html#refsWIN949">[WIN949]</a>

								    </td></tr><tr><td> EUC-JP </td><td> CP51932 </td><td>

								         <a href="references.html#refsEUCJP">[EUCJP]</a>

								         <a href="references.html#refsCP51932">[CP51932]</a>

								    </td></tr><tr><td> GB2312 </td><td> GBK </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsGBK">[GBK]</a>

								    </td></tr><tr><td> GB_2312-80 </td><td> GBK </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsGBK">[GBK]</a>

								    </td></tr><tr><td> ISO-8859-1 </td><td> windows-1252 </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsWIN1252">[WIN1252]</a>

								    </td></tr><tr><td> ISO-8859-9 </td><td> windows-1254 </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsWIN1254">[WIN1254]</a>

								    </td></tr><tr><td> ISO-8859-11 </td><td> windows-874 </td><td>

								         <a href="references.html#refsISO885911">[ISO885911]</a>

								         <a href="references.html#refsWIN874">[WIN874]</a>

								    </td></tr><tr><td> KS_C_5601-1987 </td><td> windows-949 </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsWIN949">[WIN949]</a>

								    </td></tr><tr><td> Shift_JIS </td><td> Windows-31J </td><td>

								         <a href="references.html#refsSHIFTJIS">[SHIFTJIS]</a>

								         <a href="references.html#refsWIN31J">[WIN31J]</a>

								    </td></tr><tr><td> TIS-620 </td><td> windows-874 </td><td>

								         <a href="references.html#refsTIS620">[TIS620]</a>

								         <a href="references.html#refsWIN874">[WIN874]</a>

								    </td></tr><tr><td> US-ASCII </td><td> windows-1252 </td><td>

								         <a href="references.html#refsRFC1345">[RFC1345]</a>

								         <a href="references.html#refsWIN1252">[WIN1252]</a>

								   </td></tr></tbody></table><p class="note">The requirement to treat certain encodings as other

								  encodings according to the table above is a <a href="introduction.html#willful-violation">willful

								  violation</a> of the W3C Character Model specification, motivated

								  by a desire for compatibility with legacy content. <a href="references.html#refsCHARMOD">[CHARMOD]</a></p>


								  <p>When a user agent is to use the UTF-16 encoding but no BOM has

								  been found, user agents must default to UTF-16LE.</p>


								  <p class="note">The requirement to default UTF-16 to LE rather than

								  BE is a <a href="introduction.html#willful-violation">willful violation</a> of RFC 2781, motivated by a

								  desire for compatibility with legacy content. <a href="references.html#refsRFC2781">[RFC2781]</a></p>


								  <hr><p>User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU

								  encodings. <a href="references.html#refsCESU8">[CESU8]</a> <a href="references.html#refsUTF7">[UTF7]</a> <a href="references.html#refsBOCU1">[BOCU1]</a> <a href="references.html#refsSCSU">[SCSU]</a></p>


								  <p>Support for encodings based on EBCDIC is not recommended. This

								  encoding is rarely used for publicly-facing Web content.</p>


								  <p>Support for UTF-32 is not recommended. This encoding is rarely

								  used, and frequently implemented incorrectly.</p>


								  <p class="note">This specification does not make any attempt to

								  support EBCDIC-based encodings and UTF-32 in its algorithms; support

								  and use of these encodings can thus lead to unexpected behavior in

								  implementations of this specification.</p>


								  <h5 id="preprocessing-the-input-stream"><span class="secno">8.2.2.3 </span>Preprocessing the input stream</h5>


								  <p>Given an encoding, the bytes in the input stream must be

								  converted to Unicode characters for the tokenizer, as described by

								  the rules for that encoding, except that the leading U+FEFF BYTE

								  ORDER MARK character, if any, must not be stripped by the encoding

								  layer (it is stripped by the rule below).</p>

								  <p>Bytes or sequences of bytes in the original byte stream that

								  could not be converted to Unicode code points must be converted to

								  U+FFFD REPLACEMENT CHARACTERs. Specifically, if the encoding is

								  UTF-8, the bytes must be <a href="infrastructure.html#decoded-as-utf-8-with-error-handling" title="decoded as UTF-8, with error

								  handling">decoded with the error handling</a> defined in this

								  specification.</p>


								  <p class="note">Bytes or sequences of bytes in the original byte

								  stream that did not conform to the encoding specification

								  (e.g. invalid UTF-8 byte sequences in a UTF-8 input stream) are

								  errors that conformance checkers are expected to report.</p>


								  <p>Any byte or sequence of bytes in the original byte stream that is

								  <a href="#misinterpreted-for-compatibility">misinterpreted for compatibility</a> is a <a href="#parse-error">parse

								  error</a>.</p>


								  <p>One leading U+FEFF BYTE ORDER MARK character must be ignored if

								  any are present.</p>


								  <p class="note">The requirement to strip a U+FEFF BYTE ORDER MARK

								  character regardless of whether that character was used to determine

								  the byte order is a <a href="introduction.html#willful-violation">willful violation</a> of Unicode,

								  motivated by a desire to increase the resilience of user agents in

								  the face of na&#239;ve transcoders.</p>


								  <p>Any occurrences of any characters in the ranges U+0001 to U+0008,

								     U+000E to U+001F,  U+007F

								   to U+009F, U+FDD0

								  to U+FDEF, and characters U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF,

								  U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE,

								  U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF,

								  U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE,

								  U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, U+FFFFF,

								  U+10FFFE, and U+10FFFF are <a href="#parse-error" title="parse error">parse

								  errors</a>. These are all control characters or permanently

								  undefined Unicode characters (noncharacters).</p>


								  <p>U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)

								  characters are treated specially. Any CR characters that are

								  followed by LF characters must be removed, and any CR characters not

								  followed by LF characters must be converted to LF characters. Thus,

								  newlines in HTML DOMs are represented by LF characters, and there

								  are never any CR characters in the input to the

								  <a href="tokenization.html#tokenization">tokenization</a> stage.</p>


								  <p>The <dfn id="next-input-character">next input character</dfn> is the first character in the

								  input stream that has not yet been <dfn id="consumed">consumed</dfn>. Initially,

								  the <i><a href="#next-input-character">next input character</a></i> is the first character in the

								  input. The <dfn id="current-input-character">current input character</dfn> is the last character

								  to have been <i><a href="#consumed">consumed</a></i>.</p>


								  <p>The <dfn id="insertion-point">insertion point</dfn> is the position (just before a

								  character or just before the end of the input stream) where content

								  inserted using <code title="dom-document-write"><a href="apis-in-html-documents.html#dom-document-write">document.write()</a></code> is actually

								  inserted. The insertion point is relative to the position of the

								  character immediately after it, it is not an absolute offset into

								  the input stream. Initially, the insertion point is

								  undefined.</p>


								  <p>The "EOF" character in the tables below is a conceptual character

								  representing the end of the <a href="#the-input-stream">input stream</a>. If the parser

								  is a <a href="apis-in-html-documents.html#script-created-parser">script-created parser</a>, then the end of the

								  <a href="#the-input-stream">input stream</a> is reached when an <dfn id="explicit-eof-character">explicit "EOF"

								  character</dfn> (inserted by the <code title="dom-document-close"><a href="apis-in-html-documents.html#dom-document-close">document.close()</a></code> method) is

								  consumed. Otherwise, the "EOF" character is not a real character in

								  the stream, but rather the lack of any further characters.</p>


								  <h5 id="changing-the-encoding-while-parsing"><span class="secno">8.2.2.4 </span>Changing the encoding while parsing</h5>


								  <p>When the parser requires the user agent to <dfn id="change-the-encoding">change the

								  encoding</dfn>, it must run the following steps. This might happen

								  if the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> described above

								  failed to find an encoding, or if it found an encoding that was not

								  the actual encoding of the file.</p>


								  <ol><li>If the new encoding is identical or equivalent to the encoding

								   that is already being used to interpret the input stream, then set

								   the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to

								   <i>certain</i> and abort these steps. This happens when the

								   encoding information found in the file matches what the

								   <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> determined to be the

								   encoding, and in the second pass through the parser if the first

								   pass found that the encoding sniffing algorithm described in the

								   earlier section failed to find the right encoding.</li>


								   <li>If the encoding that is already being used to interpret the

								   input stream is a UTF-16 encoding, then set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to

								   <i>certain</i> and abort these steps. The new encoding is ignored;

								   if it was anything but the same encoding, then it would be clearly

								   incorrect.</li>


								   <li>If the new encoding is a UTF-16 encoding, change it to

								   UTF-8.</li>


								   <li>If all the bytes up to the last byte converted by the current

								   decoder have the same Unicode interpretations in both the current

								   encoding and the new encoding, and if the user agent supports

								   changing the converter on the fly, then the user agent may change

								   to the new converter for the encoding on the fly. Set the

								   <a href="dom.html#document-s-character-encoding">document's character encoding</a> and the encoding used to

								   convert the input stream to the new encoding, set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to

								   <i>certain</i>, and abort these steps.</li>


								   <li>Otherwise, <a href="history.html#navigate">navigate</a> to the

								   document again, with <a href="history.html#replacement-enabled">replacement enabled</a>, and using

								   the same <a href="history.html#source-browsing-context">source browsing context</a>, but this time skip

								   the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> and instead just set

								   the encoding to the new encoding and the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to

								   <i>certain</i>. Whenever possible, this should be done without

								   actually contacting the network layer (the bytes should be

								   re-parsed from memory), even if, e.g., the document is marked as

								   not being cacheable. If this is not possible and contacting the

								   network layer would involve repeating a request that uses a method

								   other than HTTP GET (<a href="fetching-resources.html#concept-http-equivalent-get" title="concept-http-equivalent-get">or

								   equivalent</a> for non-HTTP URLs), then instead set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to

								   <i>certain</i> and ignore the new encoding. The resource will be

								   misinterpreted. User agents may notify the user of the situation,

								   to aid in application development.</li>


								  </ol></div><div class="impl">


								  <h4 id="parse-state"><span class="secno">8.2.3 </span>Parse state</h4>


								  <h5 id="the-insertion-mode"><span class="secno">8.2.3.1 </span>The insertion mode</h5>


								  <p>The <dfn id="insertion-mode">insertion mode</dfn> is a state variable that controls

								  the primary operation of the tree construction stage.</p>


								  <p>Initially, the <a href="#insertion-mode">insertion mode</a> is "<a href="tree-construction.html#the-initial-insertion-mode" title="insertion mode: initial">initial</a>". It can change to

								  "<a href="tree-construction.html#the-before-html-insertion-mode" title="insertion mode: before html">before html</a>",

								  "<a href="tree-construction.html#the-before-head-insertion-mode" title="insertion mode: before head">before head</a>",

								  "<a href="tree-construction.html#parsing-main-inhead" title="insertion mode: in head">in head</a>", "<a href="tree-construction.html#parsing-main-inheadnoscript" title="insertion mode: in head noscript">in head noscript</a>",

								  "<a href="tree-construction.html#the-after-head-insertion-mode" title="insertion mode: after head">after head</a>", "<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in body">in body</a>", "<a href="tree-construction.html#parsing-main-incdata" title="insertion mode: text">text</a>", "<a href="tree-construction.html#parsing-main-intable" title="insertion

								  mode: in table">in table</a>", "<a href="tree-construction.html#parsing-main-intabletext" title="insertion mode: in

								  table text">in table text</a>", "<a href="tree-construction.html#parsing-main-incaption" title="insertion mode: in

								  caption">in caption</a>", "<a href="tree-construction.html#parsing-main-incolgroup" title="insertion mode: in column

								  group">in column group</a>", "<a href="tree-construction.html#parsing-main-intbody" title="insertion mode: in

								  table body">in table body</a>", "<a href="tree-construction.html#parsing-main-intr" title="insertion mode: in

								  row">in row</a>", "<a href="tree-construction.html#parsing-main-intd" title="insertion mode: in cell">in

								  cell</a>", "<a href="tree-construction.html#parsing-main-inselect" title="insertion mode: in select">in

								  select</a>", "<a href="tree-construction.html#parsing-main-inselectintable" title="insertion mode: in select in table">in

								  select in table</a>", "<a href="tree-construction.html#parsing-main-afterbody" title="insertion mode: after

								  body">after body</a>", "<a href="tree-construction.html#parsing-main-inframeset" title="insertion mode: in

								  frameset">in frameset</a>", "<a href="tree-construction.html#parsing-main-afterframeset" title="insertion mode: after

								  frameset">after frameset</a>", "<a href="tree-construction.html#the-after-after-body-insertion-mode" title="insertion mode:

								  after after body">after after body</a>", and "<a href="tree-construction.html#the-after-after-frameset-insertion-mode" title="insertion mode: after after frameset">after after

								  frameset</a>" during the course of the parsing, as described in

								  the <a href="tree-construction.html#tree-construction">tree construction</a> stage. The insertion mode affects

								  how tokens are processed and whether CDATA sections are

								  supported.</p>


								  <p>Several of these modes, namely "<a href="tree-construction.html#parsing-main-inhead" title="insertion mode: in

								  head">in head</a>", "<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in body">in

								  body</a>", "<a href="tree-construction.html#parsing-main-intable" title="insertion mode: in table">in

								  table</a>", and "<a href="tree-construction.html#parsing-main-inselect" title="insertion mode: in select">in

								  select</a>", are special, in that the other modes defer to them

								  at various times. When the algorithm below says that the user agent

								  is to do something "<dfn id="using-the-rules-for">using the rules for</dfn> the <var title="">m</var> insertion mode", where <var title="">m</var> is one

								  of these modes, the user agent must use the rules described under

								  the <var title="">m</var> <a href="#insertion-mode">insertion mode</a>'s section, but

								  must leave the <a href="#insertion-mode">insertion mode</a> unchanged unless the

								  rules in <var title="">m</var> themselves switch the <a href="#insertion-mode">insertion

								  mode</a> to a new value.</p>


								  <p>When the insertion mode is switched to "<a href="tree-construction.html#parsing-main-incdata" title="insertion

								  mode: text">text</a>" or "<a href="tree-construction.html#parsing-main-intabletext" title="insertion mode: in table

								  text">in table text</a>", the <dfn id="original-insertion-mode">original insertion mode</dfn>

								  is also set. This is the insertion mode to which the tree

								  construction stage will return.</p>


								  <hr><p>When the steps below require the UA to <dfn id="reset-the-insertion-mode-appropriately">reset the insertion

								  mode appropriately</dfn>, it means the UA must follow these

								  steps:</p>


								  <ol><li>Let <var title="">last</var> be false.</li>


								   <li>Let <var title="">node</var> be the last node in the

								   <a href="#stack-of-open-elements">stack of open elements</a>.</li>


								   <li><i>Loop</i>: If <var title="">node</var> is the first node in

								   the stack of open elements, then set <var title="">last</var> to

								   true and set <var title="">node</var> to the <var title="concept-frag-parse-context"><a href="the-end.html#concept-frag-parse-context">context</a></var> element.

								   (<a href="the-end.html#fragment-case">fragment case</a>)</li>


								   <li>If <var title="">node</var> is a <code><a href="the-button-element.html#the-select-element">select</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-inselect" title="insertion mode: in select">in select</a>" and abort these

								   steps. (<a href="the-end.html#fragment-case">fragment case</a>)</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-td-element">td</a></code> or

								   <code><a href="tabular-data.html#the-th-element">th</a></code> element and <var title="">last</var> is false, then

								   switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-intd" title="insertion

								   mode: in cell">in cell</a>" and abort these steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-tr-element">tr</a></code> element, then

								   switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-intr" title="insertion

								   mode: in row">in row</a>" and abort these steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>,

								   <code><a href="tabular-data.html#the-thead-element">thead</a></code>, or <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code> element, then switch the

								   <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-intbody" title="insertion mode: in

								   table body">in table body</a>" and abort these steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-caption-element">caption</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-incaption" title="insertion mode: in caption">in caption</a>" and abort

								   these steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-colgroup-element">colgroup</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-incolgroup" title="insertion mode: in column group">in column group</a>" and

								   abort these steps. (<a href="the-end.html#fragment-case">fragment case</a>)</li>


								   <li>If <var title="">node</var> is a <code><a href="tabular-data.html#the-table-element">table</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-intable" title="insertion mode: in table">in table</a>" and abort these

								   steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="semantics.html#the-head-element">head</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in body">in body</a>" ("<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in body">in body</a>"! <em> not "<a href="tree-construction.html#parsing-main-inhead" title="insertion mode: in head">in head</a>"</em>!) and abort

								   these steps. (<a href="the-end.html#fragment-case">fragment case</a>)</li>

								   <li>If <var title="">node</var> is a <code><a href="sections.html#the-body-element">body</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in body">in body</a>" and abort these

								   steps.</li>


								   <li>If <var title="">node</var> is a <code><a href="obsolete.html#frameset">frameset</a></code> element,

								   then switch the <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-inframeset" title="insertion mode: in frameset">in frameset</a>" and abort

								   these steps. (<a href="the-end.html#fragment-case">fragment case</a>)</li>


								   <li>If <var title="">node</var> is an <code><a href="semantics.html#the-html-element">html</a></code> element,

								   then  switch the <a href="#insertion-mode">insertion mode</a>

								   to "<a href="tree-construction.html#the-before-head-insertion-mode" title="insertion mode: before head">before

								   head</a>"   Then,  abort these steps. (<a href="the-end.html#fragment-case">fragment

								   case</a>)</li>

								   <li>If <var title="">last</var> is true, then switch the

								   <a href="#insertion-mode">insertion mode</a> to "<a href="tree-construction.html#parsing-main-inbody" title="insertion mode: in

								   body">in body</a>" and abort these steps. (<a href="the-end.html#fragment-case">fragment

								   case</a>)</li>


								   <li>Let <var title="">node</var> now be the node before <var title="">node</var> in the <a href="#stack-of-open-elements">stack of open

								   elements</a>.</li>


								   <li>Return to the step labeled <i>loop</i>.</li>


								  </ol><h5 id="the-stack-of-open-elements"><span class="secno">8.2.3.2 </span>The stack of open elements</h5>


								  <p>Initially, the <dfn id="stack-of-open-elements">stack of open elements</dfn> is empty. The

								  stack grows downwards; the topmost node on the stack is the first

								  one added to the stack, and the bottommost node of the stack is the

								  most recently added node in the stack (notwithstanding when the

								  stack is manipulated in a random access fashion as part of <a href="tree-construction.html#adoptionAgency">the handling for misnested tags</a>).</p>


								  <p>The "<a href="tree-construction.html#the-before-html-insertion-mode" title="insertion mode: before html">before

								  html</a>" <a href="#insertion-mode">insertion mode</a> creates the

								  <code><a href="semantics.html#the-html-element">html</a></code> root element node, which is then added to the

								  stack.</p>


								  <p>In the <a href="the-end.html#fragment-case">fragment case</a>, the <a href="#stack-of-open-elements">stack of open

								  elements</a> is initialized to contain an <code><a href="semantics.html#the-html-element">html</a></code>

								  element that is created as part of <a href="the-end.html#html-fragment-parsing-algorithm" title="html fragment

								  parsing algorithm">that algorithm</a>. (The <a href="the-end.html#fragment-case">fragment

								  case</a> skips the "<a href="tree-construction.html#the-before-html-insertion-mode" title="insertion mode: before

								  html">before html</a>" <a href="#insertion-mode">insertion mode</a>.)</p>


								  <p>The <code><a href="semantics.html#the-html-element">html</a></code> node, however it is created, is the topmost

								  node of the stack. It only gets popped off the stack when the parser

								  <a href="the-end.html#stop-parsing" title="stop parsing">finishes</a>.</p>


								  <p>The <dfn id="current-node">current node</dfn> is the bottommost node in this

								  stack.</p>


								  <p>The <dfn id="current-table">current table</dfn> is the last <code><a href="tabular-data.html#the-table-element">table</a></code>

								  element in the <a href="#stack-of-open-elements">stack of open elements</a>, if there is

								  one. If there is no <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="#stack-of-open-elements">stack of

								  open elements</a> (<a href="the-end.html#fragment-case">fragment case</a>), then the

								  <a href="#current-table">current table</a> is the first element in the <a href="#stack-of-open-elements">stack

								  of open elements</a> (the <code><a href="semantics.html#the-html-element">html</a></code> element).</p>


								  <p>Elements in the stack fall into the following categories:</p>


								  <dl><dt><dfn id="special">Special</dfn></dt>

								   <dd><p>The following elements have varying levels of special

								   parsing rules: HTML's <code><a href="sections.html#the-address-element">address</a></code>, <code><a href="obsolete.html#the-applet-element">applet</a></code>,

								   <code><a href="the-map-element.html#the-area-element">area</a></code>, <code><a href="sections.html#the-article-element">article</a></code>, <code><a href="sections.html#the-aside-element">aside</a></code>,

								   <code><a href="semantics.html#the-base-element">base</a></code>, <code><a href="obsolete.html#basefont">basefont</a></code>, <code><a href="obsolete.html#bgsound">bgsound</a></code>,

								   <code><a href="grouping-content.html#the-blockquote-element">blockquote</a></code>, <code><a href="sections.html#the-body-element">body</a></code>, <code><a href="text-level-semantics.html#the-br-element">br</a></code>,

								   <code><a href="the-button-element.html#the-button-element">button</a></code>, <code><a href="tabular-data.html#the-caption-element">caption</a></code>, <code><a href="obsolete.html#center">center</a></code>,

								   <code><a href="tabular-data.html#the-col-element">col</a></code>, <code><a href="tabular-data.html#the-colgroup-element">colgroup</a></code>, <code><a href="interactive-elements.html#the-command-element">command</a></code>,

								   <code><a href="grouping-content.html#the-dd-element">dd</a></code>, <code><a href="interactive-elements.html#the-details-element">details</a></code>, <code><a href="obsolete.html#dir">dir</a></code>,

								   <code><a href="grouping-content.html#the-div-element">div</a></code>, <code><a href="grouping-content.html#the-dl-element">dl</a></code>, <code><a href="grouping-content.html#the-dt-element">dt</a></code>,

								   <code><a href="the-iframe-element.html#the-embed-element">embed</a></code>, <code><a href="forms.html#the-fieldset-element">fieldset</a></code>, <code><a href="grouping-content.html#the-figcaption-element">figcaption</a></code>,

								   <code><a href="grouping-content.html#the-figure-element">figure</a></code>, <code><a href="sections.html#the-footer-element">footer</a></code>, <code><a href="forms.html#the-form-element">form</a></code>,

								   <code><a href="obsolete.html#frame">frame</a></code>, <code><a href="obsolete.html#frameset">frameset</a></code>, <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h1</a></code>,

								   <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h2</a></code>, <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h3</a></code>, <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h4</a></code>, <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h5</a></code>,

								   <code><a href="sections.html#the-h1-h2-h3-h4-h5-and-h6-elements">h6</a></code>, <code><a href="semantics.html#the-head-element">head</a></code>, <code><a href="sections.html#the-header-element">header</a></code>,

								   <code><a href="sections.html#the-hgroup-element">hgroup</a></code>, <code><a href="grouping-content.html#the-hr-element">hr</a></code>, <code><a href="semantics.html#the-html-element">html</a></code>,

								   <code><a href="the-iframe-element.html#the-iframe-element">iframe</a></code>,  <code><a href="embedded-content-1.html#the-img-element">img</a></code>, <code><a href="the-input-element.html#the-input-element">input</a></code>,

								   <code><a href="obsolete.html#isindex-0">isindex</a></code>, <code><a href="grouping-content.html#the-li-element">li</a></code>, <code><a href="semantics.html#the-link-element">link</a></code>,

								   <code><a href="obsolete.html#listing">listing</a></code>, <code><a href="obsolete.html#the-marquee-element">marquee</a></code>, <code><a href="interactive-elements.html#the-menu-element">menu</a></code>,

								   <code><a href="semantics.html#the-meta-element">meta</a></code>, <code><a href="sections.html#the-nav-element">nav</a></code>, <code><a href="obsolete.html#noembed">noembed</a></code>,

								   <code><a href="obsolete.html#noframes">noframes</a></code>, <code><a href="scripting-1.html#the-noscript-element">noscript</a></code>, <code><a href="the-iframe-element.html#the-object-element">object</a></code>,

								   <code><a href="grouping-content.html#the-ol-element">ol</a></code>, <code><a href="grouping-content.html#the-p-element">p</a></code>, <code><a href="the-iframe-element.html#the-param-element">param</a></code>,

								   <code><a href="obsolete.html#plaintext">plaintext</a></code>, <code><a href="grouping-content.html#the-pre-element">pre</a></code>, <code><a href="scripting-1.html#the-script-element">script</a></code>,

								   <code><a href="sections.html#the-section-element">section</a></code>, <code><a href="the-button-element.html#the-select-element">select</a></code>, <code><a href="semantics.html#the-style-element">style</a></code>,

								   <code><a href="interactive-elements.html#the-summary-element">summary</a></code>, <code><a href="tabular-data.html#the-table-element">table</a></code>, <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>,

								   <code><a href="tabular-data.html#the-td-element">td</a></code>, <code><a href="the-button-element.html#the-textarea-element">textarea</a></code>, <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code>,

								   <code><a href="tabular-data.html#the-th-element">th</a></code>, <code><a href="tabular-data.html#the-thead-element">thead</a></code>, <code><a href="semantics.html#the-title-element">title</a></code>,

								   <code><a href="tabular-data.html#the-tr-element">tr</a></code>, <code><a href="grouping-content.html#the-ul-element">ul</a></code>, <code><a href="text-level-semantics.html#the-wbr-element">wbr</a></code>, and

								   <code><a href="obsolete.html#xmp">xmp</a></code>; MathML's <code title="">mi</code>, <code title="">mo</code>, <code title="">mn</code>, <code title="">ms</code>, <code title="">mtext</code>, and <code title="">annotation-xml</code>; and SVG's <code title="">foreignObject</code>, <code title="">desc</code>, and

								   <code title="">title</code>.</p></dd>

								   <dt><dfn id="formatting">Formatting</dfn></dt>

								   <dd><p>The following HTML elements are those that end up in the

								   <a href="#list-of-active-formatting-elements">list of active formatting elements</a>: <code><a href="text-level-semantics.html#the-a-element">a</a></code>,

								   <code><a href="text-level-semantics.html#the-b-element">b</a></code>, <code><a href="obsolete.html#big">big</a></code>, <code><a href="text-level-semantics.html#the-code-element">code</a></code>,

								   <code><a href="text-level-semantics.html#the-em-element">em</a></code>, <code><a href="obsolete.html#font">font</a></code>, <code><a href="text-level-semantics.html#the-i-element">i</a></code>,

								   <code><a href="obsolete.html#nobr">nobr</a></code>, <code><a href="text-level-semantics.html#the-s-element">s</a></code>, <code><a href="text-level-semantics.html#the-small-element">small</a></code>,

								   <code><a href="obsolete.html#strike">strike</a></code>, <code><a href="text-level-semantics.html#the-strong-element">strong</a></code>, <code><a href="obsolete.html#tt">tt</a></code>, and

								   <code><a href="text-level-semantics.html#the-u-element">u</a></code>.</p></dd>


								   <dt><dfn id="ordinary">Ordinary</dfn></dt>

								   <dd><p>All other elements found while parsing an HTML

								   document.</p></dd>


								  </dl><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-the-specific-scope" title="has an element in the specific scope">have an element in a

								  specific scope</dfn> consisting of a list of element types <var title="">list</var> when the following algorithm terminates in a

								  match state:</p>


								  <ol><li><p>Initialize <var title="">node</var> to be the <a href="#current-node">current

								   node</a> (the bottommost node of the stack).</p></li>


								   <li><p>If <var title="">node</var> is the target node, terminate in

								   a match state.</p></li>


								   <li><p>Otherwise, if <var title="">node</var> is one of the element

								   types in <var title="">list</var>, terminate in a failure

								   state.</p></li>


								   <li><p>Otherwise, set <var title="">node</var> to the previous

								   entry in the <a href="#stack-of-open-elements">stack of open elements</a> and return to step

								   2. (This will never fail, since the loop will always terminate in

								   the previous step if the top of the stack &#8212; an

								   <code><a href="semantics.html#the-html-element">html</a></code> element &#8212; is reached.)</p></li>


								  </ol><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-scope" title="has an element in scope">have an element in scope</dfn> when

								  it <a href="#has-an-element-in-the-specific-scope">has an element in the specific scope</a> consisting

								  of the following element types:</p>


								  <ul class="brief"><li><code><a href="obsolete.html#the-applet-element">applet</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="tabular-data.html#the-caption-element">caption</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="semantics.html#the-html-element">html</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="tabular-data.html#the-table-element">table</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="tabular-data.html#the-td-element">td</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="tabular-data.html#the-th-element">th</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="obsolete.html#the-marquee-element">marquee</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="the-iframe-element.html#the-object-element">object</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code title="">mi</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">mo</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">mn</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">ms</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">mtext</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">annotation-xml</code> in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>

								   <li><code title="">foreignObject</code> in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>

								   <li><code title="">desc</code> in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>

								   <li><code title="">title</code> in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>

								  </ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-list-item-scope" title="has an element in list item scope">have an element in list

								  item scope</dfn> when it <a href="#has-an-element-in-the-specific-scope">has an element in the specific

								  scope</a> consisting of the following element types:</p>


								  <ul class="brief"><li>All the element types listed above for the <i><a href="#has-an-element-in-scope">has an element

								   in scope</a></i> algorithm.</li>

								   <li><code><a href="grouping-content.html#the-ol-element">ol</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="grouping-content.html#the-ul-element">ul</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								  </ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-button-scope" title="has an element in button scope">have an element in button

								  scope</dfn> when it <a href="#has-an-element-in-the-specific-scope">has an element in the specific

								  scope</a> consisting of the following element types:</p>


								  <ul class="brief"><li>All the element types listed above for the <i><a href="#has-an-element-in-scope">has an element

								   in scope</a></i> algorithm.</li>

								   <li><code><a href="the-button-element.html#the-button-element">button</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								  </ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-table-scope" title="has an element in table scope">have an element in table

								  scope</dfn> when it <a href="#has-an-element-in-the-specific-scope">has an element in the specific

								  scope</a> consisting of the following element types:</p>


								  <ul class="brief"><li><code><a href="semantics.html#the-html-element">html</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="tabular-data.html#the-table-element">table</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								  </ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> is said to <dfn id="has-an-element-in-select-scope" title="has an element in select scope">have an element in select

								  scope</dfn> when it <a href="#has-an-element-in-the-specific-scope">has an element in the specific

								  scope</a> consisting of all element types <em>except</em> the

								  following:</p>


								  <ul class="brief"><li><code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								   <li><code><a href="the-button-element.html#the-option-element">option</a></code> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></li>

								  </ul><p>Nothing happens if at any time any of the elements in the

								  <a href="#stack-of-open-elements">stack of open elements</a> are moved to a new location in,

								  or removed from, the <code><a href="infrastructure.html#document">Document</a></code> tree. In particular, the

								  stack is not changed in this situation. This can cause, amongst

								  other strange effects, content to be appended to nodes that are no

								  longer in the DOM.</p>


								  <p class="note">In some cases (namely, when <a href="tree-construction.html#adoptionAgency">closing misnested formatting elements</a>),

								  the stack is manipulated in a random-access fashion.</p>


								  <h5 id="the-list-of-active-formatting-elements"><span class="secno">8.2.3.3 </span>The list of active formatting elements</h5>


								  <p>Initially, the <dfn id="list-of-active-formatting-elements">list of active formatting elements</dfn> is

								  empty. It is used to handle mis-nested <a href="#formatting" title="formatting">formatting element tags</a>.</p>


								  <p>The list contains elements in the <a href="#formatting">formatting</a>

								  category, and scope markers. The scope markers are inserted when

								  entering <code><a href="obsolete.html#the-applet-element">applet</a></code> elements, buttons, <code><a href="the-iframe-element.html#the-object-element">object</a></code>

								  elements, marquees, table cells, and table captions, and are used to

								  prevent formatting from "leaking" <em>into</em> <code><a href="obsolete.html#the-applet-element">applet</a></code>

								  elements, buttons, <code><a href="the-iframe-element.html#the-object-element">object</a></code> elements, marquees, and

								  tables.</p>


								  <p class="note">The scope markers are unrelated to the concept of an

								  element being <a href="#has-an-element-in-scope" title="has an element in scope">in

								  scope</a>.</p>


								  <p>In addition, each element in the <a href="#list-of-active-formatting-elements">list of active formatting

								  elements</a> is associated with the token for which it was

								  created, so that further elements can be created for that token if

								  necessary.</p>


								  <p>When the steps below require the UA to <dfn id="push-onto-the-list-of-active-formatting-elements">push onto the list of

								  active formatting elements</dfn> an element <var title="">element</var>, the UA must perform the following steps:</p>


								  <ol><li><p>If there are already three elements in the <a href="#list-of-active-formatting-elements">list of

								   active formatting elements</a> after the last list marker, if

								   any, or anywhere in the list if there are no list markers, that

								   have the same tag name, namespace, and attributes as <var title="">element</var>, then remove the earliest such element from

								   the <a href="#list-of-active-formatting-elements">list of active formatting elements</a>. For these

								   purposes, the attributes must be compared as they were when the

								   elements were created by the parser; two elements have the same

								   attributes if all their parsed attributes can be paired such that

								   the two attributes in each pair have identical names, namespaces,

								   and values (the order of the attributes does not matter).</p>


								   <p class="note">This is the Noah's Ark clause. But with three per

								   family instead of two.</p></li>

								   <li><p>Add <var title="">element</var> to the <a href="#list-of-active-formatting-elements">list of active

								   formatting elements</a>.</p></li>


								  </ol><p>When the steps below require the UA to <dfn id="reconstruct-the-active-formatting-elements">reconstruct the

								  active formatting elements</dfn>, the UA must perform the following

								  steps:</p>


								  <ol><li>If there are no entries in the <a href="#list-of-active-formatting-elements">list of active formatting

								   elements</a>, then there is nothing to reconstruct; stop this

								   algorithm.</li>


								   <li>If the last (most recently added) entry in the <a href="#list-of-active-formatting-elements">list of

								   active formatting elements</a> is a marker, or if it is an

								   element that is in the <a href="#stack-of-open-elements">stack of open elements</a>, then

								   there is nothing to reconstruct; stop this algorithm.</li>


								   <li>Let <var title="">entry</var> be the last (most recently added)

								   element in the <a href="#list-of-active-formatting-elements">list of active formatting

								   elements</a>.</li>


								   <li>If there are no entries before <var title="">entry</var> in the

								   <a href="#list-of-active-formatting-elements">list of active formatting elements</a>, then jump to step

								   8.</li>


								   <li>Let <var title="">entry</var> be the entry one earlier than

								   <var title="">entry</var> in the <a href="#list-of-active-formatting-elements">list of active formatting

								   elements</a>.</li>


								   <li>If <var title="">entry</var> is neither a marker nor an element

								   that is also in the <a href="#stack-of-open-elements">stack of open elements</a>, go to step

								   4.</li>


								   <li>Let <var title="">entry</var> be the element one later than

								   <var title="">entry</var> in the <a href="#list-of-active-formatting-elements">list of active formatting

								   elements</a>.</li>


								   <li><a href="tree-construction.html#create-an-element-for-the-token">Create an element for the token</a> for which the

								   element <var title="">entry</var> was created, to obtain <var title="">new element</var>.</li>


								   <li>Append <var title="">new element</var> to the <a href="#current-node">current

								   node</a> and push it onto the <a href="#stack-of-open-elements">stack of open

								   elements</a> so that it is the new <a href="#current-node">current

								   node</a>.</li>


								   <li>Replace the entry for <var title="">entry</var> in the list

								   with an entry for <var title="">new element</var>.</li>


								   <li>If the entry for <var title="">new element</var> in the

								   <a href="#list-of-active-formatting-elements">list of active formatting elements</a> is not the last

								   entry in the list, return to step 7.</li>


								  </ol><p>This has the effect of reopening all the formatting elements that

								  were opened in the current body, cell, or caption (whichever is

								  youngest) that haven't been explicitly closed.</p>


								  <p class="note">The way this specification is written, the

								  <a href="#list-of-active-formatting-elements">list of active formatting elements</a> always consists of

								  elements in chronological order with the least recently added

								  element first and the most recently added element last (except for

								  while steps 8 to 11 of the above algorithm are being executed, of

								  course).</p>


								  <p>When the steps below require the UA to <dfn id="clear-the-list-of-active-formatting-elements-up-to-the-last-marker">clear the list of

								  active formatting elements up to the last marker</dfn>, the UA must

								  perform the following steps:</p>


								  <ol><li>Let <var title="">entry</var> be the last (most recently added)

								   entry in the <a href="#list-of-active-formatting-elements">list of active formatting elements</a>.</li>


								   <li>Remove <var title="">entry</var> from the <a href="#list-of-active-formatting-elements">list of active

								   formatting elements</a>.</li>


								   <li>If <var title="">entry</var> was a marker, then stop the

								   algorithm at this point. The list has been cleared up to the last

								   marker.</li>


								   <li>Go to step 1.</li>


								  </ol><h5 id="the-element-pointers"><span class="secno">8.2.3.4 </span>The element pointers</h5>


								  <p>Initially, the <dfn id="head-element-pointer"><code title="">head</code> element

								  pointer</dfn> and the <dfn id="form-element-pointer"><code title="">form</code> element

								  pointer</dfn> are both null.</p>


								  <p>Once a <code><a href="semantics.html#the-head-element">head</a></code> element has been parsed (whether

								  implicitly or explicitly) the <a href="#head-element-pointer"><code title="">head</code>

								  element pointer</a> gets set to point to this node.</p>


								  <p>The <a href="#form-element-pointer"><code title="">form</code> element pointer</a>

								  points to the last <code><a href="forms.html#the-form-element">form</a></code> element that was opened and

								  whose end tag has not yet been seen. It is used to make form

								  controls associate with forms in the face of dramatically bad

								  markup, for historical reasons.</p>


								  <h5 id="other-parsing-state-flags"><span class="secno">8.2.3.5 </span>Other parsing state flags</h5>


								  <p>The <dfn id="scripting-flag">scripting flag</dfn> is set to "enabled" if <a href="webappapis.html#concept-n-script" title="concept-n-script">scripting was enabled</a> for the

								  <code><a href="infrastructure.html#document">Document</a></code> with which the parser is associated when the

								  parser was created, and "disabled" otherwise.</p>


								  <p class="note">The <a href="#scripting-flag">scripting flag</a> can be enabled even

								  when the parser was originally created for the <a href="the-end.html#html-fragment-parsing-algorithm">HTML fragment

								  parsing algorithm</a>, even though <code><a href="scripting-1.html#the-script-element">script</a></code> elements

								  don't execute in that case.</p>


								  <p>The <dfn id="frameset-ok-flag">frameset-ok flag</dfn> is set to "ok" when the parser is

								  created. It is set to "not ok" after certain tokens are seen.</p>


								  </div></body></html>