server_playground/doc/www.w3.org/TR/2000/WD-nl-spec-20001120/index.html


								<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

								    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

								<html xmlns="http://www.w3.org/1999/xhtml">

								<head>

								<meta name="generator" content="HTML Tidy, see www.w3.org" />

								<title>Natural Language Semantics Markup Language: W3C Working

								Draft</title>

								<style type="text/css">

								body {

								font-family: sans-serif;

								margin-left: 10%;

								margin-right: 5%;

								font-family: Tahoma, Verdana, "Myriad Web", Syntax, sans-serif;

								}

								h1,h2,h3,h4,h5,h6 {

								 color: rgb(0,92,160);

								 font-weight: normal;

								 margin-left: -4%;

								  }

								img {

								    border-width: 0;

								    color: white;

								  }

								h1 { clear: both; margin-top: 2em }

								div.navbar { margin-bottom: 1em }

								div.head { margin-bottom: 1em }

								p.copyright {font-size: 70% }

								span.term { color: rgb(0,0,192); font-style: italic }

								blockquote {margin-left: 4% }

								.toc {

								  list-style: none;

								  marker-offset: 1em;

								  }

								.tocline { list-style: none }

								ul.toc a { text-decoration: none }

								.fig { text-align: center }

								pre {

								    background-color: rgb(204,204,255);

								    border: none;

								    margin-left: 0;

								    margin-right: 0;

								    font-family: monospace;

								    padding: 0.5em;

								    white-space: pre;

								    width: 100%

								  }

								code {

								    color: green;

								    font-family: monospace;

								    font-weight: bold

								  }

								code.greenmono {

								    color: green;

								    font-family: monospace;

								    font-weight: bold

								  }

								.good {

								    border-bottom: green 2px solid;

								    border-left: green 2px solid;

								    border-right: green 2px solid;

								    border-top: green 2px solid;

								    color: green;

								    font-weight: bold;

								    margin: 1em 5% 1em 0px

								  }

								.bad {

								    border-bottom: red 2px solid;

								    border-left: red 2px solid;

								    border-right: red 2px solid;

								    border-top: red 2px solid;

								    color: rgb(192,101,101);

								    margin: 1em 5% 1em 0px

								  }

								div.navbar { text-align: center }

								div.contents {

								    background-color: rgb(204,204,255);

								    border-bottom: medium none;

								    border-left: medium none;

								    border-right: medium none;

								    border-top: medium none;

								    margin-right: 5%;

								    padding: 0.5em;

								  }

								.tocline { list-style: none }

								table.exceptions { background-color: rgb(255,255,153) }

								.diff { color: rgb(128,0,0) }

								.issues { color: green; font-style: italic }

								.reqs { color: blue; font-style: italic }

								</style>


								<link rel="stylesheet" type="text/css"

								href="http://www.w3.org/StyleSheets/TR/W3C-WD" />

								</head>

								<body>

								<div class="head">

								<div class="banner"><a href="http://www.w3.org/"><img

								src="http://www.w3.org/Icons/WWW/w3c_home" alt="W3C"

								border="0" /></a></div>


								<h1 class="notoc">Natural Language Semantics Markup Language for

								the Speech Interface Framework</h1>


								<h2 class="notoc">W3C Working Draft <i>20 November 2000</i></h2>


								<dl>

								<dt>This version</dt>


								<dd><a

								href="http://www.w3.org/TR/2000/WD-nl-spec-20001120/">http://www.w3.org/TR/2000/WD-nl-spec-20001120</a></dd>


								<dt>Latest version</dt>


								<dd><a

								href="http://www.w3.org/TR/nl-spec/">http://www.w3.org/TR/nl-spec</a></dd>


								<dt><br />

								Previous versions:</dt>


								<dd><i>None - this is the first public version.</i></dd>


								<dt><br />

								Editor:</dt>


								<dd>Deborah A. Dahl, Unisys</dd>

								</dl>


								<p class="copyright"><a

								href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a>

								&#169;2000 <a href="http://www.w3.org/"><abbr

								title="World Wide Web Consortium">W3C</abbr></a><sup>&#174;</sup>

								(<a href="http://www.lcs.mit.edu/"><abbr

								title="Massachusetts Institute of Technology">MIT</abbr></a>, <a

								href="http://www.inria.fr/"><abbr lang="fr"

								title="Institut National de Recherche en Informatique et Automatique">

								INRIA</abbr></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All

								Rights Reserved. W3C <a

								href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">

								liability</a>, <a

								href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">

								trademark</a>, <a

								href="http://www.w3.org/Consortium/Legal/copyright-documents-19990405">

								document use</a> and <a

								href="http://www.w3.org/Consortium/Legal/copyright-software-19980720">

								software licensing</a> rules apply.</p>


								<hr />

								</div>

								<h2 class="notoc">Abstract</h2>


								<p>The W3C Voice Browser working group aims to develop

								specifications to enable access to the Web using spoken

								interaction. This document is part of a set of specifications for

								voice browsers, and provides details of an XML markup language

								for describing the meanings of individual natural language

								utterances. It is expected to be automatically generated by

								semantic interpreters for use by components that act on the

								user's utterances, such as dialog managers.</p>


								<h2>Status of this Document</h2>


								<p>This document is a W3C Working Draft for review by W3C members

								and other interested parties. It is a draft document and may be

								updated, replaced, or obsoleted by other documents at any time.

								It is inappropriate to use W3C Working Drafts as reference

								material or to cite them as other than "work in progress". A list

								of current public W3C Working Drafts can be found at <a

								href="http://www.w3.org/TR/">http://www.w3.org/TR</a>.</p>


								<p>This specification describes markup for representing natural

								language semantics, and forms part of the proposals for the W3C

								Speech Interface Framework. This document has been produced as

								part of the <a href="http://www.w3.org/Voice/">W3C Voice Browser

								Activity</a>, following the procedures set out for the <a

								href="http://www.w3.org/Consortium/Process/">W3C Process</a>. The

								authors of this document are members of the <a

								href="http://www.w3.org/Voice/Group/">Voice Browser Working

								Group</a> (W3C Members only). This document is for public review,

								and comments and discussion are welcomed on the public mailing

								list &lt;<a

								href="mailto:www-voice@w3.org">www-voice@w3.org</a>&gt;. To

								subscribe, send an email to &lt;<a

								href="mailto:www-voice-request@w3.org">www-voice-request@w3.org</a>&gt;

								with the word <tt>subscribe</tt> in the subject line (include the

								word unsubscribe if you want to unsubscribe). The <a

								href="http://lists.w3.org/Archives/Public/www-voice/">archive</a>

								for the list is accessible online.</p>


								<!--

								<h2>Process</h2>


								<p>The specification development process will consist of the

								following steps:</p>


								<ol>

								<li>Collect requirements on natural language markup, prioritize

								those requirements and solicit public input.<br />

								 [Status: Requirements and priorities completed. Public feedback

								in process.]</li>


								<li>Analyze existing natural language markup languages against

								requirements and determine the starting point for specification

								development.<br />

								 [Status: The committee was unable to discover an existing

								XML-based semantics markup. The   XML format described in

								this document was prepared by the Voice Browser Working

								Group.]</li>


								<li>Develop a specification based on the requirements for

								delivery to the W3C Voice Browser Working Group. Iterate

								specification through review and discussion by the working

								group.<br />

								 [Status: initial draft complete.]</li>


								<li>Agreement by committee on public release draft followed by

								public review.<br />

								 [Status: agreed by committee vote.]</li>

								</ol>

								-->

								<h2>General Issues</h2>


								<p>The NL semantics representation uses the data models of the <a

								href="http://www.w3.org/TR/2000/WD-xforms-datamodel-20000406">W3C

								XForms</a> draft specification to represent application-specific

								semantics. While XForms syntax may change in future revisions of

								the specification, it is not expected to change in ways that

								affect the NL Semantics Markup Language significantly.&#160;</p>


								<h2 class="notoc">Table of Contents</h2>


								<ul class="toc">

								<li>1. <a href="#intro">Introduction</a>


								<ul class="tocline">

								<li>1.1 <a href="#uses">Uses</a></li>


								<li>1.2 <a href="#markup">Markup Functions</a></li>


								<li>1.3 <a href="#overview">Overview of Elements and

								Relationships</a></li>

								</ul>

								</li>


								<li>2. <a href="#elements">Elements and Attributes</a>


								<ul class="tocline">

								<li>2.1 <a href="#result">"result" Root Element</a></li>


								<li>2.2 <a href="#interpret">"interpretation" Root

								Element</a></li>


								<li>2.3 <a href="#model">"model" Root Element</a></li>


								<li>2.4 <a href="#instance">"instance" Root Element</a></li>


								<li>2.5 <a href="#input">"input" Root Element</a></li>


								<li>2.6 <a href="#nomatch">"nomatch" Root Element</a></li>


								<li>2.7 <a href="#noinput">"noinput" Root Element</a></li>


								<li>2.8 <a href="#meta">Interpreting Meta-Dialog and Meta-Task

								Utterances</a></li>


								<li>2.9 <a href="#anaphora">Anaphora and Deixis</a></li>

								</ul>

								</li>


								<li>3. <a href="#ext">Extensibility</a></li>


								<li>4. <a href="#compliance">Compliance</a></li>


								<li>5. <a href="#dtd">Document Type Definition</a></li>


								<li>6. <a href="#examples">Examples</a>


								<ul class="tocline">

								<li>6.1 <a href="#simple">Simple Ambiguity</a></li>


								<li>6.2 <a href="#mixed">Mixed Initiative</a></li>


								<li>6.3 <a href="#dtmf">DTMF</a></li>

								</ul>

								</li>


								<li>7. <a href="#study">Future Study</a>


								<ul class="tocline">

								<li>7.1 <a href="#ambig">Representation of Ambiguities</a></li>


								<li>7.2 <a href="#source">Representation of the Source of an

								Ambiguity</a></li>


								<li>7.3 <a href="#dialog">Representing Information Collected over

								the Course of a Dialog</a></li>


								<li>7.4 <a href="#compos">Composition of Multiple Data Models

								within One Utterance</a></li>


								<li>7.5 <a href="#multi">Representation of Multi-modal

								Input</a></li>


								<li>7.6 <a href="#xforms">Extensibility of XForms Data

								Models</a></li>


								<li>7.7 <a href="#recurse">Representation of Recursive

								Structures</a></li>


								<li>7.8 <a href="#unanalyzed">Representing Unanalyzed

								Information: "unanalyzed" Element</a></li>

								</ul>

								</li>


								<li>8.0 <a href="#acks">Acknowledgements</a></li>

								</ul>


								<h2><a id="intro" name="intro">1. Introduction</a></h2>


								<p>This document presents an XML specification for a Natural

								Language Semantics Markup Language, responding to the

								requirements documented in&#160; <a

								href="http://www.w3.org/TR/voice-nlu-reqs/">W3C Natural Language

								Processing Requirements for Voice Browsers.</a> This markup

								language is intended for use by systems that provide semantic

								interpretations for a variety of inputs, including but not

								necessarily limited to, speech and natural language text input.

								These systems include Voice Browsers, web browsers and accessible

								applications.</p>


								<p>It is expected that this markup will be used primarily as a

								standard data interchange format between Voice Browser

								components; in particular, it will normally be automatically

								generated by a semantic interpretation component to represent the

								semantics of users' utterances and will not be directly authored

								by developers.&#160;</p>


								<p>The language is focused on representing the semantic

								information of a single utterance, as opposed to (possibly

								identical) information that might have been collected over the

								course of a dialog. See the Future Study section for a detailed

								discussion of returning information from a dialog.</p>


								<p>The language provides a set of elements that are focused on

								accurately representing the semantics of a natural language

								input. The following are the key design criteria.</p>


								<ul>

								<li>

								<p><em>Fidelity:</em> The representation should be capable of

								accurately reflecting the user's intended meaning in terms of the

								application's goals. However, it should also provide a semantic

								interpreter with the means to represent vagueness and ambiguity

								when the user's meaning cannot be fully determined with the

								information available to the semantic interpreter.</p>

								</li>


								<li>

								<p><em>Interoperability:</em> The representation should support

								use along with other W3C specifications including (but not

								limited to) the Dialog Markup Language, <a

								href="http://www.w3.org/TR/grammar-spec">Speech Grammar Markup

								Language</a>, <a href="http://www.w3.org/AudioVideo/">SMIL</a>

								and <a

								href="http://www.w3.org/TR/2000/WD-xforms-datamodel-20000406">XForms.</a></p>

								</li>


								<li>

								<p><em>Implementability:</em> The required elements of the

								specification should be implementable with existing, generally

								available technology.</p>

								</li>


								<li>

								<p><i>Extensibility:</i> The specification should be extensible

								to accommodate emerging and future capabilities of&#160;

								automatic speech recognizers (ASR's), natural language

								interpreters, and voice browsers. For example, it should be

								compatible with statistical ASR's, mixed initiative dialogs and

								multi-modal components.</p>

								</li>


								<li>

								<p><i>Architectural Neutrality:</i> The specification should

								attempt wherever possible to avoid specifications which imply

								commitments to particular Voice Browser architectures, for

								example whether multi-modal integration takes place before or

								after natural language interpretation.</p>

								</li>


								<li>

								<p><i>Portability:</i> The specification should be able to

								support consistent behavior across platforms.<br />

								 &#160;</p>

								</li>

								</ul>


								<p>This specification includes a set of draft&#160; <a

								href="#elements">elements and attributes</a> and includes a <a

								href="#dtd">draft DTD</a>.</p>


								<h3><a id="uses" name="uses">1.1 Uses</a></h3>


								<p>The general purpose of the NL Semantics Markup is to represent

								information automatically extracted from a user's utterances by a

								semantic interpretation component, where <i>utterance</i> is to

								be taken in the general sense of a meaningful user input in any

								modality supported by the platform. Referring to the sample Voice

								Browser architecture in <a

								href="http://www.w3.org/Voice/Group/2000/voice-intro-20000911.html">

								Introduction and Overview of the W3C Speech Interface

								Framework</a>, a specific architecture can take advantage of this

								representation by using it to convey content among various system

								components that generate and make use of the markup.</p>


								<p>Components that generate NL Semantics Markup:</p>


								<ol>

								<li>ASR</li>


								<li>Natural language understanding</li>


								<li>Other input media interpreters (e.g. DTMF, pointing,

								keyboard)</li>


								<li>Reusable dialog component</li>


								<li>Multimedia integration component</li>

								</ol>


								<p>Components that use NL Semantics Markup:</p>


								<ol>

								<li>Dialog manager</li>


								<li>Multimedia integration component</li>

								</ol>


								<p>A platform may also choose to use this general format as the

								basis of a general semantic result that is carried along and

								filled out during each stage of processing. In addition, future

								systems may also potentially make use of this markup to convey

								abstract semantic content to be rendered into natural language by

								a natural language generation component.</p>


								<h3><a id="markup" name="markup">1.2 Markup Functions</a></h3>


								<p>A semantic interpretation system that supports the Natural

								Language Semantics Markup Language is responsible for

								interpreting natural language inputs and formatting the

								interpretation as defined in this document. Semantic

								interpretation is typically either included as part of the speech

								recognition process, or involves one or more additional

								components, such as natural language interpretation components

								and dialog interpretation components. See the Voice Browser

								Architecture described in <a

								href="http://www.w3.org/TR/voice-intro/">http://www.w3.org/TR/voice-intro/</a>

								for a sample architecture.&#160;</p>


								<p>The elements of the markup fall into the following general

								functional categories:</p>


								<p><em>Input formats and ASR information:</em></p>


								<p>The "<a href="#input">input</a>" element, representing the

								input to the semantic interpreter.</p>


								<p><i>Interpretation:</i></p>


								<p>Elements and attributes representing the semantics of the

								user's utterance, including the "<a href="#result">result</a>",

								"<a href="#interpret">interpretation</a>", "<a

								href="#model">model</a>", and "<a href="#instance">instance</a>"

								elements. The "result" element contains the full result of

								processing one utterance. It may contain multiple

								"interpretation" elements if the interpretation of the utterance

								results in multiple alternative meanings due to uncertainty in

								speech recognition or natural language understanding. There are

								at least two reasons for providing multiple interpretations:</p>


								<ol>

								<li>another component, such as a dialog manager, might have

								additional information, for example, information from a database,

								that would allow it to select a preferred interpretation from

								among the possible interpretations returned from the semantic

								interpreter.</li>


								<li>a dialog manager that was unable to select between several

								competing interpretations could use this information to go back

								to the user and find out what was intended. For example, <i>Did

								you say "Boston" or "Austin"?</i></li>

								</ol>


								<p>The "model" is an XForms data model for the semantic

								information being returned in the interpretation. The "model" is

								a structured representation of the interpretation and allows for

								type checking. The "instance" is an instantiation of the data

								model containing the semantic information for a specific

								interpretation of a specific utterance. For example, the

								information in a travel application might include three groups of

								information: flights, car rental and hotels. The flight

								information, in turn, could contain values for "to_city",

								"from_city", "departure_date" and so on, which would be typed as

								strings.</p>


								<p><i>Side Information:</i></p>


								<p>Elements and attributes representing additional information

								about the interpretation, over and above the interpretation

								itself. Side information includes</p>


								<ol>

								<li>

								<p>Whether an interpretation was achieved (the "nomatch" element)

								and the system's confidence in an interpretation (the

								"confidence" attribute of "interpretation").</p>

								</li>


								<li>

								<p>Alternative interpretations ("<a

								href="#interpret">interpretation</a>")</p>

								</li>

								</ol>


								<p><i>Multi-modal integration:</i></p>


								<p>When more than one modality is available for input, the

								interpretation of the inputs needs to be coordinated. The "mode"

								attribute of "<a href="#input">input</a>" supports this by

								indicating whether the utterance was input by speech, dtmf,

								pointing, etc. The timestamp attributes of "input" also provide

								for temporal coordination by indicating when inputs occurred.</p>


								<h3><a id="overview" name="overview">1.3 Overview of Elements and

								their Relationships</a></h3>


								<p>This figure shows a graphical view of the relationships among

								the elements of the Natural Language Semantics markup.</p>


								<p style="MARGIN-LEFT: -10%"><img alt="" border="0"

								src="nl-spe8.gif" width="537" height="360" /></p>


								<p>The elements shown in the graphic fall into two

								categories:</p>


								<ol>

								<li>description of the input to be processed; shown in the left

								box, "incoming data" in blue.</li>


								<li>description of the meaning which was extracted from the

								input; shown in the right box, "meaning", in yellow.</li>

								</ol>


								<p>Next to each element in the graphic are its attributes in

								italics. In addition, some elements can contain multiple

								instances of other elements. For example, a "result" can contain

								multiple "interpretations", each of which is taken to be an

								alternative. The element "xf:model" is an XForms data model as

								specified in the <a

								href="http://www.w3.org/TR/xforms-datamodel/">XForms data

								model</a> draft, and therefore is not defined in this

								document.</p>


								<p>To illustrate the basic usage of these elements, as a simple

								example, consider the utterance <i>ok.</i> (interpreted as "yes")

								The example illustrates how that utterance and its interpretation

								would be represented in the NL Semantics markup.</p>


								<pre>

								&lt;result x-model="http://theYesNoModel"

								 xmlns:xf="http://www.w3.org/2000/xforms"

								 grammar="http://theYesNoGrammar&gt;

								  &lt;interpretation&gt;

								    &lt;xf:instance&gt;

								      &lt;myApp:yes_no&gt;

								        &lt;response&gt;yes&lt;/response&gt;

								      &lt;/myApp:yes_no&gt;

								    &lt;/xf:instance&gt;

								    &lt;input&gt;ok&lt;/input&gt;

								  &lt;/interpretation&gt;

								&lt;/result&gt;

								</pre>


								<p>This example includes only the minimum required information,

								i.e., it does not include any of the optional information defined

								in this document. There is an overall "result" element which

								includes one interpretation. The data model is defined externally

								by referring to the URI for "theYesNo Model". This external model

								defines a "response" element. The "myApp" namespace refers to the

								application-specific elements that are defined by the XForms data

								model.</p>


								<h3><a id="elements" name="elements">2. Elements and

								Attributes</a></h3>


								<h3><a id="result" name="result">2.1 "result" Root

								Element</a></h3>


								<h3>Attributes: grammar, x-model, xmlns</h3>


								<p>The root element of the markup is "result". The "result"

								element includes one or more "<a

								href="#interpret">interpretation</a>" elements. Multiple

								interpretations result from ambiguities in the input or in the

								semantic interpretation. If the "grammar", "x-model", and "xmlns"

								attributes don't apply to all of the interpretations in the

								result they can be overridden for individual interpretations at

								the "interpretation" level.</p>


								<p>Attributes:</p>


								<ol>

								<li><b>grammar:</b> The grammar or recognition rule matched by

								this result. (The format of the grammar attribute will match the

								rule reference semantics defined in the <a

								href="http://www.w3.org/TR/grammar-spec">grammar

								specification.</a>) The grammar can be overridden by a grammar

								attribute in the "interpretation" element if the input was

								ambiguous as to which grammar it matched.</li>


								<li><b>x-model:</b> The URI which defines the XForms data model

								used for this result. The data model used by the interpretation

								can either be specified here or by an in-line data model using

								the " <a

								href="http://www.w3.org/Voice/Group/2000/nl-spec-20000809.html#2.4

								">model</a>" element. (optional) The x-model can be overridden by

								an x-model attribute in the "interpretation" element if the input

								was ambiguous as to which x-model it matched.</li>


								<li><b>xmlns:</b> An XML namespace declaration is required to

								define the namespace used by XForms elements and attributes. The

								DTD defaults the "xmlns" namespace declaration to a standard

								location, since it will rarely change.</li>

								</ol>


								<pre>

								&lt;result grammar="http://grammar" x-model="http://dataModel"

								xmlns:xf="http://www.w3.org/2000/xforms"

								  &lt;interpretation/&gt;

								&lt;/result&gt;

								</pre>


								<h3><a id="interpret" name="interpret">2.2 "interpretation"

								Element</a></h3>


								<h3>Attributes: confidence, grammar, x-model, xmlns</h3>


								<p>An "interpretation" element contains a single semantic

								interpretation.</p>


								<p>Attributes:</p>


								<ol>

								<li><b>confidence:</b> an integer from 0-100 indicating the

								semantic analyzer's confidence in this interpretation. At this

								point there is no formal, platform-independent, definition of

								confidence. (optional)</li>


								<li><b>grammar:</b> The grammar or recognition rule matched by

								this interpretation (if needed to override the grammar

								specification at the "interpretation" level.) The dialog markup

								interpreter needs to know the grammar rule that is matched by the

								utterance because multiple rules may be simultaneously active.

								The value that is filled in is the grammar URI used by the dialog

								markup interpreter to specify the grammar. The format of the

								grammar attribute will match the rule reference semantics defined

								in the <a href="http://www.w3.org/TR/grammar-spec">grammar

								specification.</a> Specifically, the rule reference will be in

								the <a href="http://www.w3.org/TR/grammar-spec#S2.2">external XML

								form for grammar rule references.</a> This attribute will only be

								needed under "interpretation" if it is necessary to override a

								grammar that was defined at the "result" level.) (optional)</li>


								<li><b>x-model:</b> The location of the XForms data model used

								for this interpretation. The XForms data used by the

								interpretation may either be specified here or by an in-line data

								model using the "<a href="#model">model</a>" element. (As in the

								case of "grammar", this attribute only needs to be defined under

								"interpretation" if it is necessary to override the x-model

								specification at the "interpretation" level.) (optional)</li>

								</ol>


								<p>Interpretations must be sorted best-first by some measure of

								"goodness". The goodness measure is "confidence" if present,

								otherwise, it is some platform-specific indication of

								quality.</p>


								<p>The x-model and grammar are expected to be specified most

								frequently at the "result" level, because most often one data

								model will be sufficient for the entire result. However, it can

								be overridden at the "interpretation" level because it is

								possible that different interpretations may have different data

								models - perhaps because they match different grammar rules.</p>


								<p>The "interpretation" element includes an "<a

								href="#input">input</a>" element which contains the input being

								analyzed, optionally a "<a href="#model">model</a>" element

								defining the XForms data model and an "<a

								href="#instance">instance</a>" element containing the

								instantiation of the data model for this utterance. The data

								model would be empty if the interpreter was not able to produce

								any interpretation.</p>


								<pre>

								   &lt;interpretation confidence="75" grammar="http://grammar"

								    x-model="http://dataModel"

								    xmlns:xf="http://www.w3.org/2000/xforms"&gt;

								    ...

								   &lt;/interpretation&gt;

								</pre>


								<h3><a id="model" name="model">2.3 "model" Element</a></h3>


								<p>The "model" element contains an XForms data model for the data

								and is part of the X-Forms name space. The XForms data model

								provides for a structured data model consisting of groups, which

								may contain other groups or simple types. Simple types can be one

								of: string, boolean, number, monetary values, date, time of day,

								duration, URI, binary. For further information on XForms data

								models see the <a

								href="http://www.w3.org/TR/2000/WD-xforms-datamodel-20000406">X-Forms

								data model specification.</a> Note that XForms fields default to

								optional.</p>


								<p>If no data model is supplied by either the "model" element or

								the "x-model" attribute then it is assumed that the data model

								will be provided by the dialog (or whatever other process

								receives the NL semantic mark-up).</p>


								<p>It is an error to specify both an x-model attribute and a

								"model" element.</p>


								<p>Example: An XForms data model for name and address.</p>


								<pre>

								&lt;model&gt;

								  &lt;xf:group name="nameAddress"&gt;

								      &lt;string name="name"/&gt;

								      &lt;string name="street"/&gt;

								      &lt;string name="city"/&gt;

								      &lt;string name="state"/&gt;

								      &lt;string name="zip"&gt;

								        &lt;mask&gt;ddddd&lt;/mask&gt;

								      &lt;/string&gt;

								  &lt;xf:/group&gt;

								&lt;/model&gt;

								</pre>


								<h3><a id="instance" name="instance">2.4 "instance"

								Element</a></h3>


								<p>The "instance" element contains an instance of the XForms data

								model for the data and is part of the XForms name space.</p>


								<p>Attributes:</p>


								<ol>

								<li><b>confidence:</b> All elements of the data instance may have

								an optional confidence attribute, defined in the NL semantics

								namespace. The confidence attribute contains an integer value in

								the range from 0-100 reflecting the system's confidence in the

								analysis of that slot. The meaning of confidence scores has not

								been defined in a platform-independent way. (optional)</li>

								</ol>


								<p>The use of a confidence attribute from the NL semantics

								namespace does not appear to present any document validation

								problems. However if future XForms specifications support an

								equivalent attribute then that would be preferable to the current

								proposal.</p>


								<pre>


								&lt;xf:instance name="nameAddress"&gt;

								  &lt;nameAddress&gt;

								      &lt;street confidence=75&gt;123 Maple Street&lt;/street&gt;

								      &lt;city&gt;Mill Valley&lt;/city&gt;

								      &lt;state&gt;CA&lt;/state&gt;

								      &lt;zip&gt;90952&lt;/zip&gt;

								  &lt;/nameAddress&gt;

								&lt;/xf:instance&gt;

								&lt;input&gt;

								  My address is 123 Maple Street,

								  Mill Valley, California, 90952

								&lt;/input&gt;

								</pre>


								<h3><a id="input" name="input">2.5 "input" Element</a></h3>


								<p>The "input" element is the text representation of a user's

								input. It includes an optional "confidence" attribute which

								indicates the recognizer's confidence in the recognition result

								(not the confidence in the interpretation, which is indicated by

								the "confidence" attribute of "interpretation"). Optional

								"timestamp-start" and "timestamp-end" attributes indicate the

								start and end times of a spoken utterance, in ISO 8601 format (<a

								href="http://www.iso.ch/markete/8601.pdf">http://www.iso.ch/markete/8601.pdf</a>

								).</p>


								<p>Attributes:</p>


								<ol>

								<li><b>timestamp-start:</b> The time at which the input began.

								(optional)</li>


								<li><b>timestamp-end:</b> the time at which the input ended.

								(optional)</li>


								<li><b>mode:</b> The modality of the input, for example, speech,

								dtmf, etc. (optional)</li>


								<li><b>confidence:</b> the confidence of the recognizer in the

								correctness of the input (optional)</li>

								</ol>


								<p>Note that it doesn't make sense for temporally overlapping

								inputs to have the same mode; however, this constraint is not

								expected to be enforced by platforms.</p>


								<p>When there is no time zone designator, ISO 8601 time

								representations default to local time.</p>


								<p>There are three possible formats for the "input" element.</p>


								<p>a) The "input" element can contain simple text:</p>


								<pre>

								  &lt;input confidence = "100" mode="speech"&gt;onions&lt;/input&gt;

								</pre>


								<p>b) The "input" element can also contain additional "input"

								elements. Having additional input elements allows the

								representation to support future multi-modal inputs as well as

								finer-grained speech information, such as timestamps for

								individual words and word-level confidences.</p>


								<pre>

								&lt;input&gt;

								   &lt;input mode="speech" confidence="50"

								     timestamp-start="2000-04-03T0:00:00"

								     timestamp-end="2000-04-03T0:00:00.2"&gt;fried&lt;/input&gt;

								   &lt;input mode="speech" confidence="100"

								     timestamp-start="2000-04-03T0:00:00.25"

								     timestamp-end="2000-04-03T0:00:00.6"&gt;onions&lt;/input&gt;

								&lt;/input&gt;

								</pre>


								<p>c) Finally, the "input" element can contain "nomatch" and

								"noinput" elements, which describe situations in which the speech

								recognizer (or other media interpreter) received input that it

								was unable to process, or did not receive any input at all,

								respectively.</p>


								<h3><a id="nomatch" name="nomatch">2.6 "nomatch" Element</a></h3>


								<p>The "nomatch" element under "input" is used to indicate that

								the natural language interpreter was unable to successfully match

								any input. It can optionally contain the text of the best of the

								(rejected) matches.</p>


								<pre>

								&lt;interpretation&gt;

								   &lt;instance/&gt;

								      &lt;input&gt;

								         &lt;nomatch/&gt;

								      &lt;/input&gt;

								&lt;/interpretation&gt;

								</pre>


								<h3><a id="noinput" name="noinput">2.7 "noinput" Element</a></h3>


								<p>The "noinput" element under "input" is used to indicate that

								there was no input-- a timeout occurred in the speech recognizer

								due to silence.</p>


								<pre>

								&lt;interpretation&gt;

								   &lt;instance/&gt;

								   &lt;input&gt;

								      &lt;noinput/&gt;

								   &lt;/input&gt;

								&lt;/interpretation&gt;

								</pre>


								<p>If there are multiple levels of inputs, it appears that the

								most natural place for the "nomatch" and "noinput" elements is

								under the highest level of "input" for "no input", and under the

								appropriate level of "input" for "nomatch". So "noinput" means

								"no input at all" and "nomatch" means "no match in speech

								modality" or "no match in dtmf modality". For example, to

								represent garbled speech combined with dtmf "1 2 3 4", we would

								have the following:</p>


								<pre>

								&lt;input&gt;

								   &lt;input mode="speech"&gt;&lt;nomatch/&gt;&lt;/input&gt;

								   &lt;input mode="dtmf"&gt;1 2 3 4&lt;/input&gt;

								&lt;/input&gt;

								</pre>


								<h3><a id="meta" name="meta">2.8 Interpreting Meta-Dialog and

								Meta-Task Utterances</a></h3>


								<p>The natural language requirements state that the semantics

								specification must be capable of representing a number of types

								of meta-dialog and meta-task utterances. This specification is

								flexible enough so that meta utterances can be represented on an

								application-specific basis without defining specific formats in

								this specification.</p>


								<p>Here are two examples of how meta-task and meta-dialog

								utterances might be represented.</p>


								<blockquote>System: <i>What toppings do you want on your

								pizza?</i><br />

								User: <i>What toppings do you have?</i></blockquote>


								<pre>

								&lt;interpretation grammar="http://toppings"

								 xmlns:xf="http://www.w3.org/2000/xforms"&gt;

								  &lt;input mode="speech"&gt;

								    what toppings do you have?

								  &lt;/input&gt;

								  &lt;xf:x-model&gt;

								    &lt;xf:group xf:name="question"/&gt;

								      &lt;xf:string xf:name="questioned_item"/&gt;

								      &lt;xf:string xf:name="questioned_property"/&gt;

								    &lt;/xf:group&gt;

								  &lt;/xf:x-model&gt;

								  &lt;xf:instance&gt;

								    &lt;xf:question&gt;

								      &lt;xf:questioned_item&gt;toppings&lt;/xf:questioned_item&gt;

								      &lt;xf:questioned_property&gt;

								    availability

								      &lt;/xf:questioned_property&gt;

								    &lt;/xf:question&gt;

								  &lt;/xf:instance&gt;

								&lt;/interpretation&gt;

								</pre>


								<blockquote>User: <i>slow down.</i></blockquote>


								<pre>

								&lt;interpretation grammar="http://generalCommandsGrammar"

								 xmlns:xf="http://www.w3.org/2000/xforms"&gt;

								  &lt;xf:model&gt;

								    &lt;group name="command"/&gt;

								      &lt;string name="action"/&gt;

								      &lt;string name="doer"/&gt;

								    &lt;/group&gt;

								  &lt;/xf:model&gt;

								  &lt;xf:instance&gt;

								    &lt;myApp:command&gt;

								    &lt;action&gt;reduce speech rate&lt;/action&gt;

								    &lt;doer&gt;system&lt;/doer&gt;

								    &lt;/myApp:command&gt;

								  &lt;/xf:instance&gt;

								  &lt;input mode="speech"&gt;slow down&lt;/input&gt;

								&lt;/interpretation&gt;

								</pre>


								<br class="reqs" />

								<br />

								<h3><a id="anaphora" name="anaphora">2.9 Anaphora and

								Deixis</a></h3>


								<p>This specification can be used on an application-specific

								basis to represent utterances that contain unresolved anaphoric

								and deictic references. Anaphoric references, which include

								pronouns and definite noun phrases that refer to something that

								was mentioned in the preceding linguistic context, and deictic

								references, which refer to something that is present in the

								non-linguistic context, present similar problems in that there

								may not be sufficient unambiguous linguistic context to determine

								what their exact place in the data instance should be. In order

								to represent unresolved anaphora and deixis using this

								specification, the developer must define a more surface-oriented

								representation that leaves the interpretation of the reference

								open. (This assumes that a later component is responsible for

								actually resolving the reference)</p>


								<p>Example: (ignoring the issue of representing the input from

								the pointing gesture.)</p>


								<blockquote>System: <i>What do you want to drink?</i><br />

								Use: I <i>want this (clicks on picture of large root

								beer.)</i></blockquote>


								<pre>

								&lt;result&gt;

								   &lt;interpretation&gt;

								    &lt;xf:model&gt;

								      &lt;group name="genericAction"&gt;

								        &lt;string name="doer"&gt;

								        &lt;string name="action"&gt;

								        &lt;string name="object"&gt;

								      &lt;/group&gt;

								    &lt;/xf:model&gt;

								    &lt;xf:instance&gt;

								       &lt;doer&gt;I&lt;/doer&gt;

								       &lt;action&gt;want&lt;/action&gt;

								       &lt;object&gt;this&lt;/object&gt;

								    &lt;/xf:instance&gt;

								    &lt;input&gt;

								       &lt;input mode="speech"&gt;I want this&lt;/input&gt;

								    &lt;/input&gt;

								   &lt;interpretation&gt;

								&lt;/result&gt;

								</pre>


								<h2><a id="ext" name="ext">3. Extensibility</a></h2>


								<p>One of the natural language requirements states that the

								specification must be extensible. The specification supports this

								requirement because of its flexibility, as discussed in the

								discussions of meta utterances and anaphora. The markup can

								easily be used in sophisticated systems to convey

								application-specific information that more basic systems would

								not make use of, for example defining speech acts, if this is

								meaningful to the dialog manager. Defining standard

								representations for items such as dates, times, etc. could also

								be done.</p>


								<h2><a id="compliance" name="compliance">4. Compliance</a></h2>


								<p>Compliance issues are deferred until a later revision of the

								specification.</p>


								<h2><a id="dtd" name="dtd">5. Document Type Definition</a></h2>


								<p>(TBD)</p>


								<p>Leading and trailing spaces in utterances are not significant.

								This will be defined in the DTD by specifying

								"xml:space=default".</p>


								<h2><a id="examples" name="examples">6. Examples</a></h2>


								<h3><a id="simple" name="simple">6.1 Simple Ambiguity:</a></h3>


								<blockquote>System: <i>To which city will you be

								traveling?</i><br />

								User: <i>I want to go to Pittsburgh.</i></blockquote>


								<pre>

								&lt;result xmlns:xf="http://www.w3.org/2000/xforms"

								   grammar="http://flight"&gt;

								  &lt;interpretation confidence="60"&gt;

								    &lt;input mode="speech"&gt;

								      I want to go to Pittsburgh

								    &lt;/input&gt;

								    &lt;xf:model&gt;

								      &lt;group name="airline"&gt;

								        &lt;string name="to_city"/&gt;

								      &lt;/group&gt;

								    &lt;/xf:model&gt;

								    &lt;xf:instance&gt;

								      &lt;myApp:airline&gt;

								        &lt;to_city&gt;Pittsburgh&lt;/to_city&gt;

								      &lt;/myApp:airline&gt;

								    &lt;/xf:instance&gt;

								  &lt;/interpretation&gt;

								  &lt;interpretation confidence="40"

								      &lt;input&gt;I want to go to Stockholm&lt;/input&gt;

								    &lt;xf:model&gt;

								      &lt;group name="airline"&gt;

								        &lt;string name="to_city"/&gt;

								      &lt;/group&gt;

								    &lt;/xf:model&gt;

								    &lt;xf:instance&gt;

								      &lt;myApp:airline&gt;

								        &lt;to_city&gt;Stockholm&lt;/to_city&gt;

								      &lt;/myApp:airline&gt;

								    &lt;/xf:instance&gt;

								  &lt;/interpretation&gt;

								&lt;/result&gt;

								</pre>


								<br class="issues" />

								<br />

								<h3><a id="mixed" name="mixed">6.2 Mixed Initiative:</a></h3>


								<blockquote>System: <i>What would you like?</i><br />

								User: <i>I would like 2 pizzas, one with pepperoni and cheese,

								one with sausage and a bottle of coke, to go.</i></blockquote>


								<p>This representation includes an order object which in turn

								contains objects named "food_item", "drink_item" and

								"delivery_method". This representation assumes there are no

								ambiguities in the speech or natural language processing. Note

								that this representation also assumes some level of

								intrasentential anaphora resolution, i.e., to resolve the two

								"one's" as "pizza".</p>


								<pre>

								&lt;result xmlns:xf="http://www.w3.org/2000/xforms"

								   grammar="http://foodorder"&gt;

								  &lt;interpretation confidence="100" &gt;

								    &lt;xf:model&gt;

								      &lt;group name="order"&gt;

								        &lt;group name="food_item" maxOccurs="*"&gt;

								          &lt;group name="pizza" &gt;

								            &lt;string name="ingredients" maxOccurs="*"/&gt;

								          &lt;/group&gt;

								          &lt;group name="burger"&gt;

								            &lt;string name="ingredients" maxOccurs="*/"&gt;

								          &lt;/group&gt;

								        &lt;/group&gt;

								        &lt;group name="drink_item" maxOccurs="*"&gt;

								          &lt;string name="size"&gt;

								          &lt;string name="type"&gt;

								        &lt;/group&gt;

								        &lt;string name="delivery_method"/&gt;

								      &lt;/group&gt;

								    &lt;/xf:model&gt;

								    &lt;xf:instance&gt;

								      &lt;myApp:order&gt;

								        &lt;food_item confidence="100"&gt;

								          &lt;pizza&gt;

								            &lt;xf:ingredients confidence="100"&gt;

								              pepperoni

								            &lt;/xf:ingredients&gt;

								            &lt;xf:ingredients confidence="100"&gt;

								              cheese

								            &lt;/xf:ingredients&gt;

								          &lt;/pizza&gt;

								          &lt;pizza&gt;

								            &lt;ingredients&gt;sausage&lt;/ingredients&gt;

								          &lt;/pizza&gt;

								        &lt;/food_item&gt;

								        &lt;drink_item confidence="100"&gt;

								          &lt;size&gt;2-liter&lt;/size&gt;

								        &lt;/drink_item&gt;

								        &lt;delivery_method&gt;to go&lt;/delivery_method&gt;

								      &lt;/myApp:order&gt;

								    &lt;/xf:instance&gt;

								      &lt;input mode="speech"&gt;I would like 2 pizzas,

								         one with pepperoni and cheese, one with sausage

								         and a bottle of coke, to go.

								      &lt;/input&gt;

								  &lt;/interpretation&gt;

								&lt;/result&gt;

								</pre>


								<h3><a id="dtmf" name="dtmf">6.3 DTMF:</a></h3>


								<p>A combination of dtmf input and speech would be represented

								using nested input elements. For example:</p>


								<blockquote>User: <i>My pin is</i> (dtmf 1 2 3 4)</blockquote>


								<pre>

								&lt;input&gt;

								  &lt;input mode="speech" confidence ="100"

								     timestamp-start="2000-04-03T0:00:00"

								     timestamp-end="2000-04-03T0:00:01.5"&gt;My pin is

								  &lt;/input&gt;

								  &lt;input mode="dtmf" confidence ="100"

								     timestamp-start="2000-04-03T0:00:01.5"

								     timestamp-end="2000-04-03T0:00:02.0"&gt;1 2 3 4

								  &lt;/input&gt;

								&lt;/input&gt;

								</pre>


								<h2><a id="study" name="study">7. Future Study</a></h2>


								<h3><a id="ambig" name="ambig">7.1 Representation of

								ambiguities</a></h3>


								<p>In this mark-up ambiguities are only represented at the

								top-level, using separate interpretation elements. Representation

								of "local" ambiguities, for example, at the level of an ambiguity

								between two ingredients (<i>peppers</i> vs. <i>pepperoni</i>)

								would be useful, but represents validation problems because of

								multiple namespaces unless the XForms specification includes it.

								The more compact representation using local ambiguities has not

								been defined for three reasons:</p>


								<ol>

								<li>It is not possible to combine ambiguities with the XForms

								notation and retain the ability to validate NL semantics

								documents using XML schema or DTDs.</li>


								<li>When multiple filler elements are allowed, as for example

								with pizza toppings, representation of ambiguity can become very

								complex and confusing.</li>


								<li>Although fully spelling out ambiguities at the top level

								results in a more verbose representation, current practical

								systems seldom make use of more than 2 alternative

								interpretations, so the increase in verbosity from spelling out

								redundant information should not be too significant in

								practice.</li>

								</ol>


								<p>Local ambiguities may be supported in the future if

								representation of ambiguity becomes part of the XForms

								standard.</p>


								<h3><a id="source" name="source">7.2 Representing the source of

								an ambiguity</a></h3>


								<p>If there is more than one interpretation, it may be useful to

								add an attribute specifying the source of the ambiguity, for

								example, "natural_language", "speech", "ocr", or "handwriting"

								Speech ambiguities originate in uncertainties about the speech

								recognition result, for example, <i>Austin</i> vs. <i>Boston</i>.

								"handwriting" and "ocr" are analogous to speech. Natural language

								ambiguities result from syntactic, semantic, or pragmatic

								ambiguities in a single recognizer result. For example in <i>I

								want fried onions and peppers,</i> there are two interpretations,

								one in which the peppers are to be fried and one in which they

								are not to be fried. This attribute would not be meaningful if

								there is only one interpretation. This information could be used,

								for example, by a dialog manager to construct a more helpful

								response (e.g. <i>I didn't hear that</i> vs. <i>I didn't

								understand that</i>) or by a scoring algorithm that treats

								different ambiguity sources differently.</p>


								<h3><a id="dialog" name="dialog">7.3 Representing information

								collected over the course of a dialog</a></h3>


								<p>In many cases identical information can be conveyed in one

								utterance or over the course of several dialog turns. This

								situation can occur both in the case of a subdialog or in the

								case of a reusable component. For example, if the system's goal

								in the subdialog or the reusable component is to collect travel

								information from a user, the ultimate information is the same

								whether the user says <i>I want to go from Pittsburgh to Seattle

								on January 1, 2001</i>, in a single utterance or whether the same

								information is elicited from the user during several dialog

								turns, as in</p>


								<blockquote>

								<p>System: <i>Where will you be departing from?</i><br />

								 User: <i>Pittsburgh.</i><br />

								 System: <i>Where will you be traveling to?</i><br />

								 User: <i>Seattle.</i></p>

								</blockquote>


								<p>etc.</p>


								<p>It should be possible to use a substantially similar semantic

								representation in both of these situations. The main issue is

								that in the case of information collected over the course of a

								dialog it becomes very difficult to tie that information back to

								the original inputs. Elements such as "input" and attributes such

								as "timestamp-start", "timestamp-end", "grammar", and "mode"

								which relate the semantic interpretation directly to the input

								become less meaningful when the information is collected in a

								dialog. Moreover, they also become less useful to the main dialog

								component, since presumably it's the function of the subdialog or

								reusable component to make use of this low-level information

								internally to guide its own dialog and to shield the main dialog

								from these details. One strategy under consideration is simply to

								omit these aspects of the markup for dialog-based semantic

								information. This issue may also be dealt with in the reusable

								components group, since the issue of return information is key to

								its charter.</p>


								<h3><a id="compos" name="compos">7.4 Composition of multiple data

								models within one utterance</a></h3>


								<p>Some utterances could potentially make use of more than one

								data model in their semantic representations. For example it is

								possible in a mixed initiative situation for the user to combine

								multiple functions in one utterance, as in:</p>


								<blockquote>

								<p>System: <i>I heard you say you want to go to Pittsburgh, is

								that correct?</i></p>


								<p>User: <i>Yes, and I'll be leaving around 8:00 a.m.</i></p>

								</blockquote>


								<p>It would be natural for there to be a generic data model for

								the "yes" and also an application-specific model for the flight

								arrangements. One possibility would be for the interpreter to

								create one joint data model on the fly from these models. Or, the

								developer could define one data model that includes both elements

								for "yes_no" and for the application-specific information. If

								there are two data models, and consequently two instances, then

								it is necessary to consider the problem of associating the

								instances with the correct data models.</p>


								<h3><a id="multi" name="multi">7.5 Representation of Multi-modal

								input</a></h3>


								<p>This is deferred until the specification for multi-modal

								inputs is better defined, except for dtmf (for dtmf, see the <a

								href="#dtmf">example</a> above)</p>


								<h3><a id="xforms" name="xforms">7.6 Extensibility of XForms data

								models</a></h3>


								<p>It would be highly desirable if components in the dialog

								system could extend the data model so that grammars or reusable

								components could return information that is additional to a base

								data model for, say, a time or date component or grammar. With

								the current XForms specification it would be necessary to provide

								a complete new data model in these cases. It is possible that the

								XForms working group may extend the XForms specification to

								include extensibility of the data model.</p>


								<p>Similarly, the current XForms data model definition does not

								provide for the re-use of complex type definitions, i.e. groups,

								in multiple locations. Thus, to represent travel information

								consisting of both an outbound flight and an inbound flight, it

								is not possible to define a single complex type "flight_details"

								that is used for both outbound and inbound flight information.

								(See the section on "Shared Datatype Libraries" in the <a

								href="http://www.w3.org/TR/2000/WD-xforms-datamodel-20000406/#shared">XForms Data

								Model</a> document for additional discussion.)</p>


								<h3><a id="recurse" name="recurse">7.7 Representation of

								recursive structures</a></h3>


								<p>Some systems may find it useful to represent generic syntactic

								parse trees in natural language output. Generic parse trees

								cannot be represented by current XForms data models because they

								do not support any recursion. However, it is not clear how

								frequently this capability would be required.</p>


								<h3><a id="unanalyzed" name="unanalyzed">7.8 Representing

								unanalyzed information: "unanalyzed" Element</a></h3>


								<p>An "unanalyzed" element could be used to represent a part of

								the input that was left unanalyzed in the current interpretation.

								This element could be used by a dialog manager to decide if

								enough of the input had been analyzed for the dialog to proceed,

								or if the dialog manager should ask for a clarification from the

								user. The dialog manager could also use the unanalyzed material

								to help it decide which of several alternative interpretations is

								correct. Each "unanalyzed" element would contain "input" elements

								which would contain the portions of the full utterance that was

								unanalyzed.</p>


								<p>"unanalyzed" has not been included in the current version of

								the spec for several reasons:</p>


								<ol>

								<li>

								<p>It's not clear that it has a platform-independent

								interpretation.</p>

								</li>


								<li>

								<p>It's not clear that current applications would make use of

								it.</p>

								</li>


								<li>

								<p>Although there is a requirement for representing "unanalyzed",

								this can be accommodated in the current specification if the

								developer incorporates "unanalyzed" into the data model in an

								application-specific manner. In addition, natural language

								interpreters can take unanalyzed information into account

								internally when they are computing confidences, so that this

								information is available indirectly to dialog managers through

								the confidence attributes.</p>

								</li>

								</ol>


								<p>The most important consideration appears to be whether in fact

								the ability to represent unanalyzed material is of interest to

								current or near future applications.</p>


								<p>Note that the use of "unanalyzed" would be mainly useful for

								systems with robust natural language interpreters which are

								capable of ignoring portions of the speech recognizer result that

								don't match the natural language grammar. In the case of tightly

								coupled ASR/NL systems which require that all of the input match

								a speech recognizer grammar the notion of "unanalyzed" isn't

								useful, since all of the input is required to be analyzed by the

								nature of the system. Similarly, keyword spotting systems with

								garbage models will not be able to make use of this element

								because the speech recognition process discards any

								unrecognizable speech before the natural language interpretation

								process begins.</p>


								<p>Example:</p>


								<blockquote>System: <i>Where do you want to go?</i><br />

								User: <i>I'd like to fly from Boston and then continue on to

								Philadelphia.</i></blockquote>


								<p>(assuming that <i>"and then continue on"</i> is not included

								in the speech grammar.)</p>


								<pre>

								&lt;unanalyzed&gt;

								   &lt;input&gt;and then continue on&lt;/input&gt;

								&lt;/unanalyzed&gt;

								</pre>


								<p>If there is duplicated unanalyzed material, as in <i>Please

								get my email please,</i> every unanalyzed item should be

								represented individually, so <i>please</i> should be duplicated

								if both occurrences are unanalyzed.</p>


								<h2><a id="acks" name="acks">8. Acknowledgements</a></h2>


								<p>This document was written with the participation of the

								members of the W3C Voice Browser Working Group <em>(listed in

								alphabetical order)</em>:</p>


								<blockquote>Daniel Austin, Ask Jeeves, Inc.<br />

								Dan Burnett, Nuance<br />

								Andrew Hunt, SpeechWorks<br />

								Robert Keiller, VoxSurf International<br />

								Andreas Kellner, Philips<br />

								Bruce Lucas, IBM<br />

								Dave Raggett W3C/Phone.com<br />

								</blockquote>

								</body>

								</html>