server_playground/doc/www.w3.org/TR/2009/REC-emma-20090210/index.html


								<?xml version="1.0" encoding="utf-8"?>

								<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

								    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

								<html xmlns="http://www.w3.org/1999/xhtml">

								<head>

								<meta name="generator" content=

								"HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org" />

								<meta http-equiv="Content-Type" content= "text/html; charset=utf-8" />

								<title>EMMA: Extensible MultiModal Annotation markup

								language</title>


								<style type="text/css">

								/*<![CDATA[*/

								span.term {

								  color: rgb(0,0,192);

								  font-style: italic

								  }

								blockquote { margin-left: 4% }

								.toc { list-style-type: none; marker-offset: 1em }

								.tocline { list-style-type: none }

								ul.toc a { text-decoration: none }

								.fig { text-align: center }

								pre { font-family: monospace }

								pre.example {

								  margin-left: 0;

								  padding: 0.5em;

								  width: 98%;

								  font-family: monospace;

								  white-space: pre;

								  border: none;

								  font-size: 95%;

								  background-color: rgb(230,230,255);

								  }

								.note { color: red }

								.new { color: green;}

								.old { text-decoration: line-through }

								.newer { text-decoration: underline }

								.change { color: red }

								.changeTable { color: orange }

								.remove { text-decoration: line-through }

								div.issues {

								  border-width: thin;

								  border-style: solid;

								  border-color: maroon;

								  background-color: #FFEECC;

								  color: maroon;

								  width: 95%; padding: 0.5em; }

								div.issues h4 { margin-top: 0 }

								code {

								  font-weight:bold;

								  color: green;

								  font-family: monospace;

								  font-size: 110%;

								  }

								.good {

								  border: green 2px solid;

								  font-weight: bold;

								  color: green;

								  margin: 1em 5% 1em 0px;

								  }

								.bad {

								  border: red 2px solid;

								  font-weight: bold;

								  color: rgb(192,101,101);

								  margin: 1em 5% 1em 0px;

								  }

								div.navbar { text-align: center }

								div.contents {

								  border: medium none;

								  padding: 0.5em;

								  margin-right: 5%;

								  background-color: rgb(230,230,255);

								  }

								table.exceptions {

								  background-color: rgb(255,255,153)

								  }

								table.modes { font-size: 90% }

								table.defn {

								  border-width: thin;

								  border-style: solid;

								  border-color: black;

								  color: black

								  }

								table.defn th { background-color: rgb(220,220,255);

								  border-style: solid; border-color: black; border-width: thin }

								table.defn td { background-color: rgb(230,230,255);

								  border-style: solid; border-color: black; border-width: thin }

								.diff { color: rgb(128,0,0) }

								.reqs {  color: blue; font-style: italic  }

								.editorial { color: maroon; font-style: italic }

								/*]]>*/

								</style>

								<link rel="stylesheet" type="text/css" href=

								"http://www.w3.org/StyleSheets/TR/W3C-REC.css" />

								</head>

								<body>

								<div class="head">

								<div class="banner"><a href="http://www.w3.org/"><img alt="W3C"

								src="http://www.w3.org/Icons/w3c_home" width="72" height=

								"48" /></a></div>

								<h1 class="notoc" id="s0">EMMA: Extensible MultiModal Annotation

								markup language</h1>

								<h2><a id="w3c-doctype" name="w3c-doctype"><acronym title=

								"World Wide Web Consortium">W3C</acronym> Recommendation

								10 February 2009</a></h2>

								<dl>

								<dt>This version:</dt>

								<dd><a href=

								"http://www.w3.org/TR/2009/REC-emma-20090210/">http://www.w3.org/TR/2009/REC-emma-20090210/</a></dd>

								<dt>Latest version:</dt>

								<dd><a href=

								"http://www.w3.org/TR/emma/">http://www.w3.org/TR/emma/</a></dd>

								<dt>Previous version:</dt>

								<dd><a href=

								"http://www.w3.org/TR/2008/PR-emma-20081215/">http://www.w3.org/TR/2008/PR-emma-20081215/</a></dd>

								</dl>

								<dl>

								<dt>Editor:</dt>

								<dd>Michael Johnston, AT&amp;T</dd>

								<dt>Authors:</dt>

								<dd>Paolo Baggia, Loquendo</dd>

								<dd>Daniel C. Burnett, Voxeo (formerly of Vocalocity and Nuance)</dd>

								<dd>Jerry Carter, Nuance</dd>

								<dd>Deborah A. Dahl, Invited Expert</dd>

								<dd>Gerry McCobb, Openstream</dd>

								<dd>Dave Raggett, (until 2007, while at W3C/Volantis and W3C/Canon)</dd>

								</dl>


								    <p>Please refer to the

								    <a href="http://www.w3.org/2009/02/emma-errata.html">

								    <strong>errata</strong></a>

								    for this document, which may include some normative

								    corrections.</p>


								    <p>See also

								    <a href="http://www.w3.org/2003/03/Translations/byTechnology?technology=emma">

								    <strong>translations</strong></a>.</p>


								<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> &copy; 2009 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>&reg;</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p>


								<hr title="Separator for header" /></div>

								<h2 class="notoc" id="abstract">Abstract</h2>

								<p>The W3C Multimodal Interaction Working Group aims to develop

								specifications to enable access to the Web using multimodal

								interaction. This document is part of a set of specifications for

								multimodal systems, and provides details of an XML markup language

								for containing and annotating the interpretation of user input.

								Examples of interpretation of user input are a transcription into

								words of a raw signal, for instance derived from speech, pen or

								keystroke input, a set of attribute/value pairs describing their

								meaning, or a set of attribute/value pairs describing a gesture.

								The interpretation of the user's input is expected to be generated

								by signal interpretation processes, such as speech and ink

								recognition, semantic interpreters, and other types of processors

								for use by components that act on the user's inputs such as

								interaction managers.</p>

								<h2 id="status">Status of this Document</h2>

								<p><em>This section describes the status of this document at the

								time of its publication. Other documents may supersede this

								document. A list of current W3C publications and the latest

								revision of this technical report can be found in the <a href=

								"http://www.w3.org/TR/">W3C technical reports index</a> at

								http://www.w3.org/TR/.</em></p>


								<p>This is the

								<a href="http://www.w3.org/2005/10/Process-20051014/tr.html#RecsW3C">

								Recommendation

								</a>

								of "EMMA: Extensible MultiModal Annotation markup language".


								It has been produced by the

								<a href="http://www.w3.org/2002/mmi/">Multimodal Interaction Working Group</a>,

								which is part of the

								<a href="http://www.w3.org/2002/mmi/Activity.html">Multimodal Interaction Activity</a>.

								</p>


								<p>Comments are welcome on <a href="mailto:www-multimodal@w3.org">www-multimodal@w3.org</a>

								(<a href="http://lists.w3.org/Archives/Public/www-multimodal/">archive</a>).


								See <a href="http://www.w3.org/Mail/">W3C mailing list and archive

								usage guidelines</a>.</p>


								<p>The design of EMMA has been widely reviewed

								(see the <a href="http://www.w3.org/TR/2008/PR-emma-20081215/emma-disp.html">

								disposition of comments</a>)

								and satisfies the Working Group's technical requirements.


								A list of implementations is included in the

								<a href="http://www.w3.org/2002/mmi/2008/emma-ir/">

								EMMA Implementation Report</a>.


								The Working Group made a few editorial changes to the

								<a href="http://www.w3.org/TR/2008/PR-emma-20081215/">

								15 December 2008 Proposed Recommendation</a>.

								Changes from the Proposed Recommendation can be found in

								<a href="#appF">Appendix F</a>.

								</p>


								<p>This document has been reviewed by W3C Members, by software

								  developers, and by other W3C groups and interested parties, and is

								  endorsed by the Director as a W3C Recommendation. It is a stable

								  document and may be used as reference material or cited from another

								  document. W3C's role in making the Recommendation is to draw

								  attention to the specification and to promote its widespread

								  deployment. This enhances the functionality and interoperability of

								  the Web.</p>


								<p>This specification describes markup for representing

								interpretations of user input (speech, keystrokes, pen input etc.)

								together with annotations for confidence scores, timestamps, input

								medium etc., and forms part of the proposals for the <a href=

								"http://www.w3.org/TR/mmi-framework/">W3C Multimodal Interaction

								Framework</a>.</p>


								<p>This document was produced by a group operating under the

								<a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5

								February 2004 W3C Patent Policy</a>. W3C maintains a <a rel=

								"disclosure" href=

								"http://www.w3.org/2004/01/pp-impl/34607/status">public list of any

								patent disclosures</a> made in connection with the deliverables of

								the group; that page also includes instructions for disclosing a

								patent. An individual who has actual knowledge of a patent which

								the individual believes contains <a href=

								"http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">

								Essential Claim(s)</a> must disclose the information in accordance

								with <a href=

								"http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">

								section 6 of the W3C Patent Policy</a>.</p>


								<p>The sections in the main body of this document are normative unless

								otherwise specified.  The appendices in this document are informative

								unless otherwise indicated explicitly.</p>


								<h2 class="notoc" id="conv">Conventions of this Document</h2>

								<p>All sections in this specification are normative, unless

								otherwise indicated. The informative parts of this specification

								are identified by "Informative" labels within sections.</p>

								<p>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL

								NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"

								in this document are to be interpreted as described in [<a href=

								"#ref-rfc2119">RFC2119</a>].</p>

								<h2 class="notoc" id="toc">Table of Contents</h2>

								<ul class="tocline">

								<li>1. <a href="#s1">Introduction</a>

								<ul class="tocline">

								<li>1.1 <a href="#s1.1">Uses of EMMA</a></li>

								<li>1.2 <a href="#s1.2">Terminology</a></li>

								</ul>

								</li>

								<li>2. <a href="#s2">Structure of EMMA documents</a>

								<ul class="tocline">

								<li>2.<span>1</span> <a href="#s2.1">Data model</a></li>

								<li>2.<span>2</span> <a href="#s2.2">EMMA namespace

								prefixes</a></li>

								</ul>

								</li>

								<li>3. <a href="#s3">EMMA structural elements</a>

								<ul class="tocline">

								<li>3.1 <a href="#s3.1">Root element:

								<code>emma:emma</code></a></li>

								<li>3.2 <a href="#s3.2">Interpretation element:

								<code>emma:interpretation</code></a></li>

								<li>3.3 <a href="#s3.3">Container elements</a>

								<ul class="tocline">

								<li>3.3.1 <a href="#s3.3.1"><code>emma:one-of</code>

								element</a></li>

								<li>3.3.2 <a href="#s3.3.2"><code>emma:group</code> element</a>

								<ul class="tocline">

								<li>3.3.2.1 <a href="#s3.3.2.1">Indirect grouping criteria:

								<code>emma:group-info</code> element</a></li>

								</ul>

								</li>

								<li>3.3.3 <a href="#s3.3.3"><code>emma:sequence</code>

								element</a></li>

								</ul>

								</li>

								<li>3.4 <a href="#s3.4">Lattice element</a>

								<ul class="tocline">

								<li>3.4.1 <a href="#s3.4.1">Lattice markup:

								<code>emma:lattice</code>, <code>emma:arc</code>,

								<code>emma:node</code> elements</a></li>

								<li>3.4.2 <a href="#s3.4.2">Annotations on lattices</a></li>

								<li>3.4.3 <a href="#s3.4.3">Relative timestamps on

								lattices</a></li>

								</ul>

								</li>

								<li>3.5 <a href="#s3.5">Literal semantics:

								<code>emma:literal</code> element</a></li>

								</ul>

								</li>

								<li>4 <a href="#s4">EMMA annotations</a>

								<ul class="tocline">

								<li>4.1 <a href="#s4.1">EMMA annotation elements</a>

								<ul class="tocline">

								<li>4.1.1 <a href="#s4.1.1">Data model: <code>emma:model</code>

								element</a></li>

								<li>4.1.2 <a href="#s4.1.2">Interpretation derivation:

								<code>emma:derived-from</code> element and

								<code>emma:derivation</code> element</a></li>

								<li>4.1.3 <a href="#s4.1.3">Reference to grammar used:

								<code>emma:grammar</code> element</a></li>

								<li>4.1.4 <a href="#s4.1.4">Extensibility to application/vendor

								specific annotations: <code>emma:info</code> element</a></li>

								<li>4.1.5 <a href="#s4.1.5">Endpoint reference:

								<code>emma:endpoint-info</code> element and

								<code>emma:endpoint</code> element</a></li>

								</ul>

								</li>

								<li>4.2 <a href="#s4.2">EMMA annotation attributes</a>

								<ul class="tocline">

								<li>4.2.1 <a href="#s4.2.1">Tokens of input:

								<code>emma:tokens</code> attribute</a></li>

								<li>4.2.2 <a href="#s4.2.2">Reference to processing:

								<code>emma:process</code> attribute</a></li>

								<li>4.2.3 <a href="#s4.2.3">Lack of input:

								<code>emma:no-input</code> attribute</a></li>

								<li>4.2.4 <a href="#s4.2.4">Uninterpreted input:

								<code>emma:uninterpreted</code> attribute</a></li>

								<li>4.2.5 <a href="#s4.2.5">Human language of input:

								<code>emma:lang</code> attribute</a></li>

								<li>4.2.6 <a href="#s4.2.6">Reference to signal:

								<code>emma:signal</code> <span>and

								<code>emma:signal-size</code></span> attributes</a></li>

								<li>4.2.7 <a href="#s4.2.7">Media type:

								<code>emma:media-type</code> attribute</a></li>

								<li>4.2.8 <a href="#s4.2.8">Confidence scores:

								<code>emma:confidence</code> attribute</a></li>

								<li>4.2.9 <a href="#s4.2.9">Input source: <code>emma:source</code>

								attribute</a></li>

								<li>4.2.10 <a href="#s4.2.10">Timestamps</a>

								<ul class="tocline">

								<li>4.2.10.1 <a href="#s4.2.10.1">Absolute timestamps:

								<code>emma:start</code>, <code>emma:end</code> attributes</a></li>

								<li>4.2.10.2 <a href="#s4.2.10.2">Relative timestamps:

								<code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code> attributes</a></li>

								<li>4.2.10.3 <a href="#s4.2.10.3">Duration of input:

								<code>emma:duration</code> attribute</a></li>

								<li><span>4.2.10.4 <a href="#s4.2.10.4">Composite Input and

								Relative Timestamps</a></span></li>

								</ul>

								</li>

								<li>4.2.11 <a href="#s4.2.11">Medium, mode, and function of user

								inputs: <code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>

								attributes</a></li>

								<li>4.2.12 <a href="#s4.2.12">Composite multimodality:

								<code>emma:hook</code> attribute</a></li>

								<li>4.2.13 <a href="#s4.2.13">Cost: <code>emma:cost</code>

								attribute</a></li>

								<li>4.2.14 <a href="#s4.2.14">Endpoint properties:

								<code>emma:endpoint-role</code>,

								<code>emma:endpoint-address</code>, <code>emma:port-type</code>,

								<code>emma:port-num</code>, <code>emma:message-id</code>,

								<code>emma:service-name</code>, <code>emma:endpoint-pair-ref</code>,

								<code>emma:endpoint-info-ref</code>

								attributes</a></li>

								<li>4.2.15 <a href="#s4.2.15">Reference to

								<code>emma:grammar</code> element: <code>emma:grammar-ref</code>

								attribute</a></li>

								<li>4.2.16 <a href="#s4.2.16">Reference to <code>emma:model</code>

								element: <code>emma:model-ref</code> attribute</a></li>

								<li>4.2.17 <a href="#s4.2.17">Dialog turns:

								<code>emma:dialog-turn</code> attribute</a></li>

								</ul>

								</li>

								<li>4.3 <a href="#s4.3">Scope of EMMA annotations</a></li>

								</ul>

								</li>

								<li>5.<a href="#s5">Conformance</a>

								<ul class="tocline">

								<li>5.1 <a href="#s5.1">Conforming EMMA Documents</a></li>

								<li>5.2 <a href="#s5.2">Using EMMA with other Namespaces</a></li>

								<li>5.3 <a href="#s5.3">Conforming EMMA Processors</a></li>

								</ul>

								</li>

								<li><a href="#appendices">Appendices</a>

								<ul class="tocline">

								<li>Appendix A. <a href="#appA">XML and <span>RELAX NG</span>

								schemata</a> <span>(Normative)</span></li>

								<li>Appendix B. <a href="#appB">MIME type</a>

								<span>(Normative)</span>

								<ul>

								<li><span>B.1 <a href="#media-type-registration">Registration of

								MIME media type application/emma+xml</a></span></li>

								</ul>

								</li>

								<li>Appendix C. <a href="#appC"><code>emma:hook</code> and SRGS</a>

								<span>(Informative)</span></li>

								<li>Appendix D. <a href="#appD">EMMA event interface</a>

								<span>(Informative)</span></li>

								<li>Appendix E. <a href="#appE">References</a>

								<ul>

								<li>E.1 <a href="#appE1">Normative references</a></li>

								<li>E.2 <a href="#appE2"><span>Informative</span>

								references</a></li>

								</ul>

								</li>

								<li>Appendix F. <a href="#appF">Changes since last draft</a>

								<span>(Informative)</span></li>

								<li>Appendix G. <a href="#appG">Acknowledgements</a>

								<span>(Informative)</span></li>

								</ul>

								</li>

								</ul>

								<h2 id="s1">1. Introduction</h2>

								<p>This section is <span>I</span>nformative.</p>

								<p>This document presents an XML specification for EMMA, an

								Extensible MultiModal Annotation markup language, responding to the

								requirements documented in <span>Requirements for EMMA</span>

								[<a href="#EMMAreqs">EMMA <span>Requirements</span></a>]. This

								markup language is intended for use by systems that provide

								semantic interpretations for a variety of inputs, including but not

								necessarily limited to, speech, natural language text, GUI and ink

								input.</p>

								<p>It is expected that this markup will be used primarily as a

								standard data interchange format between the components of a

								multimodal system; in particular, it will normally be automatically

								generated by interpretation components to represent the semantics

								of users' inputs, not directly authored by developers.</p>

								<p>The language is focused on annotating single inputs from users,

								which may be either from a single mode or a composite input

								combining information from multiple modes, as opposed to

								information that might have been collected over multiple turns of a

								dialog. The language provides a set of elements and attributes that

								are focused on enabling annotations on user inputs and

								interpretations of those inputs.</p>

								<p>An EMMA document can be considered to hold three types of

								data:</p>

								<ul>

								<li>

								<p><b>instance data</b></p>

								<p>Application-specific markup corresponding to input information

								which is meaningful to the consumer of an EMMA document. Instances

								are application-specific and built by input processors at runtime.

								Given that utterances may be ambiguous with respect to input

								values, an EMMA document may hold more than one instance.</p>

								</li>

								<li>

								<p><b>data model</b></p>

								<p>Constraints on structure and content of an instance. The data

								model is typically pre-established by an application, and may be

								implicit, that is, unspecified.</p>

								</li>

								<li>

								<p><b>metadata</b></p>

								<p>Annotations associated with the data contained in the instance.

								Annotation values are added by input processors at runtime.</p>

								</li>

								</ul>

								<p>Given the assumptions above about the nature of data represented

								in an EMMA document, the following general principles apply to the

								design of EMMA:</p>

								<ul>

								<li>The main prescriptive content of the EMMA specification will

								consist of metadata: EMMA will provide a means to express the

								metadata annotations which require standardization. (Notice,

								however, that such annotations may express the relationship among

								all the types of data within an EMMA document.)</li>

								<li>The instance and its data model are assumed to be specified in

								XML, but EMMA will remain agnostic to the XML format used to

								express these. (The instance XML is assumed to be sufficiently

								structured to enable the association of annotative data.)</li>

								<li>The extensibility of EMMA lies in the ability for additional

								kinds of metadata to be included in application specific

								vocabularies. EMMA itself can be extended with application and

								vendor specific annotations contained within the

								<code>emma:info</code> element <span>(<a href="#s4.1.4">Section

								4.1.4</a>)</span>.</li>

								</ul>

								<p>The annotations of EMMA should be considered 'normative' in the

								sense that if an EMMA component produces annotations as described

								in <a href="#s3">Section 3</a> <span>and <a href="#s4">Section

								4</a></span>, these annotations must be represented using the EMMA

								syntax. The Multimodal Interaction Working Group may address in

								later drafts the issues of modularization and profiling; that is,

								which sets of annotations are to be supported by which classes of

								EMMA component.</p>

								<h3 id="s1.1">1.1 Uses of EMMA</h3>

								<p>The general purpose of EMMA is to represent information

								automatically extracted from a user's input by an interpretation

								component, where input is to be taken in the general sense of a

								meaningful user input in any modality supported by the platform.

								The reader should refer to the sample architecture in <span>W3C

								Multimodal Interaction Framework</span> <a href="#MMIF">[<span>MMI

								Framework</span>]</a>, which shows EMMA conveying content between

								user input modality components and an interaction manager.</p>

								<p>Components that generate EMMA markup:</p>

								<ol>

								<li>Speech recognizers</li>

								<li>Handwriting recognizers</li>

								<li>Natural language understanding engines</li>

								<li>Other input media interpreters (e.g. DTMF, pointing,

								keyboard)</li>

								<li>Multimodal integration component</li>

								</ol>

								<p>Components that use EMMA include:</p>

								<ol>

								<li>Interaction manager</li>

								<li>Multimodal integration component</li>

								</ol>

								<p>Although not a primary goal of EMMA, a platform may also choose

								to use this general format as the basis of a general semantic

								result that is carried along and filled out during each stage of

								processing. In addition, future systems may also potentially make

								use of this markup to convey abstract semantic content to be

								rendered into natural language by a natural language generation

								component.</p>

								<h3 id="s1.2">1.2 Terminology</h3>

								<dl>

								<dt id="anchor-point">anchor point</dt>

								<dd>When referencing an input interval with

								<code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code> allows you to specify

								whether the referenced anchor is the start or end of the

								interval.</dd>

								<dt id="annotation">annotation</dt>

								<dd>Information about the interpreted input, for example,

								timestamps, confidence scores, links to raw input, etc.</dd>

								<dt id="composite-input">composite input</dt>

								<dd>An input formed from several pieces, often in different modes,

								for example, a combination of speech and pen gesture, such as

								saying "zoom in here" and circling a region on a map.</dd>

								<dt id="confidence">confidence</dt>

								<dd>A numerical score describing the degree of certainty in a

								particular interpretation of user input.</dd>

								<dt id="data-model">data model</dt>

								<dd>For EMMA, a data model defines a set of constraints on possible

								interpretations of user input.</dd>

								<dt id="derivation">derivation</dt>

								<dd>Interpretations of user input are said to be derived from that

								input, and higher level interpretations may be derived from lower

								level ones. EMMA allows you to reference the user input or

								interpretation a given interpretation was derived from, see

								<a href="#semantic-interpretation"><em>semantic

								interpretation</em></a>.</dd>

								<dt id="dialog">dialog</dt>

								<dd>For EMMA, dialog can be considered as a sequence of

								interactions between

								a user and the application.</dd>

								<dt id="endpoint">endpoint</dt>

								<dd>In EMMA, this refers to a network location which is the source

								or recipient of an EMMA document. It should be noted that the usage

								of the term "endpoint" in this context is different from the way

								that the term is used in speech processing, where it refers to the

								end of a speech input.</dd>

								<dt id="gestures">gestures</dt>

								<dd>In multimodal applications gestures are communicative acts made

								by the user or application. An example is circling an area on a map

								to indicate a region of interest. Users may be able to gesture with

								a pen, keystrokes, hand movements, head

								movements, or sound. Gestures often form part of <a href=

								"#composite-input"><em>composite input</em></a>. Application

								gestures are typically animations and/or sound effects.</dd>

								<dt id="grammar">grammar</dt>

								<dd>A set of rules that describe a sequence of tokens expected in a

								given input. These can be used by speech and handwriting

								recognizers to increase recognition accuracy.</dd>

								<dt id="handwriting-recognition">handwriting recognition</dt>

								<dd>The process of converting pen strokes into text.</dd>

								<dt id="ink-recognition">ink recognition</dt>

								<dd>This includes the recognition of handwriting and pen

								gestures.</dd>

								<dt id="input-cost">input cost</dt>

								<dd>In EMMA, this refers to a numerical measure indicating the

								weight or processing cost associated with a user's input or part of

								their input.</dd>

								<dt id="input-device">input device</dt>

								<dd>The device proving a particular input, for example, a

								microphone, a pen, a mouse, a camera, or a keyboard.</dd>

								<dt id="input-function">input function</dt>

								<dd>In EMMA, this refers to <span>the</span> use a particular input

								is serving, for example, as part of a recording or transcription,

								as part of a dialog, or as a means to verify the user's

								identity.</dd>

								<dt id="input-medium">input medium</dt>

								<dd>Whether the input is acoustic, visual, or tactile, for

								instance, a spoken utterance is an example of an aural input, a

								hand gesture as seen by a camera is an example of a visual input,

								pointing with a mouse or pen is an example of a tactile input.</dd>

								<dt id="input-mode">input mode</dt>

								<dd>This distinguishes a particular means of providing an input

								within a general input medium, for example, speech, DTMF, ink, key

								strokes, video, photograph, etc.</dd>

								<dt id="input-source">input source</dt>

								<dd>This is the device that provided the input, for example a

								particular microphone or camera. EMMA allows you to identify these

								with a URI.</dd>

								<dt id="input-tokens">input tokens</dt>

								<dd>In EMMA, this refers to a sequence of characters, words or

								other discrete units of input.</dd>

								<dt id="instance-data">instance data</dt>

								<dd>A representation in XML of an interpretation of user

								input.</dd>

								<dt id="interaction-manager">interaction manager</dt>

								<dd>A processor that determines how an application interacts with a

								user. This can be at multiple levels of abstraction, for example,

								at a detailed level, determining what prompts to present to the

								user and what actions to take in response to user input, versus a

								higher level treatment in terms of goals and tasks for achieving

								those goals. Interaction managers are frequently event driven.</dd>

								<dt id="interpretation">interpretation</dt>

								<dd>In EMMA, an interpretation of user input refers to information

								derived from the user input that is meaningful to the

								application.</dd>

								<dt id="keystroke-input">keystroke input</dt>

								<dd>Input provided by the user pressing on a sequence of keys

								(buttons), such as a computer keyboard or keypad.</dd>

								<dt id="lattice">lattice</dt>

								<dd>A set of nodes interconnected with directed arcs such that by

								following an arc, you can never find yourself back at a node you

								have already visited (i.e. a directed acyclic graph). Lattices

								provide a flexible means to represent the results of speech and

								handwriting recognition, in terms of arcs representing words or

								character sequences. Different arcs from the same node represent

								different local hypotheses as to what the user said or wrote.</dd>

								<dt id="metadata">metadata</dt>

								<dd>Information describing another set of data, for instance, a

								library catalog card with information on the author, title and

								location of a book. EMMA is designed to support input processors in

								providing metadata for interpretations of user input.</dd>

								<dt id="multimodal-integration">multimodal integration</dt>

								<dd>The process of combining inputs from different modes to create

								an interpretation of composite input. This is also sometimes

								referred to as <em>multimodal fusion</em>.</dd>

								<dt id="multimodal-interaction">multimodal interaction</dt>

								<dd>The means for a user to interact with an application using more

								than one mode of interaction, for instance, offering the user the

								choice of speaking or typing, or in some cases, allowing the user

								to provide a composite input involving multiple modes.</dd>

								<dt id="natural-language-understanding">natural language

								understanding</dt>

								<dd>The process of interpreting text in terms that are useful for

								an application.</dd>

								<dt id="N-best-list">N-best list</dt>

								<dd>An N-best list is a list of the most likely hypotheses for what

								the user actually said or wrote, where N stands for an integral

								number such as 5 for the 5 most likely hypotheses.</dd>

								<dt id="raw-signal">raw signal</dt>

								<dd>An uninterpreted input, such as an audio waveform captured from

								a microphone.</dd>

								<dt id="semantic-interpretation">semantic interpretation</dt>

								<dd>A normalized representation of the meaning of a user input, for

								instance, mapping the speech for "San Francisco" into the airport

								code "SFO".</dd>

								<dt id="semantic-processor">semantic processor</dt>

								<dd>In EMMA, this refers to systems that can derive interpretations

								of user input, for instance, mapping the speech for "San Francisco"

								into the airport code "SFO".</dd>

								<dt id="signal-interpretation">signal interpretation</dt>

								<dd>The process of mapping a discrete or continuous signal into a

								symbolic representation that can be used by an application, for

								instance, transforming the audio waveform corresponding to someone

								saying "2005" into the number 2005.</dd>

								<dt id="speech-recognition">speech recognition</dt>

								<dd>The process of determining the textual transcription of a piece

								of speech.</dd>

								<dt id="speech-synthesis">speech synthesis</dt>

								<dd>The process of rendering a piece of text into the corresponding

								speech, i.e. synthesi<span>z</span>ing speech from text.</dd>

								<dt id="text-to-speech">text to speech</dt>

								<dd>The process of rendering a piece of text into the corresponding

								speech.</dd>

								<dt id="time-stamp">time stamp</dt>

								<dd>The time that a particular input or part of an input began or

								ended.</dd>

								<dt id="term-uri">URI: Uniform Resource Identifier</dt>

								<dd>A URI is a unifying syntax for the expression of names and

								addresses of objects on the network as used in the World Wide Web.

								<span>Within this specification, the term URI refers to a Universal

								Resource Identifier as defined in [<a href="#RFC3986">RFC3986</a>]

								and extended in [<a href="#RFC3987">RFC3987</a>] with the new name

								IRI. The term URI has been retained in preference to IRI to avoid

								introducing new names for concepts such as "Base URI" that are

								defined or referenced across the whole family of XML

								specifications</span>. A URI is defined as any legal

								<code>anyURI</code> primitive as defined in XML Schema Part 2:

								Datatypes Second Edition Section 3.2.17 [<a href=

								"#XSD2">SCHEMA2</a>].</dd>

								<dt id="user-input">user input</dt>

								<dd>An input provided by a user as opposed to something generated

								automatically.</dd>

								</dl>

								<h2 id="s2">2. Structure of EMMA documents</h2>

								<p>This section is <span>I</span>nformative.</p>

								<p>As noted above, the main components of an interpreted user input

								in EMMA are the instance data, an optional data model, and the

								metadata annotations that may be applied to that input. The

								realization of these components in EMMA is as follows:</p>

								<ul>

								<li><b>instance data</b> is contained within an EMMA

								<i>interpretation</i></li>

								<li>the <b>data model</b> is optionally specified as an annotation

								of that instance</li>

								<li>EMMA <b>annotations</b> may be applied at different levels of

								an EMMA document.</li>

								</ul>

								<p>An EMMA <i>interpretation</i> is the primary unit for holding

								user input as interpreted by an EMMA processor. As will be seen

								below, multiple interpretations of a single input are possible.</p>

								<p>EMMA provides a simple structural syntax for the organization of

								interpretations and instances, and an annotative syntax to apply

								the annotation to the input data at different levels.</p>

								<p>An outline of the structural syntax and annotations found in

								EMMA documents is as follows. A fuller definition may be found in

								the description of individual elements and attributes in <a href=

								"#s3"><span>S</span>ection 3</a> and <a href=

								"#s4"><span>S</span>ection 4</a>.</p>

								<ul>

								<li><b><a href="#s3">EMMA <span>s</span>tructural

								<span>e</span>lements</a></b> (<a href="#s3">Section 3</a>)

								<ul>

								<li><b><a href="#s3.1">Root element</a></b>: The root node of an

								EMMA document, the <code>emma:emma</code> element, holds EMMA

								version and namespace information, and provides a container for one

								or more of the following interpretation and container elements

								(<a href="#s3.1">Section 3.1</a>)</li>

								<li><b><a href="#s3.2">Interpretation element</a></b>: The

								<code>emma:interpretation</code> element contains a given

								interpretation of the input and holds application specific markup

								(<a href="#s3.2">Section 3.2</a>)</li>

								<li><b><a href="#s3.3">Container elements</a>:</b>

								<ul>

								<li><code>emma:one-of</code> is a container for one or more

								interpretation elements or container elements and denotes that

								these are mutually exclusive interpretations (<a href=

								"#s3.3.1">Section 3.3.1</a>)</li>

								<li><code>emma:group</code> is a general container for one or more

								interpretation elements or container elements. It can be associated

								with arbitrary grouping criteria (<a href="#s3.3.2">Section

								3.3.2</a>).</li>

								<li><code>emma:sequence</code> is a container for one or more

								interpretation elements or container elements and denotes that

								these are sequential in time (<a href="#s3.3.3">Section

								3.3.3</a>).</li>

								</ul>

								</li>

								<li><b><a href="#s3.4">Lattice element</a></b>: The

								<code>emma:lattice</code> element is used to contain a series of

								<code>emma:arc</code> and <code>emma:node</code> elements that

								define a lattice of words, gestures, meanings or other symbols. The

								<code>emma:lattice</code> element appears within the

								<code>emma:interpretation</code> element (<a href="#s3.4">Section

								3.4</a>)</li>

								<li><b><a href="#s3.5">Literal element</a></b>: The

								<code>emma:literal</code> element is used as a wrapper when the

								application semantics is a string literal. (<a href="#s3.5">Section

								3.5</a>)</li>

								</ul>

								</li>

								<li><b><a href="#s4">EMMA annotations</a></b> (<a href=

								"#s4">Section 4</a>)

								<ul>

								<li><b><a href="#s4.1">EMMA annotation elements</a></b>: These are

								EMMA annotations such as <code>emma:derived-from</code>,

								<code>emma:endpoint-info</code>, and <code>emma:info</code> which

								are represented as elements so that they can occur more than once

								within an element and can contain internal structure. (<a href=

								"#s4.1">Section 4.1</a>)</li>

								<li><b><a href="#s4.2">EMMA annotation attributes</a></b>: These

								are EMMA annotations such as <code>emma:start</code>,

								<code>emma:end</code> , <code>emma:confidence</code>, and

								<code>emma:tokens</code> which are represented as attributes. They

								can appear on <code>emma:interpretation</code> elements<span>.

								S</span>ome can appear on container elements, lattice elements, and

								elements in the application-specific markup. (<a href=

								"#s4.2">Section 4.2</a>)</li>

								</ul>

								</li>

								</ul>

								<p>From the defined root node <code>emma:emma</code> the structure

								of an EMMA document consists of a tree of EMMA container elements

								(<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:group</code>) terminating in a number of interpretation

								elements (<code>emma:interpretation</code>). The

								<code>emma:interpretation</code> elements serve as wrappers for

								either application namespace markup describing the interpretation

								of the users input or an <code>emma:lattice</code> element or

								<code>emma:literal</code> element . A single

								<code>emma:interpretation</code> may also appear directly under the

								root node.</p>


								<p>

								The EMMA elements

								<code>emma:emma</code>,

								<code>emma:interpretation</code>,

								<code>emma:one-of</code>,

								and <code>emma:literal</code>

								and the EMMA attributes

								<code>emma:no-input</code>,

								<code>emma:uninterpreted</code>,

								<code>emma:medium</code>,

								and <code>emma:mode</code>

								are required of all

								implementations.  The remaining elements and attributes are optional

								and may be used in some implementations and not other depending on the

								specific modalities and processing being represented.

								</p>


								<p>To illustrate this, here is an example <span class="new">of

								an</span> EMMA document <span class="new">representing</span> input

								to a flight reservation application. In this example there are two

								speech recognition results and associated semantic representations

								of the input. The system is uncertain whether the user meant

								"flights from Boston to Denver" or "flights from Austin to Denver".

								The annotations to be captured are timestamps and confidence scores

								for the two inputs.</p>

								<p>Example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="r1" emma:start="1087995961542" emma:end="1087995963542"

								<span>     emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1" emma:confidence="0.75"

								    emma:tokens="flights from boston to denver"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:confidence="0.68"

								    emma:tokens="flights from austin to denver"&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Attributes on the root <code>emma:emma</code> element indicate

								the version and namespace. The <code>emma:emma</code> element

								contains an <code>emma:one-of</code> element which contains a

								disjunctive list of possible interpretations of the input. The

								actual semantic representation of each interpretation is within the

								application namespace. In the example here the application specific

								semantics involves elements <code>origin</code> and

								<code>destination</code> indicating the origin and destination

								cities for looking up a flight. The timestamp is the same for both

								interpretations and it is annotated using values in milliseconds in

								the <code>emma:start</code> and <code>emma:end</code> attributes on

								the <code>emma:one-of</code>. The confidence scores and tokens

								associated with each of the inputs are annotated using the EMMA

								annotation attributes <code>emma:confidence</code> and

								<code>emma:tokens</code> on each of the

								<code>emma:interpretation</code> elements.</p>

								<h3 id="s2.1">2.<span>1</span> Data model</h3>

								<p>An EMMA data model expresses the constraints on the structure

								and content of instance data, for the purposes of validation. As

								such, the data model may be considered as a particular kind of

								annotation (although, unlike other EMMA annotations, it is not a

								feature pertaining <span>to</span> a specific user input at a

								specific moment in time, it is rather a static and, by its very

								definition, application-specific structure). <span>The</span>

								specification of <span>a data model</span> in EMMA is optional.</p>

								<p>Since Web applications today use different formats to specify

								data models, e.g. <span>XML Schema Part 1: Structures Second

								Edition</span> [<a href="#XSD1">XML Schema

								<span>Structures</span></a>], XForms <span>1.0 (Second

								Edition)</span> [<a href="#XFORMS">XFORMS</a>], <span>RELAX NG

								Specification</span> [<a href="#RELAXNG">RELAX-NG</a>], etc., EMMA

								itself is agnostic to the format of data model used.</p>

								<p>Data model definition and reference is defined in <a href=

								"#s4.1.1">Section 4.1.1</a>.</p>

								<h3 id="s2.2">2.<span>2</span> EMMA namespace prefixes</h3>

								<p>An EMMA attribute is qualified with the EMMA namespace prefix if

								the attribute can also be used as an in-line annotation on elements

								in the application's namespace. Most of the EMMA annotation

								attributes in <a href="#s4.2">Section 4.2</a> are in this category.

								An EMMA attribute is not qualified with the EMMA namespace prefix

								if the attribute only appears on an EMMA element. This rule ensures

								consistent usage of the attributes across all examples.</p>

								<p>Attributes from other namespaces are permissible on all EMMA

								elements. As an example <code>xml:lang</code> may be used to

								annotate the human language of character data content.</p>

								<h2 id="s3">3. EMMA structural elements</h2>

								<p>This section defines elements in the EMMA namespace which

								provide the structural syntax of EMMA documents.</p>

								<h3 id="s3.1">3.1 Root element: <code>emma:emma</code></h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:emma</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The root element of an EMMA document.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:emma</code> element MUST immediately contain a

								single <code>emma:interpretation</code> element or EMMA container

								element: <code>emma:one-of</code>, <code>emma:group</code>,

								<code>emma:sequence</code>. It MAY also contain an optional single

								<code>emma:derivation</code> element and an optional single

								<code>emma:info</code> annotation element. It MAY also contain

								multiple optional <code>emma:grammar</code> annotation elements,

								<code>emma:model</code> annotation elements, and

								<code>emma:endpoint-info</code> annotation elements.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>version</code>: the version of EMMA used for the

								interpretation(s). Interpretations expressed using this

								specification MUST use <code>1.0</code> for the value.</li>

								<li>Namespace declaration for EMMA, see below.</li>

								</ul>

								</li>

								<li><b>Optional</b>:

								<ul>

								<li>any other namespace declarations for application specific

								namespaces.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>None</td>

								</tr>

								</tbody>

								</table>

								<p>The root element of an EMMA document is named

								<code>emma:emma</code>. It holds a single

								<code>emma:interpretation</code> or EMMA container element

								(<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:group</code>). It MAY also contain a single

								<code>emma:derivation</code> element containing earlier stages of

								the processing of the input (See <a href="#s4.1.2">Section

								4.1.2</a>). It MAY also contain an optional single annotation

								element: <code>emma:info</code> and multiple optional

								<code>emma:grammar</code>, <code>emma:model</code>, and

								<code>emma:endpoint-info</code> elements.</p>

								<p>It MAY hold attributes for information pertaining to EMMA

								itself, along with any namespaces which are declared for the entire

								document, and any other EMMA annotative data. The

								<code>emma:emma</code> element and other elements and attributes

								defined in this specification belong to the XML namespace

								identified by the URI "http://www.w3.org/2003/04/emma". In the

								examples, the EMMA namespace is generally declared using the

								attribute <code>xmlns:emma</code> on the root

								<code>emma:emma</code> element. EMMA processors MUST support the

								full range of ways of declaring XML namespaces as defined by the

								<span>Namespaces in XML 1.1 (Second Edition)</span> [<a href=

								"#XMLNS">XMLNS</a>]. Application markup MAY be declared in an

								explicit application namespace, or an undefined namespace

								(equivalent to setting xmlns="").</p>

								<p>For example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"&gt;

								    ....

								&lt;/emma:emma&gt;

								</pre>

								<p>or</p>

								<pre class="example">

								&lt;emma version="1.0" xmlns="http://www.w3.org/2003/04/emma"&gt;

								    ....

								&lt;/emma&gt;

								</pre>

								<h3 id="s3.2">3.2 Interpretation element:

								<code>emma:interpretation</code></h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:interpretation</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The <code>emma:interpretation</code> element acts as a wrapper

								for application instance data or lattices.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:interpretation</code> element MUST immediately

								contain either application instance data, or a single

								<code>emma:lattice</code> element, or a single

								<code>emma:literal</code> element, or in the case of uninterpreted

								input or no input <code>emma:interpretation</code>

								<span>MUST</span> be empty. It MAY also contain <span>multiple

								optional</span> <code>emma:derived-from</code>

								element<span>s</span> and <span>an optional single</span>

								<code>emma:info</code> <span>element</span>.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>: Attribute <code>id</code> of type

								<code>xsd:ID</code> that uniquely identifies the interpretation

								within the EMMA document.</li>

								<li><b>Optional</b>: The annotation attributes:

								<code>emma:tokens</code>, <code>emma:process</code>,

								<code>emma:no-input</code>, <code>emma:uninterpreted</code>,

								<code>emma:lang</code>, <code>emma:signal</code>,

								<code><span>emma:signal-size</span></code>,

								<code>emma:media-type</code>, <code>emma:confidence</code>,

								<code>emma:source</code>, <code>emma:start</code>,

								<code>emma:end</code>, <code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code>, <code>emma:duration</code>,

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>,

								<code>emma:cost</code>, <code>emma:grammar-ref</code>,

								<code>emma:endpoint-info-ref</code>, <code>emma:model-ref</code>,

								<code>emma:dialog-turn</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:interpretation</code> element is legal only as a

								child of <code>emma:emma</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, or

								<code>emma:derivation</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:interpretation</code> element holds a single

								interpretation represented in application specific markup, or a

								single <code>emma:lattice</code> element, or a single

								<code>emma:literal</code> element.</p>

								<p>The <code>emma:interpretation</code> element MUST be empty if it

								is marked with <code>emma:no-input="true"</code> <span>(<a href=

								"#s4.2.3">Section 4.2.3</a>)</span>. The

								<code>emma:interpretation</code> element <span>MUST</span> be empty

								if it has been annotated with

								<code>emma:uninterpreted="true"</code> <span>(<a href=

								"#s4.2.4">Section 4.2.4</a>)</span> or

								<code>emma:function="recording"</code> <span>(<a href=

								"#s4.2.11">Section 4.2.11</a>)</span>.</p>

								<p>Attributes:</p>

								<ol>

								<li><b>id</b> a REQUIRED <code>xsd:ID</code> value that uniquely

								identifies the interpretation within the EMMA document.</li>

								</ol>

								<pre class="example">

								&lt;emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="r1" emma:medium="acoustic" emma:mode="voice"&gt;

								    ...

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>While <code>emma:medium</code> and <code>emma:mode</code> are

								optional on <code>emma:interpretation</code>, note that all EMMA

								interpretations must be annotated for <code>emma:medium</code> and

								<code>emma:mode</code>, so either these attributes must appear

								directly on <code>emma:interpretation</code> or they must appear on

								an ancestor <code>emma:one-of</code> node or they must appear on an

								earlier stage of the derivation listed in

								<code>emma:derivation</code>.</p>

								<h3 id="s3.3">3.3 Container elements</h3>

								<h3 id="s3.3.1">3.3.1 <code>emma:one-of</code> element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:one-of</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>A container element indicating a disjunction among a collection

								of mutually exclusive interpretations of the input.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:one-of</code> element MUST immediately contain a

								collection of one or more <code>emma:interpretation</code> elements

								or container elements: <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code> . It MAY also

								contain <span>multiple optional</span>

								<code>emma:derived-from</code> element<span>s</span> and <span>an

								optional single</span> <code>emma:info</code>

								<span>element</span>.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li>Attribute <code>id</code> of type <code>xsd:ID</code></li>

								<li>The attribute <code>disjunction-type</code> MUST be present if

								<code>emma:one-of</code> is embedded within

								<code>emma:one-of</code>. <span>The possible values of

								<code>disjunction-type</code> are {<code>recognition</code>,

								<code>understanding</code>, <code>multi-device</code>, and

								<code>multi-process</code>}.</span></li>

								</ul>

								</li>

								<li><b>Optional</b>:

								<ul>

								<li>On a single non-embedded <code>emma:one-of</code> the attribute

								<code>disjunction-type</code> is optional.</li>

								<li>The following annotation attributes are optional:

								<code>emma:tokens</code>, <code>emma:process</code>,

								<code>emma:lang</code>, <code>emma:signal</code>,

								<code><span>emma:signal-size</span></code>,

								<code>emma:media-type</code>, <code>emma:confidence</code>,

								<code>emma:source</code>, <code>emma:start</code>,

								<code>emma:end</code>, <code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code>, <code>emma:duration</code>,

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>,

								<code>emma:cost</code>, <code>emma:grammar-ref</code>,

								<code>emma:endpoint-info-ref</code>, <code>emma:model-ref</code>,

								<code>emma:dialog-turn</code>.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:one-of</code> element MAY only appear as a child

								of <code>emma:emma</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>, or

								<code>emma:derivation</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:one-of</code> element acts as a container for a

								collection of one or more interpretation

								(<code>emma:interpretation</code>) or container elements

								(<code>emma:one-of</code>, <code>emma:group</code>,

								<code>emma:sequence</code>), and denotes that these are mutually

								exclusive interpretations.</p>

								<p>An N-best list of choices in EMMA MUST be represented as a set

								of <code>emma:interpretation</code> elements contained within an

								<code>emma:one-of</code> element. For instance, a series of

								different recognition results in speech recognition might be

								represented in this way.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="r1" <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2"&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The function of the <code>emma:one-of</code> element is to

								represent a disjunctive list of possible interpretations of a user

								input. A disjunction of possible interpretations of an input can be

								the result of different kinds of processing or ambiguity. One

								source is multiple results from a recognition technology such as

								speech or handwriting recognition. Multiple results can also occur

								from parsing or understanding natural language. Another possible

								source of ambiguity is from the application of multiple different

								kinds of recognition or understanding components to the same input

								signal. For example, an single ink input signal might be processed

								by both handwriting recognition and gesture recognition. Another is

								the use of more than one recording device for the same input

								(multiple microphones).</p>

								<p>In order to make explicit these different kinds of multiple

								interpretations and allow for concise statement of the annotations

								associated with each, the <code>emma:one-of</code> element MAY

								appear within another <code>emma:one-of</code> element. If

								<code>emma:one-of</code> elements are nested then they MUST

								indicate the kind of disjunction using the attribute

								<code>disjunction-type</code>. The values of

								<code>disjunction-type</code> are <code>{recognition,

								understanding, multi-device, and multi-process}</code>. For the

								most common use case, where there are multiple recognition results

								and some of them have multiple interpretations, the top-level

								<code>emma:one-of</code> is

								<code>disjunction-type="recognition"</code> and the embedded

								<code>emma:one-of</code> has the attribute

								<code>disjunction-type="understanding"</code>.</p>

								<p>As an example, in an interactive flight reservation application,

								recognition yielded 'Boston' or 'Austin' and each had a semantic

								interpretation as either the assertion of city name or the

								specification of a flight query with the city as the destination,

								this would be represented as follows in EMMA:</p>

								<pre class="example">

								<span>

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of disjunction-type="recognition"

								      start="12457990" end="12457995"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								     &lt;emma:one-of disjunction-type="understanding"

								         emma:tokens="boston"&gt;

								       &lt;emma:interpretation&gt;

								          &lt;assert&gt;&lt;city&gt;boston&lt;/city&gt;&lt;/assert&gt;

								       &lt;/emma:interpretation&gt;

								       &lt;emma:interpretation&gt;

								          &lt;flight&gt;&lt;dest&gt;&lt;city&gt;boston&lt;/city&gt;&lt;/dest&gt;&lt;/flight&gt;

								       &lt;/emma:interpretation&gt;

								     &lt;/emma:one-of&gt;

								     &lt;emma:one-of disjunction-type="understanding"

								         emma:tokens="austin"&gt;

								       &lt;emma:interpretation&gt;

								          &lt;assert&gt;&lt;city&gt;austin&lt;/city&gt;&lt;/assert&gt;

								       &lt;/emma:interpretation&gt;

								       &lt;emma:interpretation&gt;

								          &lt;flight&gt;&lt;dest&gt;&lt;city&gt;austin&lt;/city&gt;&lt;/dest&gt;&lt;/flight&gt;

								       &lt;/emma:interpretation&gt;

								     &lt;/emma:one-of&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</span>

								</pre>

								<p>EMMA MAY explicitly represent ambiguity resulting from different

								processes, devices, or sources using embedded

								<code>emma:one-of</code> and the <code>disjunction-type</code>

								attribute. Multiple different interpretations resulting from

								different factors MAY also be listed within a single unstructured

								<code>emma:one-of</code> though in this case it is more complex or

								impossible to uncover the sources of the ambiguity if required by

								later stages of processing. If there is no embedding in

								<code>emma:one-of</code>, then the <code>disjunction-type</code>

								attribute is not required. If the <code>disjunction-type</code>

								attribute is missing then by default the source of disjunction is

								unspecified.</p>

								<p>The example case above could also be represented as:</p>

								<pre class="example">

								<span>

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of  start="12457990" end="12457995"

								<span>         emma:medium="acoustic" emma:mode="voice"</span>&gt;

								     &lt;emma:interpretation emma:tokens="boston"&gt;

								        &lt;assert&gt;&lt;city&gt;boston&lt;/city&gt;&lt;/assert&gt;

								     &lt;/emma:interpretation&gt;

								     &lt;emma:interpretation &gt;

								        &lt;flight&gt;&lt;dest&gt;&lt;city&gt;boston&lt;/city&gt;&lt;/dest&gt;&lt;/flight&gt;

								     &lt;/emma:interpretation&gt;

								     &lt;emma:interpretation emma:tokens="austin"&gt;

								        &lt;assert&gt;&lt;city&gt;austin&lt;/city&gt;&lt;/assert&gt;

								     &lt;/emma:interpretation&gt;

								     &lt;emma:interpretation emma:tokens="austin"&gt;

								        &lt;flight&gt;&lt;dest&gt;&lt;city&gt;austin&lt;/city&gt;&lt;/dest&gt;&lt;/flight&gt;

								     &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</span>

								</pre>

								<p>But in this case information about which interpretations

								resulted from speech recognition and which resulted from language

								understanding is lost.</p>

								<p>A list of <code>emma:interpretation</code> elements within an

								<code>emma:one-of</code> MUST be sorted best-first by some measure

								of quality. The quality measure is <code>emma:confidence</code> if

								present, otherwise, the quality metric is platform-specific.</p>

								<p>With embedded <code>emma:one-of</code> structures there is no

								requirement for the confidence scores within different

								<code>emma:one-of</code> to be on the same scale. For example, the

								scores assigned by handwriting recognition might not be comparable

								to those assigned by gesture recognition. Similarly, if multiple

								recognizers are used there is no guarantee that their confidence

								scores will be comparable. For this reason the ordering requirement

								on <code>emma:interpretation</code> within <code>emma:one-of</code>

								only applies locally to sister <code>emma:interpretation</code>

								elements within each <code>emma:one-of</code>. There is no

								requirement on the ordering of embedded <code>emma:one-of</code>

								elements within a higher <code>emma:one-of</code> element.</p>

								<p>While <code>emma:medium</code> and <code>emma:mode</code> are

								optional on <code>emma:one-of</code>, note that all EMMA

								interpretations must be annotated for <code>emma:medium</code> and

								<code>emma:mode</code>, so either these annotations must appear

								directly on all of the contained <code>emma:interpretation</code>

								elements within the <code>emma:one-of</code>, or they must appear

								on the <code>emma:one-of</code> element itself, or they must appear

								on an ancestor <code>emma:one-of</code> element, or they must

								appear on an earlier stage of the derivation listed in

								<code>emma:derivation</code>.</p>

								<h3 id="s3.3.2">3.3.2 <code>emma:group</code> element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:group</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>A container element indicating that a number of interpretations

								of distinct user inputs are grouped according to some

								criteria.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:group</code> element MUST immediately contain a

								collection of one or more <code>emma:interpretation</code> elements

								or container elements: <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code> . It MAY also

								contain an <span>optional single</span>

								<code>emma:group-info</code> element. It MAY also contain

								<span>multiple optional</span> <code>emma:derived-from</code>

								element<span>s</span> and <span>an optional single</span>

								<code>emma:info</code> <span>element</span>.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>: Attribute <code>id</code> of type

								<code>xsd:ID</code></li>

								<li><b>Optional</b>: The annotation attributes:

								<code>emma:tokens</code>, <code>emma:process</code>,

								<code>emma:lang</code>, <code>emma:signal</code>,

								<code><span>emma:signal-size</span></code>,

								<code>emma:media-type</code>, <code>emma:confidence</code>,

								<code>emma:source</code>, <code>emma:start</code>,

								<code>emma:end</code>, <code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code>, <code>emma:duration</code>,

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>,

								<code>emma:cost</code>, <code>emma:grammar-ref</code>,

								<code>emma:endpoint-info-ref</code>, <code>emma:model-ref</code>,

								<code>emma:dialog-turn</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:group</code> element is legal only as a child of

								<code>emma:emma</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>, or

								<code>emma:derivation</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:group</code> element is used to indicate that the

								contained interpretations are from distinct user inputs that are

								related in some manner. <code>emma:group</code> MUST NOT be used

								for containing the multiple stages of processing of a single user

								input. Those MUST be contained in the <code>emma:derivation</code>

								element instead <span>(<a href="#s4.1.2">Section 4.1.2</a>)</span>.

								For groups of inputs in temporal order the more specialized

								container <code>emma:sequence</code> MUST be used <span>(<a href=

								"#s3.3.3">Section 3.3.3</a>)</span>. The following example shows

								three interpretations derived from the speech input "Move this

								ambulance here" and the tactile input related to two consecutive

								points on a map.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:group id="grp"

								      emma:start="1087995961542"

								      emma:end="1087995964542"&gt;

								    &lt;emma:interpretation id="int1"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								      &lt;action&gt;move&lt;/action&gt;

								      &lt;object&gt;ambulance&lt;/object&gt;

								      &lt;destination&gt;here&lt;/destination&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2"

								      <span>emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.253&lt;/x&gt;

								      &lt;y&gt;0.124&lt;/y&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int3"

								      <span>emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.866&lt;/x&gt;

								      &lt;y&gt;0.724&lt;/y&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:group&gt;

								&lt;/emma:emma&gt;


								</pre>

								<p>The <code>emma:one-of</code> and <code>emma:group</code>

								containers MAY be nested arbitrarily.</p>

								<h4 id="s3.3.2.1">3.3.2.1 Indirect grouping criteria:

								<code>emma:group-info</code> element</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:group-info</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The <code>emma:group-info</code> element contains or references

								criteria used in establishing the grouping of interpretations in an

								<code>emma:group</code> element.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:group-info</code> element MUST either

								immediately contain inline instance data specifying grouping

								criteria or have the attribute <code>ref</code> referencing the

								criteria.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Optional</b>: <code>ref</code> of type

								<code>xsd:anyURI</code> referencing the grouping criteria;

								alternatively the criteria MAY be provided inline as the content of

								the <code>emma:group-info</code> element.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:group-info</code> element is legal only as a

								child of <code>emma:group</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>Sometimes it may be convenient to indirectly associate a given

								group with information, such as grouping criteria. The

								<code>emma:group-info</code> element might be used to make explicit

								the criteria by which members of a group are associated. In the

								following example, a group of two points is associated with a

								description of grouping criteria based upon a sliding temporal

								window of two seconds duration.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"

								    xmlns:ex="http://www.example.com/ns/group"&gt;

								  &lt;emma:group id="grp"&gt;

								    &lt;emma:group-info&gt;

								      &lt;ex:mode&gt;temporal&lt;/ex:mode&gt;

								      &lt;ex:duration&gt;2s&lt;/ex:duration&gt;

								    &lt;/emma:group-info&gt;


								    &lt;emma:interpretation id="int1"

								<span>      emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.253&lt;/x&gt;

								      &lt;y&gt;0.124&lt;/y&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2"

								      <span>emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.866&lt;/x&gt;

								      &lt;y&gt;0.724&lt;/y&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:group&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>You might also use <code>emma:group-info</code> to refer to a

								named grouping criterion using external reference, for

								instance:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"

								    xmlns:ex="http://www.example.com/ns/group"&gt;

								  &lt;emma:group id="grp"&gt;

								    &lt;emma:group-info ref="http://www.example.com/criterion42"/&gt;

								    &lt;emma:interpretation id="int1"

								      <span>emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.253&lt;/x&gt;

								      &lt;y&gt;0.124&lt;/y&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2"

								      <span>emma:medium="tactile" emma:mode="ink"</span>&gt;

								      &lt;x&gt;0.866&lt;/x&gt;

								      &lt;y&gt;0.724&lt;/y&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:group&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s3.3.3">3.3.3 <code>emma:sequence</code> element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:sequence</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>A container element indicating that a number of interpretations

								of distinct user inputs are in temporal sequence.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:sequence</code> element MUST immediately contain

								a collection of one or more <code>emma:interpretation</code>

								elements or container elements: <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code> . It MAY also

								contain <span>multiple optional</span>

								<code>emma:derived-from</code> element<span>s</span> and <span>an

								optional single</span> <code>emma:info</code>

								<span>element</span>.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>: Attribute <code>id</code> of type

								<code>xsd:ID</code></li>

								<li><b>Optional</b>: The annotation attributes:

								<code>emma:tokens</code>, <code>emma:process</code>,

								<code>emma:lang</code>, <code>emma:signal</code>,

								<code><span>emma:signal-size</span></code>,

								<code>emma:media-type</code>, <code>emma:confidence</code>,

								<code>emma:source</code>, <code>emma:start</code>,

								<code>emma:end</code>, <code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code>, <code>emma:duration</code>,

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>,

								<code>emma:cost</code>, <code>emma:grammar-ref</code>,

								<code>emma:endpoint-info-ref</code>, <code>emma:model-ref</code>,

								<code>emma:dialog-turn</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:sequence</code> element is legal only as a child

								of <code>emma:emma</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>, or

								<code>emma:derivation</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:sequence</code> element is used to indicate that

								the contained interpretations are sequential in time, as in the

								following example, which indicates that two points made with a pen

								are in temporal order.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:sequence id="seq1"&gt;

								    &lt;emma:interpretation id="int1"

								        <span>emma:medium="tactile"</span> emma:mode="ink"&gt;

								      &lt;x&gt;0.253&lt;/x&gt;

								      &lt;y&gt;0.124&lt;/y&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2"

								        <span>emma:medium="tactile"</span> emma:mode="ink"&gt;

								      &lt;x&gt;0.866&lt;/x&gt;

								      &lt;y&gt;0.724&lt;/y&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:sequence&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:sequence</code> container MAY be combined with

								<code>emma:one-of</code> and <code>emma:group</code> in arbitrary

								nesting structures. The order of children in the content of the

								<code>emma:sequence</code> element corresponds to a sequence of

								interpretations. This ordering does not imply any particular

								definition of sequentiality. EMMA processors are expected therefore

								to use the <code>emma:sequence</code> element to hold

								interpretations which are either strictly sequential in nature

								(e.g. the end-time of an interpretation precedes the start-time of

								its follower), or which overlap in some manner (e.g. the start-time

								of a follower interpretation precedes the end-time of its

								precedent). It is possible to use timestamps to provide fine

								grained annotation for the sequence of interpretations that are

								sequential in time <span>(see <a href="#s4.2.10">Section

								4.2.10)</a></span>.</p>

								<p>In the following more complex example, a sequence of two pen

								gestures in <code>emma:sequence</code> and a speech input in

								<code>emma:interpretation</code> <span>is</span> contained in an

								<code>emma:group</code>.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:group id="grp"&gt;

								     &lt;emma:interpretation id="int1" emma:medium="acoustic"

								         emma:mode="voice"&gt;

								       &lt;action&gt;move&lt;/action&gt;

								       &lt;object&gt;this-battleship&lt;/object&gt;

								       &lt;destination&gt;here&lt;/destination&gt;

								     &lt;/emma:interpretation&gt;


								     &lt;emma:sequence id="seq1"&gt;

								       &lt;emma:interpretation id="int2" emma:medium="tactile"

								           emma:mode="ink"&gt;

								         &lt;x&gt;0.253&lt;/x&gt;

								         &lt;y&gt;0.124&lt;/y&gt;

								       &lt;/emma:interpretation&gt;


								     &lt;emma:interpretation id="int3" emma:medium="tactile"

								         emma:mode="ink"&gt;

								       &lt;x&gt;0.866&lt;/x&gt;

								       &lt;y&gt;0.724&lt;/y&gt;

								     &lt;/emma:interpretation&gt;

								   &lt;/emma:sequence&gt;

								 &lt;/emma:group&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s3.4">3.4 Lattice element</h3>

								<p>In addition to providing the ability to represent N-best lists

								of interpretations using <code>emma:one-of</code>, EMMA also

								provides the capability to represent lattices of words or other

								symbols using the <code>emma:lattice</code> element. Lattices

								provide a compact representation of large lists of possible

								recognition results or interpretations for speech, pen, or

								multimodal inputs.</p>

								<p>In addition to providing a representation for lattice output

								from speech recognition, another important use case for lattices is

								for representation of the results of gesture and handwriting

								recognition from a pen modality component. Lattices can also be

								used to compactly represent multiple possible meaning

								representations. Another use case for the lattice representation is

								for associating confidence scores and other annotations with

								individual words within a speech recognition result string.</p>

								<p>Lattices are compactly described by a list of transitions

								between nodes. For each transition the start and end nodes MUST be

								defined, along with the label for the transition. Initial and final

								nodes MUST also be indicated. The following figure provides a

								graphical representation of a speech recognition lattice which

								compactly represents eight different sequences of words.</p>

								<p><img alt="speech lattice" src="lattice.png" /></p>

								<p>which expands to:</p>

								<pre>

								a. flights to boston from portland today please

								b. flights to austin from portland today please

								c. flights to boston from oakland today please

								d. flights to austin from oakland today please

								e. flights to boston from portland tomorrow

								f. flights to austin from portland tomorrow

								g. flights to boston from oakland tomorrow

								h. flights to austin from oakland tomorrow

								</pre>

								<h4 id="s3.4.1">3.4.1 Lattice markup: <code>emma:lattice</code>,

								<code>emma:arc</code>, <code>emma:node</code> elements</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:lattice</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element which encodes a lattice representation of user

								input.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:lattice</code> element MUST immediately contain

								one or more <code>emma:arc</code> elements and zero or more

								<code>emma:node</code> elements.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>initial</code> <span>of type

								<code>xsd:nonNegativeInteger</code></span> indicating the number of

								the initial node of the lattice.</li>

								<li><code>final</code> contains a space-separated list of

								<code>xsd:nonNegativeInteger</code> indicating the numbers of the

								final nodes in the lattice.</li>

								</ul>

								</li>

								<li><b>Optional</b>: <code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:lattice</code> element is legal only as a child

								of the <code>emma:interpretation</code> element.</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:arc</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element which encodes a transition between two nodes in a

								lattice. The label associated with the arc in the lattice is

								represented in the content of <code>emma:arc</code>.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>The <code>emma:arc</code> element MUST immediately contain

								either character data or a single application namespace element or

								be empty, in the case of epsilon transitions. It MAY contain an

								<code>emma:info</code> element containing application or vendor

								specific annotations.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>from</code> <span>of type

								<code>xsd:nonNegativeInteger</code></span> indicating the number of

								the starting node for the arc.</li>

								<li><code>to</code> <span>of type

								<code>xsd:nonNegativeInteger</code></span> indicating the number of

								the ending node for the arc.</li>

								</ul>

								</li>

								<li><b>Optional</b>: <code>emma:start</code>,

								<code>emma:end</code>, <code>emma:offset-to-start</code>,

								<code>emma:duration</code>, <code>emma:confidence</code>,

								<code>emma:cost</code>, <code>emma:lang</code>,

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:source</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:arc</code> element is legal only as a child of

								the <code>emma:lattice</code> element.</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:node</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element which represents a node in the lattice. The

								<code>emma:node</code> elements are not required to describe a

								lattice but might be added to provide a location for annotations on

								nodes in a lattice. There MUST be at most one

								<code>emma:node</code> specification for each numbered node in the

								lattice.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>An OPTIONAL <code>emma:info</code> element for application or

								vendor specific annotations on the node.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>node-number</code> <span>of type

								<code>xsd:nonNegativeInteger</code></span> indicating the

								<span>node number</span> in the lattice.</li>

								</ul>

								</li>

								<li><b>Optional</b>: <code>emma:confidence</code>,

								<code>emma:cost</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:node</code> element is legal only as a child of

								the <code>emma:lattice</code> element.</td>

								</tr>

								</tbody>

								</table>

								<p>In EMMA, a lattice is represented using an element

								<code>emma:lattice</code>, which has attributes

								<code>initial</code> and <code>final</code> for indicating the

								initial and final nodes of the lattice. For the lattice

								<span>below</span>, this will be: <code>&lt;emma:lattice

								initial="1" final="8"/&gt;</code>. The nodes are numbered with

								integers. If there is more than one distinct final node in the

								lattice the nodes MUST be represented as a space separated list in

								the value of the <code>final</code> attribute e.g.

								<code>&lt;emma:lattice initial="1" final="9 10 23"/&gt;</code>.

								There MUST only be one initial node in an EMMA lattice. Each

								transition in the lattice is represented as an element

								<code>emma:arc</code> with attributes <code>from</code> and

								<code>to</code> which indicate the nodes where the transition

								starts and ends. The arc's label is represented as the content of

								the <code>emma:arc</code> element and MUST be any well-formed

								character or XML content. In the example here the contents are

								words. Empty (epsilon) transitions in a lattice MUST be represented

								in the <code>emma:lattice</code> representation as

								<code>emma:arc</code> <span>empty</span> elements, e.g.

								<code>&lt;emma:arc from="1" to="8"/&gt;</code>.</p>

								<p>The example speech lattice above would be represented in EMMA

								markup as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice initial="1" final="8"&gt;

								      &lt;emma:arc from="1" to="2"&gt;flights&lt;/emma:arc&gt;


								      &lt;emma:arc from="2" to="3"&gt;to&lt;/emma:arc&gt;

								      &lt;emma:arc from="3" to="4"&gt;boston&lt;/emma:arc&gt;

								      &lt;emma:arc from="3" to="4"&gt;austin&lt;/emma:arc&gt;

								      &lt;emma:arc from="4" to="5"&gt;from&lt;/emma:arc&gt;


								      &lt;emma:arc from="5" to="6"&gt;portland&lt;/emma:arc&gt;

								      &lt;emma:arc from="5" to="6"&gt;oakland&lt;/emma:arc&gt;

								      &lt;emma:arc from="6" to="7"&gt;today&lt;/emma:arc&gt;

								      &lt;emma:arc from="7" to="8"&gt;please&lt;/emma:arc&gt;


								      &lt;emma:arc from="6" to="8"&gt;tomorrow&lt;/emma:arc&gt;

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Alternatively, if we wish to represent the same information as

								an N-best list using <code>emma:one-of,</code> we would have the

								more verbose representation:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="nbest1" <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="interp1"&gt;

								      &lt;text&gt;flights to boston from portland today please&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretationid="interp2"&gt;

								      &lt;text&gt;flights to boston from portland tomorrow&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp3"&gt;

								      &lt;text&gt;flights to austin from portland today please&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp4"&gt;

								      &lt;text&gt;flights to austin from portland tomorrow&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp5"&gt;

								      &lt;text&gt;flights to boston from oakland today please&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp6"&gt;

								      &lt;text&gt;flights to boston from oakland tomorrow&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp7"&gt;

								      &lt;text&gt;flights to austin from oakland today please&lt;/text&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp8"&gt;

								      &lt;text&gt;flights to austin from oakland tomorrow&lt;/text&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The lattice representation avoids the need to enumerate all of

								the possible word sequences. Also, as detailed below, the

								<code>emma:lattice</code> representation enables placement of

								annotations on individual words in the input.</p>

								<p>For use cases involving the representation of gesture/ink

								lattices and use cases involving lattices of semantic

								interpretations, EMMA allows for application namespace elements to

								appear within <code>emma:arc</code>.</p>

								<p>For example a sequence of two gestures, each of which is

								recognized as either a line or a circle<span>,</span> might be

								represented as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice initial="1" final="3"&gt;

								      &lt;emma:arc from="1" to="2"&gt;

								        &lt;circle radius="100"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="2" to="3"&gt;

								        &lt;line length="628"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="1" to="2"&gt;

								        &lt;circle radius="200"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="2" to="3"&gt;

								        &lt;line length="1256"/&gt;

								      &lt;/emma:arc&gt;

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>As an example of a lattice of semantic interpretations, in a

								travel application where the source is either "Boston" or

								"Austin"and the destination is either "Newark" or "New York", the

								possibilities might be represented in a lattice as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice initial="1" final="3"&gt;

								      &lt;emma:arc from="1" to="2"&gt;

								        &lt;source city="boston"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="2" to="3"&gt;

								        &lt;destination city="newark"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="1" to="2"&gt;

								        &lt;source city="austin"/&gt;

								      &lt;/emma:arc&gt;

								      &lt;emma:arc from="2" to="3"&gt;

								        &lt;destination city="new york"/&gt;

								      &lt;/emma:arc&gt;

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:arc</code> element MAY contain either an

								application namespace element or character data. It MUST NOT

								contain combinations of application namespace elements and

								character data. However, an <code>emma:info</code> element MAY

								appear within an <code>emma:arc</code> element alongside character

								data, in order to allow for the association of vendor or

								application specific annotations on a single word or symbol in a

								lattice.</p>

								<p>So, in summary, there are four groupings of content that can

								appear within <code>emma:arc</code>:</p>

								<ul>

								<li>Character Data e.g. a recognized word in a speech lattice.</li>

								<li>Character Data and a single <code>emma:info</code> element

								providing vendor or application specific annotations that apply to

								the character data.</li>

								<li>An application namespace element e.g. the gesture and

								<span>semantic interpretation</span> lattice examples above.</li>

								<li>An application namespace element and a single

								<code>emma:info</code> element providing vendor or application

								specific annotations that apply to the character data.</li>

								</ul>

								<h4 id="s3.4.2">3.4.2 Annotations on lattices</h4>

								<p>The encoding of lattice arcs as XML elements

								(<code>emma:arc</code>) enables arcs to be annotated with metadata

								such as timestamps, costs, or confidence scores:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice initial="1" final="8"&gt;

								      &lt;emma:arc

								       from="1"

								       to="2"

								       emma:start="1087995961542"

								       emma:end="1087995962042"

								       emma:cost="30"&gt;

								         flights

								      &lt;/emma:arc&gt;


								      &lt;emma:arc

								       from="2"

								       to="3"

								       emma:start="1087995962042"

								       emma:end="1087995962542"

								       emma:cost="20"&gt;

								         to

								      &lt;/emma:arc&gt;


								      &lt;emma:arc

								       from="3"

								       to="4"

								       emma:start="1087995962542"

								       emma:end="1087995963042"

								       emma:cost="50"&gt;

								         boston

								      &lt;/emma:arc&gt;


								      &lt;emma:arc

								       from="3"

								       to="4"

								       emma:start="1087995963042"

								       emma:end="1087995963742"

								       emma:cost="60"&gt;

								         austin

								      &lt;/emma:arc&gt;

								      ...

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The following EMMA attributes MAY be placed on

								<code>emma:arc</code> elements: absolute timestamps

								(<code>emma:start</code>, <code>emma:end</code>), relative

								timestamps ( <code>emma:offset-to-start</code>,

								<code>emma:duration</code>), <code>emma:confidence</code>,

								<code>emma:cost</code>, the human language of the input

								(<code>emma:lang</code>), <code>emma:medium</code>,

								<code>emma:mode</code>, and <code>emma:source</code>. The use case

								for <code>emma:medium</code>, <code>emma:mode</code>, and

								<code>emma:source</code> is for lattices which contains content

								from different input modes. The <code>emma:arc</code> element MAY

								also contain an <code>emma:info</code> element for specification of

								vendor and application specific annotations on the arc.</p>

								<p>The timestamps that appear on <code>emma:arc</code> elements do

								not necessarily indicate the start and end of the arc itself. They

								MAY indicate the start and end of the signal corresponding to the

								label on the arc. As a result there is no requirement that the

								<code>emma:end</code> timestamp on an arc going into a node should

								be equivalent to the <code>emma:start</code> of all arcs going out

								of that node. Furthermore there is no guarantee that the left to

								right order of arcs in a lattice will correspond to the temporal

								order of the input signal. The lattice representation is an

								abstraction that represents a range of possible interpretations of

								a user's input and is not intended to necessarily be a

								representation of temporal order.</p>

								<p>Costs are typically application and device dependent. There are

								a variety of ways that individual arc costs might be combined to

								produce costs for specific paths through the lattice. This

								specification does not standardize the way for these costs to be

								combined; it is up to the applications and devices to determine how

								such derived costs would be computed and used.</p>

								<p>For some lattice formats, it is also desirable to annotate the

								nodes in the lattice themselves with information such as costs. For

								example in speech recognition, costs might be placed on nodes as a

								result of word penalties or redistribution of costs. For this

								purpose EMMA also provides an <code>emma:node</code> element which

								can host annotations such as <code>emma:cost</code>. The

								<code>emma:node</code> element MUST have an attribute

								<code>node-number</code> which indicates the number of the node.

								There MUST be at most one <code>emma:node</code> specification for

								a given numbered node in the lattice. In our example, if there was

								a cost of <b>100</b> on the final state this could be represented

								as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice initial="1" final="8"&gt;

								      &lt;emma:arc

								       from="1"

								       to="2"

								       emma:start="1087995961542"

								       emma:end="1087995962042"

								       emma:cost="30"&gt;

								         flights

								      &lt;/emma:arc&gt;

								      &lt;emma:arc

								       from="2"

								       to="3"

								       emma:start="1087995962042"

								       emma:end="1087995962542"

								       emma:cost="20"&gt;

								         to

								      &lt;/emma:arc&gt;


								      &lt;emma:arc

								       from="3"

								       to="4"

								       emma:start="1087995962542"

								       emma:end="1087995963042"

								       emma:cost="50"&gt;

								         boston

								      &lt;/emma:arc&gt;

								      &lt;emma:arc

								       from="3"

								       to="4"

								       emma:start="1087995963042"

								       emma:end="1087995963742"

								       emma:cost="60"&gt;

								         austin

								      &lt;/emma:arc&gt;

								        ...

								      &lt;emma:node node-number="8" emma:cost="100"/&gt;

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h4 id="s3.4.3">3.4.3 Relative timestamps on lattices</h4>

								<p>The relative timestamp mechanism in EMMA is intended to provide

								temporal information about arcs in a lattice in relative terms

								using offsets in milliseconds. In order to do this the absolute

								time MAY be specified on <code>emma:interpretation</code>; both

								<code>emma:time-ref-uri</code> and

								<code>emma:time-ref-anchor-point</code> apply to

								<code>emma:lattice</code> and MAY be used there to set the anchor

								point for offsets to the start of the absolute time specified on

								<code>emma:interpretation</code>. The offset in milliseconds to the

								beginning of each arc MAY then be indicated on each

								<code>emma:arc</code> in the <code>emma:offset-to-start</code>

								attribute.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;


								  &lt;emma:interpretation id="interp1"

								          emma:start="1087995961542" emma:end="1087995963042"

								          <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:lattice emma:time-ref-uri="#interp1"

								        emma:time-ref-anchor-point="start"

								        initial="1" final="4"&gt;

								      &lt;emma:arc

								       from="1"

								       to="2"

								       emma:offset-to-start="0"&gt;

								         flights

								      &lt;/emma:arc&gt;

								      &lt;emma:arc

								       from="2"

								       to="3"

								       emma:offset-to-start="500"&gt;

								         to

								      &lt;/emma:arc&gt;


								      &lt;emma:arc

								       from="3"

								       to="4"

								       emma:offset-to-start="1000"&gt;

								         boston

								      &lt;/emma:arc&gt;

								    &lt;/emma:lattice&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Note that the offset for the first <code>emma:arc</code> MUST

								always be zero since the EMMA attribute

								<code>emma:offset-to-start</code> indicates the number of

								milliseconds from the anchor point to the <i>start</i> of the piece

								of input associated with the <code>emma:arc</code>, in this case

								the word "flights".</p>

								<h3 id="s3.5">3.5 Literal semantics: <code>emma:literal</code>

								element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:literal</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element that contains string literal output.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>String literal</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>None.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:literal</code> is a child of

								<code>emma:interpretation</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>Certain EMMA processing components produce semantic results in

								the form of string literals without any surrounding application

								namespace markup. These MUST be placed with the EMMA element

								<code>emma:literal</code> within <code>emma:interpretation</code>.

								For example, if a semantic interpreter simply returned "boston"

								this could be represented in EMMA as:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation <span>id="r1" <br />

								     emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:literal&gt;boston&lt;/emma:literal&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Note that a raw recognition result of a sequence of words from

								speech recognition is also a kind of string literal and can be

								contained within <code>emma:literal</code>. For example,

								recognition of the string "flights to san francisco" can be

								represented in EMMA as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation <span>id="r1" <br />

								     emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:literal&gt;flights to san francisco&lt;/emma:literal&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h2 id="s4">4. EMMA annotations</h2>

								<p>This section defines annotations in the EMMA namespace including

								both attributes and elements. The values are specified in terms of

								the data types defined by XML Schema Part 2: Datatypes <span>Second

								Edition</span> [<a href="#XSD2"><span>XML Schema

								Datatypes</span></a>].</p>

								<h3 id="s4.1">4.1 EMMA annotation elements</h3>

								<h4 id="s4.1.1">4.1.1 Data model: <code>emma:model</code>

								element</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:model</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The <code>emma:model</code> either references or provides

								inline the data model for the instance data.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>If a <code>ref</code> attribute is not specified then this

								element contains the data model inline.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>id</code> of type <code>xsd:ID</code>.</li>

								</ul>

								</li>

								<li><b>Optional</b>:

								<ul>

								<li><code>ref</code> of type <code>xsd:anyURI</code> that

								references the data model. Note that either an <code>ref</code>

								attribute or in-line data model (but not both) MUST be

								specified.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:model</code> element MAY appear only as a child

								of <code>emma:emma</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The data model that may be used to express constraints on the

								structure and content of instance data is specified as one of the

								annotations of the instance. Specifying the data model is OPTIONAL,

								in which case the data model can be said to be implicit. Typically

								the data model is pre-established by the application.</p>

								<p>The data model is specified with the <code>emma:model</code>

								annotation defined as an element in the EMMA namespace. If the data

								model for the contents of a <code>emma:interpretation</code>,

								container elements, or application namespace element is to be

								specified in EMMA, the attribute <code>emma:model-ref</code> MUST

								be specified on the <code>emma:interpretation</code>, container

								element, or application namespace element. Note that since multiple

								<code>emma:model</code> elements might be specified under the

								<code>emma:emma</code> it is possible to refer to multiple data

								models within a single EMMA document. For example, different

								alternative interpretations under an <code>emma:one-of</code> might

								have different data models. In this case, an

								<code>emma:model-ref</code> attribute would appear on each

								<code>emma:interpretation</code> element in the N-best list with

								its value being the <code>id</code> of the <code>emma:model</code>

								element for that particular interpretation.</p>

								<p>The data model is closely related to the interpretation data,

								and is typically specified as the annotation related to the

								<code>emma:interpretation</code> or <code>emma:one-of</code>

								elements.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:model id="model1" ref="http://example.com/models/city.xml"/&gt;

								  &lt;emma:interpretation id="int1" emma:model-ref="model1"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;city&gt; London &lt;/city&gt;

								    &lt;country&gt; UK &lt;/country&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:model</code> annotation MAY reference any element

								or attribute in the application instance data, as well as any EMMA

								container element (<code>emma:one-of</code>,

								<code>emma:group</code>, or <code>emma:sequence</code>).</p>

								<p>The data model annotation MAY be used to either reference an

								external data model with the <code>ref</code> attribute or provide

								a data model as in-line content. Either a <code>ref</code>

								attribute or in-line data model (but not both) MUST be

								specified.</p>

								<h4 id="s4.1.2">4.1.2 Interpretation derivation:

								<code>emma:derived-from</code> element and

								<code>emma:derivation</code> element</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:derived-from</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An empty element which provides a reference to the

								interpretation which the element it appears on was derived

								from.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>None</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>resource</code> of type <code>xsd:anyURI</code> that

								references the interpretation from which the current interpretation

								is derived.</li>

								</ul>

								</li>

								<li><b>Optional</b>:

								<ul>

								<li><code>composite</code> of type <code>xsd:boolean</code> that is

								<code>"true"</code> if the derivation step combines multiple inputs

								and <code>"false"</code> if not. If <code>composite</code> is not

								specified the value is <code>"false"</code> by default.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:derived-from</code> element is legal only as a

								child of <code>emma:interpretation</code>,

								<code>emma:one-of</code>, <code>emma:group</code>, or

								<code>emma:sequence</code>.</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:derivation</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element which contains interpretation and container elements

								representing earlier stages in the processing of the input.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>One or more <code>emma:interpretation</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, or

								<code>emma:group</code> elements.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>None</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:derivation</code> MAY appear only as a child of

								the <code>emma:emma</code> element.</td>

								</tr>

								</tbody>

								</table>

								<p>Instances of interpretations are in general derived from other

								instances of interpretation in a process that goes from raw data to

								increasingly refined representations of the input. The derivation

								annotation is used to link any two interpretations that are related

								by representing the source and the outcome of an interpretation

								process. For instance, a speech recognition process can return the

								following result in the form of raw text:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="raw"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								  &lt;/emma:interpretation&gt;


								&lt;/emma:emma&gt;

								</pre>

								<p>A first interpretation process will produce:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="better"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;tomorrow&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>A second interpretation process, aware of the current date, will

								be able to produce a more refined instance, such as:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="best"

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;20030315&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The interaction manager might need to have access to the three

								levels of interpretation. The <code>emma:derived-from</code>

								annotation element can be used to establish a chain of derivation

								relationships as in the following example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:interpretation id="raw"<br />

								<span>      emma:medium="acoustic" emma:mode="voice"</span>&gt;

								      &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="better"&gt;

								      &lt;emma:derived-from resource="#raw" composite="false"/&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;tomorrow&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="best"&gt;

								    &lt;emma:derived-from resource="#better" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;20030315&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:derivation</code> element MAY be used as a

								container for representations of the earlier stages in the

								interpretation of the input. The latest stage of processing MUST be

								a direct child of <code>emma:emma</code>.</p>

								<p>The resource attribute on <code>emma:derived-from</code> is a

								URI which can reference IDs in the current or other EMMA

								documents.</p>

								<p>In addition to representing sequential derivations, the EMMA

								<code>emma:derived-from</code> element can also be used to capture

								composite derivations. Composite derivations involve combination of

								inputs from different modes.</p>

								<p>In order to indicate whether an <code>emma:derived-from</code>

								element describes a sequential derivation step or a composite

								derivation step, the <code>emma:derived-from</code> element has an

								attribute <code>composite</code> which has a boolean value. A

								composite <code>emma:derived-from</code> MUST be marked as

								<code>composite="true"</code> while a sequential

								<code>emma:derived-from</code> element is marked as

								<code>composite="false"</code>. If this attribute is not specified

								the value is <code>false</code> by default.</p>

								<p>In the following composite derivation example the user said

								"destination" using the voice mode and circled Boston on a map

								using the ink mode:</p>

								<div>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:interpretation id="voice1"

								        emma:start="1087995961500"

								        emma:end="1087995962542"

								        emma:process="http://example.com/myasr.xml"

								        emma:source="http://example.com/microphone/NC-61"

								        emma:signal="http://example.com/signals/sg23.wav"

								        emma:confidence="0.6"

								        emma:medium="acoustic"

								        emma:mode="voice"

								        emma:function="dialog"

								        emma:verbal="true"

								        emma:lang="en-US"

								        emma:tokens="destination"&gt;

								      &lt;rawinput&gt;destination&lt;/rawinput&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="ink1"

								        emma:start="1087995961600"

								        emma:end="1087995964000"

								        emma:process="http://example.com/mygesturereco.xml"

								        emma:source="http://example.com/pen/wacom123"

								        emma:signal="http://example.com/signals/ink5.inkml"

								        emma:confidence="0.5"

								        emma:medium="tactile"

								        emma:mode="ink"

								        emma:function="dialog"

								        emma:verbal="false"&gt;

								      &lt;rawinput&gt;Boston&lt;/rawinput&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="multimodal1"


								      emma:confidence="0.3"

								      <span>emma:start="1087995961500"</span>

								      <span>emma:end="1087995964000"</span>

								      emma:medium="<span>acoustic tactile</span>"

								      emma:mode="<span>voice ink</span>"

								      emma:function="dialog"

								      emma:verbal="true"

								      emma:lang="en-US"

								      emma:tokens="destination"&gt;

								    &lt;emma:derived-from resource="#voice1" composite="true"

								    &lt;emma:derived-from resource="#ink1" composite="true"

								    &lt;destination&gt;Boston&lt;/destination&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre></div>

								<p>In this example, annotations on the multimodal interpretation

								indicate the process used for the integration and there are two

								<code>emma:derived-from</code> elements, one pointing to the speech

								and one pointing to the pen gesture.</p>

								<p>The only constraints the EMMA specification places on the

								annotations that appear on a composite input are that the

								<code>emma:medium</code> attribute MUST contain the union of the

								<code>emma:medium</code> attributes on the combining inputs,

								represented as a space delimited set of <code>nmtokens</code> as

								defined in <a href="#s4.2.11">Section 4.2.11</a>, and that the

								<code>emma:mode</code> attribute MUST contain the union of the

								<code>emma:mode</code> attributes on the combining inputs,

								represented as a space delimited set of <span><code>nmtokens</code>

								as defined in <a href="#s4.2.11">Section 4.2.11</a></span>. In the

								example above this meanings that the <code>emma:medium</code> value

								is <code>"acoustic tactile"</code> and the <code>emma:mode</code>

								attribute is <code>"voice ink"</code>. How all other annotations

								are handled is author defined. In the following paragraph,

								informative examples on how specific annotations might be handled

								are given.</p>

								<p>With reference to the illustrative example above, this paragraph

								provides informative guidance regarding the determination of

								annotations (beyond <code>emma:medium</code> and

								<code>emma:mode</code> on a composite multimodal interpretation).

								Generally the timestamp on a combined input should contain the

								intervals indicated by the combining inputs. For the absolute

								timestamps <code>emma:start</code> and <code>emma:end</code> this

								can be achieved by taking the earlier of the

								<code>emma:start</code> values

								(<code>emma:start="1087995961500"</code> in our example) and the

								later of the <code>emma:end</code> values

								(<code>emma:end="1087995964000"</code> in the example). The

								determination of relative timestamps for composite is more complex,

								informative guidance is given in <a href="#s4.2.10.4">Section

								4.2.10.4</a>. Generally speaking the <code>emma:confidence</code>

								value will be some numerical combination of the confidence scores

								assigned to the combining inputs. In our example, it is the result

								of multiplying the voice and ink confidence scores

								(<code>0.3</code>). In other cases there may not be a confidence

								score for one of the combining inputs and the author may choose to

								copy the confidence score from the input which does have one.

								Generally, for <code>emma:verbal</code>, if either of the inputs

								has the value <code>true</code> then the multimodal interpretation

								will also be <code>emma:verbal="true"</code> as in the example. In

								other words the annotation for the composite input is the result of

								an inclusive OR of the boolean values of the annotations on the

								inputs. If an annotation is only specified on one of the combining

								inputs then it may in some cases be assumed to apply to the

								multimodal interpretation of the composite input. In the example,

								<code>emma:lang="en-US"</code> is only specified for the speech

								input, and this annotation appears on the composite result also.

								Similarly in our example, only the voice has

								<code>emma:tokens</code> and the author has chosen to annotate the

								combined input with the same <code>emma:tokens</code> value. In

								this example, the <code>emma:function</code> is the same on both

								combining input and the author has chosen to use the same

								annotation on the composite interpretation.</p>

								<p>In annotating derivations of the processing of the input, EMMA

								provides the flexibility of both course-grained or fine-grained

								annotation of relations among interpretations. For example, when

								relating two N-best lists, within <code>emma:one-of</code> elements

								either there can be a single <code>emma:derived-from</code> element

								under <code>emma:one-of</code> referring to the ID of the

								<code>emma:one-of</code> for the earlier processing stage:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:one-of id="nbest1"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								      &lt;emma:interpretation id="int1"&gt;

								       &lt;res&gt;from boston to denver on march eleven two thousand three&lt;/res&gt;

								      &lt;/emma:interpretation&gt;


								      &lt;emma:interpretation id="int2"&gt;

								       &lt;res&gt;from austin to denver on march eleven two thousand three&lt;/res&gt;

								      &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:derivation&gt;


								&lt;emma:one-of id="nbest2"&gt;

								  &lt;emma:derived-from resource="#nbest1" composite="false"/&gt;

								  &lt;emma:interpretation id="int1b"&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03112003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;


								  &lt;emma:interpretation id="int2b"&gt;

								    &lt;origin&gt;Austin&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03112003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:one-of&gt;


								&lt;/emma:emma&gt;

								</pre>

								<p>Or there can be a separate <code>emma:derived-from</code>

								element on each <code>emma:interpretation</code> element referring

								to the specific <code>emma:interpretation</code> element it was

								derived from.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="nbest2"&gt;

								    &lt;emma:interpretation id="int1b"&gt;

								     &lt;emma:derived-from resource="#int1" composite="false"/&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2b"&gt;

								     &lt;emma:derived-from resource="#int2" composite="false"/&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:one-of id="nbest1"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								      &lt;emma:interpretation id="int1"&gt;

								       &lt;res&gt;from boston to denver on march eleven two thousand three&lt;/res&gt;

								      &lt;/emma:interpretation&gt;


								      &lt;emma:interpretation id="int2"&gt;

								       &lt;res&gt;from austin to denver on march eleven two thousand three&lt;/res&gt;

								      &lt;/emma:interpretation&gt;

								    &lt;/emma:one-of&gt;

								  &lt;/emma:derivation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p><a href="#s4.3">Section 4.3</a> provides further examples of the

								use of <code>emma:derived-from</code> to represent sequential

								derivations and addresses the issue of the scope of EMMA

								annotations across derivations of user input.</p>

								<h4 id="s4.1.3">4.1.3 Reference to grammar used:

								<code>emma:grammar</code> element</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:grammar</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An element used to provide a reference to the grammar used in

								processing the input.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>None</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code><span>ref</span></code> of type <code>xsd:anyURI</code>

								that references a grammar used in processing the input.</li>

								<li><code>id</code> of type <code>xsd:ID</code>.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:grammar</code> is legal only as a child of the

								<code>emma:emma</code> element.</td>

								</tr>

								</tbody>

								</table>

								<p>The grammar that was used to derive the EMMA result MAY be

								specified with the <code>emma:grammar</code> annotation defined as

								an element in the EMMA namespace.</p>

								<p>Example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:grammar id="gram1" <span>ref</span>="someURI"/&gt;

								  &lt;emma:grammar id="gram2" <span>ref</span>="anotherURI"/&gt;

								  &lt;emma:one-of id="r1"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1" emma:grammar-ref="gram1"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:grammar-ref="gram1"&gt;

								        &lt;origin&gt;Austin&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int3" emma:grammar-ref="gram2"&gt;

								        &lt;command&gt;help&lt;/command&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:grammar</code> annotation is a child of

								<code>emma:emma.</code></p>

								<h3 id="s4.1.4">4.1.4 Extensibility to application/vendor specific

								annotations: <code>emma:info</code> element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:info</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The <code>emma:info</code> element acts as a container for

								vendor and/or application specific metadata regarding a user's

								input.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td><span>One of more</span> elements in the application namespace

								providing metadata about the input.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Optional</b>:

								<ul>

								<li><code>id</code> of type <code>xsd:ID</code>.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:info</code> element is legal only as a child of

								the EMMA elements <code>emma:emma</code>,

								<code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:arc</code>, or <code>emma:node</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>In <a href="#s4.2">Section 4.2</a>, a series of attributes are

								defined for representation of metadata about user inputs in a

								standardized form. EMMA also provides an extensibility mechanism

								for annotation of user inputs with vendor or application specific

								metadata not covered by the standard set of EMMA annotations. The

								element <code>emma:info</code> MUST be used as a container for

								these annotations, UNLESS they are explicitly covered by

								<code>emma:endpoint-info</code>. For example, if an input to a

								dialog system needed to be annotated with the number that the call

								originated from, their state, some indication of the type of

								customer, and the name of the service, these pieces of information

								could be represented within <code>emma:info</code> as in the

								following example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:info&gt;

								    &lt;caller_id&gt;

								      &lt;phone_number&gt;2121234567&lt;/phone_number&gt;

								      &lt;state&gt;NY&lt;/state&gt;

								    &lt;/caller_id&gt;


								    &lt;customer_type&gt;residential&lt;/customer_type&gt;

								    &lt;service_name&gt;acme_travel_service&lt;/service_name&gt;

								  &lt;/emma:info&gt;


								  &lt;emma:one-of id="r1" emma:start="1087995961542"

								      emma:end="1087995963542"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1" emma:confidence="0.75"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:confidence="0.68"&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;03112003&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>It is important to have an EMMA container element for

								application/vendor specific annotations since EMMA elements provide

								a structure for representation of multiple possible interpretations

								of the input. As a result it is cumbersome to state

								application/vendor specific metadata as part of the application

								data within each <code>emma:interpretation</code>. An element is

								used rather than an attribute so that internal structure can be

								given to the annotations within <code>emma:info</code>.</p>

								<p>In addition to <code>emma:emma</code>, <code>emma:info</code>

								MAY also appear as a child of other structural elements such as

								<code>emma:interpretation</code>, <code>emma:info</code> and so on.

								When <code>emma:info</code> appears as a child of one of these

								elements the application/vendor specific annotations contained

								within <code>emma:info</code> are assumed to apply to all of the

								<code>emma:interpretation</code> elements within the containing

								element. The semantics of conflicting annotations in

								<code>emma:info</code>, for example when different values are found

								within <code>emma:emma</code> and <code>emma:interpretation</code>,

								are left to the developer of the vendor/application specific

								annotations.</p>

								<h3 id="s4.1.5" class="notoc">4.1.5 Endpoint reference:

								<code>emma:endpoint-info</code> element and

								<code>emma:endpoint</code> element</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint-info</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The <code>emma:endpoint-info</code> element acts as a container

								for all application specific annotation regarding the communication

								environment.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>One or more <code>emma:endpoint</code> elements.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li><b>Required</b>:

								<ul>

								<li><code>id</code> of type <code>xsd:ID</code>.</li>

								</ul>

								</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>The <code>emma:endpoint-info</code> elements is legal only as a

								child of <code>emma:emma</code>.</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>The element acts as a container for application specific

								endpoint information.</td>

								</tr>

								<tr>

								<th>Children</th>

								<td>Elements in the application namespace providing metadata about

								the input.</td>

								</tr>

								<tr>

								<th>Attributes</th>

								<td>

								<ul>

								<li>Required:

								<ul>

								<li><code>id</code> of type <code>xsd:ID</code></li>

								</ul>

								</li>

								<li>Optional: <code>emma:endpoint-role</code>,

								<code>emma:endpoint-address</code>, <code>emma:message-id</code>,

								<code>emma:port-num</code>, <code>emma:port-type</code>,

								<code>emma:endpoint-pair-ref</code>,

								<code>emma:service-name</code>, <code>emma:media-type</code>,

								<code>emma:medium</code>, <code>emma:mode</code>.</li>

								</ul>

								</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint-info</code></td>

								</tr>

								</tbody>

								</table>

								<p>In order to conduct multimodal interaction, there is a need in

								EMMA to specify the properties of the endpoint that receives the

								input which leads to the EMMA annotation. This allows subsequent

								components to utilize the endpoint properties as well as the

								annotated inputs to conduct meaningful multimodal interaction. EMMA

								element <code>emma:endpoint</code> can be used for this purpose. It

								can specify the endpoint properties based on a set of common

								endpoint property attributes in EMMA, such as

								<code>emma:endpoint-address</code>, <code>emma:port-num</code>,

								<code>emma:port-type</code>, etc. (<a href="#s4.2.14">Section

								4.2.14</a>). Moreover, it provides an extensible annotation

								structure that allows the inclusion of application and vendor

								specific endpoint properties.</p>

								<p>Note that the usage of the term "endpoint" in this context is

								different from the way that the term is used in speech processing,

								where it refers to the end of a speech input. As used here,

								"endpoint" refers to a network location which is the source or

								recipient of an EMMA document.</p>

								<p>In multimodal interaction, multiple devices can be used and each

								device can open multiple communication endpoints at the same time.

								These endpoints are used to transmit and receive data, such as raw

								input, EMMA documents, etc. The EMMA element

								<code>emma:endpoint</code> provides a generic representation of

								endpoint information which is relevant to multimodal interaction.

								It allows the annotation to be interoperable, and it eliminates the

								need for EMMA processors to create their own specialized

								annotations for existing protocols, potential protocols or yet

								undefined private protocols that they may use.</p>

								<p>Moreover, <code>emma:endpoint-info</code> provides a container

								to hold all annotations regarding the endpoint information,

								including <code>emma:endpoint</code> and other application and

								vendor specific annotations that are related to the communication,

								allowing the same communication environment to be referenced and

								used in multiple interpretations.</p>

								<p>Note that EMMA provides two locations (i.e.

								<code>emma:info</code> and <code>emma:endpoint-info</code>) for

								specifying vendor/application specific annotations. If the

								annotation is specifically related to the description of the

								endpoint, then the vendor/application specific annotation SHOULD be

								placed within <code>emma:endpoint-info</code>, otherwise it SHOULD

								be placed within <code>emma:info</code>.</p>

								<p>The following example illustrates the annotation of endpoint

								reference properties in EMMA.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"

								    xmlns:ex="http://www.example.com/emma/port"&gt;

								  &lt;emma:endpoint-info id="audio-channel-1"&gt;

								    &lt;emma:endpoint id="endpoint1"

								        emma:endpoint-role="sink"

								        emma:endpoint-address="135.61.71.103"

								        emma:port-num="50204"

								        emma:port-type="rtp"

								        emma:endpoint-pair-ref="endpoint2"

								        emma:media-type="audio/dsr-202212; rate:8000; maxptime:40"

								        emma:service-name="travel"

								        emma:mode="voice"&gt;

								      &lt;ex:app-protocol&gt;SIP&lt;/ex:app-protocol&gt;

								    &lt;/emma:endpoint&gt;


								    &lt;emma:endpoint id="endpoint2"

								        emma:endpoint-role="source"

								        emma:endpoint-address="136.62.72.104"

								        emma:port-num="50204"

								        emma:port-type="rtp"

								        emma:endpoint-pair-ref="endpoint1"

								        emma:media-type="audio/dsr-202212; rate:8000; maxptime:40"

								        emma:service-name="travel"

								        emma:mode="voice"&gt;

								      &lt;ex:app-protocol&gt;SIP&lt;/ex:app-protocol&gt;

								    &lt;/emma:endpoint&gt;

								  &lt;/emma:endpoint-info&gt;


								  &lt;emma:interpretation id="int1"

								      emma:start="1087995961542" emma:end="1087995963542"

								      emma:endpoint-info-ref="audio-channel-1"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;destination&gt;Chicago&lt;/destination&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>ex:app-protocol</code> is provided by the application

								or the vendor specification. It specifies that the application

								layer protocol used to establish the speech transmission from the

								"source" port to the "sink" port is Session Initiation Protocol

								(SIP). This is specific to SIP based VoIP communication, in which

								the actual media transmission and the call signaling that controls

								the communication sessions, are separated and typically based on

								different protocols. In the above example, the Real-time

								Transmission Protocol (RTP) is used in the media transmission

								between the source port and the sink port.</p>

								<h2 id="s4.2">4.2 EMMA annotation attributes</h2>

								<h3 id="s4.2.1">4.2.1 Tokens of input: <code>emma:tokens</code>

								attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:tokens</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> holding a sequence

								of input tokens.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:tokens</code> annotation holds a list of input

								tokens. In the following description, the term <i>tokens</i> is

								used in the computational and syntactic sense of <i>units of

								input</i>, and not in the sense of <i>XML tokens</i>. The value

								held in <code>emma:tokens</code> is the list of the tokens of input

								as produced by the processor which generated the EMMA document;

								there is no language associated with this value.</p>

								<p>In the case where a grammar is used to constrain input, the

								value will correspond to tokens as defined by the grammar. So for

								an EMMA document produced by input to a SRGS grammar [<a href=

								"#SRGS">SRGS</a>], the value of <code>emma:tokens</code> will be

								the list of words and/or phrases that are defined as tokens in SRGS

								(<span>see</span> Section 2.1 <span>of [<a href=

								"#SRGS">SRGS</a>]</span>). Items in the <code>emma:tokens</code>

								list are delimited by white space and/or quotation marks for

								phrases containing white space. For example:</p>

								<pre class="example">

								emma:tokens="arriving at 'Liverpool Street'"

								</pre>

								<p>where the three tokens of input are <i>arriving</i>, <i>at</i>

								and <i>Liverpool Street</i>.</p>

								<p>The <code>emma:tokens</code> annotation MAY be applied not just

								to the lexical words and phrases of language but to any level of

								input processing. Other examples of tokenization include phonemes,

								ink strokes, gestures and any other discrete units of input at any

								level.</p>

								<p>Examples:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1"

								      emma:tokens="From Cambridge to London tomorrow"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin emma:tokens="From Cambridge"&gt;Cambridge&lt;/origin&gt;

								    &lt;destination emma:tokens="to London"&gt;London&lt;/destination&gt;

								    &lt;date emma:tokens="tomorrow"&gt;20030315&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.2">4.2.2 Reference to processing:

								<code>emma:process</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:process</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> referencing the

								process used to generate the interpretation.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code></td>

								</tr>

								</tbody>

								</table>

								<p>A reference to the information concerning the processing that

								was used for generating an interpretation MAY be made using the

								<code>emma:process</code> attribute. For example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:interpretation id="raw"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								      &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="better"

								        emma:process="http://example.com/mysemproc1.xml"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;tomorrow&lt;/date&gt;

								      &lt;emma:derived-from resource="#raw"/&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="best"

								      emma:process="http://example.com/mysemproc2.xml"&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								    &lt;emma:derived-from resource="#better"/&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The process description document, referenced by the

								<code>emma:process</code> annotation MAY include information on the

								process itself, such as grammar, type of parser, etc. EMMA is not

								normative about the format of the process description document.</p>

								<h3 id="s4.2.3">4.2.3 Lack of input: <code>emma:no-input</code>

								attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:no-input</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute holding <code>xsd:boolean</code> value that is true

								if there was no input.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code></td>

								</tr>

								</tbody>

								</table>

								<p>The case of lack of input MUST be annotated as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1" emma:no-input="true"<br />

								   <span>emma:medium="acoustic" emma:mode="voice"</span>/&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>If the <code>emma:interpretation</code> is annotated with

								<code>emma:no-input="true"</code> then the

								<code>emma:interpretation</code> MUST be empty.</p>

								<h3 id="s4.2.4">4.2.4 Uninterpreted input:

								<code>emma:uninterpreted</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:uninterpreted</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute holding <code>xsd:boolean</code> value that is true

								if <span>no interpretation was produced in response to the

								input</span></td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code></td>

								</tr>

								</tbody>

								</table>

								<p>An <code>emma:interpretation</code> element representing input

								<span>for which no interpretation was produced</span> MUST be

								annotated with <code>emma:uninterpreted="true"</code>. For

								example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								    http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="interp1" emma:uninterpreted="true"<br />

								   <span>emma:medium="acoustic" emma:mode="voice"</span>/&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The notation for uninterpreted input MAY refer to any possible

								stage of interpretation processing, including raw transcriptions.

								For instance, no interpretation would be produced for stages

								performing pure signal capture such as audio recordings. Likewise,

								if a spoken input was recognized but cannot be parsed by a language

								understanding component, it can be tagged as

								<code>emma:uninterpreted</code> as in the following example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="understanding"

								      emma:process="http://example.com/mynlu.xml"

								      emma:uninterpreted="true"

								      emma:tokens="From Cambridge to London tomorrow"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>/&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:interpretation</code> MUST be empty <span class=

								"add">if</span> the <code>emma:interpretation</code> element is

								annotated with <code>emma:uninterpreted="true"</code>.</p>

								<h3 id="s4.2.5">4.2.5 Human language of input:

								<code>emma:lang</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:lang</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:language</code> indicating the

								language for the input.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:lang</code> annotation is used to indicate the

								human language for the input that it annotates. The values of the

								<code>emma:lang</code> attribute are language identifiers as

								defined by <span>IETF Best Current Practice 47 [<a href=

								"#BCP47">BCP47</a>]</span>. For example,

								<code>emma:lang="fr"</code> denotes French, and

								<code>emma:lang="en-US"</code> denotes US English.

								<code>emma:lang</code> MAY be applied to any

								<code>emma:interpretation</code> element. Its annotative scope

								follows the annotative scope of these elements. Unlike the

								<code>xml:lang</code> attribute in XML, <code>emma:lang</code> does

								not specify the language used by element contents or attribute

								values.</p>

								<p>The following example shows the use of <code>emma:lang</code>

								for annotating an input interpretation.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1" emma:lang="fr"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;answer&gt;arretez&lt;/answer&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Many kinds of input including some inputs made through pen,

								computer vision, and other kinds of sensors are inherently

								non-linguistic. Examples include drawing areas, arrows etc. using a

								pen and music input for tune recognition. If these non-linguistic

								inputs are annotated with <code>emma:lang</code> then they MUST be

								annotated as <code>emma:lang="zxx"</code>. For example, pen input

								where a user circles an area on map display could be represented as

								follows where <code>emma:lang="zxx"</code> indicates that the ink

								input is not in any human language.</p>

								<pre class="example">

								<span>&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="pen1"

								      emma:medium="tactile"

								      emma:mode="ink"

								      emma:lang="zxx"&gt;

								    &lt;location&gt;

								      &lt;type&gt;area&lt;/type&gt;

								      &lt;points&gt;42.1345 -37.128 42.1346 -37.120 ... &lt;/points&gt;

								    &lt;/location&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;</span>

								</pre>

								<p>If inputs for which there is no information about whether the

								source input is in a particular human language, and if so which

								language, are annotated with <code>emma:lang,</code> then they MUST

								be annotated as <code>emma:lang=""</code>. Furthermore, in cases

								where there is not explicit <code>emma:lang</code> annotation, and

								none is inherited from a higher element in the document, the

								default value for <code>emma:lang</code> is <code>""</code> meaning

								that there is no information about whether the source input is in a

								language and if so which language.</p>

								<p>The <code>xml:lang</code> and <code>emma:lang</code> attributes

								serve uniquely different and equally important purposes. The role

								of the <code>xml:lang</code> attribute in XML 1.0 is to indicate

								the language used for character data content in an XML element or

								document. In contrast, the <code>emma:lang</code> attribute is used

								to indicate the language employed by a user when entering an input.

								Critically, <code>emma:lang</code> annotates the language of the

								signal originating from the user rather than the specific tokens

								used at a particular stage of processing. This is most clearly

								illustrated through consideration of an example involving multiple

								stages of processing of a user input. Consider the following

								scenario: EMMA is being used to represent three stages in the

								processing of a spoken input to an system for ordering products.

								The user input is in Italian, after speech recognition, the user

								input is first translated into English, then a natural language

								understanding system converts the English translation into a

								product ID (which is not in any particular language). Since the

								input signal is a user speaking Italian, the <code>emma:lang</code>

								will be <code>emma:lang="it"</code> on all of these three stages of

								processing. The <code>xml:lang</code> attribute, in contrast, will

								initially be <code>"it"</code>, after translation the

								<code>xml:lang</code> will be <code>"en-US"</code>, and after

								language understanding it will be <code>"zxx"</code> since the

								product ID is non-linguistic content. The following are examples of

								EMMA documents corresponding to these three processing stages,

								abbreviated to show the critical attributes for discussion here.

								Note that <code>&lt;transcription&gt;</code>,

								<code>&lt;translation&gt;</code>, and

								<code>&lt;understanding&gt;</code> are application namespace

								attributes, not part of the EMMA markup.<br /></p>

								<pre class="example">

								<span>&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								   &lt;emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"&gt;<br />

								     &lt;transcription xml:lang="it"&gt;condizionatore&lt;/transcription&gt;<br />

								   &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</span>

								</pre>

								<pre class="example">

								<span>&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								    http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								    &lt;emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"&gt;

								       &lt;translation xml:lang="en-US"&gt;air conditioner&lt;/translation&gt;<br />

								    &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;</span>

								</pre>

								<pre class="example">

								<span>&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								    http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								    &lt;emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"&gt; <br />

								       &lt;understanding xml:lang="zxx"&gt;id1456&lt;/understanding&gt;<br />

								    &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;</span>

								</pre>

								<p>In order <span>to</span> handle inputs involving multiple

								languages, such as through code switching, the

								<code>emma:lang</code> tag MAY contain several language identifiers

								separated by spaces.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1"

								      emma:tokens="please stop arretez s'il vous plait"

								      emma:lang="en fr"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;command&gt; CANCEL &lt;/command&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.6">4.2.6 Reference to signal: <code>emma:signal</code>

								<span>and <code>emma:signal-size</code></span> attributes</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:signal</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> referencing the

								input signal.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>,

								<span>and</span> application instance data.</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:signal-size</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute <span>of type <code>xsd:nonNegativeInteger</code>

								specifying</span> the size in eight bit octets of the referenced

								source.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>,

								<span>and</span> application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>A URI reference to the signal that originated the input

								recognition process MAY be represented in EMMA using the

								<code>emma:signal</code> annotation.</p>

								<p>Here is an example where the reference to a speech signal is

								represented using the <code>emma:signal</code> annotation on the

								<code>emma:interpretation</code> element:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="intp1"

								      emma:signal="http://example.com/signals/sg23.bin"<br />

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:signal-size</code> annotation can be used to

								declare the exact size of the associated signal in 8-bit octets. An

								example of the use of an EMMA document to represent a recording,

								with <code>emma:signal-size</code> indicating the size is as

								follows:</p>

								<pre class="example">

								<span>

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="intp1"

								      emma:medium="acoustic"

								      emma:mode="voice"

								      emma:function="recording"

								      emma:uninterpreted="true"

								      emma:signal="http://example.com/signals/recording.mpg"

								      emma:signal-size="82102"

								      emma:duration="10000"&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</span>

								</pre>

								<h3 id="s4.2.7">4.2.7 Media type: <code>emma:media-type</code>

								attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:media-type</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> holding the MIME

								type associated with the signal's data format.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>,

								<code>emma:endpoint</code>, <span>and</span> application instance

								data.</td>

								</tr>

								</tbody>

								</table>

								<p>The data format of the signal that originated the input MAY be

								represented in EMMA using the <code>emma:media-type</code>

								annotation. An initial set of MIME media types is defined by

								[<a href="#RFC2046">RFC2046</a>].</p>

								<p>Here is an example where the media type for the ETSI ES 202 212

								audio codec for Distributed Speech Recognition (DSR) is applied to

								the <code>emma:interpretation</code> element. The example also

								specifies an optional sampling rate of 8 kHz and maxptime of 40

								milliseconds.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="intp1"<span>

								        emma:signal="http://example.com/signals/signal.dsr"</span>

								        emma:media-type="audio/dsr-<span>es</span>202212; rate:8000; maxptime:40"<br />

								        <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.8">4.2.8 Confidence scores:

								<code>emma:confidence</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:confidence</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:decimal</code> in range 0.0 to

								1.0, indicating the processor's confidence in the result.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code>, <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The confidence score in EMMA is used to indicate the quality of

								the input, and if confidence is annotated on an input it MUST be

								given as the value of <code>emma:confidence</code>. The confidence

								score MUST be a number in the range from 0.0 to 1.0 inclusive. A

								value of 0.0 indicates minimum confidence, and a value of 1.0

								indicates maximum confidence. Note that

								<code>emma:confidence</code> represents not only the confidence of

								the speech recognizer, but rather the confidence of the whatever

								processor was responsible for creating the EMMA result, based on

								whatever evidence it has. For a natural language interpretation,

								for example, this might include semantic heuristics in addition to

								speech recognition scores. Moreover, the confidence score values do

								not have to be interpreted as probabilities. In fact confidence

								score values are platform-dependent, since their computation is

								likely to differ between platforms and different EMMA processors.

								Confidence scores are annotated explicitly in EMMA in order to

								provide this information to the subsequent processes for multimodal

								interaction. The example below illustrates how confidence scores

								are annotated in EMMA.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="nbest1"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="meaning1" emma:confidence="0.6"&gt;

								      &lt;location&gt;Boston&lt;/location&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="meaning2" emma:confidence="0.4"&gt;

								      &lt;location&gt; Austin &lt;/location&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>In addition to its use as an attribute on the EMMA

								interpretation and container elements, the

								<code>emma:confidence</code> attribute MAY also be used to assign

								confidences to elements in instance data in the application

								namespace. This can be seen in the following example, where the

								<code>&lt;destination&gt;</code> and <code>&lt;origin&gt;</code>

								elements have confidences.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="meaning1" emma:confidence="0.6"

								     <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								     &lt;destination emma:confidence="0.8"&gt; Boston&lt;/destination&gt;

								     &lt;origin emma:confidence="0.6"&gt; Austin &lt;/origin&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Although in general instance data can be represented in XML

								using a combination of elements and attributes in the application

								namespace, EMMA does not provide a standard way to annotate

								processors' confidences in attributes. Consequently, instance data

								that is expected to be assigned confidences SHOULD be represented

								using elements, as in the above example.</p>

								<h3 id="s4.2.9">4.2.9 Input source: <code>emma:source</code>

								attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:source</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> referencing the

								source of input.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:one-of</code>,

								<code>emma:group</code> , <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The source of an interpreted input MAY be represented in EMMA as

								a URI resource using the <code>emma:source</code> annotation.</p>

								<p>Here is an example that shows different input sources for

								different input interpretations.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"

								    xmlns:myapp="http://www.example.com/myapp"&gt;

								  &lt;emma:one-of id="nbest1"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="intp1"

								        emma:source="http://example.com/microphone/NC-61"&gt;

								      &lt;myapp:destination&gt;Boston&lt;/myapp:destination&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="intp2"

								        emma:source="http://example.com/microphone/NC-4024"&gt;

								      &lt;myapp:destination&gt;Austin&lt;/myapp:destination&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.10">4.2.10 Timestamps</h3>

								<p>The start and end times for input MAY be indicated using either

								absolute timestamps or relative timestamps. Both are in

								milliseconds for ease in processing timestamps. Note that the

								ECMAScript Date object's <code>getTime()</code> function is a

								convenient way to determine the absolute time.</p>

								<h4 id="s4.2.10.1">4.2.10.1 Absolute timestamps:

								<code>emma:start</code>, <code>emma:end</code> attributes</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:start, emma:end</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attributes <span>of type

								<code>xsd:nonNegativeInteger</code></span> indicating the absolute

								starting and ending times of an input in terms of the number of

								milliseconds since 1 January 1970 00:00:00 GMT</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:arc</code>, <span>and</span> application instance

								data</td>

								</tr>

								</tbody>

								</table>

								<p>Here is an example of a timestamp for an absolute time.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1"

								       emma:start="1087995961542"

								       emma:end="1087995963542"<br />

								       <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;destination&gt;Chicago&lt;/destination&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:start</code> and <code>emma:end</code>

								annotations on an input MAY be identical, however the

								<code>emma:end</code> value MUST NOT be less than the

								<code>emma:start</code> value.</p>

								<h4 id="s4.2.10.2">4.2.10.2 Relative timestamps:

								<code>emma:time-ref-uri</code>,

								<code>emma:time-ref-anchor-point</code>,

								<code>emma:offset-to-start</code> attributes</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:time-ref-uri</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute of type <code>xsd:anyURI</code> indicating the URI

								used to anchor the relative timestamp.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:lattice</code>, <span>and</span> application instance

								data</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:time-ref-anchor-point</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute with a value of <code>start</code> or

								<code>end</code>, defaulting to <code>start</code>. It indicates

								whether to measure the time from the start or end of the interval

								designated with <code>emma:time-ref-uri</code>.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:lattice</code>, <span>and</span> application instance

								data</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:offset-to-start</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute <span>of type <code>xsd:integer</code></span>,

								defaulting to zero. It specifies the offset in milliseconds for the

								start of input from the anchor point designated with

								<span><code>emma:time-ref-uri</code></span> and

								<span><code>emma:time-ref-anchor-point</code></span></td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:arc</code>, <span>and</span> application instance

								data</td>

								</tr>

								</tbody>

								</table>

								<p>Relative timestamps define the start of an input relative to the

								start or end of a reference interval such as another input.</p>

								<p><img alt="relative timestamps" src=

								"relativetimestamps.png" /></p>

								<p>The reference interval is designated with

								<code>emma:time-ref-uri</code> attribute. This MAY be combined with

								<code>emma:time-ref-anchor-point</code> attribute to specify

								whether the anchor point is the start or end of this interval. The

								start of an input relative to this anchor point is then specified

								with <code>emma:offset-to-start</code> attribute.</p>

								<p>Here is an example where the referenced input is in the same

								document:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:sequence&gt;

								    &lt;emma:interpretation id="int1"<br />

								     <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Denver&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;

								    &lt;emma:interpretation id="int2"<br />

								        <span>emma:medium="acoustic" emma:mode="voice"</span>

								        emma:time-ref-uri="#int1"

								        emma:time-ref-anchor-point="start"

								        emma:offset-to-start="5000"&gt;

								    &lt;destination&gt;Chicago&lt;/destination&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:sequence&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Note that the reference point refers to an input, but not

								necessarily to a complete input. For example, if a speech

								recognizer timestamps each word in an utterance, the anchor point

								might refer to the timestamp for just one word.</p>

								<p>The absolute and relative timestamps are not mutually exclusive;

								that is, it is possible to have both relative and absolute

								timestamp attributes on the same EMMA container element.</p>

								<p>Timestamps of inputs collected by different devices will be

								subject to variation if the times maintained by the devices are not

								synchronized. This concern is outside of the scope of the EMMA

								specification.</p>

								<h4 id="s4.2.10.3">4.2.10.3 Duration of input:

								<code>emma:duration</code> attribute</h4>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:duration</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>Attribute <span>of type

								<code>xsd:nonNegativeInteger</code></span>, defaulting to zero. It

								specifies the duration of the input in milliseconds.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:arc</code>, <span>and</span> application instance

								data</td>

								</tr>

								</tbody>

								</table>

								<p>The duration of an input in milliseconds MAY be specified with

								the <code>emma:duration</code> attribute. The

								<code>emma:duration</code> attribute MAY be used either in

								combination with timestamps or independently, for example in the

								annotation of speech corpora.</p>

								<p>In the following example, the duration of the signal that gave

								rise to the interpretation is indicated using

								<code>emma:duration</code>.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								    &lt;emma:interpretation id="int1" emma:duration="2300"<br />

								        <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;origin&gt;Denver&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h4 id="s4.2.10.4">4.2.10.4 Composite Input and Relative

								Timestamps</h4>

								<p>This section is informative.</p>

								<p>The following table provides guidance on how to determine the

								values of relative timestamps on a composite input.</p>

								<div>

								<table summary="3 columns" border="1" cellpadding="3" cellspacing=

								"0">

								<caption>Informative Guidance on Relative Timestamps in Composite

								Derivations</caption>

								<tbody>

								<tr>

								<td><code>emma:time-ref-uri</code></td>

								<td>If the reference interval URI is the same for both inputs then

								it should be the same for the composite input. If it is not the

								same then relative timestamps will have to be resolved to absolute

								timestamps in order to determine the combined timestamp. .</td>

								</tr>

								<tr>

								<td><code>emma:time-ref-anchor-point</code></td>

								<td>If the anchor value is the same for both inputs then it should

								be the same for the composite input. If it is not the same then

								relative timestamps will have to be resolved to absolute timestamps

								in order to determine the combined timestamp.</td>

								</tr>

								<tr>

								<td><code>emma:offset-to-start</code></td>

								<td>Given that the <code>emma:time-ref-uri</code> and

								<code>emma:time-ref-anchor-point</code> are the same for both

								combining inputs, then the <code>emma:offset-to-start</code> for

								the combination should be the lesser of the two. If they are not

								the same then relative timestamps will have to be resolved to

								absolute timestamps in order to determine the combined

								timestamp.</td>

								</tr>

								<tr>

								<td><code>emma:duration</code></td>

								<td>Given that the <code>emma:time-ref-uri</code> and

								<code>emma:time-ref-anchor-point</code> are the same for both

								combining inputs, then the <code>emma:duration</code> is calculated

								as follows. Add together the <code>emma:offset-to-start</code> and

								<code>emma:duration</code> for each of the inputs. Take whichever

								of these is greater and subtract from it the lesser of the

								<code>emma:offset-to-start</code> values in order to determine the

								combined duration. If <code>emma:time-ref-uri</code> and

								<code>emma:time-ref-anchor-point</code> are not the same then

								relative timestamps will have to be resolved to absolute timestamps

								in order to determine the combined timestamp.</td>

								</tr>

								</tbody>

								</table>

								</div>

								<h3 id="s4.2.11">4.2.11 Medium, mode, and function of user inputs:

								<code>emma:medium</code>, <code>emma:mode</code>,

								<code>emma:function</code>, <code>emma:verbal</code>

								attributes</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:medium</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <span><code>xsd:nmtokens</code></span>

								<span>which contains a space delimited set of values from the

								set</span> {<code>acoustic</code>, <code>tactile</code>,

								<code>visual</code>}.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:endpoint</code>, and application instance data</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:mode</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <span><code>xsd:nmtokens</code></span>

								<span>which contains a space delimited set of values from</span> an

								open set of values including: {<span><code>voice</code>,

								<code>dtmf</code></span>, <code>ink</code>, <code>gui</code>,

								<code>keys</code>, <code>video</code>, <code>photograph</code>,

								...}.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:endpoint</code>, and application instance data</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:function</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> constrained to

								values in the open set {<code>recording</code>,

								<code>transcription</code>, <code>dialog</code>,

								<code>verification</code>, ...}.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data</td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:verbal</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:boolean</code>.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data</td>

								</tr>

								</tbody>

								</table>

								<p>EMMA provides two properties for the annotation of input

								modality. One indicating the broader medium or channel

								(<code>emma:medium</code>) and another indicating the specific mode

								of communication used on that channel (<code>emma:mode</code>). The

								input medium is defined from the users perspective and indicates

								whether they use their voice (<code>acoustic</code>), touch

								(<code>tactile</code>), or visual appearance/motion

								(<code>visual</code>) as input. Tactile includes most

								<i>hand-on</i> input device types such as pen, mouse, keyboard, and

								touch screen. Visual is used for camera input.</p>

								<pre class="example">

								emma:medium = <span>space delimited sequence of values from the set: </span>

								            [acoustic|tactile|visual]

								</pre>

								<p>The mode property provides the ability to distinguish between

								different modes of communication that may be within a particular

								medium. For example, in the tactile medium, modes include

								electronic ink (<code>ink</code>), and pointing and clicking on a

								graphical user interface (<code>gui</code>).</p>

								<pre class="example">

								emma:mode = <span>space delimited sequence of values from the set: </span>

								            [<span>voice|dtmf</span>|ink|gui|keys|video|photograph| ... ]

								</pre>

								<p>The <code>emma:medium</code> classification is based on the

								boundary between the user and the device that they use. For

								<code>emma:medium="tactile"</code> the user physically touches the

								device in order to provide input. For

								<code>emma:medium="visual"</code> the user's movement is captured

								by sensors (cameras, infrared) resulting in an input to the system.

								In the case where <code>emma:medium="acoustic"</code> the user

								provides input to the system by producing an acoustic signal. Note

								then that DTMF input will be classified as

								<code>emma:medium="tactile"</code> since in order to provide DTMF

								input the user physically presses keys on a keypad.</p>

								<p>While <code>emma:medium</code> and <code>emma:mode</code> are

								optional on specific elements such as

								<code>emma:interpretation</code> and <code>emma:one-of</code>, note

								that all EMMA interpretations must be annotated for

								<code>emma:medium</code> and <code>emma:mode</code>, so either

								these attributes must appear directly on

								<code>emma:interpretation</code> or they must appear on an ancestor

								<code>emma:one-of</code> node or they must appear on an earlier

								stage of the derivation listed in <code>emma:derivation</code>.</p>

								<p>Orthogonal to the mode, user inputs can also be classified with

								respect to their communicative function. This enables a simpler

								mode classification.</p>

								<pre class="example">

								emma:function = [recording|transcription|dialog|verification| ... ]

								</pre>

								<p>For example, speech can be used for recording (e.g. voicemail),

								transcription (e.g. dictation), dialog (e.g. interactive spoken

								dialog systems), and verification (e.g. identifying users through

								their voiceprints).</p>

								<p>EMMA also supports an additional property

								<code>emma:verbal</code> which distinguishes verbal use of an input

								mode from non-verbal. This MAY be used to distinguish the use of

								electronic ink to convey handwritten commands from the user of

								electronic ink for symbolic gestures such as circles and arrows.

								Handwritten commands, such as writing <i>downtown</i> in order to

								change a map display to show the downtown are classified as verbal

								(<code>emma:function="dialog" emma:verbal="true"</code>). Pen

								gestures (arrows, lines, circles, etc), such as circling a

								building, are classified as non-verbal dialog

								(<code>emma:function="dialog" emma:verbal="false"</code>). The use

								of handwritten words to transcribe an email message is classified

								as transcription (<code>emma:function="transcription"

								emma:verbal="true"</code>).</p>

								<pre class="example">

								emma:verbal = [true|false]

								</pre>

								<p>Handwritten words and ink gestures are typically recognized

								using different kinds of recognition components (handwriting

								recognizer vs. gesture recognizer) and the verbal annotation will

								be added by the recognition component which classifies the input.

								The original input source, a pen in this case, will not be aware of

								this difference. The input source identifier will tell you that the

								input was from a pen of some kind but will not tell you if the mode

								of input was handwriting (<i>show downtown</i>) or gesture (e.g.

								circling an object or area).</p>

								<p>Here is an example of the EMMA annotation for a pen input where

								the user's ink is recognized as either a word ("Boston") or as an

								arrow:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of id="nbest1"&gt;

								    &lt;emma:interpretation id="interp1"

								     emma:confidence="0.6"

								     emma:medium="tactile"

								     emma:mode="ink"

								     emma:function="dialog"

								     emma:verbal="true"&gt;

								       &lt;location&gt;Boston&lt;/location&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp2"

								     emma:confidence="0.4"

								     emma:medium="tactile"

								     emma:mode="ink"

								     emma:function="dialog"

								     emma:verbal="false"&gt;

								       &lt;direction&gt;45&lt;/direction&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Here is an example of the EMMA annotation for a spoken command

								which is recognized as either "Boston" or "Austin":</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of&gt;

								    &lt;emma:interpretation id="interp1"

								     emma:confidence="0.6"

								     emma:medium="acoustic"

								     emma:mode="voice"

								     emma:function="dialog"

								     emma:verbal="true"&gt;

								       &lt;location&gt;Boston&lt;/location&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="interp2"

								     emma:confidence="0.4"

								     emma:medium="acoustic"

								     emma:mode="voice"

								     emma:function="dialog"

								     emma:verbal="true"&gt;

								       &lt;location&gt;Austin&lt;/location&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The following table shows the relationship between the medium,

								mode, and function properties and serves as an aid for classifying

								inputs. For the dialog function it also shows some examples of the

								classification of inputs as verbal vs. non-verbal.</p>

								<table class="modes" summary="7 columns" border="1" cellpadding="3"

								cellspacing="0">

								<tbody>

								<tr>

								<th rowspan="2">Medium</th>

								<th rowspan="2">Device</th>

								<th rowspan="2">Mode</th>

								<th colspan="4">Function</th>

								</tr>

								<tr>

								<th>recording</th>

								<th>dialog</th>

								<th>transcription</th>

								<th>verification</th>

								</tr>

								<tr>

								<td rowspan="2">acoustic</td>

								<td rowspan="2">microphone</td>

								<td rowspan="2">voice</td>

								<td rowspan="2">audiofile (e.g. voicemail)</td>

								<td>spoken command / query / response (verbal = true)</td>

								<td rowspan="2">dictation</td>

								<td rowspan="2">speaker recognition</td>

								</tr>

								<tr>

								<td>singing a note (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="14">tactile</td>

								<td rowspan="2">keypad</td>

								<td rowspan="2">dtmf</td>

								<td rowspan="2">audiofile / character stream</td>

								<td>typed command / query / response (verbal = true)</td>

								<td rowspan="2">text entry (T9-tegic, word completion, or word

								grammar)</td>

								<td rowspan="2">password / pin entry</td>

								</tr>

								<tr>

								<td>command key "Press 9 for sales" (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="2">keyboard</td>

								<td rowspan="2">dtmf</td>

								<td rowspan="2">character / key-code stream</td>

								<td>typed command / query / response (verbal = true)</td>

								<td rowspan="2">typing</td>

								<td rowspan="2">password / pin entry</td>

								</tr>

								<tr>

								<td>command key "Press S for sales" (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="4">pen</td>

								<td rowspan="2">ink</td>

								<td rowspan="2">trace, sketch</td>

								<td>handwritten command / query / response (verbal = true)</td>

								<td rowspan="2">handwritten text entry</td>

								<td rowspan="2">signature, handwriting recognition</td>

								</tr>

								<tr>

								<td>gesture (e.g. circling building) (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="2">gui</td>

								<td rowspan="2">N/A</td>

								<td>tapping on named button (verbal = true)</td>

								<td rowspan="2">soft keyboard</td>

								<td rowspan="2">password / pin entry</td>

								</tr>

								<tr>

								<td>drag and drop, tapping on map (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="4">mouse</td>

								<td rowspan="2">ink</td>

								<td rowspan="2">trace, sketch</td>

								<td>handwritten command / query / response (verbal = true)</td>

								<td rowspan="2">handwritten text entry</td>

								<td rowspan="2">N/A</td>

								</tr>

								<tr>

								<td>gesture (e.g. circling building) (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="2">gui</td>

								<td rowspan="2">N/A</td>

								<td>clicking named button (verbal = true)</td>

								<td rowspan="2">soft keyboard</td>

								<td rowspan="2">password / pin entry</td>

								</tr>

								<tr>

								<td>drag and drop, clicking on map (verbal = false)</td>

								</tr>

								<tr>

								<td rowspan="2">joystick</td>

								<td>ink</td>

								<td>trace,sketch</td>

								<td>gesture (e.g. circling building) (verbal = false)</td>

								<td>N/A</td>

								<td>N/A</td>

								</tr>

								<tr>

								<td>gui</td>

								<td>N/A</td>

								<td>pointing, clicking button / menu (verbal = false)</td>

								<td>soft keyboard</td>

								<td>password / pin entry</td>

								</tr>

								<tr>

								<td rowspan="5">visual</td>

								<td rowspan="2">page scanner</td>

								<td rowspan="2">photograph</td>

								<td rowspan="2">image</td>

								<td>handwritten command / query / response (verbal = true)</td>

								<td rowspan="2">optical character recognition, object/scene

								recognition (markup, e.g. SVG)</td>

								<td rowspan="2">N/A</td>

								</tr>

								<tr>

								<td>drawings and images (verbal = false)</td>

								</tr>

								<tr>

								<td>still camera</td>

								<td>photograph</td>

								<td>image</td>

								<td>objects (verbal = false)</td>

								<td>visual object/scene recognition</td>

								<td>face id, retinal scan</td>

								</tr>

								<tr>

								<td rowspan="2">video camera</td>

								<td rowspan="2">video</td>

								<td rowspan="2">movie</td>

								<td>sign language (verbal = true)</td>

								<td rowspan="2">audio/visual recognition</td>

								<td rowspan="2">face id, gait id, retinal scan</td>

								</tr>

								<tr>

								<td>face / hand / arm / body gesture (e.g. pointing, facing)

								(verbal = false)</td>

								</tr>

								</tbody>

								</table>

								<h3 id="s4.2.12">4.2.12 Composite multimodality:

								<code>emma:hook</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:hook</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> constrained to

								values in the open set {<code>voice</code>, <code>dtmf</code>,

								<code>ink</code>, <code>gui</code>, <code>keys</code>,

								<code>video</code>, <code>photograph</code>, ...} or the wildcard

								<code>any</code></td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td>Application instance data</td>

								</tr>

								</tbody>

								</table>

								<p>The attribute <code>emma:hook</code> MAY be used to mark the

								elements in the application semantics within an

								<code>emma:interpretation</code> which are expected to be

								integrated with content from input in another mode to yield a

								complete interpretation. The <code>emma:mode</code> to be

								integrated at that point in the application semantics is indicated

								as the value of the <code>emma:hook</code> attribute. The possible

								values of <code>emma:hook</code> are the list of input modes that

								can be values of <code>emma:mode</code> <span>(see <a href=

								"#s4.2.11">Section 4.2.11</a>)</span>. In addition to these, the

								value of <code>emma:hook</code> can also be the wildcard

								<code>any</code> indicating that the other content can come from

								any source. The annotation <code>emma:hook</code> differs in

								semantics from <code>emma:mode</code> as follows. Annotating an

								element in the application semantics with

								<code>emma:mode="ink"</code> indicates that that part of the

								semantics came from the <code>ink</code> mode. Annotating an

								element in the application semantics with

								<code>emma:hook="ink"</code> indicates that part of the semantics

								needs to be integrated with content from the <code>ink</code>

								mode.</p>

								<p>To illustrate the use of <code>emma:hook</code> consider an

								example composite input in which the user says "zoom in here" in

								the speech input mode while drawing an area on a graphical display

								in the ink input mode. <span>The fact that the

								<code>location</code> element needs to come from the

								<code>ink</code> mode is indicated by annotating this application

								namespace element using <code>emma:hook</code></span></p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation <span>emma:medium="acoustic"</span> emma:mode="voice"&gt;

								    &lt;command&gt;

								      &lt;action&gt;zoom&lt;/action&gt;

								      &lt;location emma:hook="ink"&gt;

								        &lt;type&gt;area&lt;/type&gt;

								      &lt;/location&gt;

								    &lt;/command&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>For more detailed explanation of this example see <a href=

								"#appC">Appendix C</a>.</p>

								<h3 id="s4.2.13">4.2.13 Cost: <code>emma:cost</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:cost</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:decimal</code> in range 0.0 to

								10000000, indicating the processor's cost or weight associated with

								an input or part of an input.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>,

								<code>emma:arc</code>, <code>emma:node</code>, and application

								instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The cost annotation in EMMA indicates the weight or cost

								associated with an user's input or part of their input. The most

								common use of <code>emma:cost</code> is for representing the costs

								encoded on a lattice output from speech recognition or other

								recognition or understanding processes. <code>emma:cost</code> MAY

								also be used to indicate the total cost associated with particular

								recognition results or semantic interpretations.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:one-of <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="meaning1" emma:cost="1600"&gt;

								      &lt;location&gt;Boston&lt;/location&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="meaning2" emma:cost="400"&gt;

								      &lt;location&gt; Austin &lt;/location&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.14">4.2.14 Endpoint properties:

								<code>emma:endpoint-role</code>,

								<code>emma:endpoint-address</code>, <code>emma:port-type</code>,

								<code>emma:port-num</code>, <code>emma:message-id</code>,

								<code>emma:service-name</code>, <code>emma:endpoint-pair-ref</code>,

								<code>emma:endpoint-info-ref</code>

								attributes</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint-role</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> constrained to

								values in the set {<code>source</code>, <code>sink</code>,

								<code>reply-to</code>, <code>router</code>}.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint-address</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> that uniquely

								specifies the network address of the

								<code>emma:endpoint</code>.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:port-type</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:QName</code> that specifies the

								type of the port.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:port-num</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:nonNegativeInteger</code> that

								specifies the port number.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:message-id</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> that specifies the

								message ID associated with the data.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:service-name</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> that specifies the

								name of the service.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint-pair-ref</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:anyURI</code> that specifies the

								pairing between sink and source endpoints.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:endpoint</code></td>

								</tr>

								<tr>

								<th>Annotation</th>

								<th>emma:endpoint-info-ref</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:IDREF</code> referring to the

								<code>id</code> attribute of an <code>emma:endpoint-info</code>

								element.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:endpoint-role</code> attribute specifies the role

								that the particular <code>emma:endpoint</code> performs in

								multimodal interaction. The role value <code>sink</code> indicates

								that the particular endpoint is the receiver of the input data. The

								role value <code>source</code> indicates that the particular

								endpoint is the sender of the input data. The role value

								<code>reply-to</code> indicates that the particular

								<code>emma:endpoint</code> is the intended endpoint for the reply.

								The same <code>emma:endpoint-address</code> MAY appear in multiple

								<code>emma:endpoint</code> elements, provided that the same

								endpoint address is used to serve multiple roles, e.g. sink,

								source, reply-to, router, etc., or associated with multiple

								interpretations.</p>

								<p>The <code>emma:endpoint-address</code> specifies the network

								address of the <code>emma:endpoint</code>, and

								<code>emma:port-type</code> specifies the port type of the

								<code>emma:endpoint</code>. The <code>emma:port-num</code>

								annotates the port number of the endpoint (e.g. the typical port

								number for an http endpoint is 80). The

								<code>emma:message-id</code> annotates the message ID information

								associated with the annotated input. This meta information is used

								to establish and maintain the communication context for both

								inbound processing and outbound operation. The service

								specification of the <code>emma:endpoint</code> is annotated by

								<code>emma:service-name</code> which contains the definition of the

								service that the <code>emma:endpoint</code> performs. The matching

								of the <code>sink</code> endpoint and its pairing

								<code>source</code> endpoint is annotated by the

								<code>emma:endpoint-pair-ref</code> attribute. One sink endpoint

								MAY link to multiple source endpoints through

								<code>emma:endpoint-pair-ref</code>. Further bounding of the

								<code>emma:endpoint</code> is possible by using the annotation of

								<code>emma:group</code> (see <a href="#s3.3.2">Section

								3.3.2</a>).</p>

								<p>The <code>emma:endpoint-info-ref</code> attribute associates the

								EMMA result in the container element with an

								<code>emma:endpoint-info</code> element.</p>

								<p>The following example illustrates the use of these attributes in

								multimodal interactions where multiple modalities are used.</p>

								<pre>

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"

								    xmlns:ex="http://www.example.com/emma/port"&gt;

								  &lt;emma:endpoint-info id="audio-channel-1" &gt;

								    &lt;emma:endpoint id="endpoint1"

								        emma:endpoint-role="sink"

								        emma:endpoint-address="135.61.71.103"

								        emma:port-num="50204"

								        emma:port-type="rtp"

								        emma:endpoint-pair-ref="endpoint2"

								        emma:media-type="audio/dsr-202212; rate:8000; maxptime:40"

								        emma:service-name="travel"

								        emma:mode="voice"&gt;

								      &lt;ex:app-protocol&gt;SIP&lt;/ex:app-protocol&gt;

								    &lt;/emma:endpoint&gt;


								    &lt;emma:endpoint id="endpoint2" emma:endpoint-role="source"

								        emma:endpoint-address="136.62.72.104"

								        emma:port-num="50204"

								        emma:port-type="rtp"

								        emma:endpoint-pair-ref="endpoint1"

								        emma:media-type="audio/dsr-202212; rate:8000; maxptime:40"

								        emma:service-name="travel"

								        emma:mode="voice"&gt;

								      &lt;ex:app-protocol&gt;SIP&lt;/ex:app-protocol&gt;

								    &lt;/emma:endpoint&gt;

								  &lt;/emma:endpoint-info&gt;


								  &lt;emma:endpoint-info id="ink-channel-1"&gt;

								     &lt;emma:endpoint id="endpoint3" emma:endpoint-role="sink"

								         emma:endpoint-address="http://emma.example/sink"

								         emma:endpoint-pair-ref="endpoint4"

								         emma:port-num="80" emma:port-type="http"

								         emma:message-id="uuid:2e5678"

								         emma:service-name="travel"

								         emma:mode="ink"/&gt;

								     &lt;emma:endpoint id="endpoint4"

								         emma:endpoint-role="source"

								         emma:port-address="http://emma.example/source"

								         emma:endpoint-pair-ref="endpoint3"

								         emma:port-num="80"

								         emma:port-type="http"

								         emma:message-id="uuid:2e5678"

								         emma:service-name="travel"

								         emma:mode="ink"/&gt;

								  &lt;/emma:endpoint-info&gt;


								  &lt;emma:group&gt;

								    &lt;emma:interpretation id="int1" emma:start="1087995961542"

								        emma:end="1087995963542"

								        emma:endpoint-info-ref="audio-channel-1"<br />

								        emma:medium="acoustic" emma:mode="voice"&gt;

								      &lt;destination&gt;Chicago&lt;/destination&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:start="1087995961542"

								        emma:end="1087995963542"

								        emma:endpoint-info-ref="ink-channel-1"<br />

								        emma:medium="acoustic" emma:mode="voice"&gt;

								      &lt;location&gt;

								         &lt;type&gt;area&lt;/type&gt;

								         &lt;points&gt;34.13 -37.12 42.13 -37.12 ... &lt;/points&gt;

								      &lt;/location&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:group&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.15">4.2.15 Reference to <code>emma:grammar</code>

								element: <code>emma:grammar-ref</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:grammar-ref</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:IDREF</code> referring to the

								<code>id</code> attribute of an <code>emma:grammar</code>

								element<span>.</span></td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:grammar-ref</code> annotation associates the EMMA

								result in the container element with an <code>emma:grammar</code>

								element.</p>

								<p>Example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:grammar id="gram1" <span>ref</span>="someURI"/&gt;


								  &lt;emma:grammar id="gram2" <span>ref</span>="anotherURI"/&gt;


								  &lt;emma:one-of id="r1"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1" emma:grammar-ref="gram1"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:grammar-ref="gram1"&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int3" emma:grammar-ref="gram2"&gt;

								      &lt;command&gt;help&lt;/command&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.16">4.2.16 Reference to <code>emma:model</code>

								element: <code>emma:model-ref</code> attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:model-ref</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:IDREF</code> referring to the

								<code>id</code> attribute of an <code>emma:model</code>

								element<span>.</span></td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, <code>emma:sequence</code>, and

								application instance data.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:model-ref</code> annotation associates the EMMA

								result in the container element with an <code>emma:model</code>

								element.</p>

								<p>Example:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:model id="model1" ref="someURI"/&gt;


								  &lt;emma:model id="model2" ref="anotherURI"/&gt;


								  &lt;emma:one-of id="r1"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;emma:interpretation id="int1" emma:model-ref="model1"&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int2" emma:model-ref="model1"&gt;

								      &lt;origin&gt;Austin&lt;/origin&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="int3" emma:model-ref="model2"&gt;

								      &lt;command&gt;help&lt;/command&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:one-of&gt;

								&lt;/emma:emma&gt;

								</pre>

								<h3 id="s4.2.17">4.2.17 Dialog turns: <code>emma:dialog-turn</code>

								attribute</h3>

								<table class="defn" summary="property definition" width="98%"

								cellpadding="5" cellspacing="0">

								<tbody>

								<tr>

								<th>Annotation</th>

								<th>emma:dialog-turn</th>

								</tr>

								<tr>

								<th>Definition</th>

								<td>An attribute of type <code>xsd:string</code> referring to the

								dialog turn associated with a given container element.</td>

								</tr>

								<tr>

								<th>Applies to</th>

								<td><code>emma:interpretation</code>, <code>emma:group</code>,

								<code>emma:one-of</code>, and <code>emma:sequence</code>.</td>

								</tr>

								</tbody>

								</table>

								<p>The <code>emma:dialog-turn</code> annotation associates the EMMA

								result in the container element with a dialog turn. The syntax and

								semantics of dialog turns is left open to suit the needs of

								individual applications. For example, some applications might use

								an integer value, where successive turns are represented by

								successive integers. Other applications might combine a name of a

								dialog participant with an integer value representing the turn

								number for that participant. Ordering semantics for comparison of

								<code>emma:dialog-turn</code> is deliberately unspecified and left

								for applications to define.</p>

								<p>Example:</p>

								<pre class="example">

								<span>

								&lt;emma:emma version="1.0"

								    emma="http://www.w3.org/2003/04/emma"

								    xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="int1" emma:dialog-turn="u8"<br />

								    <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;quantity&gt;3&lt;/quantity&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;</span>

								</pre>

								<h2 class="notoc" id="s4.3">4.3 Scope of EMMA annotations</h2>

								<p>The <code>emma:derived-from</code> element (<a href=

								"#s4.1.2">Section 4.1.2</a>) can be used to capture both sequential

								and composite derivations. This section concerns the scope of EMMA

								annotations across <span>sequential</span> derivations of user

								input connected using the <code>emma:derived-from</code> element

								(<a href="#s4.1.2">Section 4.1.2</a>). Sequential derivations

								involve processing steps that do not involve multimodal

								integration, such as applying natural language understanding and

								then reference resolution to a speech transcription. EMMA

								derivations describe only single turns of user input and are not

								intended to describe a sequence of dialog turns.</p>

								<p>For example, an EMMA document could contain

								<code>emma:interpretation</code> elements for the transcription,

								interpretation, and reference resolution of a speech input,

								utilizing the <code>id</code> values: <code>raw</code>,

								<code>better</code>, and <code>best</code> respectively:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								 &lt;emma:derivation&gt;

								  &lt;emma:interpretation id="raw"

								      emma:process="http://example.com/myasr1.xml"

								      <span>emma:medium="acoustic" emma:mode="voice"</span>&gt;

								    &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								  &lt;/emma:interpretation&gt;


								  &lt;emma:interpretation id="better"

								      emma:process="http://example.com/mynlu1.xml"&gt;

								    &lt;emma:derived-from resource="#raw" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;tomorrow&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								 &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="best"

								      emma:process="http://example.com/myrefresolution1.xml"&gt;

								    &lt;emma:derived-from resource="#better" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Each member of the derivation chain is linked to the previous

								one by a <code>derived-from</code> element (<a href=

								"#s4.1.2">Section 4.1.2</a>), which has an attribute

								<code>resource</code> that provides a pointer to the

								<code>emma:interpretation</code> from which it is derived. The

								<code>emma:process</code> annotation (<a href="#s4.2.2">Section

								4.2.2</a>) provides a pointer to the process used for each stage of

								the derivation.</p>

								<p>The following EMMA example represents the same derivation as

								above but with a more fully specified set of annotations:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:interpretation id="raw"

								        emma:process="http://example.com/myasr1.xml"

								        emma:source="http://example.com/microphone/NC-61"

								        emma:signal="http://example.com/signals/sg23.wav"

								        emma:confidence="0.6"

								        emma:medium="acoustic"

								        emma:mode="voice"

								        emma:function="dialog"

								        emma:verbal="true"

								        emma:tokens="from boston to denver tomorrow"

								        emma:lang="en-US"&gt;

								      &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="better"

								        emma:process="http://example.com/mynlu1.xml"

								        emma:source="http://example.com/microphone/NC-61"

								        emma:signal="http://example.com/signals/sg23.wav"

								        emma:confidence="0.8"

								        emma:medium="acoustic"

								        emma:mode="voice"

								        emma:function="dialog"

								        emma:verbal="true"

								        emma:tokens="from boston to denver tomorrow"

								        emma:lang="en-US"&gt;

								      &lt;emma:derived-from resource="#raw" composite="false"/&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;tomorrow&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="best"

								      emma:process="http://example.com/myrefresolution1.xml"

								      emma:source="http://example.com/microphone/NC-61"

								      emma:signal="http://example.com/signals/sg23.wav"

								      emma:confidence="0.8"

								      emma:medium="acoustic"

								      emma:mode="voice"

								      emma:function="dialog"

								      emma:verbal="true"

								      emma:tokens="from boston to denver tomorrow"

								      emma:lang="en-US"&gt;

								    &lt;emma:derived-from resource="#better" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>EMMA annotations on earlier stages of the derivation often

								remain accurate at later stages of the derivation. Although this

								can be captured in EMMA by repeating the annotations on each

								<code>emma:interpretation</code> within the derivation, as in the

								example above, there are two disadvantages of this approach to

								annotation. First, the repetition of annotations makes the

								resulting EMMA documents significantly more verbose. Second, EMMA

								processors used for intermediate tasks such as natural language

								understanding and reference resolution will need to read in all of

								the annotations and write them all out again.</p>

								<p>EMMA overcomes these problems by assuming that annotations on

								earlier stages of a derivation automatically apply to later stages

								of the derivation unless a new value is specified. Later stages of

								the derivation essentially inherit annotations from earlier stages

								in the derivation. For example, if there was an

								<code>emma:source</code> annotation on the transcription

								(<code>raw</code>) it would also apply to the later stages of the

								derivation such as the result of natural language understanding

								(<code>better</code>) or reference resolution

								(<code>best</code>).</p>

								<p>Because of the assumption in EMMA that annotations have scope

								over later stages of a sequential derivation, the example EMMA

								document above can be equivalently represented as follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:derivation&gt;

								    &lt;emma:interpretation id="raw"

								        emma:process="http://example.com/myasr1.xml"

								        emma:source="http://example.com/microphone/NC-61"

								        emma:signal="http://example.com/signals/sg23.wav"

								        emma:confidence="0.6"

								        emma:medium="acoustic"

								        emma:mode="voice"

								        emma:function="dialog"

								        emma:verbal="true"

								        emma:tokens="from boston to denver tomorrow"

								        emma:lang="en-US"&gt;

								      &lt;answer&gt;From Boston to Denver tomorrow&lt;/answer&gt;

								    &lt;/emma:interpretation&gt;


								    &lt;emma:interpretation id="better"

								        emma:process="http://example.com/mynlu1.xml"

								        emma:confidence="0.8"&gt;

								      &lt;emma:derived-from resource="#raw" composite="false"/&gt;

								      &lt;origin&gt;Boston&lt;/origin&gt;

								      &lt;destination&gt;Denver&lt;/destination&gt;

								      &lt;date&gt;tomorrow&lt;/date&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:derivation&gt;


								  &lt;emma:interpretation id="best"

								      emma:process="http://example.com/myrefresolution1.xml"&gt;

								    &lt;emma:derived-from resource="#better" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The fully specified derivation illustrated above is equivalent

								to the reduced form derivation following it where only annotations

								with new values are specified at each stage. These two EMMA

								documents MUST yield the same result when processed by an EMMA

								processor.</p>

								<p>The <code>emma:confidence</code> annotation is respecified on

								the <code>better</code> interpretation. This indicates the

								confidence score for natural language understanding, whereas

								<code>emma:confidence</code> on the <code>raw</code> interpretation

								indicates the speech recognition confidence score.</p>

								<p>In order to determine the full set of annotations that apply to

								an <code>emma:interpretation</code> element an EMMA processor or

								script needs to access the annotations directly on that element and

								for any that are not specified follow the reference in the

								<code>resource</code> attribute of the

								<code>emma:derived-from</code> element to add in annotations from

								earlier stages of the derivation.</p>

								<p>The EMMA annotations break down into three groups with respect

								to their scope in sequential derivations. One group of annotations

								always hold<span>s</span> true for all members of a sequential

								derivation. A second group <span>is</span> always respecified on

								each stage of the derivation. A third group may or may not be

								respecified.</p>

								<table summary="7 columns" border="1" cellpadding="3" cellspacing=

								"0">

								<caption>Scope of Annotations in Sequential Derivations</caption>

								<tbody>

								<tr>

								<th>Classification</th>

								<th>Annotation</th>

								</tr>

								<tr>

								<td rowspan="16">Applies to whole derivation</td>

								<td><code>emma:signal</code></td>

								</tr>

								<tr>

								<td><code><span>emma:signal-size</span></code></td>

								</tr>

								<tr>

								<td><code><span>emma:dialog-turn</span></code></td>

								</tr>

								<tr>

								<td><code>emma:source</code></td>

								</tr>

								<tr>

								<td><code>emma:medium</code></td>

								</tr>

								<tr>

								<td><code>emma:mode</code></td>

								</tr>

								<tr>

								<td><code>emma:function</code></td>

								</tr>

								<tr>

								<td><code>emma:verbal</code></td>

								</tr>

								<tr>

								<td><code>emma:lang</code></td>

								</tr>

								<tr>

								<td><code>emma:tokens</code></td>

								</tr>

								<tr>

								<td><code>emma:start</code></td>

								</tr>

								<tr>

								<td><code>emma:end</code></td>

								</tr>

								<tr>

								<td><code>emma:time-ref-uri</code></td>

								</tr>

								<tr>

								<td><code>emma:time-ref-anchor-point</code></td>

								</tr>

								<tr>

								<td><code>emma:offset-to-start</code></td>

								</tr>

								<tr>

								<td><code>emma:duration</code></td>

								</tr>

								<tr>

								<td rowspan="2">Specified at each stage of derivation</td>

								<td><code>emma:derived-from</code></td>

								</tr>

								<tr>

								<td><code>emma:process</code></td>

								</tr>

								<tr>

								<td rowspan="6">May be respecified</td>

								<td><code>emma:confidence</code></td>

								</tr>

								<tr>

								<td><code>emma:cost</code></td>

								</tr>

								<tr>

								<td><code>emma:grammar-ref</code></td>

								</tr>

								<tr>

								<td><code>emma:model-ref</code></td>

								</tr>

								<tr>

								<td><code>emma:no-input</code></td>

								</tr>

								<tr>

								<td><code>emma:uninterpreted</code></td>

								</tr>

								</tbody>

								</table>

								<p>One potential problem with this annotation scoping mechanism is

								that earlier annotations could be lost if earlier stages of a

								derivation were dropped in order to reduce message size. This

								problem can be overcome by considering annotation scope at the

								point where earlier derivation stages are discarded and populating

								the final interpretation in the derivation with all of the

								annotations which it could inherit. For example, if the

								<code>raw</code> and <code>better</code> stages were dropped the

								resulting EMMA document would be:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="best"

								      emma:start="1087995961542"

								      emma:end="1087995963542"

								      emma:process="http://example.com/myrefresolution1.xml"

								      emma:source="http://example.com/microphone/NC-61"

								      emma:signal="http://example.com/signals/sg23.wav"

								      emma:confidence="0.8"

								      emma:medium="acoustic"

								      emma:mode="voice"

								      emma:function="dialog"

								      emma:verbal="true"

								      emma:tokens="from boston to denver tomorrow"

								      emma:lang="en-US"&gt;

								    &lt;emma:derived-from resource="#better" composite="false"/&gt;

								    &lt;origin&gt;Boston&lt;/origin&gt;

								    &lt;destination&gt;Denver&lt;/destination&gt;

								    &lt;date&gt;03152003&lt;/date&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>Annotations on an <code>emma:one-of</code> element are assumed

								to apply to all of the container elements within the

								<code>emma:one-of</code>.</p>

								<p>If <code>emma:one-of</code> appears with another

								<code>emma:one-of</code> then annotations on the parent

								<code>emma:one-of</code> are assumed to apply to the children of

								the child <code>emma:one-of</code>.</p>

								<p>Annotations on <code>emma:group</code> or

								<code>emma:sequence</code> do not apply to their child

								elements.</p>

								<h2 id="s5">5. Conformance</h2>

								<p>The contents of this section are normative.</p>

								<h3 id="s5.1">5.1 Conforming EMMA Documents</h3>

								<p>A document is a Conforming EMMA Document if it meets both the

								following conditions:</p>

								<ul>

								<li>It is a well-formed XML document [<a href="#XML">XML</a>]

								conforming to Namespaces in XML [<a href="#XMLNS">XMLNS</a>].</li>

								<li>It adheres to the specification described in this document

								(EMMA Specification) including the constraints expressed in the

								Schema (see <a href="#appA">Appendix A</a>) and having an XML

								Prolog and root element as specified in <a href="#s3.1">Section

								3.1</a>.</li>

								</ul>

								<p>The EMMA specification and these conformance criteria provide no

								designated size limits on any aspect of EMMA documents. There are

								no maximum values on the number of elements, the amount of

								character data, or the number of characters in attribute

								values.</p>

								<p><span>Within this specification, the term URI refers to a

								Universal Resource Identifier as defined in [<a href=

								"#RFC3986">RFC3986</a>] and extended in [<a href=

								"#RFC3987">RFC3987</a>] with the new name IRI. The term URI has

								been retained in preference to IRI to avoid introducing new names

								for concepts such as "Base URI" that are defined or referenced

								across the whole family of XML specifications</span>.</p>

								<h3 id="s5.2">5.2 Using EMMA with other Namespaces</h3>

								<p>The EMMA namespace is intended to be used with other XML

								namespaces as per the Namespaces in XML Recommendation [<a href=

								"#XMLNS">XMLNS</a>]. Future work by W3C is expected to address ways

								to specify conformance for documents involving multiple

								namespaces.</p>

								<h3 id="s5.3">5.3 Conforming EMMA Processors</h3>

								<p>A EMMA processor is a program that can process and/or generate

								Conforming EMMA documents.</p>

								<p>In a Conforming EMMA Processor, the XML parser MUST be able to

								parse and process all XML constructs defined by XML 1.1 [<a href=

								"#XML">XML</a>] and Namespaces in XML [<a href="#XMLNS">XMLNS</a>].

								It is not required that a Conforming EMMA Processor uses a

								validating XML parser.</p>

								<p>A Conforming EMMA Processor MUST correctly understand and apply

								the semantics of each markup element or attribute as described by

								this document.</p>

								<p>There is, however, no conformance requirement with respect to

								performance characteristics of the EMMA Processor. For instance, no

								statement is required regarding the accuracy, speed or other

								characteristics of output produced by the processor. No statement

								is made regarding the size of input that a EMMA Processor is

								required to support.</p>

								<h2 id="appendices">Appendices</h2>

								<h3 id="appA">Appendix A. XML and <span>RELAX NG</span>

								schemata</h3>

								<p>This section is Normative.</p>

								<p>This section defines the formal syntax for EMMA documents in

								terms of a normative XML Schema.</p>

								<p>There are both an XML Schema and <span>RELAX NG</span> Schema

								for the EMMA markup. The latest version of the XML Schema for EMMA

								is available at <a href=

								"http://www.w3.org/TR/emma/emma.xsd">http://www.w3.org/TR/emma/emma.xsd</a>

								and the RELAX NG Schema can be found at <a href=

								"http://www.w3.org/TR/emma/emma.rng">http://www.w3.org/TR/emma/emma.rng</a>.</p>

								<p>For stability it is RECOMMENDED that you use the dated URI

								available at <a href=

								"http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd">http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd</a>

								and <a href=

								"http://www.w3.org/TR/2009/REC-emma-20090210/emma.rng">http://www.w3.org/TR/2009/REC-emma-20090210/emma.rng</a>.</p>

								<h2 id="appB">Appendix B. MIME type</h2>

								<p>This section is <span>N</span>ormative.</p>

								<p>This appendix registers a new MIME media type,

								"<code>application/emma+xml</code>".</p>


								<p>The "<code>application/emma+xml</code>" media type is

								registered with IANA at

								<a href="http://www.iana.org/assignments/media-types/application/">

								http://www.iana.org/assignments/media-types/application/</a>.

								</p>


								<div>

								<h3 id="media-type-registration">B.1 Registration of MIME media

								type application/emma+xml</h3>

								<dl>

								<dt>MIME media type name:</dt>

								<dd>

								<p><code>application</code></p>

								</dd>

								<dt>MIME subtype name:</dt>

								<dd>

								<p><code>emma+xml</code></p>

								</dd>

								<dt>Required parameters:</dt>

								<dd>

								<p>None.</p>

								</dd>

								<dt>Optional parameters:</dt>

								<dd>

								<dl>

								<dt><code>charset</code></dt>

								<dd>

								<p>This parameter has identical semantics to the

								<code>charset</code> parameter of the <code>application/xml</code>

								media type as specified in [<a href="#RFC3023">RFC3023</a>] or its

								successor.</p>

								</dd>

								</dl>

								</dd>

								<dt>Encoding considerations:</dt>

								<dd>

								<p>By virtue of EMMA content being XML, it has the same

								considerations when sent as "<code>application/emma+xml</code>"as

								does XML. See RFC 3023 (or its successor), section 3.2.</p>

								</dd>

								<dt>Security considerations:</dt>

								<dd>

								<p>Several features of EMMA require dereferencing arbitrary URIs.

								Implementers are advised to heed the security issues of [<a href=

								"#RFC3986">RFC3986</a>] section 7.</p>

								<p>In addition, because of the extensibility features for EMMA, it

								is possible that "<code>application/emma+xml</code>" will describe

								content that has security implications beyond those described here.

								However, if the processor follows only the normative semantics of

								this specification, this content will be ignored. Only in the case

								where the processor recognizes and processes the additional

								content, or where further processing of that content is dispatched

								to other processors, would security issues potentially arise. And

								in that case, they would fall outside the domain of this

								registration document.</p>

								</dd>

								<dt>Interoperability considerations:</dt>

								<dd>

								<p>This specification describes processing semantics that dictate

								the required behavior for dealing with, among other things,

								unrecognized elements.</p>

								<p>Because EMMA is extensible, conformant

								"<code>application/emma+xml</code>" processors MAY expect that

								content received is well-formed XML, but processors SHOULD NOT

								assume that the content is valid EMMA or expect to recognize all of

								the elements and attributes in the document.</p>

								</dd>

								<dt>Published specification:</dt>

								<dd>

								<p>

								This media type registration is extracted from Appendix B of the

								"<a href="http://www.w3.org/TR/emma/">EMMA: Extensible MultiModal Annotation markup language</a>"

								specification.

								</p>

								</dd>

								<dt>Additional information:</dt>

								<dd>

								<dl>

								<dt>Magic number(s):</dt>

								<dd>

								<p>There is no single initial octet sequence that is always present

								in EMMA documents.</p>

								</dd>

								<dt>File extension(s):</dt>

								<dd>

								<p>EMMA documents are most often identified with the extensions

								"<code>.emma</code>"<!-- or "<code>.mma</code>"-->.</p>

								</dd>

								<dt>Macintosh File Type Code(s):</dt>

								<dd>

								<p>TEXT</p>

								</dd>

								</dl>

								</dd>

								<dt>Person &amp; email address to contact for further

								information:</dt>

								<dd>

								<p>Kazuyuki Ashimura, &lt;<a href=

								"mailto:ashimura@w3.org">ashimura@w3.org</a>&gt;.</p>

								</dd>

								<dt>Intended usage:</dt>

								<dd>

								<p>COMMON</p>

								</dd>

								<dt>Author/Change controller:</dt>

								<dd>

								<p>The EMMA specification is a work product of the World Wide Web

								Consortium's Multimodal Interaction Working Group. The W3C has

								change control over these specifications.</p>

								</dd>

								</dl>

								</div>

								<h2 id="appC">Appendix C. <code>emma:hook</code> and SRGS</h2>

								<p>This section is <span>I</span>nformative.</p>

								<div>

								<p>One of the most powerful aspects of multimodal interfaces is

								their ability to provide support for user inputs which are

								distributed over the available input modes. These <b>composite</b>

								inputs are contributions made by the user within a single turn

								which have component parts in different modes. For example, the

								user might say "zoom in here" in the speech mode while drawing an

								area on a graphical display in the ink mode. One of the central

								motivating factors for this kind of input is that different kinds

								of communicative content are best suited to different input modes.

								In the example of a user drawing an area on a map and saying "zoom

								in here", the zoom command is easiest to provide in speech but the

								spatial information, the specific area, is easier to provide in

								ink.</p>

								<p>Enabling composite multimodality is critical in ensuring that

								multimodal systems support more natural and effective interaction

								for users. In order to support composite inputs, a multimodal

								architecture must provide some kind of multimodal integration

								mechanism. In the W3C Multimodal Interaction Framework

								<span>[<a href="#MMIF">MMI Framework</a>]</span>, multimodal

								integration can be handled by an integration component which

								follows the application of speech understanding and other kinds of

								interpretation procedures for individual modes.</p>

								<p>Given the broad range of different techniques being employed for

								multimodal integration and the extent to which this is an ongoing

								research problem, standardization of the specific method or

								algorithm used for multimodal integration is not appropriate at

								this time. In order to facilitate the development and

								inter-operation of different multimodal integration mechanisms EMMA

								provides markup language enabling application independent

								specification of elements in the application markup where content

								from another mode needs to be integrated. These representation

								'hooks' can then be used by different kinds of multimodal

								integration components and algorithms to drive the process of

								multimodal integration. In the processing of a composite multimodal

								input, the result of applying a mode-specific interpretation

								component to each of the individual modes will be EMMA markup

								describing the possible interpretation of that input.</p>

								</div>

								<p>One way to build an EMMA representation of a spoken input such

								as "zoom in here" is to use grammar rules in the W3C Speech

								Recognition Grammar Specification [<a href="#SRGS">SRGS</a>] using

								the Semantic Interpretation <span>[<a href="#SI">SISR</a>]</span>

								tags to build the application semantics with the

								<code>emma:hook</code> attribute. In this approach <span>[<a href=

								"#ECMASCRIPT">ECMAScript</a>]</span> is specified in order to build

								up an object representing the semantics. The resulting ECMAScript

								object is then translated to XML.</p>

								<p>For our example case of "zoom in here". The following SRGS rule

								could be used. The <span>Semantic Interpretation for Speech

								Recognition</span> specification <span>[<a href=

								"#SI">SISR</a>]</span> provides a reserved property

								<b>_nsprefix</b> for indicating the namespace to be used with an

								attribute.</p>

								<pre class="example">

								&lt;rule id="zoom"&gt;

								  zoom in here

								  &lt;tag&gt;

								    $.command = new Object();

								    $.command.action = "zoom";

								    $.command.location = new Object();

								    $.command.location._attributes = new Object();

								    $.command.location._attributes.hook = new Object();

								    $.command.location._attributes.hook._nsprefix = "emma";

								    $.command.location._attributes.hook._value = "ink";

								    $.command.location.type = "area";

								  &lt;/tag&gt;

								&lt;/rule&gt;

								</pre>

								<p>Application of this rule will result in the following ECMAScript

								object being built.</p>

								<pre class="example">

								command: {

								      action: "zoom"

								      location: {

								        _attributes: {

								           hook: {

								             _nsprefix: "emma"

								             _value: "ink"

								             }

								           }

								        type: "area"

								      }

								}

								</pre>

								<p><a href="#SI">SI</a> processing in an XML environment would

								generate the following document:</p>

								<pre class="example">

								&lt;command&gt;

								  &lt;action&gt;zoom&lt;/action&gt;

								  &lt;location emma:hook="ink"&gt;

								     &lt;type&gt;area&lt;/type&gt;

								  &lt;/location&gt;

								&lt;/command&gt;

								</pre>

								<p>This XML fragment might then appear within an EMMA document as

								follows:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="voice1"

								      emma:medium="acoustic"

								      emma:mode="voice"&gt;

								    &lt;command&gt;

								      &lt;action&gt;zoom&lt;/action&gt;

								      &lt;location emma:hook="ink"&gt;

								         &lt;type&gt;area&lt;/type&gt;

								      &lt;/location&gt;

								    &lt;/command&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The <code>emma:hook</code> annotation indicates that this speech

								input needs to be combined with ink input such as the

								following:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation id="pen1"

								      emma:medium="tactile"

								      emma:mode="ink"&gt;

								    &lt;location&gt;

								      &lt;type&gt;area&lt;/type&gt;

								      &lt;points&gt;42.1345 -37.128 42.1346 -37.120 ... &lt;/points&gt;

								    &lt;/location&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;


								</pre>

								<p>This representation could be generated by a pen modality

								component performing gesture recognition and interpretation. The

								input to the component would be an <span>Ink Markup Language</span>

								specification <span>[<a href="#InkML">INKML</a>]</span> of the ink

								trace and the output would be the EMMA document above.</p>

								<p>The combination will result in the following EMMA document for

								the combined speech and pen multimodal input.</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation

								      emma:medium="acoustic tactile"

								      emma:mode="<span>voice ink</span>"

								      emma:process="http://example.com/myintegrator.xml"&gt;

								    &lt;emma:derived-from resource="<span>http://example.com/voice1.emma/</span>#voice1" composite="true"/&gt;

								    &lt;emma:derived-from resource="<span>http://example.com/pen1.emma/</span>#pen1" composite="true"/&gt;

								    &lt;command&gt;

								       &lt;action&gt;zoom&lt;/action&gt;

								       &lt;location&gt;

								         &lt;type&gt;area&lt;/type&gt;

								         &lt;points&gt;42.1345 -37.128 42.1346 -37.120 ... &lt;/points&gt;

								        &lt;/location&gt;

								     &lt;/command&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<div>

								<p>There are two components to the process of integrating these two

								pieces of semantic markup. The first is to ensure that the two are

								compatible; that is, that no semantic constraints are violated. The

								second is to fuse the content from the two sources. In our example,

								the <code>&lt;type&gt;area&lt;/type&gt;</code> element is intended

								to indicate that this speech command requires integration with an

								area gesture rather than, for example, a line gesture, which would

								have the subelement <code>&lt;type&gt;line&lt;/type&gt;</code>.

								This constraint needs to be enforced by whatever mechanism is

								responsible for multimodal integration.</p>

								<p>Many different techniques could be used for achieving this

								integration of the semantic interpretation of the pen input, a

								<code>&lt;location&gt;</code> element, with the corresponding

								<code>&lt;location&gt;</code> element in the speech. The

								<span><code>emma:hook</code></span> simply serves to indicate the

								existence of this relationship.</p>

								<p>One way to achieve both the compatibility checking and fusion of

								content from the two modes is to use a well-defined general purpose

								matching mechanism such as unification. <span>Graph unification

								[</span><a href="#graphunification">Graph

								unification</a><span>]</span> is a mathematical operation defined

								over directed acylic graphs which captures both of the components

								of integration in a single operation: the applications of the

								semantic constraints and the fusing of content. One possible

								semantics for the <code>emma:hook</code> markup indicates that

								content from the required mode needs to be unified with that

								position in the application semantics. In order to unify, two

								elements must not have any conflicting values for subelements or

								attributes. This procedure can be defined recursively so that

								elements within the subelements must also not clash and so on. The

								result of unification is the union of all of the elements and

								attributes of the two elements that are being unified.</p>

								<p>In addition to the unification operation, in the resulting

								<code>emma:interpretation</code> the <code>emma:hook</code>

								attribute needs to be removed and the <code>emma:mode</code>

								attribute changed to <span>the list of the modes of the individual

								inputs</span> <span>, e.g. <code>"voice ink"</code></span>.</p>

								<p>Instead of the unification operation, for a specific application

								semantics, integration could be achieved using some other algorithm

								or script. The benefit of using the unification semantics for

								<code>emma:hook</code> is that it provides a general purpose

								mechanism for checking the compatibility of elements and fusing

								them, whatever the specific elements are in the application

								specific semantic representation.</p>

								<p>The benefit of using the <code>emma:hook</code> annotation for

								authors is that it provides an application independent method for

								indicating where integration with content from another mode is

								required. If a general purpose integration mechanism is used, such

								as the unification approach described above, authors should be able

								to use the same integration mechanism for a range of different

								applications without having to change the integration rules or

								logic. For each application the speech grammar rules [<a href=

								"#SRGS">SRGS</a>] need to assign <code>emma:hook</code> to the

								appropriate elements in the semantic representation of the speech.

								The general purpose multimodal integration mechanism will use the

								<code>emma:hook</code> annotations in order to determine where to

								add in content from other modes. Another benefit of the

								<code>emma:hook</code> mechanism is that it facilitates

								interoperability among different multimodal integration components,

								so long as they are all general purpose and utilize

								<code>emma:hook</code> in order to determine where to integrate

								content.</p>

								<p>The following provides a more detailed example of the use of the

								<code>emma:hook</code> annotation. In this example, spoken input is

								combined with two <span>ink</span> gestures. The semantic

								representation assigned to the spoken input "send this file to

								this" indicates two locations where content is required from ink

								input using <code>emma:hook="ink"</code>:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:interpretation<span> id="voice2"

								      emma:medium="acoustic"

								      emma:mode="voice"

								      emma:tokens="send this file to this"

								      emma:start="1087995961500"

								      emma:end="1087995963542"</span>&gt;

								    &lt;command&gt;

								      &lt;action&gt;send&lt;/action&gt;

								        &lt;arg1&gt;

								          &lt;object emma:hook="ink"&gt;

								            &lt;type&gt;file&lt;/type&gt;

								            &lt;number&gt;1&lt;/number&gt;

								          &lt;/object&gt;

								        &lt;/arg1&gt;

								       &lt;arg2&gt;

								         &lt;object emma:hook="ink"&gt;

								           &lt;number&gt;1&lt;/number&gt;

								         &lt;/object&gt;

								       &lt;/arg2&gt;

								    &lt;/command&gt;

								  &lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>The user gesturing on the two locations on the display can be

								represented using <code>emma:sequence</code>:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								  &lt;emma:sequence<span> id="ink2"</span>&gt;

								    &lt;emma:interpretation <span>emma:start="1087995960500"

								      emma:end="1087995960900"<br />

								      emma:medium="tactile"

								      emma:mode="ink"</span>&gt;

								      &lt;object&gt;

								       &lt;type&gt;file&lt;/type&gt;

								       &lt;number&gt;1&lt;/number&gt;

								       &lt;id&gt;test.pdf&lt;/id&gt;

								      &lt;object&gt;

								    &lt;/emma:interpretation&gt;

								    &lt;emma:interpretation <span>emma:start="1087995961000"

								      emma:end="1087995961100"<br />

								      emma:medium="tactile"

								      emma:mode="ink"</span>&gt;

								      &lt;object&gt;

								        &lt;type&gt;printer&lt;/type&gt;

								        &lt;number&gt;1&lt;/number&gt;

								        &lt;id&gt;lpt1&lt;/id&gt;

								      &lt;object&gt;

								    &lt;/emma:interpretation&gt;

								  &lt;/emma:sequence&gt;

								&lt;/emma:emma&gt;

								</pre>

								<p>A general purpose unification-based multimodal integration

								algorithm could use the <code>emma:hook</code> annotation as

								follows. It identifies the elements marked with

								<code>emma:hook</code> in document order. For each of those in

								turn, it attempts to unify the element with the corresponding

								element in order in the <code>emma:sequence</code>. Since none of

								the subelements conflict, the unification goes through and as a

								result, we have the following EMMA for the composite result:</p>

								<pre class="example">

								&lt;emma:emma version="1.0"

								    xmlns:emma="http://www.w3.org/2003/04/emma"

								    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

								    xsi:schemaLocation="http://www.w3.org/2003/04/emma

								     http://www.w3.org/TR/2009/REC-emma-20090210/emma.xsd"

								    xmlns="http://www.example.com/example"&gt;

								&lt;emma:interpretation<span> id="multimodal2"

								      emma:medium="acoustic tactile"

								      emma:mode="voice ink"

								      emma:tokens="send this file to this"

								      emma:process="http://example.com/myintegration.xml"

								      emma:start="1087995960500"

								      emma:end="1087995963542"</span>&gt;

								  &lt;emma:derived-from resource="<span>http://example.com/voice2.emma/</span>#voice2" composite="true"/&gt;

								  &lt;emma:derived-from resource="<span>http://example.com/ink2.emma/</span>#ink2" composite="true"/&gt;

								  &lt;command&gt;

								   &lt;action&gt;send&lt;/action&gt;

								    &lt;arg1&gt;

								     &lt;object&gt;

								       &lt;type&gt;file&lt;/type&gt;

								       &lt;number&gt;1&lt;/number&gt;

								        &lt;id&gt;test.pdf&lt;/id&gt;

								     &lt;/object&gt;

								    &lt;/arg1&gt;

								    &lt;arg2&gt;

								     &lt;object&gt;

								       &lt;type&gt;printer&lt;/type&gt;

								        &lt;number&gt;1&lt;/number&gt;

								       &lt;id&gt;lpt1&lt;/id&gt;

								     &lt;/object&gt;

								    &lt;/arg2&gt;

								  &lt;/command&gt;

								&lt;/emma:interpretation&gt;

								&lt;/emma:emma&gt;

								</pre></div>

								<h2 id="appD">Appendix D. EMMA event interface</h2>

								<p>This section is <span>I</span>nformative.</p>

								<p>The W3C Document Object Model [<a href="#DOM">DOM</a>] defines

								platform and language neutral interfaces that gives programs and

								scripts the means to dynamically access and update the content,

								structure and style of documents. DOM Events define a generic event

								system which allows registration of event handlers, describes event

								flow through a tree structure, and provides basic contextual

								information for each event.</p>

								<p>This section of the EMMA specification extends the DOM Event

								interface for use with events that describe interpreted user input

								in terms of a DOM Node for an EMMA document.</p>

								<pre class="example">

								// File: emma.idl


								#ifndef _EMMA_IDL_

								#define _EMMA_IDL_


								#include "dom.idl"#include "views.idl"#include "events.idl"

								#pragma prefix "dom.w3c.org"module emma

								{

								  typedef dom::DOMString DOMString;

								  typedef dom::Node Node;


								  interface EMMAEvent : events::UIEvent {

								    readonly attribute dom::Node  node;

								    void               initEMMAEvent(in DOMString typeArg,

								                                   in boolean canBubbleArg,

								                                   in boolean cancelableArg,

								                                   in Node node);

								  };

								};


								#endif // _EMMA_IDL_

								</pre>

								<h2 id="appE">Appendix E. References</h2>

								<h3 id="appE1">E.1 Normative references</h3>

								<dl>

								<dt id="BCP47">BCP47</dt>

								<dd>A. Phillips and M. Davis, editors. <a href=

								"http://www.rfc-editor.org/rfc/bcp/bcp47.txt">Tags for the

								Identification of Languages</a>, IETF, September 2006.</dd>

								<dt id="RFC3023">RFC3023</dt>

								<dd>M. Murata et al.<span>,</span> editors. <a href=

								"http://www.ietf.org/rfc/rfc3023.txt">XML Media Types</a>. IETF RFC

								3023<span>, January 2001</span>.</dd>

								<dt id="RFC2046">RFC2046</dt>

								<dd>N. Freed and N. Borenstein<span>,</span> editors. <a href=

								"http://www.ietf.org/rfc/rfc2046.txt">Multipurpose Internet Mail

								Extensions (MIME) Part Two: Media Types</a>. IETF RFC 2046<span>,

								November 1996</span>.</dd>

								<dt><a id="ref-rfc2119" name="ref-rfc2119" shape=

								"rect">RFC2119</a></dt>

								<dd>S. Bradner, <span>e</span>ditor. <a href=

								"http://www.ietf.org/rfc/rfc2119.txt">Key words for use in RFCs to

								Indicate Requirement Levels</a>, IETF <span>RFC 2119</span>, March

								1997.</dd>

								<dt id="RFC3986">RFC3986</dt>

								<dd>T. Berners-Lee et al.<span>,</span> editors. <a href=

								"http://www.ietf.org/rfc/rfc3986.txt">Uniform Resource Identifier

								(URI): Generic Syntax</a>. IETF RFC 3986<span>, January

								2005</span>.</dd>

								<dt id="RFC3987">RFC3987</dt>

								<dd>M. Duerst and M. Suignard<span>,</span> editors. <a href=

								"http://www.ietf.org/rfc/rfc3987.txt">Internationalized Resource

								Identifiers (IRIs)</a>. IETF RFC 3987<span>, January

								2005</span>.</dd>

								<dt id="XML">XML</dt>

								<dd>Tim Bray <span>et al.,</span> editors. <a href=

								"http://www.w3.org/TR/2004/REC-xml11-20040204/">Extensible Markup

								Language (XML) 1.1</a>. World Wide Web Consortium, <span>W3C

								Recommendation,</span> 2004.</dd>

								<dt id="XMLNS">XMLNS</dt>

								<dd>Tim Bray <span>et al.</span>, editors<span>.</span> <a href=

								"http://www.w3.org/TR/xml-names11/">Namespaces in XML 1.1</a>,

								World Wide Web Consortium, <span>W3C Recommendation,</span>

								200<span>6</span>.</dd>

								<dt id="XSD1">XML Schema Structures</dt>

								<dd>Henry S. Thompson <span>et al.</span>, editors. <a href=

								"http://www.w3.org/TR/xmlschema-1/">XML Schema Part 1: Structures

								Second Edition</a>, World Wide Web Consortium<span>, W3C

								Recommendation</span>, 2004.</dd>

								<dt id="XSD2">XML Schema Datatypes</dt>

								<dd>Paul V. Biron <span>and</span> Ashok Malhotra, editors.

								<a href="http://www.w3.org/TR/xmlschema-2/">XML Schema Part 2:

								Datatypes Second Edition</a>, World Wide Web Consortium, <span>W3C

								Recommendation,</span> 2004.</dd>

								</dl>

								<h3 id="appE2">E.2 Informative references</h3>

								<dl>

								<dt id="DOM">DOM</dt>

								<dd><a href="http://www.w3.org/DOM/">Document Object Model</a>,

								World Wide Web Consortium, 2005.</dd>

								<dt id="ECMASCRIPT">ECMAScript</dt>

								<dd><a href=

								"http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf">

								ECMAScript</a></dd>

								<dt id="InkML">INKML</dt>

								<dd>Yi-Min Chee, Max Froumentin, Stephen M. Watt, editors. <a href=

								"http://www.w3.org/TR/InkML/">Ink Markup Language (InkML)</a>,

								World Wide Web Consortium, W3C Working Draft, 2006.</dd>

								<dt id="SI">SI<span>SR</span></dt>

								<dd>Luc Van Tichelen <span>and Dave Burke</span>,

								editor<span>s</span>. <a href=

								"http://www.w3.org/TR/semantic-interpretation/">Semantic

								Interpretation for Speech Recognition</a>, World Wide Web

								Consortium, <span>W3C Proposed Recommendation, 2007</span>.</dd>

								<dt id="SRGS">SRGS</dt>

								<dd>Andrew Hunt, Scott McGlashan, editors. <a href=

								"http://www.w3.org/TR/speech-grammar/">Speech Recognition Grammar

								Specification Version 1.0</a>, World Wide Web Consortium<span>, W3C

								Recommendation,</span> 2004.</dd>

								<dt id="XFORMS">XFORMS</dt>

								<dd><span>John M. Boyer et al., editors.</span> <a href=

								"http://www.w3.org/TR/2006/REC-xforms-20060314/">XForms <span>1.0

								(Second Edition)</span></a>, World Wide Web Consortium, <span>W3C

								Recommendation,</span> 2006.</dd>

								<dt id="RELAXNG">RELAX-NG</dt>

								<dd><span>James Clark and Makoto Murata, editors.</span> <a href=

								"http://www.oasis-open.org/committees/relax-ng/spec-20011203.html"><span>

								RELAX NG Specification</span></a><span>, OASIS, Committee

								Specification, 2001.</span></dd>

								<dt id="EMMAreqs">EMMA Requirements</dt>

								<dd>Stephane H. Maes and Stephen Potter, editors. <a href=

								"http://www.w3.org/TR/EMMAreqs/">Requirements for EMMA</a>, World

								Wide Web Consortium, <span>W3C Note,</span> 2003<span>.</span></dd>

								<dt id="graphunification">Graph Unification</dt>

								<dd>Bob Carpenter. <cite>The Logic of Typed Feature

								Structures</cite>, Cambridge Tracts in Theoretical Computer Science

								32, Cambridge University Press, 1992.</dd>

								<dd>Kevin Knight. <cite>Unification: A Multidisciplinary

								Survey</cite>, ACM Computing Surveys, 21(1), 1989.</dd>

								<dd>Michael Johnston. <cite>Unification-based Multimodal

								Parsing</cite>, Proceedings of Association for Computational

								Linguistics, pp. 624-630, 1998.</dd>

								<dt id="MMIF">MMI Framework</dt>

								<dd>James A. Larson, T.V. Raman and Dave Raggett, editors. <a href=

								"http://www.w3.org/TR/mmi-framework/">W3C Multimodal Interaction

								Framework</a>, World Wide Web Consortium<span>, W3C Note</span>,

								2003<span>.</span></dd>

								<dt id="MMIreqs">MMI Requirements</dt>

								<dd>Stephane H. Maes and Vijay Saraswat, editors. <a href=

								"http://www.w3.org/TR/mmi-reqs/">Multimodal Interaction

								Requirements</a>, World Wide Web Consortium<span>, W3C Note</span>,

								2003<span>.</span></dd>

								</dl>

								<h2 id="appF">Appendix F. Changes since last draft</h2>

								<p>This section is <span>I</span>nformative.</p>

								<p>

								Since the publication of the Proposed Recommendation of the EMMA

								specification, the following minor editorial changes have been

								added to the draft.

								</p>

								<ul>

								<li>

								Fixed wrong style of text.

								(<a href="#s1.2">1.2 Terminology</a>)

								</li>


								<li>

								Changed schemaLocation URI in example codes

								  from

								  "http://www.w3.org/TR/2008/PR-emma-20081215/"

								  to

								  "http://www.w3.org/TR/2009/REC-emma-20090210/".

								(<a href="#s2">2. Structure of EMMA documents</a>,

								<a href="#s3">3. EMMA structural elements</a>

								and

								<a href="#s4">4 EMMA annotations</a>)

								</li>


								<li>

								Changed the note on the status of MIME type registration from

								  "being submitted to the IESG for review, approval, and registration

								  with IANA" to "registered with IANA at

								  http://www.iana.org/assignments/media-types/application/" because

								  the EMMA MIME type is registered with IANA.

								(<a href="#appB">Appendix B</a>)

								</li>

								</ul>


								<h2 id="appG">Appendix G. Acknowledgements</h2>

								<p>This section is <span>I</span>nformative.</p>

								<p>The editors would like to recognize the contributions of the

								current and former members of the W3C Multimodal Interaction Group

								<em>(listed in alphabetical order)</em>:</p>

								<dl>

								<dd>Kazuyuki Ashimura, W3C</dd>

								<dd>Patrizio Bergallo, (until 2008, while at Loquendo)</dd>

								<dd>Wu Chou, Avaya</dd>

								<dd>Max Froumentin, (until 2006, while at W3C)</dd>

								<dd>Katriina Halonen, Nokia</dd>

								<dd>Jin Liu, T-Systems</dd>

								<dd>Roberto Pieraccini, Speechcycle</dd>

								<dd>Stephen Potter, Microsoft</dd>

								<dd>Massimo Romanelli, DFKI</dd>

								<dd>Yuan Shao, Canon</dd>

								</dl>

								</body>

								</html>