Extending the Web to support multiple modes of interaction.
News
- 20 September 2011: Ink Markup Language (InkML) is a W3C Recommendation.
- 6 September 2011: The second Last Call Working Draft of Multimodal Architecture and Interfaces is published.
The main normative change from the previous draft is removing the 'immediate' field from the following Life Cycle Events: CancelRequest, PauseRequest.- 10 May 2011: Ink Markup Language (InkML) is a W3C Proposed Recommendation.
- 8 April 2011: The Last Call Working Draft of Emotion Markup Language (EmotionML) 1.0 is published. The main change from the previous draft is the inclusion of a mechanism for defining emotion vocabularies. A precise list of changes from the previous draft is also available for comparison purposes.
Also Vocabularies for EmotionML is published as a First Public Working Draft. This document represents a public collection of emotion vocabularies that can be used with EmotionML to represent emotions and related states. It was originally part of an earlier draft of the EmotionML specification, but was moved out of it so that we can easily update, extend and correct the list of vocabularies as required.- 1 Mach 2011: Best practices for creating MMI Modality Components is published as a Working Group Note.
- 25 January 2011: The Last Call Working Draft of Multimodal Architecture and Interfaces is published.
The main change from the previous draft is the tightening of the language to make the requirements more precise. A diff-marked version is also available for comparison purposes.- 11 January 2011: Ink Markup Language (InkML) is a W3C Candidate Recommendation (See also the group's Implementation Report Plan).
- 5-6 October 2010: The EmotionML Workshop was held in Paris, France, hosted by Telecom ParisTech. The summary and detailed minutes are available online. Participants from 12 organizations discussed use cases of possible emotion-ready applications and clarified several key requirements for the current EmotionML to make the specification even more useful.
- 21 September 2010: The seventh Working Draft of Multimodal Architecture and Interfaces is published.
The main changes from the previous draft are (1) the inclusion of state charts for modality components, (2) the addition of a 'confidential' field to life-cycle events and (3) the removal of the 'media' field from life-cycle events. A diff-marked version is also available for comparison purposes.- 29 July 2010: The second Working Draft of Emotion Markup Language (EmotionML) 1.0 is published. A diff-marked version is also available for comparison purposes.
Please send your comments to the Multimodal Interaction public mailing list (<www-multimodal@w3.org>).- 18-19 June 2010: The workshop on Conversational Applications was held in Somerset, NJ, (USA), hosted by Openstream. The summary and detailed minutes are available online. Participants from 12 organizations fucused discussion on the use cases of possible conversational applications and clarified limitations of the current W3C language model in order to develop a more comprehensive one.
- 10 February 2009: EMMA: Extensible MultiModal Annotation markup language is a W3C Recommendation. (press release)
The Multimodal Interaction Activity seeks to extend the Web to allow users to dynamically select the most appropriate mode of interaction for their current needs, including any disabilities, whilst enabling developers to provide an effective user interface for whichever modes the user selects. Depending upon the device, users will be able to provide input via speech, handwriting, and keystrokes, with output presented via displays, pre-recorded and synthetic speech, audio, and tactile mechanisms such as mobile phone vibrators and Braille strips.
Multimodal interaction offers significant ease of use benefits over uni-modal interaction, for instance, when hands-free operation is needed, for mobile devices with limited keypads, and for controlling other devices when a traditional desktop computer is unvailable to host the application user interface. This is being driven by advances in embedded and network-based speech processing that are creating opportunities for integrated multimodal Web browsers and for solutions that separate the handling of visual and aural modalities, for example, by coupling a local XHTML user agent with a remote VoiceXML user agent.
The Multimodal Interaction Working Group (member only link) should be of interest to a range of organizations in different industry sectors:
The Multimodal Interaction Working Group was launched in 2002 following a joint workshop between the W3C and the WAP Forum. The Working Group's initial focus was on use cases and requirements. This led to the publication of the W3C Multimodal Interaction Framework, and in turn to work on extensible multi-modal annotations (EMMA), and InkML, an XML language for ink traces. The Working Group has also worked on integration of composite multimodal input; dynamic adaptation to device configurations, user preferences and environmental conditions (now transferred to the Device Independence Activity); modality component interfaces; and a study of current approaches to interaction management. The Working Group has now been re-chartered through 31 January 2009 under the terms of the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis. The Working Group is chaired by Deborah Dahl. The W3C Team Contact is Kazuyuki Ashimura.
We are very interested in your comments and suggestions. If you have implemented multimodal interfaces, please share your experiences with us, as we are particularly interested in reports on implementations and their usability for both end-users and application developers. We welcome comments on any of our published documents. If you have a proposal for multimodal authoring language, please let us know. To subscribe to the discussion list send an email to www-multimodal-request@w3.org with the word subscribe in the subject header. Previous discussion can be found in the public archive. To unsubscribe send an email to www-multimodal-request@w3.org with the word unsubscribe in the subject header.
If your organization is already a member of W3C, ask your W3C Advisory Comittee Representative (member only link) to fill out the online registration form to confirm that your organization is prepared to commit the time and expense involved in particpating in the group. You will be expected to attend all Working Group meetings (about 3 or 4 times a year) and to respond in a timely fashion to email requests. Further details about joining are available on the Working Group (member only link) page. Requirements for patent disclosures, as well as terms and conditions for licensing essential IPR are given in the W3C Patent Policy.
More information about the W3C is available, as is information about joining W3C.
W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent.
| Specification | FPWD | LC | CR | PR | Rec |
|---|---|---|---|---|---|
| Multimodal Architecture and Interfaces |
- Completed - 2nd WD - 3rd WD - 4th WD - 5th WD - 6th WD - 7th WD |
Completed | TBD | TBD | TBD |
| EMMA 2.0 | 4Q 2009 | January 2011 | TBD | TBD | TBD |
| EMMA | Completed | Completed | Completed | Completed | Completed (10 Feb. 2009) |
| InkML | Completed |
1st LC: Completed
2nd LC: Completed |
Completed | April 2010 | June 2010 |
| EmotionML |
Completed - 2nd |
Completed | June 2011 | TBD | TBD |
| Ink Modality Component Definition | Completed (as a WG Notes) | - | - | - | - |
| Voice Modality Component Definition | December 2009 (as a WG Notes) |
- | - | - | - |
This is intended to give you a brief summary of each of the major work items under development by the Multimodal Interaction Working Group. The suite of specifications is known as the W3C Multimodal Interaction Framework.
The following indicates current work items. Additional work is expected on topics described in section 4 of the charter, including multimodal authoring, modality component interfaces, composite multimodal input, and coordinated multimodal output.
A loosely coupled architecture for the Multimodal Interaction Framework that focuses on providing a general means for components to communicate with each other, plus basic infrastructure for application contrl and platform services. Work is continuing on how the architecture can be realized in terms of well defined component interfaces and eventing models.
EMMA has been developed as a data exchange format for the interface between input processors and interaction management systems. It will define the means for recognizers to annotate application specific data with information such as confidence scores, time stamps, input mode (e.g. key strokes, speech or pen), alternative recognition hypotheses, and partial recognition results etc. EMMA is a target data format for the semantic interpretation specification being developed in the Voice Browser Activity, and which describes annotations to speech grammars for extracting application specific data as a result of speech recognition. EMMA supercedes earlier work on the natural language semantics markup language in the Voice Browser Activity.
Since EMMA 1.0 became a W3C Recommendation, a number of new possible use cases for the EMMA language have emerged. These include the use of EMMA to represent multimodal output, biometrics, emotion, sensor data, multi-stage dialogs, and interactions with multiple users. So the Working Group have decided to work on a document capturing use cases and issues for a series of possible extensions to EMMA, and published a Working Group Note to seek feedback on the various different use cases.
This work item sets out to define an XML data exchange format for ink entered with an electronic pen or stylus as part of a multimodal system. This will enable the capture and server-side processing of handwriting, gestures, drawings, and specific notations for mathematics, music, chemistry and other fields, as well as supporting further research on this processing. The Ink subgroup maintains a separate public page devoted to W3C's work on pen and stylus input.
EmotionML will provide representations of emotions and related states for technological applications. As the web is becoming ubiquitous, interactive, and multimodal, technology needs to deal increasingly with human factors, including emotions. The language is conceived as a "plug-in" language suitable for use in three different areas: (1) manual annotation of data; (2) automatic recognition of emotion-related states from user behavior; and (3) generation of emotion-related system behavior.
For more details on other organizations see the Multimodal Interaction Charter.