W3C

Site Navigation


Voice Current Status

This page summarizes the relationships among specifications, whether they are finished standards or drafts. Below, each title links to the most recent version of a document. For related introductory information, see: Voice Browsing.

Completed Work

W3C Recommendations have been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and are endorsed by the Director as Web Standards. Learn more about the W3C Recommendation Track.

Group Notes are not standards and do not have the same level of W3C endorsement.

Standards

2011-07-05

Voice Browser Call Control: CCXML Version 1.0

translations · errata

The Call Control Extensible Markup Language (CCXML) provides declarative markup to describe telephony call control. CCXML can be used in conjunction with a dialog system such as VoiceXML.

2010-09-07

Speech Synthesis Markup Language (SSML) Version 1.1

translations · errata

Provides a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.

2008-10-14

Pronunciation Lexicon Specification (PLS) Version 1.0

translations · errata

This document defines the syntax for specifying pronunciation lexicons to be used by Automatic Speech Recognition and Speech Synthesis engines in voice browser applications.

2007-06-19

Voice Extensible Markup Language (VoiceXML) 2.1

translations · errata

VoiceXML 2.1 specifies a set of features commonly implemented by Voice Extensible Markup Language platforms. This specification is designed to be fully backwards-compatible with VoiceXML 2.0 [VXML2]. This specification describes only the set of additional features.

2007-04-05

Semantic Interpretation for Speech Recognition (SISR) Version 1.0

translations · errata

Grammar Processors, and in particular speech recognizers, use a grammar that defines the words and sequences of words to define the input language that they can accept. The major task of a grammar processor consists of finding the sequence of words described by the grammar that (best) matches a given utterance, or to report that no such sequence exists. This document defines the syntax and the semantics of Semantic Interpretation Tags for use with the Speech Recognition Grammar Specification.

2004-09-07

Speech Synthesis Markup Language (SSML) Version 1.0

translations · errata

Provides a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.

2004-03-16

Voice Extensible Markup Language (VoiceXML) Version 2.0

translations · errata

This document specifies VoiceXML, the Voice Extensible Markup Language. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

2004-03-16

Speech Recognition Grammar Specification Version 1.0

translations · errata

This document defines syntax for representing grammars for use in speech recognition so that developers can specify the words and patterns of words to be listened for by a speech recognizer.

Group Notes

2009-12-08

Mobile Web for Social Development Roadmap

The purpose of this document is to help people understand the current challenges of deploying development-oriented services on mobile phones, evaluate existing technologies, and identify the most promising directions to lower the barriers of developing, deploying and accessing services on mobile phones and thereby creating an enabling environment for more social-oriented services to appear.

2005-05-26

SSML 1.0 say-as attribute values

This is a sample short description for this specification; over time we will replace this description with a real one.

1998-01-28

Voice Browsers

This is a sample short description for this specification; over time we will replace this description with a real one.

Drafts

Below are draft documents: Last Call Drafts, other Working Drafts. Some of these may become Web Standards through the W3C Recommendation Track process. Others may be published as Group Notes or become obsolete specifications.

Last Call Drafts

2011-08-18

CSS Speech Module

CSS (Cascading Style Sheets) is a language for describing the rendering of HTML and XML documents on screen, on paper, in speech, etc. CSS defines aural properties that give control over rendering XML to speech. This draft describes the text to speech properties proposed for CSS level 3. These are designed for match the model described in the Speech Synthesis Markup Language (SSML) Version 1.0 [SSML10]

The CSS3 Speech Module is a community effort and if you would like to help with implementation and driving the specification forward along the W3C Recommendation track, please contact the editors.

Other Working Drafts

2011-04-26

State Chart XML (SCXML): State Machine Notation for Control Abstraction

This document describes SCXML, or the "State Chart extensible Markup Language". SCXML provides a generic state-machine based execution environment based on CCXML and Harel State Tables.

2010-12-16

Voice Extensible Markup Language (VoiceXML) 3.0

VoiceXML 3.0 is a modular XML language for creating interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. The primary goal of the spec is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

2008-08-08

Voice Extensible Markup Language (VoiceXML) 3.0 Requirements

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirement studies for voice browsers, and provides details of the requirements for marking up spoken dialogs.

2007-06-11

Speech Synthesis Markup Language Version 1.1 Requirements

This is a sample short description for this specification; over time we will replace this description with a real one.

Obsolete Specifications

These specifications have either been superseded by others, or have been abandoned. They remain available for archival purposes, but are not intended to be used.

Retired

2004-10-29

Pronunciation Lexicon Specification (PLS) Version 1.0 Requirements

The W3C Voice Browser Working Group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirements studies for voice browsers, and provides details of the requirements for markup used for specifying application specific pronunciation lexicons.

Application specific pronunciation lexicons are required in many situations where the default lexicon supplied with a speech recognition or speech synthesis processor does not cover the vocabulary of the application. A pronunciation lexicon is a collection of words or phrases together with their pronunciations specified using an appropriate pronunciation alphabet.

2002-08-08

Voice Browser Interoperation: Requirements

A voice browser provides the means for people to use their voice to interact with appropriately designed applications. Users generally connect to voice browsers by dialling an access number. The voice browser in turn retrieves markup (e.g. VoiceXML) and other resources from an application server. In some situations it is appropriate to transfer the user from one voice browser to another. In other situations, the user may start from a visual web page and then transfer to a voice browser, yet another possibility is transfer from a voice browser to a human operator.

This document describes the requirements for how voice browsers and other call sites can cooperate by sharing data to create a seamless caller experience. An example of a potential resulting benefit to a caller is not having to re-enter the same information repeatedly at different call sites. A potential benefit for service providers is a flexible architecture for deploying and interconnecting disparate call sites.

2001-04-13

Call Control Requirements in a Voice Browser Framework

This is a sample short description for this specification; over time we will replace this description with a real one.

2001-01-03

Stochastic Language Models (N-Gram) Specification

This is a sample short description for this specification; over time we will replace this description with a real one.

2000-12-04

Voice Browsers, Introduction

The World Wide Web Consortium's Voice Browser Working Group is defining several markup languages for applications supporting speech input and output. These markup languages will enable speech applications across a range of hardware and software platforms. Specifically, the Working Group is designing markup languages for dialog, speech recognition grammar, speech synthesis, natural language semantics, and a collection of reusable dialog components. These markup languages make up the W3C Speech Interface Framework. The speech community is invited to review and comment on the working draft requirement and specification documents.

2000-11-20

Natural Language Semantics Markup Language for the Speech Interface Framework

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of specifications for voice browsers, and provides details of an XML markup language for describing the meanings of individual natural language utterances. It is expected to be automatically generated by semantic interpreters for use by components that act on the user's utterances, such as dialog managers.

2000-07-10

Multimodal Requirements for Voice Markup Languages

Multimodal browsers allow users to interact via a combination of modalities, for instance, speech recognition and synthesis, displays, keypads and pointing devices. The Voice Browser working group is interested in adding multimodal capabilities to voice browsers. This document sets out a prioritized list of requirements for multimodal dialog interaction, which any proposed markup language (or extension thereof) should address.

2000-04-26

Reusable Dialog Requirements for Voice Markup Language

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirements studies for voice browsers, and provides details of the requirements for reusable components for spoken dialogs.

1999-12-23

Grammar Representation Requirements for Voice Markup Languages

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirements studies for voice browsers, and provides details of the requirements for grammars for speech recognition.

1999-12-23

Dialog Requirements for Voice Markup Languages

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirements studies for voice browsers, and provides details of the requirements for marking up spoken dialogs.

1999-12-23

Model Architecture for Voice Browser Systems

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirements studies for voice browsers, and provides a model architecture for processing speech within voice browsers.

1999-12-23

Natural Language Processing Requirements for Voice Markup Languages

This is a sample short description for this specification; over time we will replace this description with a real one.

1999-12-23

Speech Synthesis Markup Requirements for Voice Markup Languages

This is a sample short description for this specification; over time we will replace this description with a real one.