VoiceXML for speech-activated information
retrieval
The focus of recent developments in the fast-growing field of
mobile Internet is on presenting Web contents on portable end devices
- mobile telephones, handheld computers and personal digital
assistants (PDAs) similar to the way they appear on Web browsers. But
there is a drawback to some technologies such as Wireless Application
Protocol (WAP), Wireless Markup Language (WML) or Handheld Device
Markup Language (HDML) - content is displayed on far smaller screens.
Fortunately, squinting at screens the size of a postage stamp is not
the only way to access the Internet's vast information offering. The
convenient alternative is voice-enabled Web access. In this
installment of the newsletter we'll take a closer look at VoiceXML,
the remarkable technology that makes this possible.
Providing the underpinning for VoiceXML is eXtensible Markup
Language XML, a standard sanctioned by the W3C, the consortium
responsible for developing the standards that underlie the Web.
Strictly speaking, VoiceXML is an XML schema that serves a single
purpose: maximum standardization of voice-activated retrieval of
Internet content. This means conventional phone systems may be used to
access information and applications in the Internet via spoken
selection dialogues, voice commands and interactive replies (to
include voice frequency signaling processes). VoiceXML is particularly
well-suited for two tasks:
- delivering the contents of an Internet site in speech form (for
example, to enable access via mobile phones)
- facilitating development of new interactive and voice-controlled
phone services based on standard open architecture.
VoiceXML brings a considerable benefit to developers' efforts to
come up with voice controlled applications by virtue of the fact that
it is based on existing tools as well as Web infrastructures and
servers. Instead of a Web browser, a VoiceXML Interpreter and
telephony server grants users telephone-enabled access to contents,
for example, information residing on the Software AG's XML Server,
Tamino.
Special features that delight developers
Voice portals are among the most popular VoiceXML applications.
These are seeing ever more widespread use in the areas of customer
services such as automated human-machine voice dialogues (help desk,
support) and information services (share prices, sports results). But
VoiceXML can do much more. A case in point: it lets users access
intranets via voice commands, or it can be used to create a
"reminder" service to alert users to upcoming events,
appointments, etc. Notably, there is a strict division between
applications that run on a standard Web server and voice dialogues
provided by a telephony server. This opens a window for an entirely
new business opportunity - «voice service providing». One of the
things that makes this such an attractive business proposition is that
developers aren't compelled to buy or maintain additional hardware and
software, and are instead free to direct all their attention towards
turning up and rolling out telephone voice services. Another area with
a promising future is consumer applications, where voice recognition
will eventually bring unrivalled ease of use and convenience to
consumers.
The architecture of VoiceXML
Every VoiceXML solution is based on several components:
- An application server. Generally this is the Web server on which
all the applications and databases run. It can also serve as an
interface to external databases or transaction servers.
- A VoiceXML telephony server. The VoiceXML Interpreter runs on
this platform. Acting as an interface to the calling client, it
translates all VoiceXML dialogues, natural language, and commands,
ensuring intelligible communication between the most diverse
VoiceXML-enabled end devices.
- The network protocol. Based on the TCP/IP protocol, this is a
packet network that connects the various application servers and
telephony servers.
- A telephone network. This may be a public telephone networks or
enterprises' proprietary local networks. VoiceXML solutions are
also able to communicate across VoIP-enabled (Voice over IP)
networks. Users make their calls using standard telephones.
VoiceXML is a powerful language for developing and using
voice-controlled dialogues and commands. It allows developers to
converge the Internet architecture, tools and technologies of leading
vendors and create innovative new solutions. Courtesy of VoiceXML's
remarkable level of standardization, these solutions may be created by
means of native XML development tools and deployed with ease in
conjunction with databases like Tamino by Software AG. This affords
developers entirely new possibilities for designed products and
solutions for both corporate customers and private consumers.
|