Download Listener Controlled Navigation of VoiceXML Documents

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Public address system wikipedia , lookup

Transcript
Listener Controlled
Navigation of VoiceXML
Documents
Gopal Gupta
N. Annamalai, H. Reddy
Dept. of Computer Science
UT Dallas
VoiceWEB
VoiceXML: The open-standard language
for serving voice/audio documents
Voice/audio documents can be browsed
using a voice browser with speaker &
microphone or using the regular phone
Voice browser:VoiceXML:: Browser:HTML
VoiceXML (Cont’d)
VoiceXML allows scripts/CGIs etc.
Can take input from the listener via speech
(fill out forms like in HTML)
Used extensively for automated call
handling.
Makes info. accessible over (cell)phones.
The next revolution on the WEB.
Problem with VoiceXML
Navigation of the voice document is
completely controlled by the page author
After each dialog (form) the author has to
ask where the listener will like to go next
Listener has absolutely no control over
navigation.
Tedium, Adv. Applications not possible
Analogy: Scroll vs a book
Our Solution: Voice Anchors
Voice anchors are speech labels that
listeners can place on a dialog.
Listener can return to that dialog later by
uttering that label.
Hard to implement this concept, as freeform speech recognition is not possible.
Need to incorporate it in the voice browser
Voice Anchors
We have developed a number of methods
for attaching voice anchors.
Most practical method: via spelling
The user can state the anchor as a whole
word and return to the dialoged labeled
Can also have default anchors (turning a
scroll into a book).
Can also have a no. of default navigation
strategies. E.g. skim section headings first
Competing Products
None that will do this for VoiceXML (at
least we are not aware of them).
The concept of voice anchors not
implemented in speech
There are anchors & bookmarks in HTML
AsTeR is a system developed for making
latex documents heard by blind users
Applications of the Technology
Talking books (published in VoiceXML)
Making WEB accessible to blind (ADA
compliance); Our lab has already
developed a HTML -> VXML converter
Advanced interaction over the phone (e.g.
travel reservation can be done).
Interactive direction giving.
Numerous other applications.
Buyers/Licensees
Makers of voice browsers: IBM + others
Big users of voice applications: Airlines,
Retailers, etc.
Developers of voice applications: Tellme,
BeVocal, Intervoice.
Govt.: make its websites ADA compliant
Publishers of e-books.
Size of the Market
If we have patent protection,market size can
be huge. Potentially millions of dollars (cos.
such as InterVoice are quite big)
Voice based WEB: the next revolution.
VoiceXML has revolutionized phone services,
not the Internet yet. But will happen in the
next 2-3 yrs
Costs & Commercialization
Currently we have a system developed as
a proxy server; cost to develop: not much
We could incorporate voice anchors in a
voice browser: approx. 1 year effort
Or wrap our system around an existing
voice browser (six months effort)
Time to commercialization: less than 1 yr.
Interested Companies
We haven’t disclosed it to any company
yet. The idea can be easily copied.
Our ideas are very simple. This simplicity
is the source of power of our ideas.
Potentially Interested Cos.
Computer & Communication Cos: IBM,
Motorola, Lucent, Verizon, HP
Voice Solutions Cos: Intervoice
Critically important to secure a patent,
since once the idea is known it is easily
implementable.
Best strategy: license the technology