Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
VoiceXML implementation VoxBuilder vs. OpenVXI VoiceXML Overview – Provide an automated user interface to web-like content via telephone or VoIP – Uses synthesized speech, or pre-recorded audio as output – Recognition of spoken word and DTMF keys serve as input – Control mechanisms allow for serial access to data “Hello World” Example ● <?xml version="1.0"?> ● <vxml version="1.0"> ● ● ● ● <form> <block>Hello World!</block> </form> </vxml> Form tag and child tags ● ● ● ● ● Like a paper form, each field must be filled out before accessing the next Field, child of form, supplies a location for user response Grammar, child of field, specifies expected audio response Prompt, child of field, asks user for input Filled, child of field, executes when user input matches the grammar Form Example This will respond back to the user with any number between 0 and 9999 <form id="hello_form"> <field name="first"> <grammar> NATURAL_NUMBER_THRU_9999 </grammar> <prompt> <audio>Say a number.</audio> </prompt> <filled> <audio> You said </audio> <audio><value expr="first"/></audio> <filled> </field> </form> System Overview Web Server HTML and VXML Content IP Network VoiceXML Interpreter HTML Scraper Synthesis & Recognition Telephony Services OpenVXI ● Advantages – Open source for easy modification, and verification of code – Dedicate use of server may allow for better performance – Run on own server, allows explicit and direct control ● Allows users choice of speech synthesis and voice recognition packages, telephony integration, operating system OpenVXI ● Other notes – Tested with a number of telephony and speech APIs including JTAPI, TAPI, JSAPI, SAPI and Sphinx III – Expects Natual Lanuguage Semantic Markup Language (NLSML) as a recognition response – Operates on v1.0 and v2.0 VXML – Does not explicitly require certain telephony API calls such as source number OpenVXI ● Performance – Use of a separate server for speech synthesis and recognition speeds processing – Caching any remote documents and scripts can improve response in VXML parsing VoxBuilder ● Remote hosted Internet based VoiceXML service provider – Provides all required equipment and software for adding VoiceXML to an existing Web presence – Does offer hosting for VXML, scripts and audio, they may optionally be remotely hosted – Provides Web interface for all development and management – Multiple project support VoxBuilder ● Focused on European deployment – Access to European phone system – Multilingual support ● Prompt in one language ● Accepts responses in another language – 13 different languages supported – Integrated phone management for different billing rates Comparing voxBuilder and OpenVXI ● For small projects particularly multi-country, multi-language projects, voxBuilder is the right choice ● ● Easy quick deployment on a proven platform with little initial cash outlay and minimal effort For large corporations interested in high volumes or more precise control OpenVXI works best ● Your choice of text-to-speech, and recognition software, your choice of setup for optimal performance under heavy load conditions Information Retrieval and VXML ● Broadens IR searching toolkit – Typical web browsing involves a large display, not practical for phones or a small PDA – Viewing a display while driving is not recommended – High speed connections for image download is needed for typical web page Information Retrieval and VXML ● Adds a layer of difficulty to search engine designers – Current voice recognition technology works on a limited grammar set or a long training period – A search which can involve any word (including foreign words), it not possible today – Future improvements of the voice recognition algorithms could alleviate this problem