Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Document Skimmer Overcoming the soda-straw effect Alex Krstic Kelly Van Busum Suzanne Vogel Outline Problem Overview Prior Work (briefly) Our Work Demo Study Follow up Overview: Problem Listening is slower than reading, but speeding up decreases comprehension Speed up only by increasing reading rate, with NO scanning or skimming Skip ahead only by one line or one page Overview: Goal Identify features to increase speed Enable the user to adjust these features Trade off speed and comprehension Prior Work: Features Scan at levels of detail (LODs) Skip 1 segment within a level Speech Skimmer [1] & Aster [2] Speech Skimmer [1] Refs 1. 2. Speech Skimmer (Arons, 1993) Aster (Raman, 1994) Prior Work: Implementation Segment document, semantically Speech divisions: Long pauses [1] Text divisions: Structure boundaries [2] Filter out words or sounds within segments Spaces [1] Latter P number of words or seconds [1] Detailed (lower-level) info [2] Our Work: Features Hierarchy Dropping Words/Phonemes Spatial Sound Our Work: LOD Hierarchy Our Work: Dropping Words/Sounds Dropping common words Change text to phonemes Remove phonemes without lexical stress toz, suhn computing mpyootng Blending phonemes (Drop spaces) what up whuhtuhp Our Work: Spatial Sound Hearing more than one sound source at the same time 2, 3 or 4 Each source plays different segments of the file Some sources dominant over the others Spatial orientation Our Work: Screenshot Copyright 2003, ASK (Alex, Suzanne, Kelly) User Evaluations 3 informal, 4 systematic Asked questions, navigate to answer Hear text in various forms, then asked questions User Evaluations, 2 Hierarchy Sound (Word) Removal Difficult to explain “hierarchy concept”, underused Removing common words was liked (29% of words) Either really liked or hated phonemes (29%, 10%) Spatial Sound 2 sounds worked ok, 3 or more didn’t *Lots of different perspectives! New Questions… How much does voice selection matter? How much would training help? What is the relationship between phonemes and speed? What is the role of prior knowledge? How does this relate to Ctrl-F? Acknowledgements Peter Parente Pointed us to programming resources (BATS; wxPython, Python Numeric 22.0, Win32 libraries) Gave us Python sample code for speech synthesis and spatial sound Experiment participants (Informed consent requires confidentiality) Programming Resources BATS NCDemo – http://www.sourceforge.net OpenAL.dll, MSVRTD.dll, pyTTS.py, pyOpenAL.py (I think) Python – http://www.python.org/ Win32 library for Python – http://starship.python.net/crew/mhammond/ Python Numeric 22.0 library – http://www.pfdubois.com/numpy/ wxPython GUI library – http://www.wxpython.org/