Download Document Scanner

A Document Skimmer Overcoming the soda-straw effect Alex Krstic Kelly Van Busum Suzanne Vogel Outline       Problem Overview Prior Work (briefly) Our Work Demo Study Follow up Overview: Problem  Listening is slower than reading, but speeding up decreases comprehension   Speed up only by increasing reading rate, with NO scanning or skimming Skip ahead only by one line or one page Overview: Goal   Identify features to increase speed Enable the user to adjust these features  Trade off speed and comprehension Prior Work: Features  Scan at levels of detail (LODs)   Skip 1 segment within a level   Speech Skimmer [1] & Aster [2] Speech Skimmer [1] Refs 1. 2. Speech Skimmer (Arons, 1993) Aster (Raman, 1994) Prior Work: Implementation  Segment document, semantically    Speech divisions: Long pauses [1] Text divisions: Structure boundaries [2] Filter out words or sounds within segments    Spaces [1] Latter P number of words or seconds [1] Detailed (lower-level) info [2] Our Work: Features  Hierarchy  Dropping Words/Phonemes  Spatial Sound Our Work: LOD Hierarchy Our Work: Dropping Words/Sounds  Dropping common words  Change text to phonemes   Remove phonemes without lexical stress   toz, suhn computing  mpyootng Blending phonemes (Drop spaces)  what up  whuhtuhp Our Work: Spatial Sound  Hearing more than one sound source at the same time     2, 3 or 4 Each source plays different segments of the file Some sources dominant over the others Spatial orientation Our Work: Screenshot Copyright 2003, ASK (Alex, Suzanne, Kelly) User Evaluations  3 informal, 4 systematic  Asked questions, navigate to answer  Hear text in various forms, then asked questions User Evaluations, 2  Hierarchy   Sound (Word) Removal    Difficult to explain “hierarchy concept”, underused Removing common words was liked (29% of words) Either really liked or hated phonemes (29%, 10%) Spatial Sound  2 sounds worked ok, 3 or more didn’t *Lots of different perspectives! New Questions…  How much does voice selection matter?  How much would training help?  What is the relationship between phonemes and speed?  What is the role of prior knowledge?  How does this relate to Ctrl-F? Acknowledgements  Peter Parente    Pointed us to programming resources (BATS; wxPython, Python Numeric 22.0, Win32 libraries) Gave us Python sample code for speech synthesis and spatial sound Experiment participants  (Informed consent requires confidentiality) Programming Resources  BATS NCDemo – http://www.sourceforge.net      OpenAL.dll, MSVRTD.dll, pyTTS.py, pyOpenAL.py (I think) Python – http://www.python.org/ Win32 library for Python – http://starship.python.net/crew/mhammond/ Python Numeric 22.0 library – http://www.pfdubois.com/numpy/ wxPython GUI library – http://www.wxpython.org/

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document Scanner