Download 19-Audition

Motor Theory Remnants April 3, 2012 Dirty Work • Project Reports #5 to turn in. • On Thursday, we’ll talk about the muscles that control articulation… • And do a slightly messy static palatography demo • At the end of today, we’ll do the USRI evaluations. Another Piece of the Puzzle • Another interesting finding which has been used to argue for the “speech is special” theory is duplex perception. • Take an isolated F3 transition: and present it to one ear… Do the Edges First! • While presenting this spectral frame to the other ear: Two Birds with One Spectrogram • The resulting combo is perceived in duplex fashion: • One ear hears the F3 “chirp”; • The other ear hears the combined stimulus as “da”. Duplex Interpretation • Check out the spectrograms in Praat. • Mann and Liberman (1983) found: • Discrimination of the F3 chirps is gradient when they’re in isolation… • but categorical when combined with the spectral frame. • (Compare with the F3 discrimination experiment with Japanese and American listeners) • Interpretation: the “special” speech processor puts the two pieces of the spectrogram together. fMRI data • Benson et al. (2001) • Non-Speech stimuli = notes, chords, and chord progressions on a piano fMRI data • Benson et al. (2001) • Difference in activation for natural speech stimuli versus activation for sinewave speech stimuli Mirror Neurons • In the 1990s, researchers in Italy discovered what they called mirror neurons in the brains of macaques. • Macaques had been trained to make grasping motions with their hands. • Researchers recorded the activity of single neurons while the monkeys were making these motions. • Serendipity: • the same neurons fired when the monkeys saw the researchers making grasping motions. •  a neurological link between perception and action. • Motor theory claim: same links exist in the human brain, for the perception of speech gestures Moving On… • One important lesson to take from the motor theory perspective is: • The dynamics of speech are generally more important to perception than static acoustic cues. • Note: visual chimerism and March Madness. Auditory Chimeras • Speech waveform + music spectrum: frequency bands 1 2 4 8 16 32 • Music waveform + speech spectrum: frequency bands 1 2 4 8 16 32 Originals: Source: http://research.meei.harvard.edu/chimera/chimera_demos.html Auditory Chimeras • Speech1 waveform + speech2 spectrum: frequency bands 1 2 4 6 8 16 • Speech2 waveform + speech1 spectrum: frequency bands Originals: 1 2 4 6 8 16 Motor Theory, in a nutshell • The big idea: • • We perceive speech as abstract “gestures”, not sounds. Evidence: 1. The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds 2. Speech perception is multi-modal 3. Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues 4. Limited top-down access to the primary, acoustic elements of speech Audition (or, how we hear things) April 3, 2012 How Do We Hear? • The ear is the organ of hearing. It converts sound waves into electrical signals in the brain. • the process of “audition” • The ear has three parts: • The Outer Ear • sound is represented acoustically (in the air) • The Middle Ear • sound is represented mechanically (in solid bone) • The Inner Ear • sound is represented in a liquid The Ear Outer Ear Fun Facts • The pinna, or auricle, is a bit more receptive to sounds from the front than sounds from the back. • It functions primarily as “an earring holder”. • Sound travels down the ear canal, or auditory meatus. • Length  2 - 2.5 cm • Sounds between  3500-4000 Hz resonate in the ear canal • The tragus protects the opening to the ear canal. • Optionally provides loudness protection. • The outer ear dead ends at the eardrum, or tympanic membrane. The Middle Ear the anvil (incus) the hammer (malleus) the stirrup (stapes) eardrum The Middle Ear • The bones of the middle ear are known as the ossicles. • They function primarily as an amplifier. • = increase sound pressure by about 20-25 dB • Works by focusing sound vibrations into a smaller area • area of eardrum = .55 cm2 • area of footplate of stapes = .032 cm2 • Think of a thumbtack... Concentration • Pressure (on any given area) = Force / Area • Pushing on a cylinder provides no gain in force at the other end... • Areas are equal on both sides. • Pushing on a thumb tack provides a gain in force equal to A1 / A2. • For the middle ear , force gain  • .55 / .032  17 Leverage • The middle ear also exerts a lever action on the inner ear. • Think of a crowbar... • Force difference is proportional to ratio of handle length to end length. • For the middle ear: • malleus length / stapes length • ratio  1.3 Conversions • Total amplification of middle ear  17 * 1.3  22 • increases sound pressure by 20 - 25 dB • Note: people who have lost their middle ear bones can still hear... • With a 20-25 dB loss in sensitivity. • (Fluid in inner ear absorbs 99.9% of acoustic energy) • For loud sounds (> 85-90 dB), a reflex kicks in to attenuate the vibrations of the middle ear. • this helps prevent damage to the inner ear. The Attenuation Reflex • Requires 50-100 msec of reaction time. • Poorly attenuates sudden loud noises • Muscles fatigue after 15 minutes or so • Also triggered by speaking tensor tympani stapedius The Inner Ear • In the inner ear there is a snail-shaped structure called the cochlea. • The cochlea: • is filled with fluid • consists of several different membranes • terminates in membranes called the oval window and the round window. Cochlea Cross-Section • The inside of the cochlea is divided into three sections. • In the middle of them all is the basilar membrane. Contact • On top of the basilar membrane are rows of hair cells. • We have about 3,500 “inner” hair cells... • and 15,000-20,000 “outer” hair cells. How does it work? • On top of each hair cell is a set of about 100 tiny hairs (stereocilia). • Upward motion of the basilar membrane pushes these hairs into the tectorial membrane. • The deflection of the hairs opens up channels in the hair cells. • ...allowing the electrically charged endolymph to flow into them. • This sends a neurochemical signal to the brain. An Auditory Fourier Analysis • Individual hair cells in the cochlea respond best to particular frequencies. • General limits: 20 Hz - 20,000 Hz • Cells at the base respond to high frequencies; tonotopic organization of the cochlea • Cells at the apex respond to low. How does this work? • Hermann von Helmholtz (again!) first proposed the place theory of cochlear organization. • Original idea: one hair cell for each frequency. • a.k.a. the “resonance theory” • But...we can perceive more frequencies than we have hair cells for. • The rate theory emerged as an alternative: • Frequency of cell firing encodes frequencies in the acoustic signal. • a.k.a. the “frequency theory” • Problem: cell firing rate is limited to 1000 Hz... Synthesis • The volley theory attempted to salvage the frequency rate proposal. • Idea: frequency rates higher than 1000 Hz are “volleyed” back and forth between individual hair cells. • There is evidently considerable evidence for this proposal. Traveling Waves (in the ear!) • Last but not least, there is the traveling wave theory. • Idea: waves of different frequencies travel to a different extent along the cochlea. • Like wavelength: • Higher frequency waves are shorter • Lower frequency waves are longer The Traveling Upshot • Lower frequency waves travel the length of the cochlea... • but higher frequencies cut off after a short distance. • All cells respond to lower frequencies (to some extent), • but fewer cells respond to high frequency waves. • Individual hair cells thus function like low-pass filters. Hair Cell Bandwidth • Each hair cell responds to a range of frequencies, centered around an optimal characteristic frequency. Frequency Perception • In reality, there is (unfortunately?) more than one truth-- • Place-encoding (traveling wave theory) is probably more important for frequencies above 1000 Hz; • Rate-encoding (volley theory) is probably more important for frequencies below 1000 Hz. • Interestingly, perception of frequencies above 1000 Hz is much less precise than perception of frequencies below 1000 Hz. • Match this tone: • To the tone that is twice the frequency: Higher Up • Now try it with this tone: • Compared to these tones: • Idea: listeners interpret pitch differences as (absolute) distances between hair cells in the cochlea. • Perceived pitch is expressed in units called mels. • Twice the number of mels = twice as high of a perceived pitch. • Mels = 1127.01048 * ln (1 + F/700) • where acoustic frequency (F) is expressed in Hertz. The Mel Scale Equal Loudness Curves • Perceived loudness also depends on frequency.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 19-Audition