Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What do we hear for? Seeing is knowing what is where by looking (David Marr) Seeing is predicting what is where, verified by looking, in order to drink that cup of coffee (Reza Shadmehr) What do we hear for? Seeing is knowing what is where by looking (David Marr) Seeing is predicting what is where, verified by looking, in order to drink that cup of coffee (Reza Shadmehr) Hearing is predicting what will happen next, verified by listening, in order to know as much as possible about what’s out there (Eli Nelken) Even simple sounds tell stories A stupid story The calm of the sea Vox balaenae (Voice of the whale) For flute, cello and piano (cello and piano playing) George Crumb A shout of despair Wozzeck, orchestral transition between scenes 2 and 3 of act 3 Alban Berg Auditory worlds • What are sounds? • What do we hear? • How do we hear? Sound As a Pressure Wave Vibrations of objects set up pressure waves in the surrounding air. The “elastic” property of air allows these pressure waves to propagate (spread). Structure of sounds What happens without structure? Introducing structure The bird and Chopin © Gabriel J. Arsante Structure of sounds © Gabriel J. Arsante What are sounds? • Structure at a lot of time scales • Perceptual correlates: – Melodies (1 s) – Notes (0.1 s) – Pitch (much faster than 0.01 s) Peripheral processing of sounds Outer Ear Inner Ear Middle Ear Outer Ear Inner Ear Middle Ear Outer Ear Inner Ear Middle Ear Outer Ear Inner Ear Middle Ear Cross Section of Cochlea “Travelling Wave” Along the Basilar Membrane Von Békésy Travelling Wave Peaks at Different Locations As the Frequency Changes Inner Hair Cells Outer Hair Cells A simple neuron in the auditory system BF The auditory pathways Responses of simple neurons to complex sounds A set of complex sounds Orig Slow In consequence… The neurogram We get a very rich and precise representation of the incoming sound at the level of the auditory nerve The sound and its components full 337 600 2000 Brahms, Geistlisches Wiegenlied Op. 91 no. 2 Kathleen Ferrier, Phyllis Spurr, Max Gilbert Is that enough? (do we hear the spectrogram?) What are the perceptual qualities of sounds? “The basic elements of any sound are loudness, pitch, contour, duration (or rhythm), tempo, timbre, spatial location, and reverberation.” (D.J. Levitin, This is Your Brain on Music: The Science of a Human Obsession, p.14) The Long Road from Spectrogram to Perception • How do we go from the ‘neurogram’ to ‘loudness, pitch, contour, duration (or rhythm), tempo, timbre, spatial location, and reverberation’? Relationships with low-level features… • Loudness with sound intensity – Encoded by some population-averaged activity • Pitch with periodicity Filtered clicks Pure tones SAM IRN Pitch: examples Iterated Filtered AMpure ripple (3 kHz) clicks noise Time Relationships with low-level features… • Loudness with sound intensity – Encoded by some population-averaged activity • Pitch with periodicity – Periodicity IS NOT frequency! • Contour with slow amplitude modulations – Encoded in the range of 1-10 Hz very clearly at the level of A1 (e.g. Shamma and collaborators) – But not slower than that (probably) • • • • Duration/rhythm with ??? Tempo with ??? Timbre with spatial activation patterns (e.g. in A1) Spatial location with ITD/ILD/spectral activation patterns – Low-level information available at the CN/SOC – But requires integration • Reverberation with ??????? The Long Road from Spectrogram to Perception • Pitch, timbre, phonemic identity, and so on are ‘separable’ – they are independent of each other • They represent high-level generalizations – Many different sounds have the same pitch (violin and trumpet), same timbre (trumpet on two different tones), same phonemic identity (two different people talking) – The neurograms of these pairs of sounds are very different from each other • The generalizations should be derivable from the neurogram, but are not explicitly represented at that level The Long Road from Spectrogram to Perception Problem no. 1: we do not hear the physics of sounds, but rather their derived properties (Reverse hierarchies – we perceive high representation levels unless we make serious efforts to go down into the details) The Long Road from Spectrogram to Perception The Long Road from Spectrogram to Perception Problem no. 2: In natural conditions, sounds rarely occur by themselves We have to group and segregate ‘bits of sounds’ in order to form representations of ‘auditory objects’ What comes first, the sound or its properties? • We may need to start by forming objects (solve problem no. 2) and only later assign properties to them (solve problem no. 1) Hypothesis: the early auditory system (presumably up to the level of primary auditory cortex) deals with the formation of auditory objects Evidence A: Object representation in primary auditory cortex The auditory pathways Primary auditory cortex is a higher brain area! Visual system: Auditory system: Photoreceptors Hair cells Bipolar cells Auditory nerve fibers Retinal ganglion cells Cochlear nucleus Frequency LGN Superior Olive V1 Inferior Colliculus detection MGB Species-specific calls? Auditory cortex Auditory scene analysis? IT Face cells Localization and binaural The auditory pathways A1 Neurons have a large variety of frequency response areas (FRAs) 98 98 Memory in primary auditory cortex Neurons in auditory cortex represent the weak components of sounds (evidence for the representation of auditory objects in primary auditory cortex) Strong effects of weak backgrounds… dB Attn 10 100 0.1 0 kHz ms 100 40 0 ms 0 100 ms 100 Some cortical neurons respond to weak noise in mixture with high-level tones Tones in modulated and unmodulated background Weak tones in strong noise Noise (bandwidth: BF, 10 Hz trapezoidal envelope) Tone (BF) Tone+Noise Las et al. 2005 Responses to high-level tones in silence and to low-level tones in noise are similar Evidence B: coding of surprising events in primary auditory cortex Time 95% Low Freq. 50% Low Freq. High Freq. High Freq. 5% High Freq. Low Freq. Low Freq. Low Freq. Low Freq. Standard Deviant 0.34 0.32 Low Freq. High Freq. SSA = 0.23 …Also with spikes… Evidence C: Perceptual qualities such as pitch are coded outside primary auditory cortex Activation of auditory cortex by noise and pitched stimuli Activation by intelligible speech Take-home messages • Auditory perception is far removed from the ‘physical’, low-level representation of sounds • A major problem of early processing is the definition of the ‘objects’ to which properties will be assigned • There is evidence that objects are defined first, properties are assigned in higher brain areas Reverse Hierarchy Theory • The hierarchical trade offs that dictate the relations between processing and perception • We perceive the high-order constructs rather than the low-level physics Interactions between high- and lowlevel representations Interactions between high- and lowlevel representations Interactions between high- and lowlevel representations From Hochstein and Ahissar 2002 Change blindness Name the color of the letters נשר אדום כחול Visual Reverse Hierarchy Theory (RHT) (Ahissar & Hochstein, 1997; Hochstein & Ahissar, 2002) Phonological/semantic level day bay dream …… Low levels are sensitive to fine temporal cues, in a μs resolution Initial perception is based on high-levels, which represent phonological entities See: Nahum, Nelken and Ahissar, PLoS 2008 We can either hear the sounds or understand the words, but not both at the same time