Download Auditory perception of amplitude and frequency modulations in sounds

A brief course on … Auditory Perception of Amplitude and Frequency Modulations in Sounds Christian Lorenzi Laboratoire des Systèmes Perceptifs UMR CNRS 8248 Dépt d’Etudes Cognitives Institut d’Etude de la Cognition Ecole normale supérieure, Paris PSL Research University Overall plan of this presentation • Short introduction • Assessment of modulation perception in humans - Characterizing deficits in modulation perception - Auditory mechanisms of modulation perception • Role of temporal modulations in sound recognition & auditory scene analysis - Effects of cochlear damage on the perception of speech modulation cues • Conclusions A short introduction to the study of auditory perception of temporal modulations in sounds … The ear is a frequency analyzer Peripheral auditory system Outer ear Middle ear Basilar membrane Hair cells Central auditory system Auditory nerve fibers Brainstem Auditory cortex A steady sound The ‘internal power spectrum’ The excitation pattern The (peripheral) auditory system (the basilar membrane in the cochlea) decomposes sounds – such as this steady sound – into their audio-frequency components Cochear filters Inner ear Steady vs modulated sounds Amplitude (linear units) An example of steady sound Time (s) oboe Steady vs modulated sounds However, most natural sounds, such as this speech signal /ababa/, show pronounced amplitude and frequency modulations (AM and FM components) Envelopes (ERBx1) 3758 /ababa/ - 3 bands, unprocessed 2656 0.8 Center Frequency (Hz) Amplitude (linear Amplitude (linear units) units) 1 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 1858 1281 863 561 343 184 0 0.2 0.4 0.8 0.6 Time (ms) Time (s) 1 1.2 1.4 0.2 0.4 0.6 0.8 1 1.2 Time(s) Natural sounds, e.g. speech sounds, show salient modulations The ear is not only a frequency analyzer; it is also a demodulator Amplitude Speech sounds as modulated sounds 0.4 Amplitude 0.2 0 -0.2 -0.4 -0.6 0 Time These modulations can be studied by taking speech sounds, passing them through a bank a bandpass (analysis) filters (as the cochlea does), and extracting the AM and FM components of each narrowband signal AM The Temporal Envelope (E) Amplitude Speech sounds as modulated sounds 0.4 Amplitude 0.2 0 -0.2 -0.4 -0.6 0 Time FM The Temporal Fine Structure (TFS) Speech sounds as modulated sounds The narrowband signal at the output of a given analysis (cochlear) filter • Sk(t) = Ek(t) . cos( φk(t) ) The temporal fine structure (FM component) ~ • Sk(t) : analytic signal of Sk(t) ~ The envelope (AM component) E(t) = | Sk(t) | ~ φk (t) = arg(Sk(t)) There are many ways for decomposing a given signal into an AM and a FM carrier. Here, narrowband signals are modelled as the product of an AM (the envelope) and a FM carrier (the temporal fine structure). AM is positive and real. The AM and the FM components are obtained by using the Hilbert transform. However, alternative AM/FM decompositions have been considered. Gilbert & Lorenzi (JASA, 2006) Sheft et al. (JASA, 2008) Speech sounds as modulated sounds Modulation spectra Sheft et al. (JASA, 2008) Syllabic Rate (3-4 syllables/sec) FM These modulations range AM between ~2-500 Hz These modulation spectra are obtained by taking the Fourier transform of AM and FM patterns extracted from minutes of speech signals. Comparable modulation spectra are obtained in French, English, German, etc. Speech sounds as modulated sounds Other the last decades, several groups of researchers have attempted to test the validity of the following 3 assumptions Syllabic Rate (3-4 syllables/sec) FM These modulations range AM between ~2-500 Hz H1: These modulations carry useful timbre (phonetic) information Effects of cochlear damage & ageing Oral comprehension Ageing Cochlear damage Speech identification performance Sheft et al. (Ear & Hearing, 2012) NH: normal hearing HI: hearing impaired H2: Deficits in modulation perception explain poor speech perception for hearing-impaired (HI) and elderly persons Rehabilitation Oral comprehension Speech identification performance A hearing aid Cochlear implantees (CI) A cochlear implant Gnansia et al. (2014) 70% correct for CI versus 100 % for NH listeners H3: Poor transmission of modulations explains poor speech perception despite rehabilitation via Hearing Aids How do we assess modulation perception in humans ? The ‘listener’ The ‘experimenter’ Assessment of modulation perception in humans: sine AM and sine FM Unmodulated sine-tone carrier AM tone 1 2 0 -0.5 -1 0 -1 0 0.5 1 Time (s) 1.5 -2 2 0 0.5 4 x 10 1 0.5 0.5 0 1 Time (s) 1.5 2 4 x 10 FM tone 1 Amplitude Amplitude Unmodulated sine-tone carrier The modulation detection threshold can be tracked by varying systematically modulation depth. 0 -0.5 -0.5 -1 In a typical modulationdetection task, each listener is presented with successive trials, corresponding to two successive sounds. On each trial, the two sounds are presented in random order: an unmodulated carrier and a modulated carrier. The listener is then asked to indicate which sound is modulated. 1 Amplitude Amplitude 0.5 0 0.5 1 Time (s) 1.5 2 4 x 10 -1 0 0.5 1 Time (s) 1.5 2 4 x 10 Assessment of modulation perception in humans: sine AM and sine FM 50% depth 1.5 AM tone 1 1 0.5 Amplitude Unmodulated sine-tone carrier 2 0 -0.5 -1 1 Amplitude Amplitude 0.5 0 -1.5 0 0.5 1 Time (s) 1.5 1 Time (s) 1.5 x 10 1.5 0 2 4 25% depth 1 -0.5 -1 Amplitude 0.5 -1 0 0.5 1 Time (s) 1.5 -2 2 0 -0.5 -1 0 0.5 4 x 10 1 Time (s) 1.5 2 -1.5 4 x 10 0 0.5 12% depth 1.5 Unmodulated sine-tone carrier FM tone 1 Amplitude 0.5 1 1 2 4 x 10 0 -0.5 -1 0.5 0.5 Amplitude Amplitude -1.5 0 -0.5 -0.5 -1 0 0 0.5 1 Time (s) 1.5 2 4 x 10 -1 0 0.5 1 Time (s) 1.5 0 0.5 1 Time (s) 1.5 … 2 4 x 10 Track modulation depth at threshold for a given modulation rate, fm 2 4 x 10 Assessment of modulation perception in humans: Sensitivity to AM These ‘temporal modulation transfer functions’ show AM detection thresholds as a function of AM rate, for a given carrier Modulation rate, fm (Hz) Modulation depth, m (%) Modulation depth, m (%) Lorenzi et al. (2001a,b) fm (Hz) Modulation rate, fm (Hz) Assessment of modulation perception in humans: Sensitivity to AM These ‘temporal modulation transfer functions’ are lowpass in shape; sensitivity drops (degrades) when AM rate is greater than about 50-100 Hz Modulation rate, fm (Hz) Modulation depth, m (%) Modulation depth, m (%) Lorenzi et al. (2001a,b) fm (Hz) Modulation rate, fm (Hz) The auditory system operates as a lowpass filter for AM, smearing out fast (>50-100 Hz) AM fluctuations Assessment of modulation perception in humans: Sensitivity to FM Frequency excursion (Hz) fm=2 Hz fm=20 Hz The auditory system is also ‘sluggish’ for FM detection Carrier frequency, fc=500 Hz Number of cycles of modulation Stimulus duration Wallaert et al. Characterizing deficits in modulation sensitivity Peripheral auditory system Outer ear Middle ear Basilar membrane Hair cells Central auditory system Auditory nerve fibers Brainstem Auditory cortex Inner ear Cochlear Damage Ageing effects The auditory system can be damaged: • peripherally (lesions of inner and outer hair cells, damage to the auditory nerve) • centrally (brain lesions, ageing effects) Peripheral damage causes sensorineural hearing loss (SNHL). Central damage cause central auditory processing disorders. However, the effects of cochlear lesions are often associated with ageing effects, as in the case of presbycusis, a common form of sensorineural hearing loss observed in elderly people. It is important to separate the effects of cochlear damage from ageing effects on the ability to perform a given auditory task. Characterizing deficits in modulation sensitivity Cochlear damage is often associated with reduced audibility – as shown by the pure-tone audiogram (below). ‘Suprathreshold’ Auditory Deficits Léger et al. (Hear Res, 2012) However, cochlear damage is also associated with suprathresholds auditory deficits, that is deficits in the ability to discriminate audible sounds. As an example, speech comprehension can be impaired even in regions of near-normal hearing (below). Abnormal sensitivity to AM and FM: Two suprathreshold auditory deficits caused by cochlear damage and/or ageing Characterizing deficits in AM sensitivity Individual and mean data from NHy (yound normal-hearing listeners), NHe (elderly normalhearing listeners), HIy (yound hearing-impaired listeners), HIe (elderly hearing-impaired listeners) fm=5 Hz Modulation depth (dB) Ageing effect Ageing degrades AM sensitivity Hearing loss preserves AM sensitivity SNHL effect Carrier frequency (Hz) Wallaert et al. (ARO, 2015) Characterizing deficits in FM sensitivity Frequency excursion (Hz) fm=5 Hz Ageing effect Ageing preserves FM sensitivity Hearing loss degrades FM sensitivity SNHL effect Carrier frequency (Hz) Wallaert et al. (ARO, 2015) Characterizing deficits in modulation perception In summary: Cochlear damage & ageing affect AM and FM sensitivity differently : - Cochlear lesions degrade FM sensitivity while sparing AM sensitivity - Ageing degrades AM sensitivity while sparing FM sensitivity Distinct auditory mechanisms for AM and FM processing ? Auditory Periphery AM FM Auditory Centers Separate processing paths ? AM (Envelope) Cochear filters FM (Temporal fine structure) Auditory mechanisms of modulation perception Auditory Periphery Auditory Centers AM FM Cochear filters ? Are AM and FM cues processed by totally distinct auditory mechanisms, or do they share a common code? Auditory mechanisms of AM perception Before exposure After exposure AM detection thresholds were measured in young normalhearing listeners, at 3 modulation rates. Measures were conducted before and after exposure to a 15-min 16-Hz AM tone (with the same carrier). Adaptation effects Bruckert et al. (JASA, 2006) Auditory mechanisms of AM perception Before exposure After exposure AM detection thresholds are selectivity degraded (increased) at the exposure AM rate (16 Hz). This adaptation effect is often interpreted as evidence for the existence of neural units tuned to specific AM rates. Adaptation effects Bruckert et al. (JASA, 2006) Auditory mechanisms of AM perception Millman et al. (JASA, 2002) AM detection thresholds were measured in young normal-hearing listeners, Measures were conducted in the presence or in the absence of a secondary (masking) AM applied to the the same carrier as the target AM. The rates of target and masker AMs were systematically varied. Masking effects Auditory mechanisms of AM perception Millman et al. (JASA, 2002) Detection thresholds for the target AM were found to increase (degrade) in the presence of the secondary (masking) AM. The modulation masking effect was greater when the rates of target and masker AMs were close to each other. BMF ∈ [2-100 Hz] BW 1 oct. Masking effects The peaked (bandpass) aspect of modulation masking patterns is interpreted as evidence for the existence of tuned modulation channels in the auditory system Auditory mechanisms of AM perception Target modulation detection threshold Peaked modulationmasking patterns can be found for AM rates between about 2 and 100 Hz, suggesting that AM channels are tuned between 2-100 Hz. These channels are believed to be broadly tuned (Q=1) Target modulation rate fixed to 100 Hz In the presence of a modulation masker Modulation masking Masking effects No modulation masker Masker modulation rate Lorenzi et al. (JSLHR, 1997) Auditory mechanisms of AM perception 0 Magnitude (dB) Carrier effects on AM detection thresholds can be predicted by the characteristics of the modulation spectrum of the carrier. The modulation spectrum of a narrowband noise is triangular (see insert). This explains why AM detection thresholds are selectivity degraded for low AM rates only when the carrier is a narrowband noise (instead of a sine tone). -20 -40 - 60 0 2 4 6 8 10 Modulation rate of inherent fluctuations (Hz) Masking effects Lorenzi et al. (JASA, 2001) Auditory mechanisms of AM perception Cortical neurons are tuned to a given ‘best modulation rate’ Two recording sites (PAC) Primary Auditory Cortex Neurophysiological evidence supporting the notion of central AM channels was found in humans. SEEG and fMRI studies have found that cortical units are tuned to best modulation rates, below 100 Hz. The (central) auditory system is selectively tuned for AM Giraud et al. (J Neurophysiol, 2000); Liégeois-Chauvel et al. (Cereb Cortex, 2004) Auditory mechanisms of AM perception Cortical neurons are tuned to a given ‘best modulation rate’ Two recording sites (PAC) Primary Auditory Cortex A SEEG study was conducted on epileptic patients wearing intracranial electrodes in primary and secondary auditory areas. AM stimuli with varying AM rates were presented in free field to these patients. Patients listened to them passively. Auditory evoked responses (local field potentials) were measured on each electrode in response to these AM sounds, and analyzed to build neural ‘temporal modulation transfer functions’. These functions were found to be tuned. These SEEG data are consistent with f-MRI data obtained in another study. The (central) auditory system is selectively tuned for AM Giraud et al. (J Neurophysiol, 2000); Liégeois-Chauvel et al. (Cereb Cortex, 2004) Frequency excursion (Hz) Auditory mechanisms of FM perception fm=2 Hz fm=20 Hz Carrier frequency fc=500 Hz Number of cycles of modulation FM processing was investigated by measuring FM detection thresholds in young normalhearing listeners. FM detection thresholds were measured in the absence or in the presence of a superimposed (masking) AM at the same rate as FM. Wallaert et al. Frequency excursion (Hz) Auditory mechanisms of FM perception fm=2 Hz fm=20 Hz Carrier frequency fc=500 Hz Number of cycles of modulation AM interferes with FM processing → FM is encoded as (converted into) AM (envelope cues) Wallaert et al. AM time Auditory mechanisms of FM perception Cochlear filter Amplitude FM can be converted into AM thanks to cochlear filtering. The differential attenuation of cochlear filters transforms frequency excursions into changes in excitation at the output of the cochlear filter. ∆f FM frequency FM is encoded as (converted into) AM (envelope cues) This suggests a common code for AM and FM. In other words, FM may not be encoded as such in the auditory system. Auditory mechanisms of FM perception The temporal fine structure (TFS, < ~1-2kHz) is encoded via neural phase locking in auditory-nerve fibers 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 2.2 2.25 2.3 2.35 2.4 2.45 2.5 ∆t 2.55 2.6 2.65 4 x 10 spike trains in auditory-nerve fibers However, auditory neuroscientists have demonstrated the existence of another mechanism able to encode FM accurately, for carrier frequencies below about 1-2 kHz, and FM rates below about 10 Hz: neural phase locking. Auditory mechanisms of FM perception The temporal fine structure (TFS, < ~1-2kHz) is encoded via neural phase locking in auditory-nerve fibers 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 2.2 2.25 2.3 2.35 2.4 2.45 2.5 ∆t 2.55 2.6 2.65 4 x 10 spike trains in auditory-nerve fibers How can we demonstrate that FM is also encoded via neural phase locking ? Auditory sensitivity to changes in TFS interaural phase is constrained by neural phase locking in auditory nerve fibers Track IPD at threshold Auditory mechanisms of FM perception The temporal fine structure (TFS, < ~1-2kHz) is encoded via neural phase locking in auditory-nerve fibers 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 2.2 2.25 2.3 2.35 2.4 2.45 2.5 ∆t 2.55 2.6 2.65 4 x 10 spike trains in auditory-nerve fibers Changes in TFS interaural phase elicit a ‘spatial percept’ (changes in perceived sound laterality) Auditory sensitivity to changes in TFS interaural phase is constrained by neural phase locking in auditory nerve fibers Track IPD at threshold Auditory mechanisms of FM perception The temporal fine structure (TFS) is encoded via neural phase locking in auditory-nerve fibers (iff < ~1-2kHz) 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 2.2 2.25 2.3 2.35 2.4 2.45 2.5 ∆t 2.55 2.6 2.65 4 x 10 spike trains in auditory-nerve fibers Changes in TFS interaural phase elicit ‘spatial percepts’ (changes in perceived sound laterality) Auditory sensitivity to changes in TFS interaural phase is constrained by neural phase locking in auditory nerve fibers Track IPD at threshold A change in interaural phase Auditory mechanisms of FM perception Auditory sensitivity to AM, FM, and IPD was measured in the same young, normal-hearing listeners FM vs IPD AM vs IPD 20 5 r=0.6; p<.01 ns AM detection threshold (%) 16 16 4 14 14 3.5 12 12 3 10 10 2.5 2 FM sentivity is the only task correlated to IPD thresholds 1.5 ns 18 AM detection threshold (%) 18 4.5 FM detection threshold (Hz) AM vs FM 20 8 6 4 8 6 4 2 2 1 0 0.2 0.4 0.6 0.8 1 Interaural Phase Difference IPD (Radians) 0 0.2 0.4 0.6 0.8 1 Interaural Phase Difference IPD (Radians) 0 1 2 3 4 5 FM detection threshold (Hz) FM detection is constrained by the phase locking properties of auditory neurons Paraouty et al. (ARO, 2015) Auditory mechanisms of modulation perception Proposed architecture for AM and FM processing in the early and central auditory system Auditory Periphery Auditory Centers AM FM Features Ek(t) Cochear filters TFSk(t) Phase Locking Features - Pitch - Timbre In conclusion: Partially distinct auditory mechanisms for AM and FM perception Shamma & Lorenzi (JASA, 2013) Role of temporal modulations in sound recognition & auditory scene analysis What are the respective roles of AM and FM cues in sound recognition and sound-source separation ? AM speech 0.5 AM+FM speech 0 -0.5 0 0.1 0.2 0.3 Time (s) 0.4 0.5 FM speech Vocoders are used to degrade selectively AM and/or FM components in speech signals Effects of degrading AM and FM cues on speech recognition in quiet AM speech 0.5 0 -0.5 0 0.1 0.2 0.3 Time (s) 0.4 0.5 AM+FM speech Effects of degrading AM and FM cues on speech recognition in quiet Normal-hearing listeners AM speech Perfect recognition for syllables or sentences Limited training required FM speech Intensive training often required Poor recognition for sentences AM cues are the most salient cues for speech recognition in quiet Gilbert et al. (JASA, 2006, 2007); Sheft et al. (JASA, 2008); Ardoint et al. (Hear Res, IJA, 2010) Lorenzi et al. (PNAS, 2006) Effects of degrading AM and FM cues on speech recognition in quiet It follows that speech processors of hearing aids and cochlear implants should not restrict unduly the transmission of AM cues relevant to human hearing Cochlear implant (CI) Speech processor AM Pulse Generator Mapping Micro AGC Pulse Generator AM … as illustrated here by the detrimental efffects of amplitude compression on the transmission of AM cues Mapping Effects of varying Compression Ratio (CR) Won et al. (JARO, 2014) Electrode outputs Effects of degrading AM and FM cues on speech recognition in quiet This is consistent with studies showing a significant correlation between AM detection thresholds and speech recognition in quiet (measured in free field) in CI patients Cochlear implantees (CI) r~0.5; p<.05 Good transmission/reception of AM cues is often associated with good speech recognition in quiet Gnansia et al. (IJA, 2014) Effects of degrading AM and FM cues on speech recognition in quiet Cochlear implantees (CI) r~0.5; p<.05 Good transmission/reception of AM cues is often associated with good speech recognition in quiet Gnansia et al. (IJA, 2014) Effects of degrading AM and FM cues on speech recognition in quiet Cochlear implantees (CI) r~0.5; p<.05 Good transmission/reception of AM cues is often associated with good speech recognition in quiet Gnansia et al. (IJA, 2014) Effects of degrading FM cues on speech recognition in masking noise Speech recognition is degraded when speech is presented against a background noise masker. The masking effect is dependent on the AM content of the background noise masker. NH: normal hearing HI: hearing impaired Benefit from noise fluctuations for NH listeners Limited benefit for HI listeners 0.5 0.5 0.5 0 0 0 -0.5 0 0.1 0.2 0.3 0.4 Time (s) Lorenzi et al. (IJA, 2006) Clean speech 0.5 -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) + a notionally ‘steady’ noise masker -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) + an AM noise masker Gnansia et al. (Hear Res, 2008; JASA, 2009) Effects of degrading FM cues on speech recognition in masking noise Speech recognition is substantially improved when a slow AM is superimposed to the noise masker. This effects is called ‘speech masking release’. NH: normal hearing HI: hearing impaired Benefit from noise fluctuations for NH listeners Limited benefit for HI listeners 0.5 0.5 0.5 0 0 0 -0.5 0 0.1 0.2 0.3 0.4 Time (s) Lorenzi et al. (IJA, 2006) Clean speech 0.5 -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) + a notionally ‘steady’ noise masker -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) + an AM noise masker Gnansia et al. (Hear Res, 2008; JASA, 2009) Effects of degrading FM cues on speech recognition in masking noise 0.5 AM speech 0 -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) AM+FM speech 0.5 0 -0.5 0 0.1 0.2 0.3 0.4 0.5 Time (s) The speech masking release effect can be studied by passing the speech+noise mixtures [i.e., the (notionally) steady or AM noise added to speech] though a vocoder degrading selectively FM cues while preserving AM cues. Effects of degrading FM cues on speech recognition in masking noise Speech masking release is measured for unprocessed and processed (vocoded) speech+noise mixtures. Maximum release is found for a 100-% AM depth. Poorer masking release is found when degrading FM cues. 40 35 Masking release (%) Speech masking release is measured for young normal-hearing listeners, as a function of masker AM depth. 30 Unprocessed - fm = 8Hz Processed - fm = 8Hz 25 20 15 10 5 Performance 0 -5 12.5 25 50 59.5 70.7 84.1 100 Modulation depth (%) FM cues help separating speech from noise Degrading FM cues simulates the limited benefit from noise fluctuations for HI listeners Gnansia et al. (Hear Res, 2008; JASA, 2009) Effects of degrading FM cues on speech recognition in masking noise Gnansia et al. (2014) Speech identification performance Cochlear implantees (CI) This is consistent with CI data (knowing that CI processors do not transmit FM cues) Degrading FM cues simulates the limited benefit from noise fluctuations for CI listeners Effects of cochlear damage and ageing on speech recognition in masking noise Complex FM patterns This is consistent with studies showing that FM detection and discrimination is significantly correlated with speech recognition scores in noise for hearingimpaired listeners r~0.5 ; p<.05 Deficits in FM sensitivity explain - partly – poor speech recognition in noise Sheft et al. (Ear & Hearing, 2012) Conclusions • The auditory system is able to extract temporal modulations in complex sounds such as speech • AM (‘envelope’) cues convey useful timbre (phonetic) information • FM (‘temporal fine structure’) cues convey useful segregation cues • AM and FM cues are processed by partially independent auditory mechanisms Conclusions • Cochlear damage and ageing affect AM and FM processing differently • Cochlear damage degrades FM processing but preserves AM processing; Ageing alters AM processing • Deficits in FM perception explain – at least partially – the poorer-than-normal speech perception in noise typically associated with cochlear damage • Hearing Aids and Cochlear implants should not restrict the transmission of AM and FM cues relevant for human hearing

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Auditory perception of amplitude and frequency modulations in sounds