Download The Benefit of Temporal Fine Structure in Spatial Release From

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Auditory system wikipedia , lookup

Dysprosody wikipedia , lookup

Earplug wikipedia , lookup

Hearing loss wikipedia , lookup

Olivocochlear system wikipedia , lookup

Noise-induced hearing loss wikipedia , lookup

Speech perception wikipedia , lookup

Audiology and hearing health professionals in developed and developing countries wikipedia , lookup

Lip reading wikipedia , lookup

Sensorineural hearing loss wikipedia , lookup

Transcript
V
The Benefit of Temporal Fine Structure in Spatial Release
Listeners gained a 3.7 dB benefit
to spatial release from masking
From Masking for Older Hearing-Impaired Listeners
Andrew King1,2
Lars Bramsløw1
Renskje K Hietkamp1
Marianna Vatti1
Atefeh Hafez1
Niels Henrik Pontoppidan1
Kathryn Hopkins2
Scan to access
poster PDF.
with the original, compared to
tone-vocoded, temporal fine
Eriksholm Research Centre
Oticon A/S
Snekkersten
Denmark.
1
structure. The benefit was best
predicted by low-frequency
School of Psychological Sciences
University of Manchester
Manchester
United Kingdom.
2
audiometric thresholds.
QUESTIONS:
ANSWERS:
1. Can older, hearing-impaired (HI) listeners benefit from spatially separating target and masker
sentences?
1. Older, HI listeners benefited from spatial separation, although it was not as marked as for
younger, normal-hearing listeners and varied between listeners.
2. How much benefit is lost when tone vocoding (TVC) removes cues of spatial separation in the
temporal fine structure (TFS)?
2. TFS information contributed significantly to SRM: when it was disrupted, the listeners did not
benefit so much from spatial separation of the sentences.
M1
MHA
Right ear
Left ear
Hearing Threshold (dB HL)
0
10
I/O Correction
Linear Gain
I/O Correction
Linear Gain
Original TFS
TVC
T
T
M2
M1
Separated
M2
M1
45°
45°
45°
45°
MHA
I/O Correction
Linear Gain
Original TFS
TVC
IPD500 used 500 Hz pure tones, presented with a 0° starting
phase in both ears in the reference stimulus. Onset and offset
ramps were synchronous across ears.
HYPOTHESES & RESULTS
Discrimination Target Shifts
Shift limit
Listeners will perform better with the maskers spatially
separated from the target than with the maskers colocated with the target:
TMR50% was lower (better) by when speakers were separated
(1.1 dB) than when co-located (3.3 dB); (F[1,19]=37.76,p<0.001).
F0DL: Components are shifted by component
number multiplied by modulation rate before
filtering.
FCk12 & FCk6: Components are shifted by a
constant frequency before filtering, making
them inharmonic.
IPD500: Phase in the left ear is shifted
positively.
N/A.
30
The difference in listeners performance in separated and
co-located conditions will be greater without TVC than
with TVC:
Spatial separation had a greater effect without TVC (4.0 dB) than
with TVC (0.3 dB); (F[1,19]=38.58,p<0.001).
40
50
60
70
80
125
250
500
1000
2000
4000
8000
Frequency (Hz)
Figure 1.The mean (±1 SD) audiometric thresholds in dB HL at
the 11 tested frequencies for the 20 listeners in the study.
METHODS
Speech stimuli
A special version of the Danish Dantale II corpus designed for spatial speech-on-speech testing was used (Behrens et al., 2007).
Recorded words were spoken by three females. Sentences followed the structure:
<Name> <Verb> <Number> <Adjective> <Object>.
E.g. Henning købte tre flotte ringe.
Three sentences are played simultaneously, masker 1 (M1),
masker 2 (M2), and target (T). The name in sentence T is the
Callsign, displayed to the listener via a computer screen hanging
above the front loudspeaker. Maximum SPL at the centre of the
listener’s head position was 70 dB for M1, M2 and T together.
Procedure
Listeners wore the MHAs which provided linear gain following
CAMEQ specifications (Moore and Glasberg, 1998). One test
block included TVC applied by the MHA before the gain whilst
the other test block only applied gain (No TVC), these blocks
were randomly ordered; within these blocks subblocks for
Separated and Co-located spatial conditions were randomised
(see Figure 2 for the test condition setup). Approximately a
week before testing, the listeners completed a training program to familiarize them with the procedure and stimuli.
For each test condition, a listener’s psychometric function is
estimated from the number of correctly recalled words over
54 trials that spanned a TMR range of approximately 20 dB.
The TMR that would give 50 % correct was taken as threshold
(TMR 50%).
Information Contact
Andrew King
University of Manchester
[email protected]
SNR (dB, estimated 50% correct)
90
6
4
0.2
FCk6
CL-TVC
LR-TVC
Figure 3. TMR50% in the four test conditions for each listener (open cirlces)
and on average (filled diamonds) ± 1 SD. From left to right, conditions are colocated without TVC (CL-Lin), separated without TVC (LR-Lin), co-located with
TVC (CL-TVC) and separated with TVC (LR-TVC).
The average SRM was approximately 8-14 dB less than that
achieved by NH listeners (Marrone et al., 2008; Neher et al.,
2011). SRM was greater in the conditions without TVC than in the
conditions with TVC. This suggests disruption of the original TFS
limits the ability of old, HI listeners to benefit from spatial separation of talkers for understanding speech. Despite evidence
that both age and hearing loss are associated with poor TFS sensitivity (Hopkins and Moore, 2011), some old, HI listeners can still
benefit from the TFS in speech. However, the benefit is roughly
half that achieved by young NH listeners (Andersen et al.2010).
Without TVC, the variation in SRM for HI listeners (−0.9 dB to
8.4 dB), was very similar to that found by Marrone et al. (2008).
This is generally greater than the variability seen in NH listeners
(Behrens et al. 2008).
0.6
0.8
1.0
Performance on FCk12, FCk6 and/or IPD500 will account
for variance in the benefit of TFS to SRM that is not
accounted for by audiometric thresholds between .125
and 1.5 kHz (PTALF):
Whilst the FCk6 test was significantly correlated with the benefit
of TFS to SRM, a multiple linear regression model with a stepwise method of entering and removing predictor variables only
included PTALF and excluded FCk6, F0DL and IPD500 scores and
listener age.
PTALF
8
6
6
4
4
2
2
30
35
40
45
4
50
Hearing Loss (dB, HL)
8
F0DL
1.0
1.2
Frequency (kHz)
Binaural
1.4
16
32
Carrier Frequency Shift (Hz)
64
IPD500
Spear. R = −0.40
6
4
4
2
2
0
r = −0.17
8
6
0
2
4
6
8
10
12
Modulation Rate Shift (Hz)
14
50
100
200
Phase shift (μs)
400
800
Figure 6. Scatterplots of the difference in SRM with TVC and without TVC
(TFS benefit to SRM) and (clockwise from top left) low frequency audiometric
thresholds (PTALF), harmonic and frequency shifted complex discrimination
(FCk6), IPD discrimination (IPD500) and fundamental frequency difference
limens (F0DL). Correlation coefficients are given in each panel.
1.6
DISCUSSION
+
1.0
r = −0.64
0
0
8
FCk12
–
FCk6
8
F0DL
0.8
DISCUSSION
0.4
r = −0.76
0
LR-Lin
1.0
Figure 6 shows the scatterplots illustrating these correlations.
None of the listeners performed the FCk12 test better than
by chance.
180°
−2
CL-Lin
0.8
Thresholds on some or all the psychoacoustic tests will
correlate negatively with benefit of TFS to SRM:
TFS benefit to SRM was correlated with FCk6 scores (r=−0.64,
p<0.01) and with PTALF (r=−0.76,p<0.001), but not with
IPD500 scores (r=−0.17, p>0.05) or F0DL scores (Spearman R=
−0.40, p>0.05).
Procedure
Monaural
Individual scores
Mean (±1 SD)
0.6
The PTALF model provided an adjusted R 2 of 0.55.
2.0
3.0
Time (ms)
4.0
1.0
2.0
3.0
Time (ms)
4.0
Figure 4. A schematic of the stimuli in the three monaural pschyoacoustic
tasks (top three panels) and the IPD discrimination task (bottom two panels). Light turquoise indicates the target stimuli, whilst the dark turquoise
indicates the reference stimuli.
Peissig and Kollmeier (1997) found audiometric threshold to
be a poor predictor of SRM. However, Neher et al (2011) found
a moderate correlation between low frequency audiometric
thresholds and SRM. The current results suggest that PTALF is
a good predictor of the benefit of TFS in SRM. Neher et al. also
found a relationship between IPD discrimination and SRM. However, the current study did not find IPD500 to be a good predictor
of the TFS benefit to SRM, despite the IPD500 task relying on
binaural TFS cues.
REFERENCES:
Andersen, M. R., Kristensen, M. S., Neher, T., and Lunner, T. (2010). Effect of Binaural Tone Vocoding on Recognising Target Speech Presented Against Spatially Separated Speech Maskers. Poster at IHCON.
Behrens, T., Neher, T., and Johannesson, R.B. (2007). ERH-42-08-05 Evaluation of a Danish speech corpus for assessment of spatial unmasking. Poster at ISAAR.
Behrens T., Neher T. and Johannesson R.B. (2008). Evaluation of speech corpus for assessment of spatial release from masking. In: T. Dau et al (eds.) Auditory Signal Processing in Hearing-Impaired Listeners. Copenhagen, Denmark: Centertryk
A/S, pp. 449–457.
eriksholm.com
Duquesnoy, A. J. (1983). Effect of a single interfering noise or speech source on the binaural sentence intelligibility of aged persons, J. Acoust. Soc. Am. 74, 739-743.
Gatehouse, R. W., and Noble, W. (2004). The speech, spatial and qualities of hearing scale (SSQ), Int. J. Audiol. 43, 85-99.
Eriksholm Research Centre
Rørtangvej 20
DK - 3070 Snekkersten
Phone +45 4829 8900
0.0
F0DL F0DL, FCk12 and FCk6 were presented monaurally to the
left ear, whilst IPD500 was presented binaurally. See Figure 4 for
diagrams of the stimuli.
2
−6
0.4
Time (s)
25
−4
0.2
HYPOTHESES & RESULTS
Modulation
rate / 2.
A geometric adaptive procedure tracked 71% correct (Levitt,
1971). Up to three tracks were performed for each test by each
listener and the geometric mean of these results was used as
that listener’s threshold.If the listener could not discriminate the
maximally-shifted stimulus from the reference stimulus, 40 trials
were presented with the maximum shift, then the adaptive track
was run again if the listener scored better than 65% correct.
8
0.0
Time (s)
METHODS
Figure 2. The four test conditions for the SRM experiment. Each listener
performed all four conditions.
–
Figure 5. A schematic of the two-interval, two-alternative forced choice
discrimination task used for F0DL, FCk12, FCk6 and IPD500 tests. Both intervals
included noise. Light turquoise indicates the target stimulus, whilst the dark
turquoise indicates the reference stimulus.
F0DL and FCk12 used harmonic complexes with a modulation rate
of 100 Hz. FCk6 used a modulation rate of 200 Hz. Stimuli were
bandpass filtered around a centre frequency of 1.2 kHz, passing
five components with a 30 dB/octave rolloff. Therefore F0DL and
FCk12 passed the 10th to the 14th harmonic components and
FCk6 passed the 4th to the 8th harmonic components.
MHA
I/O Correction
Linear Gain
+
–
Stimuli
Listeners will perform better without TVC than with TVC:
TMR50% was higher with TVC (3.6 dB) than without (0.8 dB);
(F[1,19]=87.00,p<0.001).
20
Amplitude
In addition to the age and hearing loss data already collected, we
measured listeners’ fundamental frequency difference limens
(F0DL), discrimination thresholds of IPDs (IPD500), and discrimination thresholds of harmonic and frequency-shifted bandpass
complexes centred around the twelfth harmonic (FCk12) and
around the sixth harmonic (FCk6).
MHA
2 interval, 2 alternative forced-choice + Threshold Equalizing Noise
–
+
TFS benefit to SRM (dB)
−10
+
TFS benefit to SRM (dB)
Listeners ranged from 64 to 86 years old and had bilateral, gently
sloping sensorineural hearing loss (see Figure 1). Asymmetry
across ears was < 15 dB at any audiometric frequency tested (125
Hz to 8 kHz). All listeners spoke Danish as their first language.
PREDICTING TFS BENEFIT TO SRM
M2
Component Magnitude
LISTENERS
T
Neher et al. (2011) found IPD discrimination and low frequency
audiometric thresholds predicted SRM performance in listeners
with hearing loss. To explain the wide variation in TFS benefit to
SRM for HI listeners, a battery of tests including TFS sensitivity
were conducted with the same listeners.
Andersen et al. (2010) showed normal-hearing (NH) listeners
achieve greater SRM when the speech signals contain their original TFS than when the speech has TVC applied to it. Andersen et
al. split the speech signal into 32 ERBN frequency bands (Glasberg and Moore, 1990), took the envelope from each band and
multiplied it by a pure tone at the band centre frequency.
The current study extends the study of Andersen et al. (2010)
by using a master hearing aid system (MHA) to apply TVC and
amplify the sounds presented from loudspeakers in an anechoic
room, rather than processing the sounds with generic head
related transfer functions before presenting them over headphones. Using the MHA is more ecologically valid than headphone
presentation for hearing aid users as it processes the sounds at
the listener’s ear, allowing the listener’s own head and torso to
affect the sounds, including head movements.
M1
Amplitude
Old, HI people, struggle to understand a target sound when they
are in a noisy environment (Duquesnoy, 1983; Gatehouse and
Noble, 2004). This may be because hearing loss is associated
with poorer SRM (Marrone et al., 2008; Neher et al., 2009).
TVC
M2
Co-located
SRM is the ability to hear target sounds over maskers more easily
when the sounds come from different directions than when all
sounds come from the same direction.
T
Amplitude
No TVC
BACKGROUND
3. Benefit from TFS to SRM was not correlated with interaural phase difference (IPD) discrimination
of pure tones in noise. Rather, the best predictor to the TFS benefit to SRM was audiometric hearing
loss below 1.5 kHz, over which TFS measures accounted for no extra variability.
Amplitude
3. Can discrimination tasks designed to measure TFS processing predict individual differences in
how much TVC affects spatial release from masking (SRM)?
Glasberg, B. R.; Moore, B. C. J. (1990). Derivation of auditory filter shapes from notched-noise data, Hear. Res. 47, 103–138.
Hopkins, K., and Moore, B. C. J. (2011). The effects of age and cochlear hearing loss on temporal fine structure sensitivity, frequency selectivity, and speech reception in noise, J. Acoust. Soc. Am. 130, 334-349.
Levitt, H. C. C. H. (1971). Transformed up‐down methods in psychoacoustics. J. Acoust Soc. Am., 49, 467–477.
Marrone, N., Mason, C. R., and Gerald Kidd, J. (2008). The effects of hearing loss and age on the benefit of spatial separation between multiple talkers in reverberant rooms,J. Acoust. Soc. Am. 124, 3064-3075.
Moore, B.C.J. and Glasberg, B.R. (1998). Use of a loudness model for hearing-aid fitting. I. Linear hearing aids. Brit. J. Audiol. 32(5), 317-325.
Neher, T., Behrens, T., Carlile, S., Jin, C., Kragelund, L., Petersen, A. S., and Schaik, A. (2009). Benefit from spatial separation of multiple talkers in bilateral hearing-aid users: Effects of hearing loss, age, and cognition, Int. J. Audiol. 48, 758-774.
Peissig, J., and Kollmeier, J. (1997). Directivity of binaural noise reduction in spatial multiple noise-source arrangements for normal and impaired listeners, J. Acoust. Soc. Am. 101, 1660-1670.
IN COOPERATION WITH