Download Neural Sensitivity to Periodicity in the Inferior Colliculus: Evidence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Earplug wikipedia , lookup

Sensorineural hearing loss wikipedia , lookup

Sound localization wikipedia , lookup

Sound from ultrasound wikipedia , lookup

Auditory system wikipedia , lookup

Transcript
J Neurophysiol 92: 1295–1311, 2004.
First published May 5, 2004; 10.1152/jn.00034.2004.
Neural Sensitivity to Periodicity in the Inferior Colliculus: Evidence
for the Role of Cochlear Distortions
David McAlpine
Department of Physiology and University College London Centre for Auditory Research, London WC1E 6BT, United Kingdom
Submitted 12 January 2004; accepted in final form 30 April 2004
Pitch is defined as “that attribute of auditory sensation in
terms of which sounds may be ordered on a musical scale.” As
with many subjective phenomena, neural mechanisms that
contribute to the ability to determine the pitch of a sound
remain elusive, particularly for sounds with relatively complex
spectra. It is likely that multiple neural mechanisms contribute
to our perception of the pitch of complex sounds. The place
theory of pitch suggests that the pitch of a sound is related to
the position along the basilar membrane activated by the
spectral components present in the sound. For pure tones, or for
complex sounds where individual spectral components are
resolved by the peripheral auditory system, i.e., pass through
separate auditory filters, information can be combined across
different auditory filters to derive the pitch. The temporal
theory of pitch suggests that the pitch of a sound may be
derived from the temporal pattern of nerve action potentials
generated by the periodic nature of the stimulus. In temporal
pitch, the pitch of a complex sound is determined from the
interval between successive (in time) peaks in the stimulus
waveform. Depending on the spectral components, the pitch
approximates the inverse of the periodicity of the stimulus.
This occurs even in the absence of spectral information present
at the frequency at which the pitch is heard—the fundamental
frequency (f0).
Of the two theories, the temporal theory of pitch has gained
most credence, particularly over the past decade, in part due to
its ability to explain a greater range of observed pitch phenomena. Mechanisms that account for temporal pitch have been the
subject of numerous psychophysical (Krumbholz et al. 2000;
Plack and White 2000; Pressnitzer et al. 2001; Shackleton and
Carlyon 1994; Wiegrebe et al. 1998) and modeling (de Cheveigne 1998; Meddis and O’Mard 1997) studies, and electrophysiological investigations in experimental animals (Biebel
and Langner 2002; Cariani and Delgutte 1996a,b; Wiegrebe
and Winter 2001). A number of models posit mechanisms that
could account for the processing of temporal pitch cues, in
particular the implementation of auto-correlation on the stimulus waveform (de Cheveigne 1998; Licklider 1951). However, electrophysiological studies have largely been unsuccessful in revealing potential neural candidates for such a role.
A critical feature of temporal pitch theories is their unique
ability to explain the pitch of unresolved harmonics, where all
of the spectral components of a complex sound pass through a
single same auditory filter. In such cases, pitch information is
confined to the temporal pattern of neural activity within
auditory filters. A recent neural theory proposed to account for
the pitch of unresolved harmonics (Langner 1997) suggests
that maps of periodicity [quantified by sensitivity to amplitudemodulated (AM) tones] run orthogonal to the main tonotopic
axis in the inferior colliculus (IC). Neurons with tuning for
similar best modulation frequencies (BMFs) project along
iso-modulation contours to synapse on low characteristic-frequency [characteristic frequency (CF) ⬍2.0 kHz] neurons in
the IC dorsal laminae. These neurons act as modulation extractors, responding to high-frequency AM tones outside their
traditional frequency-versus-level response area, with preferred BMFs matching their CFs (Biebel and Langner 2002).
Evidence from primary auditory cortex (Schulze and Langner
1997, 1999; Schulze et al. 2002) appears to confirm the
existence of periodicity maps in response to high-frequency
AM tones including one overlying the low-frequency region of
primary auditory cortex. Note that this proposed mechanism
can account only for the pitch of unresolved harmonics, since
resolved harmonics will not evoke temporally-modulated responses within a single neural filter. However, unlike psychophysics experiments that use low-pass masking noise to reduce
or abolish the contribution of distortions in temporal pitch
mechanisms (Bernstein and Oxenham 2003; Grimault et al.
Address for reprint requests and other correspondence: D. McAlpine, Dept.
of Physiology, Univ. College London, Gower St., London WC1E 6BT, UK
(E-mail: [email protected]).
The costs of publication of this article were defrayed in part by the payment
of page charges. The article must therefore be hereby marked “advertisement”
in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
INTRODUCTION
www.jn.org
0022-3077/04 $5.00 Copyright © 2004 The American Physiological Society
1295
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
McAlpine, David. Neural sensitivity to periodicity in the inferior
colliculus: evidence for the role of cochlear distortions. J Neurophysiol 92: 1295–1311, 2004. First published May 5, 2004; 10.1152/
jn.00034.2004. Responses of low characteristic-frequency (CF) neurons in the inferior colliculus were obtained to amplitude-modulated
(AM) high-frequency tones in which the modulation rate was equal to
the neuron’s CF. Despite all spectral components lying outside the
pure tone– evoked response areas, discharge rates were modulated by
the AM signals. Introducing a low-frequency tone (CF ⫺ 1 Hz) to the
same ear as the AM tones produced a 1-Hz beat in the neural response.
Introducing a tone (CF ⫺ 1 Hz) to the opposite ear to the AM tone
also produced a beat in the neural response, with the beat at the period
of the interaural phase difference between the CF ⫺ 1 Hz tone in one
ear, and the AM rate in the other ear. The monaural and interaural
interactions of the AM signals with introduced pure tones suggest that
AM tones generate combination tones, (inter-modulation distortion)
on the basilar membrane. These interact with low-frequency tones
presented to the same ear to produce monaural beats on the basilar
membrane, modulating the responses of inferior colliculus (IC) neurons on the 1-Hz period of the monaural beats or interacting binaurally
with neural input generated in response to stimulation of the opposite
ear. The auditory midbrain appears to show a robust representation of
cochlear distortions generated by amplitude-modulated sounds.
1296
D. MCALPINE
METHODS
Surgical preparation
All experiments were carried out in accordance with the guidelines
of the UK Home Office, under the control of the Animals (Scientific
Procedures) Act 1986. Pigmented guinea pigs (Cavia porcellus, 315–
505 g) were anesthetized with intraperitoneal urethane (1.0 –1.5 g/kg
in 20% solution). Additional analgesia was administered as required
using 0.1 ml im injections of fentanyl citrate/fluanisone (Hypnorm,
Janssen, Beerse, Belgium). Atropine sulfate (0.06 mg; Animalcare,
York, UK) was administered subcutaneously to reduce bronchial
secretions. All animals had a tracheal cannula inserted. Body temperature was maintained using a thermostatically controlled heating
blanket and rectal probe (Harvard Instruments, Edenbridge, UK).
Most animals breathed spontaneously, but some were respired artificially with air or 95% O2-5% CO2. All animals were clear of any signs
of infection in the ear canals and tympanic membranes. Any obstructing particles were carefully removed from the ear canals before
proceeding. Animals were mounted in a modified Kopf Instruments
(Bilaney Consultants, Sevenoaks, UK) stereotaxic frame situated
inside a sound attenuating booth (IAC, Winchester, UK). Hollow ear
speculae allowed the insertion of custom-made earphones and probe
tube microphones to form a sealed pressure-field sound delivery
system. Pressure equalization of the middle ear was achieved by
sealing high acoustic impedance cannulae into the bulla via small
holes drilled on both sides. Following subcutaneous injection of
lignocaine (2%; Astra, Kings Langley, UK) into the scalp, skin and
muscle were retracted, and a craniotomy was performed, extending
J Neurophysiol • VOL
2–3 mm rostral and caudal of the interaural axis and 1– 4 mm lateral
from the midline on the right side. The dura overlying the cortex was
removed, allowing microelectrode access through the cortex to the
right inferior colliculus, and the cranium was sealed with 2% agar
(Oxoid, Basingstoke, UK).
Single unit recording
Recordings were made from single neurons using parylene-coated
tungsten microelectrodes (1–5 MOhm impedance; WPI, Stevenage,
UK), mounted on a piezo-electric stepper motor and positioned
stereotaxically into the inferior colliculus. Electrical activity from the
microelectrode was filtered (300 Hz–3 kHz) and amplified (variable
gain) using a DAM-80 ac differential amplifier (WPI, Stevenage, UK)
and a PC1 spike conditioner (Tucker Davis Technologies, Gainesville,
FL). Units were isolated using variable frequency and intensity diotic
tone probe stimuli. Single spikes were discriminated from background
noise using an SD1 spike discriminator (Tucker Davis Technologies),
linked to the computer system to allow accurate time stamping (1 ␮s)
of the spike events via an ET1 event timer (Tucker Davis Technologies). Single unit isolation was confirmed by the consistency of the
discriminated spike waveform displayed on a Tektronix TDS-210
digital oscilloscope.
Stimulus presentation and data analysis
Acoustic stimuli were produced and presented under computer
control, using software developed at the Medical Research Council
Institute of Hearing Research (by Prof. Alan Palmer and Dr. Trevor
Shackleton) and Tucker Davis Technologies System II hardware.
Digitally generated dichotic stimuli (AP2 digital signal processor; at
100- or 48-kHz sampling rate; Tucker Davis Technologies) were
converted to analogue signals (DA3-2, Tucker Davis Technologies).
The signals were filtered (FT6; corner frequency ⫽ 40 kHz; Tucker
Davis Technologies), and attenuated (PA4, Tucker Davis Technologies), before being amplified (RB-971, Rotel, Tokyo, Japan) and
delivered to Beyerdynamic DT-48 (Burgess Hill, UK) loudspeakers
fitted with brass tube attachments sealed into the hollow ear speculae
supporting the animal. The sound field inside the sealed system was
sampled using FG3452 (Knowles Electronics, Burgess Hill, UK)
microphones via a probe tube inserted to within a few millimeters of
the tympanic membrane. The probe microphones had been previously
calibrated against type 4136 1/8-in microphone (Bruel and Kjaer,
Stevenage, UK). The sound systems for each ear were flat to within
⫾5 dB from 50 to 12,000 Hz and were matched to within ⫾5 dB for
this range. All sounds were generated using the maximum range of the
DSP to ensure a high signal-to-noise ratio. A fixed, in-line end
attenuation of 60 dB was applied to each signal following signal
generation, digital attenuation, and amplification, to give a maximum
output at 1 kHz of 106 dB SPL for 0 dB digital attenuation. For
several experiments, end attenuation was set to 40 dB, and the output
of the total output of the system was reduced by 20 dB. For the
response areas shown in RESULTS section, the ordinate decibel scale
reflects the maximum106 dB SPL output at 1 kHz.
The measured f0 distortion in the sound delivery/recording system
was examined to ensure that any such distortions were lower than the
level of presumed cochlear distortions generated by the AM signals.
Possible sources of distortion are the sound generation system—
unlikely, given appropriate filtering was applied to the D/A signal
conversion—the sound delivery system, or the in-line sound calibration system. The sum total of all distortions was significantly lower
than could account for if the distortions were due to the stimulus
generation/delivery system. For example, f0 distortion, measured by
the on-line spectrum analyser, was 51 dB lower than a 3-kHz carrier
AM at 363 Hz (100% depth) and presented at 70 dB SPL, 45 dB lower
when the carrier level was increased to 75 dB SPL and within the
noise floor (⬇65 dB lower than the 3 kHz carrier at 70 dB SPL) when
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
2002; Plack and White 2000), these studies were not able to
exclude the contribution to neural responses of low-frequency
combinations tones (distortions) generated by the nonlinear
mechanics of the basilar membrane in response to relatively
spectrally complex (compared with pure tones) AM tones.
Given the potential interest in central auditory mechanisms that
extract modulation and/or pitch information, it is imperative
that the contributions of peripheral mechanisms to neural
responses be discounted before central mechanisms are assumed. Combination tones (CTs) were suggested by Helmholtz
(1885) more than a century ago as a possible means of
accounting for the pitch of the missing fundamental, and
although several lines of evidence indicate that CTs cannot
account for all of the pitch percept of periodic signals, recent
psychophysical evidence from Pressnitzer and Patterson (2001)
indicates that CTs likely contribute to the perception of complex pitches.
This study shows that low-frequency IC neurons readily
respond to high-frequency AM tones, as previously reported
(Biebel and Langner 2002), but that a likely explanation for
this responsiveness is the generation of distortion on the basilar
membrane. The data indicate that a significant distortion is
generated when high-frequency AM tones of moderate (⬍70
dB Sound Pressure Level) sound level are presented and that
this distortion interacts with tones presented to the same ear on
the basilar membrane to produce monaural beating in the
responses of low-frequency IC neurons, and with neural activity generated by tones presented to the opposite ear in the
binaural brain stem nuclei to generate binaural beating in
interaural delay-sensitive IC neurons. These data suggest that
recent reports of central neural mechanisms for temporal pitch
extraction may, instead, be demonstrations of the contribution
of cochlear-generated distortions to the neural representation of
complex sounds.
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
RESULTS
The contribution of cochlear-generated distortions to the
response of low-CF neurons to high-frequency AM tones was
examined for 32 IC neurons with CFs ranging from 123 to
1,086 Hz, with a mean of 443 ⫾ 252 (SD) Hz. Two neurons
were also recorded in the dorsal nucleus of the lateral lemniscus (DNLL). All neurons tested were sensitive to the IPDs of
binaural beats. Neurons were assessed for evidence of CTs
generating their response to AM tones in which all of the
J Neurophysiol • VOL
spectral components lay well outside the upper cut-off frequency of the pure-tone frequency-versus-level response area
at the stimulus level tested.
Evidence for the contribution of combination tones in
AM-generated neural responses
Figure 1 shows the responses of an IC neuron to lowfrequency pure tones and high-frequency AM tones. A frequency-versus-level response area (Fig. 1A) obtained over six
octaves (2 below CF and 4 above), using diotic, 50-ms tone
bursts, confirms the audio-visually determined CF of 292 Hz.
The top PSTH in Fig. 1B shows the response to a 3-s, 2-kHz
pure tone presented to the contralateral ear. This frequency lay
well outside (above) the pure-tone response area (Fig. 1A, large
white circle), and no activity was evoked above the spontaneous discharge rate (indicated by the arrow to the right of the
panel). The second from top PSTH in Fig. 1B shows the
response of the same neuron to a 2-kHz tone AM 292 Hz, the
neuron’s CF. This AM produced sidebands in the spectrum at
2.292 and 1.708 kHz (Fig. 1A, small white circles), 6 dB lower
than the level of the carrier. Despite all spectral components,
including the lower sideband, lying above the response area,
evoked discharge rates were higher than the spontaneous discharge rate. This is consistent with previous reports (Biebel
and Langner 2002) in which IC neurons with CFs below 2 kHz
responded to AM tones that lay outside their pure-tone response areas. As might be predicted from Fig. 1A, a 291-Hz
tone presented to the ipsilateral ear alone (Fig. 1B, 2nd from
bottom) also evoked activity above spontaneous discharge
rates.
Figure 1B, bottom, shows the response to simultaneous
presentation of the contralateral 2-kHz AM tone and the
ipsilateral 291-Hz pure tone. Here, the discharge rate was
modulated between higher and lower values than evoked by
either stimulus alone. Peaks in the PSTH indicate a 1-Hz
modulation in the neuron’s discharge pattern (3 complete
response cycles over the 3-s stimulus presentation). The rate of
the response modulation equates to the difference between the
pure-tone frequency in one ear and the AM rate in the other,
suggesting the possibility that it results from binaural integration of temporally phase-locked inputs, a process usually
considered to arise in the medial superior olive of the brain
stem, below the level of the IC, and that requires specialized
cell types (Smith 1995; Smith et al. 1998) and synaptic mechanisms (Agmon-Snir et al. 1998; Forsythe and Barnes-Davies
1993) considered unique to the lower binaural brain stem.
Figure 2 shows the responses of another IC neuron, with an
audio-visually determined CF of 364 Hz, to low-frequency
pure tones and high-frequency AM tones. Frequencies ⬎1.5
kHz evoked no activity, even at levels ⬎90 dB SPL. The
neuron’s response to binaural beats (Fig. 2B), where the contralateral stimulus was a CF tone (364 Hz) and the ipsilateral
stimulus was a CF ⫺ 1 Hz tone (363 Hz), indicates the neuron
to be strongly sensitive to IPD cues—responses were significantly modulated (2nR2 ⬎ 13.815; P ⬍ 0.001) with IPD over
a wide range of sound levels. For the highest sound level of 42
dB SPL (Fig. 2B, top), approximately ⫹15 dB at threshold at
CF (Fig. 2B, top row), the mean best IPD was ⫺0.44 cycles of
IPD, i.e., the neuron responded maximally when the stimulus at
the right ear led the left ear by 0.44 cycles of the stimulus
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
the carrier level was reduced to 65 dB SPL. All of the presumed
cochlear-generated distortions were significantly higher in level than
the measured system distortions; on average, the threshold cochleargenerated distortion was 25 dB lower than the carrier level used, and
all but two recordings were made with maximum carrier levels 70 dB
SPL or below. Thus the distortions generated by the cochlea were
ⱖ26 dB above the level of distortions generated in the combined
signal generation, calibration, and measurement system. The noise
floor of this combined system was 65 dB below the level of the 3-kHz
carrier tone. The site of generation of distortions in the sound
generation/delivery/calibration system was not determined. Although
it is possible that the speakers generate some distortion, it is also likely
that the Knowles microphones used for on-line analysis of the sound
signals account for at least some of this distortion. The speakers and
the probe tubes used to measure sound at the eardrum, and to which
the Knowles microphones were attached throughout recordings, were
calibrated against an 1/8-in Ban dK microphone. However, although
these microphones show a very flat transfer function, they have a high
noise floor [low signal-to-noise ratio (SNR)] and cannot be used to
measure distortions for low sound level signals.
Once a single neuron was isolated, its CF was estimated audiovisually. In most cases, CF was confirmed by the generation of
frequency-versus-level response areas, extending two octaves above
and four octaves below the estimated CF, using diotic 50-ms tones
covering a range of 60 –100 dB, presented at a rate of 5 Hz in
randomized order. Response areas were also obtained over the range
of four octaves above to two octaves below CF to confirm that
neurons were unresponsive to pure tone stimulation at frequencies
used as carrier signals using AM.
Additionally, neurons were characterized by the generation of
peristimulus time histograms (PSTHs) to 50-ms CF tones presented
diotically and monaurally to both ears, at 20 dB above threshold for
150 repetitions at 5 Hz. The neurons binaural sensitivity to interaural
phase disparity (IPD) cues was assessed using 3-s duration binaural
beat stimuli, with a 1-Hz difference between the ears. The 3-s beat
contains two full sweeps of IPD during the middle 2 s and sweeps of
0.5 cycles of IPD during the first and last 500 ms of the stimulus. The
initial and final 500-ms periods of the response were omitted from
analysis to exclude contamination of IPD sensitive responses by the
often large, but rapidly adapting, discharge rates evoked at the onset
of the stimulus. PSTHs of the two complete cycles of IPD were
averaged and plotted on-line with respect to the IPD of the stimulus to
form period histograms. Using the methodology of Goldberg and
Brown (1969), the vector strength (R) of the response was calculated
from the period histogram. A vector strength of 1.0 reports perfect
locking to the phase (IPD) of the stimulus, with all spikes occurring
in 1 bin (bin width in this study was 20 ms), and 0.0 reflects an even
distribution of spikes across all bins of the IPD phase plot. The
average best phase was calculated as a vector average of the response
magnitudes at each point in the cyclic IPD phase histograms. Vector
strengths were assessed for their statistical significance by measuring
the Rayleigh coefficient, 2nR2, where n is the total spike count, and R
is the vector strength. Responses were considered significantly modulated with IPD for Rayleigh coefficients ⬎ 13.815 (Yin and Kuwada
1983), i.e., P ⬍ 0.001.
1297
1298
D. MCALPINE
period. For the lowest sound level of 32 dB SPL (⫹5 dB at
threshold), the best IPD was ⫺0.40 cycles (Fig. 2B, bottom).
Figure 2C shows the responses of the same neuron in which
the CF tone to the contralateral ear was replaced with a
3.0-kHz tone AM at a rate of 364 Hz. The white circles in Fig.
2A indicate that all components lay above the pure-tone response area and did not evoke neural activity when presented
in isolation. The highest carrier sound level used was ⬃65 dB
SPL (Fig. 2A, top arrow) and the lowest was 45 dB SPL (Fig.
2A, bottom arrow; intermediate levels in 5-dB steps). Despite
the large spectral differences between the ears, however, the
neural response was significantly modulated on the period of
the difference between the CF in one ear and the AM rate in the
other, at least for the three higher sound levels (Fig. 2B, top 3
rows), with a mean best IPD ranging from ⫺0.25 to ⫺0.35
cycles at CF.
J Neurophysiol • VOL
A final example of this behavior is shown in Fig. 3 for a
neuron with a CF of 888 Hz (Fig. 3A). As for the neuron in Fig.
2, this neuron was sensitive to the IPDs of 1-Hz binaural beats
generated by a CF tone in the contralateral ear and an 887-Hz
tone in the ipsilateral ear, showing a best IPD of 0.31 cycles
(Fig. 3B), with significant IPD sensitivity (P ⬍ 0.001) observed for the two highest levels (51 and 46 dB SPL) of the
contralateral pure tone (Fig. 3B, top 2 PSTHs). In Fig. 3C, the
CF tone is replaced with a 5-kHz tone, AM at 888 Hz. This
stimulus configuration also evoked a response that, when
binned on the period of the difference between the AM rate in
the contralateral ear (at CF) and the CF ⫺ 1 Hz tone in the
ipsilateral ear, indicated the neuron to be sensitive to the IPD
between these components. Significant IPD sensitivity (P ⬎
0.001) was obtained for the four highest levels of AM tones
shown (Fig. 3C, top 4 PSTHs).
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 1. A: frequency-vs.-level response area of an inferior colliculus (IC) neuron with characteristic frequency (CF) ⫽ 292 Hz.
B: peristimulus time histograms (PSTHs) of the responses to the stimulus configurations on the right. Horizontal arrow on the right
of each indicates spontaneous discharge rate: (top) response to 2-kHz, 64 dB SPL pure-tone in the contralateral ear at the level
indicated by the large white star in A; (2nd from top) response to 2-kHz tone amplitude modulated at 292 Hz in the contralateral
ear. Level and frequency of the sidebands are indicated by small white circles in A; (2nd from bottom) response to 291-Hz tone
in the ipsilateral ear at 53 dB SPL; (bottom) response to 2-kHz tone amplitude modulated at 292 Hz in the contralateral ear and
291-Hz tone in the ipsilateral ear.
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1299
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 2. A: frequency-vs.-level response area of an IC neuron with CF ⫽ 364 Hz. B: response to 3 different levels of binaural
beats: (top) 42 dB SPL, (middle) 37 dB SPL, and (bottom) 32 dB SPL, where the tone in the contralateral ear was 364 Hz and the
tone in the ipsilateral ear was 363 Hz. C: response to 5 different levels (60, 55, 50, 45, and 40 dB SPL from top to bottom) of a
3-kHz tone amplitude-modulated at 364 Hz in the contralateral ear, and a fixed level (66 dB SPL) 363-Hz tone in the ipsilateral
ear. The levels of the amplitude-modulated (AM) carrier are indicated by the arrows to the right of the response area in A.
Where is the generation site for neural sensitivity to the
periodicity of AM sounds?
What explanation might be posited for the sensitivity of IC
neurons to AM high-frequency tones? For the examples above,
one can hypothesize that the combination of AM tones in one
ear and CF ⫺ 1-Hz tones in the other produces a binaural
distortion beat; action potentials generated by vibration of the
contralateral basilar membrane at the place tuned to CF, equal
to the AM rate, and by vibration of the ipsilateral basilar
membrane at the place tuned to CF ⫺ 1 Hz, propagate via
J Neurophysiol • VOL
low-frequency, phase-locking neurons in the cochlear nucleus
to converge on delay-sensitive neurons in the superior olivary
complex (Fig. 4, left). Here, the quadratic nonlinearity of the
basilar membrane produces a traveling wave (horizontal red
arrow) that is resolved by the basilar membrane at the (lowfrequency) place tuned to the AM frequency. So, since there is
no contralateral stimulus component at 364 Hz for the neuron
in Fig. 2, or at 888 Hz for the neuron in Fig. 3, a parsimonious
explanation is that the response modulation arises due to a
low-frequency (364 or 889 Hz) component—the difference
92 • SEPTEMBER 2004 •
www.jn.org
1300
D. MCALPINE
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 3. Similar format to Fig. 2. A: frequency response area indicating CF of 888 Hz. B: responses to binaural beats;
contralateral stimulus ⫽ 888 Hz, ipsilateral stimulus ⫽ 887 Hz, with binaural signal levels: (top) 51 dB SPL, (middle) 46 dB SPL,
and (bottom) 36 dB SPL. C: response to contralateral 5.0-kHz tone amplitude modulated at 888 Hz (65 dB SPL). Levels of
ipsilateral 887 Hz tone, from top to bottom, are 70, 60, 55, 50, and 45 dB SPL. All responses, except bottom (45 dB SPL), were
significantly modulated (P ⬍ 0.001) with the interaural phase disparity (IPD) between the ipsilateral carrier and the contralateral
modulator.
tone at fc ⫾ fm— generated by the nonlinear response of the
basilar membrane to the AM complex. Phase-locked activity
generated by this difference tone is propagated through fibers
tuned to these low frequencies to binaural coincidence detectors in the brain stem nuclei of the superior olive, where it
interacts with phase-locked activity generated by the tone 1 Hz
lower in the other ear. This activation propagates to the
low-frequency laminae of the IC, where it is recorded as an
IPD sensitivity in low-CF neurons.
J Neurophysiol • VOL
This distortion hypothesis is in marked contrast to the pitch
extraction hypothesis recently posited by Biebel and Langner
(2002). In that hypothesis, neural activity generated by the
high-frequency AM complex is transmitted via high-frequency
tuned fibers to high-frequency laminae of the IC (Fig. 4, right).
Here, neurons are arranged within each lamina according to
their BMF. These neurons project to IC neurons in the lowfrequency laminae whose pure-tone frequency tuning is similar
to their envelope-modulation frequency tuning.
92 • SEPTEMBER 2004 •
www.jn.org
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1301
Sensitivity to the periodicity of AM sounds below the IC
The argument that the generation site for the sensitivity of
auditory neurons to high-frequency AM tones is lower than the
level of the IC is supported by the response of the neuron in
Fig. 5, which differs from the previous examples in two
respects. Although sensitive to IPDs (Fig. 5A, top) and highfrequency tones (3 kHz) modulated at CF (208 Hz; Fig. 5A,
middle), and also showing a modulated response to binaural
stimulation on the period of the difference between the AM
rate in the contralateral ear (208 Hz) and the pure-tone frequency in the ipsilateral ear (207 Hz; Fig. 5A, bottom), this
neuron was recorded in the DNLL rather than the IC. The
DNLL was located using the stereotaxic coordinates of Medvedeva (1977); the neuron in Fig. 5 was recorded 6.6 mm
below the cortical surface and 3.5 mm lateral to the midline.
This recording position was 3.1 mm deeper than (ventral to)
the location in the same electrode penetration at which a
low-frequency excitatory drive was recorded as the electrode
passed the lateral edge of the IC. Histological reconstruction of
the midbrain and brain stem of this animal confirmed the
location of an electrolytic lesion at this neuron’s recording site
to be in the DNLL. Neurons in the DNLL, which is a
GABAergic projection nucleus to the IC (Adams and Mugnaini
1984; Batra and Fitzpatrick 2002; Kelly and Li 1997; Zhang et
al. 1998), show a high degree of phase-locking and high,
sustained discharge rates (Aitkin et al. 1970; Brugge et al.
1970) compared with IC neurons, as Fig. 5B suggests. The
second difference lies in the form of binaural interaction,
which appears to be based on phase-locked inhibition as well
as excitation. Diotic presentation of 50-ms tone bursts evoked
instantaneous discharge rates peaking around 400 spikes/s
(Fig. 5B, left). However, contralateral stimulation alone evoked
significantly higher discharge rates (Fig. 5B, right top). Conversely, ipsilateral stimulation alone reduced discharge rates
J Neurophysiol • VOL
below the low spontaneous rate (Fig. 5B, right bottom; note
change of scale). This suggests input from neurons in the
lateral superior olive (LSO) that show evidence of an exquisitely timed— on the microsecond scale— glycinergic input
from the medial nucleus of the trapezoid body (MNTB) (Smith
et al. 1991, 1998; Tsuchitani 1997). The LSO sends excitatory
projections to the DNLL (Huffman and Covey 1995; Oliver
2000). Consistent with this, the influence of the 207-Hz tone
presented to the ipsilateral ear (Fig. 5A, bottom) is to reduce, in
an IPD-sensitive manner, the response to the contralateral ear
alone (cf. Fig. 5A, middle). The 3-kHz AM tone presented to
the contralateral ear evokes ⬃30 spikes/s, after an initial onset
response over the first 200 –300 ms of ⬃40 spikes/s. Simultaneously presenting the 207-Hz tone to the ipsilateral ear modulates the response between a maximum of 40 and ⬃5 spikes/s.
The observation that low-frequency neurons below the level
of the IC also show evidence of sensitivity to the periodicity of
high-frequency signals indicates that the generation site for
such sensitivity lies below the level of the midbrain. Furthermore, that responses are seemingly modulated with the interaural phase difference between the pure-tone frequency in one
ear and the AM frequency in the other ear suggests that the site
of production lies below the brain stem nuclei at which binaural information converges. It might be argued that phaselocked action potentials generated in low-frequency channels
at one ear could converge with action potentials phase-locked
to the high-frequency stimulus envelope at the other ear,
eventually creating interaural time difference (ITD) sensitivity
de novo in low-CF neurons. However, the requirement for
specialized synaptic and cellular mechanisms underpinning
low-frequency binaural hearing makes this unlikely. Finally,
the ability to phase-lock to monaural inputs had no bearing on
whether neurons were sensitive to the presumed IPD between
the pure tone at the ipsilateral ear and the distortion tone on the
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 4. Two hypotheses suggested to account for the sensitivity of low-frequency IC neurons to high-frequency AM tones. The
distortion hypothesis (left) suggests a peripheral site, and the pitch extraction hypothesis (right) suggests a central site for the effect.
See RESULTS for further information.
1302
D. MCALPINE
contralateral ear (see Fig. 6, A and B). Only 5 of the 32 IC
neurons showed phase-locking to the tones at either ear alone,
and none of the examples in Figs. 1–3 showed evidence of
monaural phase-locking.
Assessing the magnitude and phase of distortions
In their study of the contribution of CTs to human pitch
perception, Pressnitzer and Patterson (2001) showed, using the
cancellation of beats method (Goldstein 1967), the presence of
distortions ⬃15–20 dB lower than level of the primary tones,
depending on the number of harmonic components. In this
study, a pure-tone probe with frequency equal to CF ⫺ 1 Hz
was introduced to the left ear to beat monaurally with the
presumed CF distortion tone, producing a 1-Hz monaural
distortion beat. The phase and amplitude of the difference tone
J Neurophysiol • VOL
were estimated by varying the level of the probe tone required
to modulate maximally the neural response.
Figure 7A shows the response area of an IC neuron (CF ⫽
185 Hz). The panels in Fig. 7B indicate the response of this
neuron to a 2.5-kHz tone in the contralateral ear AM at 185 Hz
with, from bottom to top, increasing levels of a 184-Hz tone
presented to the same ear. None of the spectral components of
the AM signal (Fig. 7A, white circles) evoked a response when
presented in isolation. For low levels of the 184-Hz tone (Fig.
7B, bottom; 31 dB SPL), the discharge rate was essentially that
evoked by the high-frequency AM signal. As the level of the
184-Hz tone was increased (Fig. 7B, bottom to top), however,
the neural response was modulated above and below the rate
evoked by the AM tone alone. For a tone level of 48 dB SPL
(Fig. 7B, 3rd PSTH from top), the response was maximally
modulated. Further increasing the level of the tone also in-
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 5. Response of a neuron recorded in the dorsal nucleus of the lateral lemniscus (DNLL). A: (top) response to low-frequency
binaural beats at CF (208 Hz); (middle) response to high-frequency tone (3 kHz) modulated at CF (208 Hz) in the contralateral ear;
(bottom) response to binaural stimulation with high-frequency tone (3 kHz) modulated at CF (208 Hz) in the contralateral ear and
a pure tone (207 Hz) in the ipsilateral ear. B: response to (left) diotic presentation of 50-ms tones at CF; (right top) contralateral
stimulation alone; and (right bottom) ipsilateral stimulation alone (note different ordinate scales). C: response to monaural distortion
beats in the contralateral ear, with a fixed level 3-kHz AM tone (modulated at 208 Hz) and increasing levels of a 207 Hz tone
(bottom to top). The 4th panel from the bottom indicates the level at which the response was maximally modulated.
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1303
creased the discharge rate, but the response was less modulated. For the highest level used (59 dB SPL; Fig. 7B, top
PSTH), the response was completely unmodulated. The explanation for the modulation in the neural response, which is at the
period of the difference between the modulation rate of the
high-frequency AM tone and low-frequency pure tone, is the
same as for the binaural responses described above: the AM
tone generates a distortion at the frequency of the difference
tone (185 Hz in this case). The addition of a 184-Hz tone into
the same ear causes a beating on the basilar membrane—a
monaural distortion beat—at the 1-Hz rate observed in the
neural response. The two basilar membrane waves add when in
phase and cancel when out of phase, producing modulated
neural activity at 1 Hz in the auditory nerves innervating this
region of the cochlea. When the levels of the two waves differ,
for example, when the pure tone is lower than, or higher than,
the level of the distortion, one frequency component dominates
the basilar membrane output, and the IC neural response is less
modulated as a result. When the level of the 184-Hz tone is
J Neurophysiol • VOL
equal to that of the 185-Hz distortion, the two basilar membrane responses cancel completely when out of phase (3 times
over the 3 s of the stimulus), and the response is maximally
modulated. The level at which this occurred (48 dB SPL) was
17 dB lower than the level of the 2.5-kHz carrier.
Figure 8A shows the frequency-versus-level response area of
an IC neuron with a much higher level of spontaneous activity
than in the previous examples, as well as evidence of an
inhibitory sideband above the excitatory response area. This
neuron was assessed for sensitivity to high-frequency AM
sounds using a 2.5-kHz tone modulated at 210 Hz. Note that all
of these components lie beyond both the excitatory and the
inhibitory regions of the response area. Figure 8B shows the
response of the same neuron when a pure tone (209 Hz) is
presented to the same ear. Figure 8B, bottom, shows the
response to the AM complex alone, which did not evoke
discharge rates higher than the spontaneous rate (denoted by
arrow to the right). As the 209-Hz tone was introduced to the
same ear and increased in level to 76 dB SPL (Fig. 8B, 2nd
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 6. Responses of 2 IC neurons showing (A and C) frequency-vs.-level response areas. Period histograms show responses to
binaural distortion beats with 3 different levels of carrier. A: (contra) 4.0-kHz amplitude modulated at 138 Hz and (ipsi) 137 Hz
tone carrier level in period histograms top to bottom; 76, 66, and 61 dBSPL. C: (contra) 3.0-kHz amplitude modulated at 394 Hz
and (ipsi) 393-Hz tone carrier level in period histograms top to bottom; 65, 59, and 55 dBSPL. B and D: (left) PSTH of response
to pure tone binaural beats; (top right) response locked to contralateral stimulus; and (bottom right) response locked to ipsilateral
stimulus.
1304
D. MCALPINE
panel from bottom), the response magnitude increased, but was
not significantly modulated. Modulation was maximal and
significant when the 185-Hz tone was presented at 40 dB SPL
(Fig. 8B; 3rd panel from bottom), 25 dB lower than the level of
the AM carrier. Above this level, the response was less,
although still significantly, modulated until, for the highest
level used (55 dB SPL; Fig. 8B, top), the neural response was
completely unmodulated.
The average distortion level, as judged by the maximum
modulation of beats method in the neural response (n ⫽ 16),
was 25 ⫾ 4.6 dB lower than the level of the carrier, with values
ranging from 34 to 17 dB (Fig. 9A). One confounding issue is
that, with increasing CF, an increasingly higher carrier frequency was required to ensure that the lower frequency sideband did not encroach into the frequency region delineated by
the pure-tone frequency-versus-level response area. In Fig. 9A,
J Neurophysiol • VOL
the magnitude of the distortion is plotted as a function of neural
CF, and the different symbols indicate the carrier frequency of
the AM signal used to elicit the distortion. There was some
tendency for the lower frequencies around 2–3 kHz to generate
the largest distortions (see Fig. 9 for details of carrier frequencies).
Phase and level specificity of distortions
A prediction from the cochlear distortion hypothesis is that
the distortion shows high specificity to the level and phase of
a masker tone presented to the same ear. As shown above,
neural activity is modulated in a manner consistent with monaural beating when a tone 1 Hz lower than the AM rate is
presented to the same ear. Here, by adjusting the level and
phase of a cancellation tone of equal frequency to the AM rate
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 7. A: frequency-vs.-level response area of an IC neuron with CF ⫽ 185 Hz. B: PSTHs (left) and period histograms (right)
in response to the stimulus configurations shown to the right. In each case, a modulated 2.5-kHz tone modulated at 185 Hz and
a 184-Hz tone were presented to the contralateral ear to create a monaural distortion beat. White circles indicate frequency and level
of 2.5-kHz carrier and sidebands. The 184-Hz tone was increased (B; bottom to top) from 31 to 59 dB SPL, and maximally
modulated at 48 dB SPL, 17 dB lower than the level of the carrier (3rd from top PSTH).
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1305
presented to the same ear, I show that the neural response is
altered in a highly specific, and predictable, manner that
suggests the interaction occurs at a peripheral stage in auditory
processing. Such specificity is considerably less likely if the
generation site for the neural response to the high-frequency
AM signal is central, i.e., if it occurs after the response to the
signal has been converted into a train of action potentials.
Figure 9B plots the binaural phase of the distortion— equal to
the difference between the best IPD to binaural beats and the
best IPD to binaural distortion beats—as a function of the
phase obtained from the monaural beats method. The values
obtained by each method are clearly very similar (the line
indicates unity) and argue for a peripheral generation site for
generation of the phenomenon rather than a central site.
The specificity of the level and phase of the presumed
distortion product was further examined using a CF “masker”
tone presented to the same ear as the AM complex modulated
at CF, and a pure tone at CF ⫺ 1 Hz was presented to the
opposite ear. The level and phase of the CF masker were
altered systematically, and the neural response was assessed.
Figure 10 shows the responses of an IC neuron with a CF of
J Neurophysiol • VOL
437 Hz to a 2.5-kHz tone in the contralateral ear AM at 437 Hz,
a pure-tone “masker” of 437 Hz in the same ear, and a pure
tone of 436 Hz in the ipsilateral ear. From top to bottom along
the central spine of Fig. 11, responses are shown for decreasing
levels of the CF masker tone in the contralateral ear. In each
case, the masker starting phase is ⫺0.1 cycles with respect to
the tone in the ipsilateral ear, which is also the cancellation
phase obtained from the monaural distortion beats experiment
above. At high levels, the 437-Hz masker dominates the
contralateral-evoked response on the basilar membrane and
beats binaurally with the neural response generated by the
436-Hz tone in the ipsilateral ear to produce the observed IPD
sensitivity. The phase of the binaural response at the highest
masker level (7 dB lower than the AM carrier level; top panel)
is ⫹0.2 cycles, almost exactly the difference between the
cancellation phase (⫺0.1 cycles) and the response to binaural
distortion beats (⫺0.31 cycles) for this neuron. As the level of
the 437-Hz masker is reduced, however, the response magnitude decreases systematically, although the phase of the response remains constant (Fig. 11, 2nd and 3rd from top; mean
best IPDs of ⫹0.17 and ⫹0.18 cycles, respectively), until, for
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 8. A: frequency-vs.-level response area of an IC neuron
with CF ⫽ 210 Hz. B: PSTHs (left) and period histograms
(right) in response to the stimulus configurations shown to the
right. In each case, a modulated 2.5- (AM rate ⫽ 210 Hz) and
a 209-Hz tone were presented to the contralateral ear. White
circles indicate frequency and level of 2.5-kHz carrier and
sidebands. Response was maximally modulated at a level of 48
dB SPL, 30 dB lower than the level of the carrier (3rd from top
PSTH).
1306
D. MCALPINE
a masker level 19 dB lower than the AM carrier level (Fig. 11,
middle), the IPD-modulated response was abolished. When the
level of the 437-Hz tone is reduced further (Fig. 11, bottom 3
panels), however, the response increases again, and it is once
more significantly modulated with IPD for a masker level of 43
dB SPL (Fig. 11, 3rd panel from bottom) but with a different
best IPD (⫺0.29 cycles in each of the bottom 2 panels) to that
for high levels of the 437-Hz tone. At these low masker levels,
the distortion tone, which, like the masker, is also 437 Hz,
presumably dominates the response in the contralateral ear and
beats binaurally, with the 436-Hz tone in the ipsilateral ear, to
produce interaural phase characteristics that reflect the phase of
the distortion not the phase of the masker.
In like manner, the response is dependent on the phase of the
masker (Fig. 10, period histograms along the horizontal dimension). Only for one phase of the masker—the ⫹0.1 cycles
determined from monaural beats—is the neural response abolished (Fig. 10, middle). As the phase of the cancellation tone is
advanced or lagged, the neural response increases, becoming
maximal for phases one-half a cycle from the cancellation
phase (Fig. 10, right) when the masker is in the same phase as
the different tone. Because the signals at the contralateral ear
are now in additive phase, the response is greater than when
J Neurophysiol • VOL
DISCUSSION
This study shows that cochlear distortions generated in
response to moderate level AM sounds are sufficient to provide
the auditory system with a central representation of the rate at
which high-frequency tones are modulated. These presumed
distortions interact with tones presented to the same ear and
interact binaurally with tones in the opposite ear to produce
sensitivity to interaural timing differences, the main binaural
localization cue.
This study also examines claims made in recent studies
suggesting a central neural mechanism of temporal pitch extraction (Biebel and Langner 2002; Schulze and Langner 1997,
1999; Schulze et al. 2002). The data show that low-CF neurons
respond to high-frequency AM tones in which all spectral
components lie outside the pure-tone response area, confirming
the basic observation of Biebel and Langner (2002), and
apparently consistent with the conclusion of both studies that
periodicity pitch is mapped in the CNS. Given the importance
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 9. A: level of a CF 1-Hz tone required to modulate maximally the
response of IC neurons to high-frequency tones amplitude modulated at CF.
Different symbols refer to different carrier frequencies: F, 2.0 kHz; Œ, 2.5 kHz;
■, 3.0 kHz; E, 5 kHz; ‚, 7.0 kHz. B: comparison of monaural and binaural
estimates of the phase of the f0 distortion for 16 IC neurons. The monaural
estimate was obtained from the monaural beats method and the binaural
estimate from the difference between the mean best IPD to binaural beats and
binaural distortion beats. Line indicates unity.
just the distortion is beating binaurally with the 436-Hz tone in
the opposite ear (Fig. 10, bottom). Thus both the level and the
phase of a CF masker that abolishes the neural response are
highly specific.
A second example of this behavior is shown in Fig. 11 (the
same DNLL neuron shown previously in Fig. 5). Once more,
only a narrow range of CF masker levels and phases abolishes
the modulated neural response. This is consistent with the
hypothesis that a distortion tone at f0, generated on the basilar
membrane, is responsible for generating the responses of
low-frequency IC neurons to high-frequency AM sounds.
A total of 10 IC neurons and 1 DNLL neuron were systematically examined using different levels and phases of the
cancellation tone in the binaural condition. All showed similar
sensitivity to the level and phase of the tonal masker as the
neurons in Figs. 10 and 11, with the response modulation
gradually disappearing as masker level was reduced, only to
reappear with further reductions in masker level.
Figure 12 shows the response of an IC neuron to three
different levels of AM tone (3 kHz modulated at 263 Hz) in the
contralateral ear and a pure tone (262 Hz) in the ipsilateral ear.
The carrier level of the AM tone in the left column (Fig. 12A)
was 10 dB higher than the middle column (Fig 12B) and 20 dB
higher than that in the right column (Fig. 12C). In the top row,
a high-level CF masker (263 Hz) beats binaurally with the
262-Hz tone in the ipsilateral ear, producing modulated discharge patterns with similar best IPDs. The middle row shows
the response of the same neuron when the level of the 263-Hz
masker is sufficient to offset the distortion produced by the AM
tone, which was 19 dB lower than the level of the carrier in Fig.
12A. For AM levels 10 (Fig. 12B) or 20 dB (Fig. 12C) lower
than in Fig. 12A, the masker level required to abolish the
response was also reduced by 10 dB (Fig. 12, A and B, middle).
In each case, the phase of the masker lagged the tone in the
ipsilateral ear by ⫺0.3 cycles. When the level of the masker
was reduced further, the IPD-modulated response in Fig. 12, A
and B, but not Fig. 12C, reappeared with altered best IPD,
reflecting the relative phases of the contralateral distortion tone
and the ipsilateral pure tone. In Fig. 12C, the AM carrier level
was below the level required to generate a distortion that could
interact binaurally with a tone in the other ear.
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1307
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 10. Response of an IC neuron (CF ⫽ 437 Hz) to binaural distortion beats (contra, 2.5-kHz AM at 437 Hz; ipsi, ⫺436 Hz)
and a 437-Hz masker in the contralateral ear. Top–bottom: response to different levels of masker. For high masker levels, binaural
response was dominated by the contralateral masker and the ipsilateral tone. When the masker level (42 dB SPL) was equivalent
to that of the distortion (middle), the response was abolished when the masker was presented at the appropriate cancellation phase.
Response modulation was also insignificant for the masker level of 40 dB SPL. When the phase of the masker was adjusted to lag
or lead that of the distortion, the response reappeared with the appropriate phase shift. Response modulation was only insignificant
for masker phase of ⫺0.1 at 42 dB SPL. Responses to maskers in additive phase to the distortion (far left and far right) were
enhanced. For masker levels lower than the distortion, the binaural response reappeared, but with altered best IPD. These responses
reflect the interaction of the contralateral distortion and the ipsilateral pure tone.
of pitch in acoustic processing, particularly in auditory grouping and stream segregation, such findings potentially have
immense importance in the field. The interpretation of the data
in this study, however, suggests that these claims should be
J Neurophysiol • VOL
treated cautiously. None of the studies cited above tested the
possibility that cochlear-generated distortions contributed to
neural responses of low-CF neurons to high-frequency AM
tones. Thus such distortions cannot be excluded in these
92 • SEPTEMBER 2004 •
www.jn.org
1308
D. MCALPINE
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 11. Response of DNLL neuron (CF ⫽ 208 Hz) shown in Fig. 5 to binaural distortion beats (contra, 3.0-kHz AM at 208
Hz; ipsi, ⫺207 Hz) and a 208-Hz masker in the contralateral ear. Format is identical to Fig. 10. Top– bottom: response to different
levels of masker. For high masker levels, binaural response was dominated by the contralateral masker and the ipsilateral tone.
When the masker level (46 dB SPL) was equivalent to that of the distortion (middle), the response was abolished when the masker
was presented at the appropriate cancellation phase (⫹0.65 cycles). Responses at 43 dB SPL masker levels at ⫹0.65 masker phase
were not significantly modulated. All other responses, except for a 46 dB SPL masker at ⫹0.6 cycles, were significantly modulated.
When the phase of the masker was adjusted to lag or lead that of the distortion, the response reappeared with the appropriate phase
shift. Responses to maskers in additive phase to the distortion (far left and far right) were enhanced. For masker levels lower than
the distortion, the binaural response reappeared, but with altered best IPD. These responses reflect the interaction of the contralateral
distortion and the ipsilateral pure tone.
studies. Psychophysical experiments investigating temporal
pitch mechanisms employ low-frequency masking noise specifically designed to attenuate or remove the contribution of
low-frequency spectral components generated at the fundamental frequency that are generated by distortion (Carlyon et
al. 2002; Moore and Sek 2000; Plack and White 2000). Given
J Neurophysiol • VOL
the importance to psychophysical studies of masking the potential contribution of cochlear-generated distortions, it is imperative that the contribution of such distortions be excluded
when examining potential neural mechanisms that generate
sensitivity to temporal pitch. Obviously low-frequency masking noise cannot be used to mask cochlear-generated distor-
92 • SEPTEMBER 2004 •
www.jn.org
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
1309
tions in single-neuron recordings, as the masking noise will
excite directly the neuron through the vibration of the lowfrequency end of the basilar membrane. However, low-frequency sounds can be used to assess the presence of cochleargenerated distortions, as was performed in this study. None of
the cited studies (Biebel and Langner 2002; Schulze and
Langner 1997, 1999; Schulze et al. 2002) purporting to examine the existence of periodotopic representations in the brain
examined this possibility. It was found that all of the data that
indicate sensitivity of low-CF neurons to high-frequency periodic stimuli can readily be explained by the generation of CTs
on the basilar membrane at a frequency corresponding to the
AM rate. Pure tones of frequency 1 Hz lower than the frequency of the presumed CT produced a beating pattern in the
neural response. This response could be abolished using a pure
tone of identical frequency to the presumed CT with appropriate phase and amplitude, and the response interacted binaurally, being sensitive to interaural phase differences between
the presumed CT and a tone presented to the other ear. A
parsimonious explanation for these observations is that low-CF
neurons respond to high-frequency AM complexes by means
of being activated through low-frequency auditory channels
from the level of the cochlear nerve to the level of the IC.
Consequently, a central mechanism of periodicity, or pitch,
extraction is not required to explain these data.
inputs, and there is no evidence that IC neurons show any
intrinsic temporal rate preference of inputs that would match
their CF. As such, a rate code for periodicity extracted from
high-frequency laminae in the IC does not require, nor does it
have any apparent template on which to map, a preference for
similar tuning to carrier and envelope modulations. A rate (i.e.,
discharge rate) map of AM preferences could just as well be
arranged orthogonal to the tonotopic organization in the IC.
In those studies in primary auditory cortex where the relationship between CF tuning and tuning for periodicity was not
as clear as in the IC (Schulze and Langner 1997, 1999), the
authors suggested that this indicated a complex pattern of
across-frequency integration. However, a simpler explanation
for these data, but one that does not appear to have been
accounted for in their studies, is the contribution of the headrelated transfer function to spectrally shaping of the sound at
the eardrum. Stimuli in these studies were presented under
free-field listening conditions, in which the frequency-dependent gain function of the outer ear potentially alters sound
levels at the ear drum by ⱕ20 dB, depending on the exact
frequency and the location of the sound source relative to the
head (May and Huang 1996). This would inevitably have
consequences for the sound levels within each frequency component that reaches the eardrum and could alter the apparent
tuning of the neuron when presented with complex sounds
compared with simple tuning.
Are neurons tuned for periodicity?
In the study by Biebel and Langner (2002), the tuning of
low-CF IC neurons for AM rates was remarkably similar to
their tuning for low-frequency pure tones. Since tuning for CF
is determined at the level of the cochlea and imposed on central
auditory neurons by means of axonal connections—a labeledline code—there is no intrinsic requirement that such neurons
favor an AM rate similar to their CF. The majority of low-CF
IC neurons do not show phase-locked responses to monaural
J Neurophysiol • VOL
Comparison with previous studies: free-field stimulation and
binaural sensitivity
There is no categorical way of determining whether responses recorded in this study and those reported previously in
the IC (Biebel and Langner 2002) are derived from the same or
similar population of neurons, apart from the phenomenological observation that low-CF neurons respond to high-frequency AM tones. In this sense, this study concurs completely
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
FIG. 12. Response of an IC neuron (CF ⫽ 263 Hz) to 3
different levels of 3-kHz tone amplitude modulated at 263 Hz
in the contralateral, a 262-Hz tone in the ipsilateral ear, and a
263-Hz masker tone in the contralateral ear. Carrier levels:
(column A) 74 dB SPL, (column B) 64 dB SPL, and (column C)
54 dB SPL. Top row: response to high levels of the 263-Hz
masker, where the best IPD presumably reflects the binaural
interaction of the masker and the ipsilateral tone. Middle row:
response when the level of the masker is (A) 19 dB lower, (B)
29 dB lower, and (C) 39 dB lower than the level of the carrier.
The masker phase in each case was obtained from the monaural
distortion beat for the 74 dB SPL carrier. Bottom row: response
to lower masker levels than in the middle row. Note that only
responses to carriers of 74 and 64 dB SPL are apparent; 54 dB
SPL was lower than the threshold for generating the distortion,
which was assessed from monaural distortion beats as being 55
dB SPL.
1310
D. MCALPINE
J Neurophysiol • VOL
quency-dependent interaural level difference. The sensitivity to
these interaural cues and their influence on responses were
unknown and untested, whereas this study examined monaural
and binaural influences separately. The inference from their
study is that they studied monaural effects. However, it is only
possible to state categorically that Biebel and Langner studied
uncontrolled binaural influences and did not address monaural
responses at all, since all recordings were made essentially
with free-field stimulation. Such binaural stimulation could
also account for the reported inhibitory influences of highfrequency AM signal, which they took to be inconsistent with
the distortion hypothesis. Since stimulation of the ear ipsilateral to the IC is well documented to provide significant inhibitory drive to IC neurons, the source of inhibition may have
been the sound-evoked ipsilateral ear.
Contribution of cochlear-generated distortions to
pitch processing
Several lines of evidence have been taken to indicate that
distortions cannot account for all of the pitch percept of
spectrally complex sounds. For example, they cannot explain
the pitch shift that occurs when the fundamental frequency is
changed but the spacing of the partials is held constant
(Schouten et al. 1962), and the perception of the residue pitch
is maintained even in the presence of low-pass masking noise
designed to eliminate low-frequency spectral cues for the pitch
of the missing fundamental. Although this is often taken as
evidence against the role of cochlear distortions in the perception of the pitch of complex sounds, recent modeling studies
suggest that even the pitch shift may be explicable in terms of
the nonlinear dynamics of the cochlea itself (Cartwright et al.
1999). Further evidence for an important role for CTs in
processing complex pitches was obtained by Pressnitzer and
Patterson (2001) by examining the lower-level of melodic pitch
(LLMP). When low-frequency masking noise was added to the
stimulus, as is common to many psychophysical investigations
of temporal pitch, the lowest pitched note that could contribute
to a melody was reduced, and (high) frequency region at which
harmonics contributed to the experience of melodic pitch was
increased. This suggests that cochlear-generated combination
tones provide an important contribution to the pitch perception
of complex sounds under natural listening conditions. In response to higher-frequency harmonic series at moderate intensities (55– 65 dB SPL), these authors found a significant
low-frequency distortion spectrum: approximately ⫺10 to ⫺15
dB of the level of the pure tone components in the 11component harmonic series in cosine phase. In this study, AM
signals with just three spectral components appear to produce
sufficient distortion to evoke a strong neural response. Thus, in
the absence of low-frequency masking noise usually provided
in psychophysical studies examining purely temporal mechanisms of pitch, the contribution of cochlear-generated distortions to pitch processing could be substantial, independent of
whether harmonics are resolved or not.
GRANT
This work was supported by a Medical Research Council Career Establishment Grant to the author.
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
with the phenomenon observed by Biebel and Langner, but
goes further by showing that it occurs in low-frequency ITD
sensitive neurons that might be considered to be specialized for
spatial, rather than spectral, processing. However, before valid
comparisons can be made between this study and that of Biebel
and Langner, or between their own study and psychophysical
studies, it is important to understand the significant differences
in methodology that potentially contribute to their data and
their interpretation of it.
In particular, the method of sound stimulation in Biebel
and Langner’s study is less controlled for sound level at
either ear compared with this study, in which all sounds
were presented by placement of speakers within a few
millimeters of the eardrum, and a calibrated probe tube was
used to record the sound level at this point. Biebel and
Langner reported responses to free-field stimulation, with a
speaker positioned some 4 cm external to one ear, and sound
levels measured 1 cm from the ear canal. In at least two
important ways this constitutes an uncontrolled stimulus and
does not allow for comparison with sound levels in this
study or for comparison with sound levels used in the
carefully controlled psychophysical experiments cited by
them. First, the contribution of the head-related transfer
function (HRTF) on sound levels at the eardrum is not
considered in their study. This is despite the fact that the
pinna boosts sound pressure at the eardrum in a frequencyand position-dependent manner, by 10 dB or more at 4 kHz
in the chinchilla (Murphy and Davis 1998). Additionally,
any complex tuning, with respect to AM rate, of the response to high-frequency AM tones is potentially confounded by the frequency-dependent HRTF. Related to the
frequency dependence of the HRTF is that the relative levels
of the individual spectral components, and thus the form and
depth of the AM stimulus, are not well-controlled in Biebel
and Langner’s study. Although speaker output was reported
as flat to within 6 dB over the range tested, the HRTF is
likely to alter this considerably, depending on the frequency
of the carrier and sidebands. This potentially impacts on the
modulated waveform arriving at the eardrum. For example,
100% AM depth occurs when the sidebands (in appropriate
phase relationship to the carrier) are 6 dB lower than the
carrier. A frequency-dependent HRTF in which one or both
of the sidebands are relatively boosted or attenuated compared with the carrier, for example, could significantly alter
the modulation waveform. This is also relevant to their
observation that strong neural responses could be elicited by
very low modulation depths, since the relative levels of the
spectral components in the AM complex, and thus the AM
depth, at the eardrum are potentially very different to those
outputted by the speaker. This also impacts on the reported
responsiveness of IC neurons to low modulation depths of
high-frequency carriers, which was taken as evidence for a
non– distortion-related phenomenon.
A second issue concerns the binaural nature of sound stimulation in Biebel and Langner’s study, where all recordings
were made in response to free-field stimulation. All signals
were subject to an ITD created by the sound arriving at one ear
earlier than the other ear (which was not blocked in their
experiments) in both the low-frequency carriers and highfrequency envelopes of the modulated sounds. In addition, all
high-frequency signals would have been subject to a fre-
REPRESENTATION OF COCHLEAR DISTORTIONS IN IC
REFERENCES
J Neurophysiol • VOL
May BJ and Huang AY. Sound orientation behavior in cats. I. Localization
of broadband noise. J Acoust Soc Am 100: 1059 –1069, 1996.
Meddis R and O’Mard L. A unitary model of pitch perception. J Acoust Soc
Am 102: 1811–1820, 1997.
Medvedeva MV. Stereotaxic coordinates of midbrain structures of the guinea
pig. Neurosci Behav Physiol 8: 168 –171, 1977.
Moore BC and Sek A. Effects of relative phase and frequency spacing on the
detection of three-component amplitude modulation. J Acoust Soc Am 108:
2337–2344, 2000.
Murphy WJ and Davis RR. The role of the chinchilla pinna and ear canal in
electrophysiological measures of hearing thresholds. J Acoust Soc Am 103:
1951–1956, 1998.
Oliver DL. Ascending efferent projections of the superior olivary complex.
Microsc Res Tech 51: 355–363, 2000.
Plack CJ and White LJ. Perceived continuity and pitch perception. J Acoust
Soc Am 108: 1162–1169, 2000.
Pressnitzer D and Patterson RD.Distortion products and the perceived pitch
of harmonic complex tones. In: Physiological and Psychophysical Bases of
Auditory Function, edited by Breebart DJ, Houtsma AJM, Kohlrausch A,
Prijs VF, and Schoonhoven R. Maastricht, The Netherlands: Shaker Publishing BV, 2001, p. 97–104.
Pressnitzer D, Patterson RD, and Krumbholz K. The lower limit of melodic
pitch. J Acoust Soc Am 109: 2074 –2084, 2001.
Schouten JF, Ritsma RJ, and Cardozo BL. Pitch of the residue. J Acoust Soc
Am 34: 1418 –1424, 1962.
Schulze H, Hess A, Ohl FW, and Scheich H. Superposition of horseshoe-like
periodicity and linear tonotopic maps in auditory cortex of the Mongolian
gerbil. Eur J Neurosci 15: 1077–1084, 2002.
Schulze H and Langner G. Representation of periodicity pitch in the primary
auditory cortex of the Mongolian gerbil. Acta Otolaryngol Suppl 532:
89 –95, 1997.
Schulze H and Langner G. Auditory cortical responses to amplitude modulations with spectra above frequency receptive fields: evidence for wide
spectral integration. J Comp Physiol [A] 185: 493–508, 1999.
Shackleton TM and Carlyon RP. The role of resolved and unresolved
harmonics in pitch perception and frequency modulation discrimination. J
Acoust Soc Am 95: 3529 –3540, 1994.
Smith PH. Structural and functional differences distinguish principal from
nonprincipal cells in the guinea pig MSO slice. J Neurophysiol 73: 1653–
1667, 1995.
Smith PH, Joris PX, Carney LH, and Yin TC. Projections of physiologically
characterized globular bushy cell axons from the cochlear nucleus of the cat.
J Comp Neurol 304: 387– 407, 1991.
Smith PH, Joris PX, and Yin TC. Anatomy and physiology of principal cells
of the medial nucleus of the trapezoid body (MNTB) of the cat. J Neurophysiol 79: 3127–3142, 1998.
Tsuchitani C. Input from the medial nucleus of trapezoid body to an interaural
level detector. Hear Res 105: 211–224, 1997.
Wiegrebe L, Patterson RD, Demany L, and Carlyon RP. Temporal dynamics of pitch strength in regular interval noises. J Acoust Soc Am 104:
2307–2313, 1998.
Wiegrebe L and Winter IM. Temporal representation of iterated rippled
noise as a function of delay and sound level in the ventral cochlear nucleus.
J Neurophysiol 85: 1206 –1219, 2001.
Yin TC and Kuwada S. Binaural interaction in low-frequency neurons in
inferior colliculus of the cat. II. Effects of changing rate and direction of
interaural phase. J Neurophysiol 50: 1000 –1019, 1983.
Zhang DX, Li L, Kelly JB, and Wu SH. GABAergic projections from the
lateral lemniscus to the inferior colliculus of the rat. Hear Res 117: 1–12,
1998.
92 • SEPTEMBER 2004 •
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.1 on June 18, 2017
Adams JC and Mugnaini E. Dorsal nucleus of the lateral lemniscus: a
nucleus of GABAergic projection neurons. Brain Res Bull 13: 585–590,
1984.
Agmon-Snir H, Carr CE, and Rinzel J. The role of dendrites in auditory
coincidence detection. Nature 393: 268 –272, 1998.
Aitkin LM, Anderson DJ, and Brugge JF. Tonotopic organization and
discharge characteristics of single neurons in nuclei of the lateral lemniscus
of the cat. J Neurophysiol 33: 421– 440, 1970.
Batra R and Fitzpatrick DC. Monaural and binaural processing in the ventral
nucleus of the lateral lemniscus: a major source of inhibition to the inferior
colliculus. Hear Res 168: 90 –97, 2002.
Bernstein JG and Oxenham AJ. Pitch discrimination of diotic and dichotic
tone complexes: harmonic resolvability or harmonic number? J Acoust Soc
Am 113: 3323–3334, 2003.
Biebel UW and Langner G. Evidence for interactions across frequency
channels in the inferior colliculus of awake chinchilla. Hear Res 169:
151–168, 2002.
Brugge JF, Anderson DJ, and Aitkin LM. Responses of neurons in the
dorsal nucleus of the lateral lemniscus of cat to binaural tonal stimulation.
J Neurophysiol 33: 441– 458, 1970.
Cariani PA and Delgutte B. Neural correlates of the pitch of complex tones.
I. Pitch and pitch salience. J Neurophysiol 76: 1698 –1716, 1996a.
Cariani PA and Delgutte B. Neural correlates of the pitch of complex tones.
II. Pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch,
and the dominance region for pitch. J Neurophysiol 76: 1717–1734, 1996b.
Carlyon RP, van Wieringen A, Long CJ, Deeks JM, and Wouters J.
Temporal pitch mechanisms in acoustic and electric hearing. J Acoust Soc
Am 112: 621– 633, 2002.
Cartwright JHE, Gonzalez DL, and Piro O. Nonlinear dynamics of the
perceived pitch of complex sounds. Phys Rev Lett 82: 5389 –5392, 1999.
de Cheveigne A. Cancellation model of pitch perception. J Acoust Soc Am
103: 1261–1271, 1998.
Forsythe ID and Barnes-Davies M. The binaural auditory pathway: excitatory amino acid receptors mediate dual timecourse excitatory postsynaptic
currents in the rat medial nucleus of the trapezoid body. Proc R Soc Lond B
Biol Sci 251: 151–157, 1993.
Goldberg JM and Brown PB. Response of binaural neurons of dog superior
olivary complex to dichotic tonal stimuli: some physiological mechanisms
of sound localization. J Neurophysiol 32: 613– 636, 1969.
Goldstein JL. Auditory nonlinearity. J Acoust Soc Am 41: 676 – 689, 1967.
Grimault N, Micheyl C, Carlyon RP, and Collet L. Evidence for two pitch
encoding mechanisms using a selective auditory training paradigm. Percept
Psychophys 64: 189 –197, 2002.
Helmholtz HLFv. On the Sensations of Tone as the Physiological Basis for
the Theory of Music. New York: Dover Publications, 1885.
Huffman RF and Covey E. Origin of ascending projections to the nuclei of
the lateral lemniscus in the big brown bat, Eptesicus fuscus. J Comp Neurol
357: 532–545, 1995.
Kelly JB and Li L. Two sources of inhibition affecting binaural evoked
responses in the rat’s inferior colliculus: the dorsal nucleus of the lateral
lemniscus and the superior olivary complex. Hear Res 104: 112–126, 1997.
Krumbholz K, Patterson RD, and Pressnitzer D. The lower limit of pitch as
determined by rate discrimination. J Acoust Soc Am 108: 1170 –1180, 2000.
Langner G. Neural processing and representation of periodicity pitch. Acta
Otolaryngol Suppl 532: 68 –76, 1997.
Licklider JCR. A duplex theory of pitch perception. Experientia 7: 128 –134,
1951.
1311