Download A Cochlear-Implant Processor for Encoding Music and Lowering Stimulation Power

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

MIMO wikipedia , lookup

Public address system wikipedia , lookup

Electronic engineering wikipedia , lookup

Islanding wikipedia , lookup

Mains electricity wikipedia , lookup

Alternating current wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Metadyne wikipedia , lookup

Buck converter wikipedia , lookup

Power electronics wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
i m pl a nta b l e e l e c tr o n i c s
A Cochlear-Implant
Processor for Encoding
Music and Lowering
Stimulation Power
This 75 dB, 357 W analog cochlear-implant processor encodes finephase-timing spectral information in its asynchronous stimulation
outputs to convey music to deaf patients.
C
ochlear implants (CIs), or bionic
ears, restore hearing in profoundly deaf (greater than –90 dB
hearing loss) patients. They function by transforming frequency
patterns in sound into corresponding spatial
electrode-stimulation patterns for the auditory
nerve. Over the past 20 years, improvements in
sound-processing strategies, in the number of
electrodes and channels, and in the rate of stimulation have yielded improved
Ji-Jon Sit
sentence and word recogniAdvanced Bionics
tion scores in patients.1 Nextgeneration implants will be
Rahul Sarpeshkar
fully implanted inside the
Massachusetts Institute
patient’s body. Consequently,
of Technology
power consumption requirements for signal processing
will be very stringent.
The processor we discuss in this article
is intended for use in such next-generation
implants. It can operate on a 100 mA-hr battery with a 1,000 charge-and-discharge cycle
lifetime for 30 years, while allowing nearly
1 mW of electrode-stimulation power. It provides more than an order-of-magnitude power
reduction over an A/D-then-DSP (analog/digital, then digital signal processor) solution, which
often consumes 5 mW or more. Our processor’s
digital outputs, its immunity to power-supply
noise and temperature variations, and its high
40
P ER VA SI V E computing programmability level ensure ease of use with
an implant system’s other parts, such as the
wireless communication link and the programming interface.
CI users and music perception
A common speech-processing strategy, used
in implants and speech-recognition systems,
employs a mel-cepstrum filter bank with eight to
20 channels. The mel scale maps frequencies to
a perceptually linear scale.2 Filter banks based
on the mel scale use linearly spaced filter center
frequencies of up to 1 KHz and logarithmically
spaced center frequencies above 1 KHz. (The
center frequency is the frequency of maximum
response in a bandpass filter output.) Ubiquitous cepstral techniques use a logarithmic measure of the spectral energy in each filter bank
channel for further processing. Implants also
use eight to 20 functioning electrodes for stimulation; because of spatial interactions among
the electrodes, having more electrodes is often
not useful.
Compared to normal-hearing listeners, deaf
patients who use a cochlear implant have only
a very limited ability to perceive music. 3 In
an earlier work, we showed that a low-power
algorithm providing asynchronous interleaved
sampling (AIS) in cochlear implants is wellsuited for encoding fine-phase-timing information.4 The ability to encode such information is
Published by the IEEE CS n 1536-1268/08/$25.00 © 2008 IEEE
important for music perception by CI
users, 5 because even the best-performing CI users appear unable to use more
than seven to 10 channels of spectral
information.6 A recent study of two
stimulation strategies that don’t include
fine-phase-timing information—the
Advanced Combination Encoder (ACE)
and Spectral Peak (Speak) strategies—
confirms that music appreciation is less
than satisfactory even with the latest
implants.7
Consequently, researchers have
recently proposed several strategies,
in addition to AIS, for delivering finephase-timing information in CI stimulation. These include frequency amplitude modulation encoding (FAME)8
and peak-derived timing (PDT). 9
However, none of these have successfully presented fine-phase-timing information to CI users in a way that can
improve music perception. Hence, further tests on CI users are necessary to
investigate each technique’s efficacy.
Our 16-channel cochlear-implant
processor implements our earlier AIS
algorithm.4 This processor moves the
AIS strategy a step closer to testing
on CI users. It is suitable for encoding
music, and it operates with very low
power consumption because of its use
of analog processing techniques.10,11
The AIS algorithm allows high-rate
sampling of high-intensity channels
while maintaining a low average rate of
sampling for all channels, thus allowing lower stimulation power as well.4
Analog versus digital
We estimate that an A/D-then-DSP
implementation of traditional cochlearimplant processing would use about
0.25 mW to 0.5 mW for the microphone
front end and A/D converter, and use
250 W/MIP × 20 MIPS = 5 mW for the
other processing, yielding a total power
consumption of about 5.5 mW. These
numbers are representative of state-of-
JANUARY–MARCH 2008
the-art cochlear-implant processing,
although many commercial processors’
power consumption is significantly
worse because of various system inefficiencies. The power consumption for
stimulation can range from 1 mW to
10 mW, depending on the patient and
stimulation strategy. Our algorithm’s
digital implementation, unlike that
of a traditional processing algorithm,
will likely be extremely power hungry,
owing to its need for asynchronous
processing and high-speed sampling of
certain channels.
Our analog, 357 W processor can
improve performance, reduce processing power consumption by more
than an order of magnitude, and significantly lower stimulation power
consumption. This processor makes a
fully implantable system feasible and
practical. Thanks to the use of analog
processing, its effective computational
efficiency is between 1 W/MIP and 5
W/MIP in a 1.5 m process. This is
considerably better than even the most
power-efficient DSPs, dominated by
switching capacitance only in the DSP
core, implementing their most favorable applications, and implemented in
consumption of a very low-power
microphone front end, anti-alias filter,
and A/D converter would still likely
exceed 357 W. Moreover, A/D scaling in speed, power, and precision is
far slower than Moore’s law, and some
circuits in our analog implementation
could also benefit from these improvements. A custom digital solution would
certainly narrow the gap between
a DSP’s and our processor’s power
consumption, but the high cost of the
A/D converter, the microphone, and the
asynchronous processing in the digital
domain would still give our analog processor a significant advantage.
It’s useful to understand why our processor operates more efficiently than an
A/D-then-DSP implementation. An
A/D converter immediately creates a
representation of the incoming information as a series of relatively highprecision and high-speed numbers (16
bits at 44 KHz is typical in such applications) that by themselves carry very
little meaningful information. This digitization consumes considerable power
because doing any task with high speed
and high precision is expensive. (High
precision is necessary if an operation
Our cochlear-implant processor operates with very low power consumption because of its use of analog processing techniques.
an advanced submicron process. The
effective computational efficiency for
such DSPs is between 50 W/MIP and
100 W/MIP.
Needless to say, the efficiency for
DSPs will continually improve with
Moore’s law, but such improvements
are increasingly more modest. Even if
we generously assume that the power
consumption of the DSP is actually zero
at the end of Moore’s law, the power
requires a wide dynamic range and all
computations, including gain control,
are performed in the digital domain.
High speed is necessary to avoid aliasing.) Then, a DSP takes all these numbers and crunches them with millions of
multiply-accumulate operations per second, burning power in several switching transistors. It finally extracts more
meaningful log spectral-energy information—but, because of speech data’s
P ER VA SI V E computing 41
implantable electronics
18
18
Spike output
16
14
14
12
12
Channel no.
Channel no.
16
HWR filter output
10
8
8
6
4
4
2
2
0.330
0.335
(a)
0.340
Time (s)
0.345
0.350
0
0.325
0.355
Inhibition current
10
6
0
0.325
Capacitor voltage
0.330
(b)
0.335
0.340
Time (s)
0.345
0.350
0.355
Figure 1. Matlab simulation of the asynchronous interleaved sampling (AIS) algorithm: (a) the half-wave-rectified (HWR) filter
outputs as inputs to the algorithm (dashed black lines) and the spike outputs (solid red lines); (b) the algorithm’s internal state
variables, including the neuronal capacitor voltage (solid blue) and the inhibition current (dashed green). The AIS algorithm
turns on an inhibition current as soon as a spike (pulse) fires.
high variability, at a far slower rate of
100 Hz to 1 KHz in 16 parallel bands
and at 6-bit-to-8-bit precision.
In contrast, analog preprocessing lets
our processor efficiently compress the
incoming data such that low-speed and
low-precision A/D converters at a later
stage of the computation quantize the
meaningful information. Some of our
prior work analyzes the optimal point
for digitizing information in more general systems.12 Too much analog preprocessing before digitization is inefficient because the costs required to
maintain precision begin to rise steeply.
Too little analog preprocessing before
digitization is inefficient because the
digital system ignores analog degrees
of freedom that can be exploited to
improve computational efficiency.
Analog systems are more efficient
than digital systems at low output precision, whereas digital systems are more
efficient than analog systems at high
output precision.12 In our processor,
the output precision in each channel is
7 bits, and we intentionally limited the
maximum output firing rate to 1 KHz,
to lower stimulation power and to
42
P ER VA SI V E computing avoid a firing rate in the auditory nerve
that is limited to the refractory period
of recovery (when the nerve is overstimulated at a rate that is too high).
Our processor’s internal dynamic range
(IDR) is near 55 dB, with gain control
allowing 75 dB of input dynamic range.
An analog solution can therefore compete with a digital solution if the entire
system maintains the necessary precision. If a task required 14 bits of output precision, 72 dB IDR and 100 KHz
bandwidth at each channel, the A/Dthen-DSP strategy would definitely be
more efficient than our solution.
An analog solution must preserve
its efficiency advantage by carefully
monitoring robustness (that is, immunity or insensitivity to process variation, power supply noise, crosstalk
between signals, and pickup of other
interfering noise sources). Such robustness need not be present in every device
and every signal, as in a digital solution,
but only at important locations in the
signal-flow chain, where it truly matters. Our processor is robust in the face
of power-supply noise, thermal noise,
temperature variations, and transistor
mismatches, owing to its use of feedforward and feedback calibration circuits,
robust biasing techniques, and careful
analog design. Thus, an analog system
addresses the robustness-efficiency
trade-off very differently than a digital
system does.
Programmability is certainly not as
great in an analog system as in a digital one. However, as in our case, this is
less of an issue when implementing an
algorithm that’s known to work. Our
processor’s programmability of 165
parameters with 546 bits allows sufficient but not excessive programmability. Our processor’s efficiency is high
because it exploits the transistor’s analog degrees of freedom for computation
without treating it as a mere switch.
The AIS algorithm
This algorithm uses half-wave-rectified (HWR) and phase-locked current
outputs from spectral-analysis channels
to charge an array of neuronal capacitors that compete with one another in a
race-to-spike paradigm: The first neuron to reach a fixed voltage threshold
wins the race and gets to fire a spike
www.computer.org/pervasive
Bandpass filter
channel 1
Envelope
detector
HWR
magnitude
55 dB internal
dynamic range
FG3329
Microphone
Preamplifier
75 dB overall
dynamic range
Broadband
automatic gain
control
Bandpass filter
channel 2
AIS
circuit
Envelope
detector
HWR
magnitude
.
.
.
.
.
.
Bandpass filter
channel 16
Log
A/D
Log
A/D
AIS
circuit
.
.
.
Envelope
detector
HWR
magnitude
Log
A/D
AIS
circuit
7 bits at 1 KHz
Tristates
Vo1 output pulse
7 bits at 1 KHz
Tristates
Output
bus
Vo2 output pulse
.
.
.
7 bits at 1 KHz
Tristates
Vo16 output pulse
Figure 2. The AIS processor. The FG3329 microphone picks up sound, which goes through a preamplifier and then to a
broadband automatic gain control (AGC) circuit. The AGC compresses this sound and converts the input dynamic range
of 75 dB to an internal dynamic range of 55 dB. A bank of bandpass filters then filter the AGC’s compressed output. Envelope
detectors perform rectification and peak detection on the filter outputs to create inputs for the AIS circuit and log A/D
(analog/digital) blocks, respectively. The AIS circuit then generates the asynchronous timing events, while the log A/D
converter digitizes the envelope of each channel.
(pulse). Thus, the AIS algorithm prevents simultaneous channel stimulation,
to avoid spectral smearing through electrode interactions.13 Once a spike fires,
all the capacitors reset, and the race
to spike begins again, except that the
algorithm applies a negative current to
the neuron that just spiked, to inhibit it
from winning in subsequent races. This
inhibition current remains active for the
duration of a predetermined relaxation
time constant. The algorithm thereby
enforces a minimum interspike interval,
which the relaxation time constant sets.
This prevents the maximum stimulation
rate from ever exceeding the refractory
rate of neuronal recovery, which would
otherwise cause unnatural distortions
in the temporal discharge patterns of
cochlear implants.
However, stimulation is not constrained to fire only at the maximum
rate. The algorithm naturally adapts the
stimulation rate (effectively, the rate at
which the algorithm samples the input)
in both time and spectral space to the
signal’s information content, so that the
JANUARY–MARCH 2008
processor doesn’t spend any power during quiet periods or on quiet channels.
Therefore, high-intensity channels win
the race to spike more frequently and
are sampled at a high phase-encoded
rate, whereas low-intensity channels
win less frequently and are sampled
at a lower phase-encoded rate. This
adaptable stimulation rate lowers the
average stimulation power and allows
more natural, asynchronous stimulation of the auditory nerve.4 Figure 1
shows a M atlab simulation of the AIS
algorithm on a segment of speech.
The AIS processor
Figure 2 shows a block diagram of
the AIS processor, which implements
the AIS algorithm, building on our
prior work.10,11 We modified the envelope detector in our prior work so that it
could quickly output HWR currents to
the AIS circuit. When a spike fires, the
spike activates tristate buffers within
the winning channel to report the log
envelope amplitude as a 7-bit digital
number onto a common output bus,
thus providing both amplitude information and fine-phase-timing information in a single output event. The only
constraint on the rate of spikes arriving
from multiple channels onto the output
bus is that they not overlap. Hence, our
16-channel analog AIS processor provides high temporal resolution, without
the need for a high-rate sampling clock
that constantly runs whether or not
events occur.
Figure 3 shows a 16-channel, voltage-mode winner-take-all (WTA) circuit that forms the AIS circuit’s core by
detecting the first channel whose neuronal state variable, Vix, crosses a fixed
voltage threshold, Vthresh. (Throughout
our description, the letter x in a signal variable denotes the signal variable corresponding to a channel with
a channel number x. The value of x
can range from 1 to 16.) Output voltage Vox goes high only in the winning
channel, thereby suppressing all other
channel outputs from rising by pulling
up strongly on the common source voltage Vs through positive feedback in the
P ER VA SI V E computing 43
implantable electronics
T103
T104
T1603
T1604
T105
T1601
Vi16
...
Vo1
Vs
Channel 1
T114
...
T115
l in
V8
Channel 16
Vthresh
T1614
Vncasc
Ib ↓
T1615
...
l pota
Vsink
T015
la + lr
↓
Tx24
Vpcasc
T001
Vo16
↓
Tx22
↓
Tx03
Tx04
Tx23
Tx06
Tx25
Tx05
Vpcasc
Translinear
current gain
T1602
l3
↓
Tx21
T005
Vpcasc
T102
Figure 3. The voltage-mode winnertake-all (WTA) circuit that implements
the AIS algorithm.
T004
T1605
Vpcasc
T101
Vi1
T003
Tx01
Tx02
Vox
Tx32
lo ↓
Tx31
VinhTH
Vix
Tx11
Cin
Tx10
Vs
Vax
Tx07
Vhx
Vrx
Tx08
Tx33
Tx09
Tx34
Cr
Reset
Tx12
l2
↓
Tx13
lr
Tx14
Tx15
↓
Vncasc
Vsink
Figure 4. One channel of the AIS circuit, with the attack-and-release subcircuit shown
in bold.
loop of transistors Tx02 , Tx03, Tx04, and
Tx05. The current sink gate voltage Vsink,
which falls in the vicinity of a threshold
crossing, reduces the pull-down current
from all the Tx15 transistors. This action
44
P ER VA SI V E computing enhances the circuit’s positive-feedback
loop gain, making it greater than that
of an earlier voltage-mode WTA circuit
that Gert Cauwenberghs and Volnei
Pedroni described.14
When a channel wins and Vox rises,
signaling an asynchronous firing event,
a method of setting a one-shot pulse
width on Vox is required. It’s also necessary to immediately inhibit the winning
channel from firing again until after an
absolute refractory period. Figure 4
shows a single channel of the AIS circuit. The attack-and-release subcircuit
(shown in bold in the figure), defines
both Vox pulse width Ta and the absolute refractory period of inhibition Tr
immediately after a pulse fires. This
subcircuit works as follows: A twotransistor superbuffer of cascaded n
and p source followers (Tx07 and Tx08
in figure 4) is biased with a large pullup current Ia and a small pull-down
current Ir. By design, the superbuffer
output voltage Vax initially sits below
comparator input threshold voltage
VinhTH , so comparator output voltage
Vhx is low, and devices Tx10 and Tx11 are
off. At the rising edge of Vox, Vax initially undershoots because Tx07 shorts
out the threshold drop on Tx08 such that
Vax falls to approximately the value of
the release voltage Vrx. Current Ia then
charges up the release capacitor C r, and
Vax ramps up from a minimum value
until it crosses VinhTH and causes Vhx to
go high, turning on Tx10 and Tx11 and
terminating the pulse by pulling Vox
low again. Fixing C r and VinhTH lets us
make pulse width Ta programmable by
varying Ia. The pulse width’s programmability is necessary to accommodate
different time profiles for charge transfer in each stimulation event.
Turning on Tx10 and Tx11 pulls Vox
low and inhibits input voltage Vix from
rising. Upon the falling edge of Vox,
Tx07 shuts off, causing Vax to step back
up to a threshold drop above Vrx. Vax
then follows Vrx, which ramps down as
Ir discharges C r. The inhibition from
Vhx, therefore, remains high until Vax
falls below VinhTH again, and the time
www.computer.org/pervasive
Superbuffer output, Va (V)
Figure 5. Measured chip waveforms
showing the range of programmability
in the AIS circuit: (a) superbuffer output
voltage Va waveforms with increasing
pulse width Ta and absolute refractory
period of inhibition Tr, where I a is the
pull-down current and I r is the pull-up
current; (b) input voltage V i waveforms
with increasing current gain A, where I 3
is the parameter used to set A.
(a)
JANUARY–MARCH 2008
VinhTH = 1.1 V
l a = 0, l r = 7
l a = 3, l r = 5
l a = 5, l r = 3
l a = 6, l r = 2
Ta
0
Tr
0.2
0.4
0.6
0.8
1.0
Time (ms)
1.2
1.4
1.6
1.8
1.5
Capacitor voltage, Vi (V)
it takes to do so sets Tr, programmable
by varying Ir. Programmability in Tr is
necessary to enforce a minimum interpulse interval, which prevents a channel
that has just won the race from immediately winning again. This programmability is also necessary for setting
a minimum refractory period, which
allows the auditory nerves stimulated
by the winning channel to recover.
To perform pre-emphasis or equalization across channels, which might
be necessary due to patient variability or fabrication offsets, a translinear input stage programs the effective
threshold in each channel by varying
the current gain, A, applied to each
input. Rather than fixing A and varying
Vthresh across channels, we equivalently
fix Vthresh and vary A. Because PMOS
(p-channel metal-oxide semiconductor) devices Tx21, Tx22 , Tx23, and Tx24
are biased below threshold, the translinear loop yields output current Io as
Iin(I2/I3); thus, A = I2/I3, and we make
A programmable by fixing I2 and varying I3.
The AIS algorithm requires resetting all input capacitors, Cin, to ground
whenever a stimulation pulse is generated. A reset digital signal, which is a
Schmitt-triggered, buffered version
of analog signal Vs, accomplishes this
resetting. Thus, Cin discharges through
Tx09 when Vs goes high, for a stimulation pulse’s duration. We program
Ia, Ir, and I3 using 3-bit current D/A
converters. Figure 5 shows the range
of programmability in the AIS circuit
from measured chip waveforms. The
chip has a total of 546 programmable
2.0
1.8
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0
–0.2
(b)
l3 = 0
l3 = 3
l3 = 4
l3 = 5
l3 = 7
1.0
0.5
0
–0.5
Vthresh = 1.1 V
Reset
–0.4
–0.3
bits, allowing the adjustment of 165
spectral-analysis and AIS parameters
through a three-wire serial peripheral
interface. We employed robust biasing of D/A currents, immune to both
power supply noise and temperature, as
described in our earlier work.10,11
Performance comparisons
We played various sound clips from
the computer into the AIS processor,
and we recorded the asynchronous
stimulation pulses along with their 7bit log envelope values. One of these
clips, taken from Handel’s Messiah,
was analyzed by a bandpass filter bank
in M atlab that was mathematically
equivalent to the filters on the chip. Figure 6a shows this clip as a spectrogram.
Figure 6b shows the asynchronous
spike pulses scaled by the log envelope
energy, which we reconstructed into
a continuous-time signal (shown in
figure 6c) using low-pass filtering.4 As
the spectrograms show, the chip reconstruction matches the ideal M atlab
simulation very well.
Figure 7 compares the performance
–0.2
Time (ms)
–0.1
0
0.1
of the AIS processor (tested with various speech and music sound clips), an
ideal M atlab AIS simulation, and a
traditional non-phase-based tone-vocoding simulation representing continuous-interleaved-sampling (CIS)
synchronous stimulation. The tonevocoding simulation represents only
a traditional CIS strategy; to achieve
higher performance, most modern
cochlear implants implement more
sophisticated strategies, which could
be based on CIS7 or on some other
sampling technique. 9 Nevertheless,
cochlear implant simulations are still
helpful for normal-hearing listeners
to gauge the best possible outcomes
in CI users. The reason is that electrical stimulation creates many artificial
problems, such as cross-fiber synchrony
and perceptual dissonance,15 that don’t
exist in natural acoustic stimulation.
In Figure 7a, we correlated the sound
reconstruction (given by the summation of all channels) with the original
sound signal, and the vertical bars represent the correlation coefficient, r. A
high correlation coefficient between
P ER VA SI V E computing 45
implantable electronics
16
14
Channel no.
12
10
8
6
4
2
0
(a)
0.2
0.4
0.6
0.8
Time (s)
1.0
1.2
1.4
16
N=0
N=0
N = 137
N = 277
N = 808
N = 654
N = 670
N = 1,252
N = 1,098
N = 1,300
N = 929
N = 875
N = 810
N = 866
N = 318
N = 295
14
Channel no.
12
10
8
6
4
2
0
(b)
0.2
0.4
0.6
0.8
Time (s)
1.0
1.2
1.4
16
r = NaN
r = NaN
r = 0.01
r = 0.03
r = 0.00
r = 0.15
r = 0.08
r = 0.39
r = 0.11
r = 0.04
r = 0.50
r = 0.57
r = 0.28
r = 0.31
r = 0.82
r = 0.64
14
Channel no.
12
10
8
6
4
2
(c)
46
f = 6,280 Hz
f = 5,214 Hz
f = 4,132 Hz
f = 3,126 Hz
f = 2,595 Hz
f = 1,963 Hz
f = 1,556 Hz
f = 1,233 Hz
f = 1,123 Hz
f = 933 Hz
f = 739 Hz
f = 586 Hz
f = 486 Hz
f = 385 Hz
f = 266 Hz
f = 183 Hz
0
0.2
0.4
P ER VA SI V E computing 0.6
0.8
Time (s)
1.0
1.2
1.4
Figure 6. Spectrograms comparing
(a) an ideal Matlab simulation of the
bandpass filter outputs in each channel,
where f indicates the center frequency
of each filter; (b) asynchronous spike
outputs from the AIS processor, where
N is the number of spikes recorded in
that channel; (c) spike-reconstructed
filter outputs from the AIS processor,
where r shows the correlation coefficient
of each channel ranging from 0 to 1, or
NaN (meaning “not a number,” resulting
from N in that channel being 0). Note
that fine phase-timing information
is preserved. (Sound source: 1.46 s of
the Hallelujah chorus from Handel’s
Messiah.)
the reconstruction and original sound
captures the fidelity of both envelope
and fine-phase-timing preserved in
the signal.4 A high correlation coefficient can also predict a normal-hearing
listener’s increased ability to recognize
words and music while listening to a
reconstruction. The AIS processor’s
correlation coefficients are comparable
to those from the M atlab AIS simulation. Thus, the AIS processor, unlike
traditional CIS, encodes fine-phasetiming information, which is necessary
for preserving music.
Figure 7b compares the average firing rate (AFR) of the AIS processor,
the ideal AIS M atlab simulation, and
the CIS tone-vocoding simulation. The
CIS stimulation rate is the fixed rate at
which a conventional CIS processor
samples the envelope of each analysis
channel. In practice, clinicians typically set this fixed rate to between 800
Hz and 2.5 KHz and then adjust it to
maximize performance.1 So, for comparison purposes, we chose a rate of 2
KHz in figure 7b. AIS achieves a lower
AFR than conventional CIS, without
compromising signal fidelity; in fact,
AIS increases this fidelity. Hence, the
AIS processor demonstrates that adapting the sampling rate to the signal’s
needs can substantially save stimulation power. This benefit comes at the
cost of increased signal-processing
www.computer.org/pervasive
2,000
1,800
0.50
1,600
Average firing rate (Hz)
Coefficient of correlation with original sound, r
0.60
0.40
0.30
0.20
1,400
1,200
1,000
800
600
400
0.10
200
0.00
“die”
“buy”
(a)
Handel
chorus
Jazz
Blues
“die”
(b)
Sound clip
AIS processor
0
Beethoven
symphony
“buy”
Handel
chorus
Jazz
Blues
Beethoven
symphony
Sound clip
AIS MATLAB simulation
CIS tone-vocoding simulation
Figure 7. Performance comparison between the AIS processor outputs, an ideal AIS simulation in Matlab, and a traditional tonevocoding simulation representing traditional synchronous continuous-interleaved-sampling (CIS) stimulation that doesn’t
preserve phase information: (a) coefficients of correlation between signal reconstruction and original sound using each method;
(b) average rates of stimulation using each method.
Power-supply-immune
biasing
Automatic
gain control
Figure 8. Die photo of the AIS processor,
showing the various blocks.
Microphone
preamplifier
AIS
output
tristates
AIS
output
tristates
AIS
circuit
JANUARY–MARCH 2008
Log A/D
Bandpass filters
Envelope detectors
Local bias distribution circuits
Log A/D
AIS
circuit
power. However, in our analog implementation with the AIS processor, the
increase is a modest 106 W (or 6.6 W
per channel) over our analog CIS processor, which consumes 251 W.10
Figure 8 shows a die photo of the
9.23 mm × 9.58 mm AIS processor,
with labels describing the various
blocks. The entire processor, built in
a 1.5 m process, consumes 357 W.
This is very efficient compared to typical A/D-then-DSP cochlear-implant
processors, which often consume 5 mW
or more.
T
he AIS processor demonstrates an example of how
simple analog circuit-building blocks can help implement a complex signal-processing
P ER VA SI V E computing 47
implantable electronics
the Authors
Ji-Jon Sit is an RFIC and systems engineer at Advanced Bionics. He completed
the work described in this article while pursuing his PhD at the Massachusetts
Institute of Technology. His research interests focus on emerging cochlear
implant technology. He received his PhD in electrical engineering from MIT.
Contact him at Advanced Bionics Corp., 12740 San Fernando Rd., Sylmar, CA
91342.
Rahul Sarpeshkar heads a research group on Analog VLSI and Biological
Systems on the faculty of the Electrical Engineering and Computer Science
Department at the Massachusetts Institute of Technology. His research
interests include analog and mixed-signal VLSI, biomedical systems, ultralow-power circuits and systems, biologically inspired circuits and systems,
molecular biology, neuroscience, and control theory. He received his PhD in
computation and neural systems from the California Institute of Technology.
He is an associate editor of IEEE Transactions on Biomedical Circuits and Systems.
Contact him at Massachusetts Inst. of Technology, 77 Massachusetts Ave.,
Cambridge, MA 02139; [email protected].
IEEE
algorithm with minimal resources of
power and silicon area. In this example, the implemented algorithm is one
that encodes music and lowers stimulation power, making fully implanted
cochlear implants with good performance possible. Future work needs to
combine work such as ours with other
improvements to allow fully implanted
systems to enter clinical practice. These
improvements include lowering electrode impedances to further reduce
stimulation power, improving the performance of implantable microphones
to a level that is comparable to highquality external microphones, and
reducing spectral and temporal smear-
THE #1 ARTIFICIAL
INTELLIGENCE
MAGAZINE!
48
P ER VA SI V E computing ing due to current-spreading interactions among electrode channels.
References
1. P.C. Loizou, “Mimicking the Human
Ear,” IEEE Signal Processing Magazine,
vol. 15, no. 5, 1998, pp. 101–130.
2. J.W. Picone, “Signal Modeling Techniques
in Speech Recognition,” Proc. IEEE, vol.
81, no. 9, 1993, pp. 1215–1247.
3. H.J. McDermott, “Music Perception with
Cochlear Implants: A Review,” Trends in
Amplification, vol. 8, no. 2, 2004, pp.
49–82.
Cochlear Implants that Encodes Envelope
and Phase Information,” IEEE Trans.
Biomedical Eng., vol. 54, no. 1, 2007, pp.
138–149.
5. Z.M. Smith, B. Delgutte, and A.J. Oxenham, “Chimaeric Sounds Reveal Dichotomies in Auditory Perception,” Nature, 7
Mar. 2002, pp. 87–90.
6. L.M. Friesen et al., “Speech Recognition
in Noise as a Function of the Number of
Spectral Channels: Comparison of Acoustic Hearing and Cochlear Implants,” J.
Acoustical Soc. of America, vol. 110, no.
2, 2001, pp. 1150–1163.
7. V. Looi et al., “Comparisons of Quality
Ratings for Music by Cochlear Implant
and Hearing Aid Users,” Ear and Hearing, vol. 28, no. 2, 2007, pp. 59S–61S.
8. K. Nie, G. Stickney, and F.-G. Zeng,
“Encoding Frequency Modulation to
Improve Cochlear Implant Performance
in Noise,” IEEE Trans. Biomedical Eng.,
vol. 52, no. 1, 2005, pp. 64–73.
9. A.E. Vandali et al., “Pitch Ranking Ability of Cochlear Implant Recipients: A
Comparison of Sound-Processing Strategies,” J. Acoustical Soc. of America, vol.
117, no. 5, 2005, pp. 3126–3138.
10. R. Sarpeshkar et al., “An Analog Bionic
Ear Processor with Zero-Crossing Detection,” Proc. IEEE Int’l Solid-State Circuits Conf. (Isscc 05), IEEE Press, 2005,
pp. 78–79.
4. J.-J. Sit et al., “A Low-Power Asynchronous Interleaved Sampling Algorithm for
11. R. Sarpeshkar et al., “An Ultra-LowPower Programmable Analog Bionic Ear
Processor,” IEEE Trans. Biomedical
Eng., vol. 52, no. 4, 2005, pp. 711–727.
IEEE Intelligent Systems delivers
the latest peer-reviewed research on
all aspects of artificial intelligence,
focusing on practical, fielded applications.
Contributors include leading experts in
12. R. Sarpeshkar, “Analog Versus Digital:
Extrapolating from Electronics to Neurobiology,” Neural Computation, vol.
10, no. 7, 1998, pp. 1601–1638.
• Intelligent Agents • The Semantic Web
• Natural Language Processing
• Robotics • Machine Learning
Visit us on the Web at
www.computer.org/intelligent
13. B.S. Wilson et al., “Better Speech Recognition with Cochlear Implants,” Nature,
18 July 1991, pp. 236–238.
14. G. Cauwenberghs and V.A. Pedroni, “A
Charge-Based CMOS Parallel Analog
Vector Quantizer,” Proc. Advances in
Neural Information Processing Systems
7 (NIPS 94), MIT Press, 1994, pp. 779–
786.
15. G.E. Loeb, “Are Cochlear Implant
Patients Suffering from Perceptual Dissonance?” Ear and Hearing, vol. 26, no. 5,
2005, pp. 435–450.
www.computer.org/pervasive