Download Paper Title (use style: paper title)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Current source wikipedia , lookup

Rectifier wikipedia , lookup

Heterodyne wikipedia , lookup

Mains electricity wikipedia , lookup

Islanding wikipedia , lookup

Capacitor wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Transistor wikipedia , lookup

Power electronics wikipedia , lookup

Power MOSFET wikipedia , lookup

TRIAC wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Buck converter wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Alternating current wikipedia , lookup

Oscilloscope history wikipedia , lookup

Random-access memory wikipedia , lookup

Current mirror wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
Low-power low-area beamformer design
using switched-current ARAM for In-Probe
Ultrasound Imaging
Surya Sharma1, Trond Ytterdal2
12
Department of Electronics and Telecommunications
Norwegian University of Science and Technology, Trondheim, Norway
Abstract: A switched-current (SI) based beamformer, suitable for low-power and low-area applications, is
presented. The most important feature of the proposed beamformer is that the current consumption does not
increase linearly with memory size. This is a fully differential design implemented using delay and sum approach.
An Analog RAM (ARAM) has been used to implement the delay and each memory cell in the ARAM has been
implemented using an SI circuits. There are 16 memory cells in each ARAM consuming 35.15 μA/cell. The signalto-noise ratio (SNR) of 63 dB is measured after summation of 16 channels. The total power consumption of
beamformer including the bias, and digital control is 9 mA and area is 1.27 mm2.
Keywords: Analog beamformer, Analog RAM, Low-power design, Switched-current circuits, Ultrasound.
Introduction
Ultrasound imaging is an important tool in the medical diagnostics [1, 2] which provides low cost and portable imaging
solution for non-invasive applications [3]. For invasive applications however the better image quality demands a large
number of transducers on the probe. As a result it becomes necessary to have in-probe beamforming to reduce the number
of cables in the probe. The beamforming also reduces the number of high speed output drivers required inside the probe.
For the proposed beamformer design an SI implementation has been used due to the ease of summation as the signal is
current. Also unlike switched-capacitor circuits, the SI circuits can use non-linear capacitors (eg. MOSFET gate capacitor)
which can be much smaller in area. Unlike the conventional SI memory, a new analog RAM (ARAM) architecture has
been proposed where external capacitors are used and the bias current is shared to reduce the overall power consumption.
In-probe Ultrasound Imaging and Beamforming
A. In-probe ultrasound imaging
Figure 1. Typical In-probe ultrasound imaging system. It consists of both transmitter and receiver.
Page | 1
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
The non-invasive applications of ultrasound have been used for gynecology, obstetric, and cardiology applications.
However the invasive ultrasound, also known as IVUS, is emerging as a very useful tool in plaque determination inside the
coronary arteries. The transducer in case of IVUS is placed on the tip of a probe and is sent inside the body using a catheter.
The diameter of the catheter is typically 1mm. Usually a small amount of electronics (for transmitting and receiving signal)
is also placed inside the probe along with the transducer for initial signal processing before it is sent out to external
processing unit. This system is called in-probe imaging system. Fig.1 shows a typical in-probe imaging. The red path is the
transmitter path which is usually high voltage system. The green one is the receiver path. It must be noted that each
transducer gets connected of one LNA+TGC assembly. It means as the number of transducers increase, the number of
LNA and TGC will also increase.
There are certain design challenges associated with in-probe systems. Area and power are the most important design
parameters. The size of the artery strongly limits the circuit area. Even with the increase in the number of transducers the
total area of the in-probe system must be kept as small as possible. The area is limited, so the total heat dissipation
capability is also limited and since the electronics is being placed inside the body the temperature rise must be kept below
body temperature. It means that the total power consumption of the circuit must be kept very small.
B. Beamforming
Beamformer is the circuit block in the receiver chain which is used to sum the signal from multiple channels. Each
channel consists of a transducer, the LNA and the TGC. Beamformer performs two very important tasks, first it does
receiver beam focusing and second it reduces the number of cables from in-probe system to external processing unit. The
process of beamforming is explained in Fig.2. The reflected wave-front from a target is usually spherical or elliptical in
Figure 2. Beamforming process using delay and sum method. The wavefront from target is received at different channel at
different time. Delays are added to each channel such that the signals align themselves after delay and can be summed in phase.
shape, which means that the wave-front hits various transducers at different time. In order to focus the beam all the
received signals must be added in phase. To align all the received signals different delays must be added to various
channels. This process is called delay and sum method. This is typically done by sampling the signal and storing the value
in an analog RAM (ARAM). The stored signals can be retrieved after required delay time and send directly to summation
circuitry. The delays are discrete and depend on the sampling frequency (fsample). The advantage of having discrete-time
delays is that the delays are independent of input amplitude and frequency. The maximum delay will be (N-1)/fsample where
N is the number of the memory cells in the ARAM. In literature multiple ARAM designs have been proposed [4]-[7] but
they tend to consume a lot of power. In this paper a new ARAM architecture has been proposed which is switched current
based. External capacitors have been used and bias current is shared to reduce power consumption.
Analog RAM Design using Switch Current Memory
Switched current (SI) memories have been used in the past for medium accuracy application. The major advantages of
using SI memory cells are, firstly they use non-linear capacitors (typically MOS gate capacitors) which occupy very small
area in comparison to switched capacitor circuits which require MIM capacitors. Secondly since for the beamforming
application ultimate aim is summation of signal, it becomes very easy when the signal is current. The problem with
existing SI memory implementations is that they are all power hungry. There have been few Class-AB implementations of
SI cell but they tend to have higher distortion and are slow.
The authors have proposed a new Class-A SI memory architecture for ARAM implementation which reduces the power
consumption without the distortion of Class-AB cells [9]. First we describe a traditional SI memory cell as shown in Fig.
3. The figure shows only the scheme of the cell and not the actual implementation in order to keep the explanation simple.
M3 is the memory transistor and gate capacitor of M3 acts as a storage capacitor. clk1 and clk2 are write and read clocks
Page | 2
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
Figure 4. Scheme of basic SI cell.
resp. and I1 is the bias current. During clk1 (write mode) M3 is configured as diode connected transistor and I 1+Iin flows
through it and causes the gate of M3 to charge to a certain Vgs. When clk2 turns on gate of M3 is still charged to same Vgs
and assuming that Vds across M3 is same in both cases M3 will continue to sink I1+Iin. Hence an output current similar to
input current with opposite phase is obtained.
The problem with basic cell is that input current range depends directly on the bias current, so for a large input range a
larger bias current is needed. When such a cell is used in a memory array there are many cells operating at same time
causing huge amount of current consumption. The proposed architecture shares the bias current between N memory cells,
where N is the size of an ARAM. This causes effective power consumption per memory cell to reduce by N. Since only
one cell is read or written at any time only one bias current can be used and passive memory cells can be used. Here
external capacitors have been used and this capacitor bank is used to store the gate voltage of memory transistor at
Figure 3. Scheme for proposed ARAM architecture with external capacitor bank.
different sampling instances. Any number of such capacitors can be connected without adding to power consumption as
long as bandwidth of the cells is high enough. Fig. 4 shows the scheme for the proposed architecture.
Core is similar to the basic SI memory cell. At the gate of memory transistor the capacitor bank is connected. The
schematic implementation is shown is Fig. 5. Typically for SI memories M1 is used as memory transistor [2] and the
signal is stored on gate of M1(Cgs,M1) but for proposed design additional capacitors (C1-C4) have been added. Each
capacitor acts as memory storage and can be selected using ‘sel’ signals. The memory size/depth is determined by number
of such capacitors in parallel.
There are many advantages of sharing bias current. First, the effective current consumption per memory cell becomes
Ibias/N, where N is number of memory cells or depth of memory. Second, a very high input current range and bandwidth
can be maintained, which results in linearity comparable to Class-A circuits but power consumption is close to Class-AB
designs. Since the memory elements are passive components the memory size can be increased without increasing the bias
current. It means a large memory size can be obtained for smaller current consumption. The upper limit on number of
capacitors that can be connected to the core depends on the bandwidth of the ARAM. The unity-gain bandwidth is given
by
Page | 3
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
fugb 
gm,M 1
2  Cgs , M 1  N .Cselect  CMEM 
(1)
where, Cselect is the capacitance of the select switches, and CMEM is the memory capacitor. As ‘N’ increases bandwidth
of the ARAM reduces. To ensure proper settling, the bandwidth must be 5-6 times the write cycle length (twrite=1/2.fsample).
fugb  10 * f sample
(2)
Due to the use of external capacitor instead of MOSFET gate capacitor there is small amount of error due to charge
sharing between the two capacitors

  Vin 
Cgs , M 1
 Cgs , M 1  CMEM



(3)
As can be seen from (3) the error is input dependent therefore it must be small to maintain linearity. There can be two
ways to reduce this error. One ways is to use two different cores for read and write. The core used for read mode will be
reset during write and hence every time Cgs1 will be add a finite known amount of charge. Error in this case will not be
input dependent and calibration can be used. Only drawback is that using two cores will double the power consumption of
the circuit. Another way to reduce the error is to make the memory capacitor much larger than Cgs1, which results in some
area penalty. The advantage here is the ability to use the MOSFET oxide capacitance as external capacitor, thus saving
Figure 5. Schematic implementation of the proposed architecture showing a memory depth of 4 cells. Actual implementation
has 16 cells in each ARAM. Read and write clocks and select signals are generated on the chip using a controlled. All the
cascode biases are also generated on the chip inside a global bias generation block.
area as compared to metal capacitor. If N-bit of accuracy of required from the beam-former than we can easily conclude
the size for memory capacitor such that Cgs1/(Cgs1+CM) << 1/2N. The choice between two options depends on power and
area constraints. For this work single core implementation with larger capacitors was used. By making CMEM very large
compared to Cgs,M1 it can be reduced. For the current design CMEM is close to 15 pF while Cgs,M1 is around 15 fF.
Current mode fully differential designs may have a higher contribution from second harmonic than their voltage domain
counterparts. This is due to the strong dependence of gain on gm in current mode circuits as compared to voltage mode.
The input current in fully differential mode will be different for two halves and so will be the corresponding g m values. It
is therefore imperative that a high overdrive must be used for memory biasing transistor (M1 in this case). It will reduce
the change in gm with input current and hence suppressing the second harmonic. Noise is another important factor which
must be considered in this design. Current based circuits tend to have lower Signal to Noise Ratio (SNR) owing to the
small voltage swing. The cascode transistors do not add significant amount of noise current to the output for the
frequencies under consideration. Sizing of the memory and bias transistors must be done such that the output noise current
is minimized. This can be done by keeping gm small for both the transistors. The issue in reducing the gm is that it will
limit the signal current amplitude and hence impacting the overall dynamic range. In this design the size of transistors has
therefore been optimized to maximize the dynamic range.
The sequence of signals is very important in this circuit. To avoid any charge injection due to read/write switches, ‘select’
signals are turned off first. Moreover the switches for ‘select’ signals have dummy transistors to avoid charge injection.
Similarly ϕwrite_e turns off before ϕwrite and has dummy transistor to avoid charge injection. A digital controller generates all
the switching signals including clocks and select signals. There is a block to generate bias voltages for M2-M4. This block
can easily be shared between all the channels of the beamformer therefore the current consumption and the noise
Page | 4
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
contribution from this block is negligible. The offset between write and read clocks must be varied for beamforming.
There are four bits (Offset1-4) which control the delay between read and write. Code ‘0000’ means no delay whereas
‘1111’ indicated maximum delay. For the ease of testing currently these bits are provided externally. There is a voltage to
current converter (V2I) block [3] before ARAM so that a voltage input can be used during testing. For the current
beamformer 16 channels have been summed together, each implemented using one ARAM having depth of 16. The
sampling frequency is 25 MHz providing unit delay of 40 ns and the maximum delay is 600 ns. To perform summation the
output signals can be simply connected together.
Results
The circuit implementation and layout was done using a 180nm single poly, six metal process. The area of one channel
with 16 memory cells is 250x150µm2. There are 16 channels, whose output current is summed and converted to voltage
Figure 6. FFT plot after summation of 16 channels when input p-p signal is 1.2V, input frequency is 10 MHz. Sampling
frequency is 25 MHz
using a TIA. An external trans-impedance amplifier (TIA) has been implemented using AD8139 with resistive feedback to
get voltage output. AD8130 has been used for differential to single-ended conversion. For all the SNR and FFT
measurements the input frequency is 10 MHz and sampling frequency is 25 MHz. The beamformer achieves 63dB SNR
for summation of 16 channels. An FFT plot after summation of 16 channels is shown in Fig. 6. The input p-p voltage for
the FFT is 1.2 V.
Because this is a fully differential architecture, HD2 is very small, but HD3 can be seen close to 5 MHz after getting
folded back into the Nyquist band. There are some spurs visible in the spectrum besides the harmonics. These are
intermodulation products due to coupling between the clock and signal frequency. These spurs increase in magnitude as
the number of channels increase which means they act as coherent noise sources.
Table 1: Beamformer measured results and comparison with the state-of-the-art
Parameter
This work
[10]
[11]
[12]
Technology
0.18 µm CMOS
0.13 µm CMOS
0.35 µm CMOS
0.8 µm CMOS
Supply
1.8 V
1.2 V
3.3 V
5V
# of channels
16
32
9
16
#unit delay per channel*
16
192
8
72
Current/unit delay(µA)
35.15
33**
10
194
Area/unit delay(µm2)
4960
2673
3472
62500
Input Frequency(MHz)
10
1.25
5
2
Sampling Frequency(MHz)
25
5
25
64
Unit delay (ns)
40
4
40
15.62
Max. Delay
600 ns
748 ns
280 ns
1.09 µs
SNR (dB)
63
-
-
60
NCF (dB)
13
-
-
-
Page | 5
International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX
Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com
To quantify the effect of these spurs on beamformer performance noise correlation factor (NCF) needs to be measured.
It is defined as
 Pnoise,16 channels 

 Pnoise,1channel 
NCF  10 log 
(4)
where, Pnoise,16channels is noise power for 16 channels and Pnoise,1channel is noise power for single channel. NCF is measured by
turning off the input signal source while the clock is kept on. The total noise power at the output is measured while
increasing number of channels. When only uncorrelated noise is present, the NCF is 12dB (10log16). But when correlated
noise sources are present the NCF increases. For proposed design the NCF is measured to be 13dB. This small value of
NCF implies that the summation is linear and the impact of spurs on summation is not significant. Table 1 shows the
measured results for the beamformer and compares it to the state-of-the-art designs [10-12]. The performance of proposed
beamformer is comparable to existing work. It must be noted though that the current consumption in all the other work
increases linearly with increase in memory size whereas in this work the current consumption does not increase linearly
with memory size. In fact as long as bandwidth requirement from (2) is met, the memory size can be increased without
any increase in current resulting in a smaller current/unit delay. This is the biggest advantage of the proposed architecture.
Acknowledgment
This work was supported by the Research Council of Norway through the project Microsystems for Medical Ultrasound
Video Cameras (192456) and Maxim Integrated, USA.
Conclusions
A low-power and low-area SI based beamformer is presented. The scalability is the most important aspect of the proposed
architecture where current consumption does not increase linearly with memory size. Non-linear MOS capacitors are used
a passive memory elements. The current/unit delay is 35.15 μA and it can be further reduced. The beamformer achieves
63dB SNR after summation and the summation is linear w.r.t number of channels and input signal.
References
[1]. K. K. Shung, “Diagnostic Ultrasound: Past, Present, and Future,” Journal of Medical and Biomedical Engineering, vol. 31, no.6,
pp. 371-371, 2011.
[2]. B. Goldberg and B. Kimmelman, “Medical disgnostic ultrasound: a retrospective on its 40th anniversary.
[3]. D. A. Christensen. Ultrasonic Bioinstrumentation. New York : John Wiley and Sons, Jan. 1988.
[4]. K. A. Nishimura and P. R. Gray, “A monolithic analog video comb filter in 1.2µm CMOS,” IEEE J. Solid-State Circuits, vol. 28,
pp. 1331-1339, Dec. 1993.
[5]. A. Gerosa, G. M. Cortelazzo, A. Baschirotto, and E. Malavasi, “2-D video SC FIR filters based on analog RAMs’,” IEEE Trans.
Circuits Syst. II, Analog Digit. Signal Process., vol. 7, pp. 1348-1360, Dec. 1999.
[6]. C. Ricardo, A. G. Vazquez, S. Espejo, and R. D. Castro, “A 0.5µm CMOS analog RAM chip for real-time video processing,”
Proceedings of the European Solid-State Circuits Conference, pp. 162-165, 1999.
[7]. J. B. Hughes, and K. W. Moulding, “Switched-current signal processing for video frequencies and beyond,” IEEE J. Solid-State
Circuits, vol. 28, no. 3, pp. 314-322, March 1993.
[8]. S. J. Daubert, D. Vallancourt, and Y. P. Tsividis,“Current Copier cells,” Electronic letters, vol. 24, no. 25, pp. 1560-1562, 1988.
[9]. S. Sharma, and T. Ytterdal, “Low power front end electronics for in-probe beamforming in ultrasound imaging,” Proceedings of
the IEEE Norchip Conference, pp. 1-4, 2013.
[10]. J. Y. Um, Y. J. Kim, S. E. Cho, M. K. Chae, B. Kim, J. Y. Sim, H. J. Park, “A single-chip 32-channel analog beamformer with 4ns delay resolution and 768-ns maximum delay range for ultrasound medical imaging with a linear array transducer,” IEEE
Trans. Biomedical Circuits and Systems, vol. 9, no. 1, pp. 138-151, 2015.
[11]. Z. Yu, M. A. P. Pertijis, and G. C. M. Meijer, “Ultrasound beamformer using pipeline operated S/H delay stages and charge
mode summation,” Electronic letters, vol. 47, no. 18, pp. 1011-1012, September 2011.
[12]. B. Stefanelli, I. O. Connor, L. Quiquerez, A. Kaiser and D. Billet, “An analog beam-forming circuit for ultrasound imaging using
switched-current delay lines,” IEEE J. Solid-State Circuits, vol. 35, no. 2, pp. 202-211, February 2000.
Page | 6