Download Paper Title (use style: paper title)

International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Low-power low-area beamformer design using switched-current ARAM for In-Probe Ultrasound Imaging Surya Sharma1, Trond Ytterdal2 12 Department of Electronics and Telecommunications Norwegian University of Science and Technology, Trondheim, Norway Abstract: A switched-current (SI) based beamformer, suitable for low-power and low-area applications, is presented. The most important feature of the proposed beamformer is that the current consumption does not increase linearly with memory size. This is a fully differential design implemented using delay and sum approach. An Analog RAM (ARAM) has been used to implement the delay and each memory cell in the ARAM has been implemented using an SI circuits. There are 16 memory cells in each ARAM consuming 35.15 μA/cell. The signalto-noise ratio (SNR) of 63 dB is measured after summation of 16 channels. The total power consumption of beamformer including the bias, and digital control is 9 mA and area is 1.27 mm2. Keywords: Analog beamformer, Analog RAM, Low-power design, Switched-current circuits, Ultrasound. Introduction Ultrasound imaging is an important tool in the medical diagnostics [1, 2] which provides low cost and portable imaging solution for non-invasive applications [3]. For invasive applications however the better image quality demands a large number of transducers on the probe. As a result it becomes necessary to have in-probe beamforming to reduce the number of cables in the probe. The beamforming also reduces the number of high speed output drivers required inside the probe. For the proposed beamformer design an SI implementation has been used due to the ease of summation as the signal is current. Also unlike switched-capacitor circuits, the SI circuits can use non-linear capacitors (eg. MOSFET gate capacitor) which can be much smaller in area. Unlike the conventional SI memory, a new analog RAM (ARAM) architecture has been proposed where external capacitors are used and the bias current is shared to reduce the overall power consumption. In-probe Ultrasound Imaging and Beamforming A. In-probe ultrasound imaging Figure 1. Typical In-probe ultrasound imaging system. It consists of both transmitter and receiver. Page | 1 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com The non-invasive applications of ultrasound have been used for gynecology, obstetric, and cardiology applications. However the invasive ultrasound, also known as IVUS, is emerging as a very useful tool in plaque determination inside the coronary arteries. The transducer in case of IVUS is placed on the tip of a probe and is sent inside the body using a catheter. The diameter of the catheter is typically 1mm. Usually a small amount of electronics (for transmitting and receiving signal) is also placed inside the probe along with the transducer for initial signal processing before it is sent out to external processing unit. This system is called in-probe imaging system. Fig.1 shows a typical in-probe imaging. The red path is the transmitter path which is usually high voltage system. The green one is the receiver path. It must be noted that each transducer gets connected of one LNA+TGC assembly. It means as the number of transducers increase, the number of LNA and TGC will also increase. There are certain design challenges associated with in-probe systems. Area and power are the most important design parameters. The size of the artery strongly limits the circuit area. Even with the increase in the number of transducers the total area of the in-probe system must be kept as small as possible. The area is limited, so the total heat dissipation capability is also limited and since the electronics is being placed inside the body the temperature rise must be kept below body temperature. It means that the total power consumption of the circuit must be kept very small. B. Beamforming Beamformer is the circuit block in the receiver chain which is used to sum the signal from multiple channels. Each channel consists of a transducer, the LNA and the TGC. Beamformer performs two very important tasks, first it does receiver beam focusing and second it reduces the number of cables from in-probe system to external processing unit. The process of beamforming is explained in Fig.2. The reflected wave-front from a target is usually spherical or elliptical in Figure 2. Beamforming process using delay and sum method. The wavefront from target is received at different channel at different time. Delays are added to each channel such that the signals align themselves after delay and can be summed in phase. shape, which means that the wave-front hits various transducers at different time. In order to focus the beam all the received signals must be added in phase. To align all the received signals different delays must be added to various channels. This process is called delay and sum method. This is typically done by sampling the signal and storing the value in an analog RAM (ARAM). The stored signals can be retrieved after required delay time and send directly to summation circuitry. The delays are discrete and depend on the sampling frequency (fsample). The advantage of having discrete-time delays is that the delays are independent of input amplitude and frequency. The maximum delay will be (N-1)/fsample where N is the number of the memory cells in the ARAM. In literature multiple ARAM designs have been proposed [4]-[7] but they tend to consume a lot of power. In this paper a new ARAM architecture has been proposed which is switched current based. External capacitors have been used and bias current is shared to reduce power consumption. Analog RAM Design using Switch Current Memory Switched current (SI) memories have been used in the past for medium accuracy application. The major advantages of using SI memory cells are, firstly they use non-linear capacitors (typically MOS gate capacitors) which occupy very small area in comparison to switched capacitor circuits which require MIM capacitors. Secondly since for the beamforming application ultimate aim is summation of signal, it becomes very easy when the signal is current. The problem with existing SI memory implementations is that they are all power hungry. There have been few Class-AB implementations of SI cell but they tend to have higher distortion and are slow. The authors have proposed a new Class-A SI memory architecture for ARAM implementation which reduces the power consumption without the distortion of Class-AB cells [9]. First we describe a traditional SI memory cell as shown in Fig. 3. The figure shows only the scheme of the cell and not the actual implementation in order to keep the explanation simple. M3 is the memory transistor and gate capacitor of M3 acts as a storage capacitor. clk1 and clk2 are write and read clocks Page | 2 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com Figure 4. Scheme of basic SI cell. resp. and I1 is the bias current. During clk1 (write mode) M3 is configured as diode connected transistor and I 1+Iin flows through it and causes the gate of M3 to charge to a certain Vgs. When clk2 turns on gate of M3 is still charged to same Vgs and assuming that Vds across M3 is same in both cases M3 will continue to sink I1+Iin. Hence an output current similar to input current with opposite phase is obtained. The problem with basic cell is that input current range depends directly on the bias current, so for a large input range a larger bias current is needed. When such a cell is used in a memory array there are many cells operating at same time causing huge amount of current consumption. The proposed architecture shares the bias current between N memory cells, where N is the size of an ARAM. This causes effective power consumption per memory cell to reduce by N. Since only one cell is read or written at any time only one bias current can be used and passive memory cells can be used. Here external capacitors have been used and this capacitor bank is used to store the gate voltage of memory transistor at Figure 3. Scheme for proposed ARAM architecture with external capacitor bank. different sampling instances. Any number of such capacitors can be connected without adding to power consumption as long as bandwidth of the cells is high enough. Fig. 4 shows the scheme for the proposed architecture. Core is similar to the basic SI memory cell. At the gate of memory transistor the capacitor bank is connected. The schematic implementation is shown is Fig. 5. Typically for SI memories M1 is used as memory transistor [2] and the signal is stored on gate of M1(Cgs,M1) but for proposed design additional capacitors (C1-C4) have been added. Each capacitor acts as memory storage and can be selected using ‘sel’ signals. The memory size/depth is determined by number of such capacitors in parallel. There are many advantages of sharing bias current. First, the effective current consumption per memory cell becomes Ibias/N, where N is number of memory cells or depth of memory. Second, a very high input current range and bandwidth can be maintained, which results in linearity comparable to Class-A circuits but power consumption is close to Class-AB designs. Since the memory elements are passive components the memory size can be increased without increasing the bias current. It means a large memory size can be obtained for smaller current consumption. The upper limit on number of capacitors that can be connected to the core depends on the bandwidth of the ARAM. The unity-gain bandwidth is given by Page | 3 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com fugb  gm,M 1 2  Cgs , M 1  N .Cselect  CMEM  (1) where, Cselect is the capacitance of the select switches, and CMEM is the memory capacitor. As ‘N’ increases bandwidth of the ARAM reduces. To ensure proper settling, the bandwidth must be 5-6 times the write cycle length (twrite=1/2.fsample). fugb  10 * f sample (2) Due to the use of external capacitor instead of MOSFET gate capacitor there is small amount of error due to charge sharing between the two capacitors    Vin  Cgs , M 1  Cgs , M 1  CMEM    (3) As can be seen from (3) the error is input dependent therefore it must be small to maintain linearity. There can be two ways to reduce this error. One ways is to use two different cores for read and write. The core used for read mode will be reset during write and hence every time Cgs1 will be add a finite known amount of charge. Error in this case will not be input dependent and calibration can be used. Only drawback is that using two cores will double the power consumption of the circuit. Another way to reduce the error is to make the memory capacitor much larger than Cgs1, which results in some area penalty. The advantage here is the ability to use the MOSFET oxide capacitance as external capacitor, thus saving Figure 5. Schematic implementation of the proposed architecture showing a memory depth of 4 cells. Actual implementation has 16 cells in each ARAM. Read and write clocks and select signals are generated on the chip using a controlled. All the cascode biases are also generated on the chip inside a global bias generation block. area as compared to metal capacitor. If N-bit of accuracy of required from the beam-former than we can easily conclude the size for memory capacitor such that Cgs1/(Cgs1+CM) << 1/2N. The choice between two options depends on power and area constraints. For this work single core implementation with larger capacitors was used. By making CMEM very large compared to Cgs,M1 it can be reduced. For the current design CMEM is close to 15 pF while Cgs,M1 is around 15 fF. Current mode fully differential designs may have a higher contribution from second harmonic than their voltage domain counterparts. This is due to the strong dependence of gain on gm in current mode circuits as compared to voltage mode. The input current in fully differential mode will be different for two halves and so will be the corresponding g m values. It is therefore imperative that a high overdrive must be used for memory biasing transistor (M1 in this case). It will reduce the change in gm with input current and hence suppressing the second harmonic. Noise is another important factor which must be considered in this design. Current based circuits tend to have lower Signal to Noise Ratio (SNR) owing to the small voltage swing. The cascode transistors do not add significant amount of noise current to the output for the frequencies under consideration. Sizing of the memory and bias transistors must be done such that the output noise current is minimized. This can be done by keeping gm small for both the transistors. The issue in reducing the gm is that it will limit the signal current amplitude and hence impacting the overall dynamic range. In this design the size of transistors has therefore been optimized to maximize the dynamic range. The sequence of signals is very important in this circuit. To avoid any charge injection due to read/write switches, ‘select’ signals are turned off first. Moreover the switches for ‘select’ signals have dummy transistors to avoid charge injection. Similarly ϕwrite_e turns off before ϕwrite and has dummy transistor to avoid charge injection. A digital controller generates all the switching signals including clocks and select signals. There is a block to generate bias voltages for M2-M4. This block can easily be shared between all the channels of the beamformer therefore the current consumption and the noise Page | 4 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com contribution from this block is negligible. The offset between write and read clocks must be varied for beamforming. There are four bits (Offset1-4) which control the delay between read and write. Code ‘0000’ means no delay whereas ‘1111’ indicated maximum delay. For the ease of testing currently these bits are provided externally. There is a voltage to current converter (V2I) block [3] before ARAM so that a voltage input can be used during testing. For the current beamformer 16 channels have been summed together, each implemented using one ARAM having depth of 16. The sampling frequency is 25 MHz providing unit delay of 40 ns and the maximum delay is 600 ns. To perform summation the output signals can be simply connected together. Results The circuit implementation and layout was done using a 180nm single poly, six metal process. The area of one channel with 16 memory cells is 250x150µm2. There are 16 channels, whose output current is summed and converted to voltage Figure 6. FFT plot after summation of 16 channels when input p-p signal is 1.2V, input frequency is 10 MHz. Sampling frequency is 25 MHz using a TIA. An external trans-impedance amplifier (TIA) has been implemented using AD8139 with resistive feedback to get voltage output. AD8130 has been used for differential to single-ended conversion. For all the SNR and FFT measurements the input frequency is 10 MHz and sampling frequency is 25 MHz. The beamformer achieves 63dB SNR for summation of 16 channels. An FFT plot after summation of 16 channels is shown in Fig. 6. The input p-p voltage for the FFT is 1.2 V. Because this is a fully differential architecture, HD2 is very small, but HD3 can be seen close to 5 MHz after getting folded back into the Nyquist band. There are some spurs visible in the spectrum besides the harmonics. These are intermodulation products due to coupling between the clock and signal frequency. These spurs increase in magnitude as the number of channels increase which means they act as coherent noise sources. Table 1: Beamformer measured results and comparison with the state-of-the-art Parameter This work [10] [11] [12] Technology 0.18 µm CMOS 0.13 µm CMOS 0.35 µm CMOS 0.8 µm CMOS Supply 1.8 V 1.2 V 3.3 V 5V # of channels 16 32 9 16 #unit delay per channel* 16 192 8 72 Current/unit delay(µA) 35.15 33** 10 194 Area/unit delay(µm2) 4960 2673 3472 62500 Input Frequency(MHz) 10 1.25 5 2 Sampling Frequency(MHz) 25 5 25 64 Unit delay (ns) 40 4 40 15.62 Max. Delay 600 ns 748 ns 280 ns 1.09 µs SNR (dB) 63 - - 60 NCF (dB) 13 - - - Page | 5 International Journal of Enhanced Research Publications, ISSN: XXXX-XXXX Vol. 2 Issue 4, April-2013, pp: (1-4), Available online at: www.erpublications.com To quantify the effect of these spurs on beamformer performance noise correlation factor (NCF) needs to be measured. It is defined as  Pnoise,16 channels    Pnoise,1channel  NCF  10 log  (4) where, Pnoise,16channels is noise power for 16 channels and Pnoise,1channel is noise power for single channel. NCF is measured by turning off the input signal source while the clock is kept on. The total noise power at the output is measured while increasing number of channels. When only uncorrelated noise is present, the NCF is 12dB (10log16). But when correlated noise sources are present the NCF increases. For proposed design the NCF is measured to be 13dB. This small value of NCF implies that the summation is linear and the impact of spurs on summation is not significant. Table 1 shows the measured results for the beamformer and compares it to the state-of-the-art designs [10-12]. The performance of proposed beamformer is comparable to existing work. It must be noted though that the current consumption in all the other work increases linearly with increase in memory size whereas in this work the current consumption does not increase linearly with memory size. In fact as long as bandwidth requirement from (2) is met, the memory size can be increased without any increase in current resulting in a smaller current/unit delay. This is the biggest advantage of the proposed architecture. Acknowledgment This work was supported by the Research Council of Norway through the project Microsystems for Medical Ultrasound Video Cameras (192456) and Maxim Integrated, USA. Conclusions A low-power and low-area SI based beamformer is presented. The scalability is the most important aspect of the proposed architecture where current consumption does not increase linearly with memory size. Non-linear MOS capacitors are used a passive memory elements. The current/unit delay is 35.15 μA and it can be further reduced. The beamformer achieves 63dB SNR after summation and the summation is linear w.r.t number of channels and input signal. References [1]. K. K. Shung, “Diagnostic Ultrasound: Past, Present, and Future,” Journal of Medical and Biomedical Engineering, vol. 31, no.6, pp. 371-371, 2011. [2]. B. Goldberg and B. Kimmelman, “Medical disgnostic ultrasound: a retrospective on its 40th anniversary. [3]. D. A. Christensen. Ultrasonic Bioinstrumentation. New York : John Wiley and Sons, Jan. 1988. [4]. K. A. Nishimura and P. R. Gray, “A monolithic analog video comb filter in 1.2µm CMOS,” IEEE J. Solid-State Circuits, vol. 28, pp. 1331-1339, Dec. 1993. [5]. A. Gerosa, G. M. Cortelazzo, A. Baschirotto, and E. Malavasi, “2-D video SC FIR filters based on analog RAMs’,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 7, pp. 1348-1360, Dec. 1999. [6]. C. Ricardo, A. G. Vazquez, S. Espejo, and R. D. Castro, “A 0.5µm CMOS analog RAM chip for real-time video processing,” Proceedings of the European Solid-State Circuits Conference, pp. 162-165, 1999. [7]. J. B. Hughes, and K. W. Moulding, “Switched-current signal processing for video frequencies and beyond,” IEEE J. Solid-State Circuits, vol. 28, no. 3, pp. 314-322, March 1993. [8]. S. J. Daubert, D. Vallancourt, and Y. P. Tsividis,“Current Copier cells,” Electronic letters, vol. 24, no. 25, pp. 1560-1562, 1988. [9]. S. Sharma, and T. Ytterdal, “Low power front end electronics for in-probe beamforming in ultrasound imaging,” Proceedings of the IEEE Norchip Conference, pp. 1-4, 2013. [10]. J. Y. Um, Y. J. Kim, S. E. Cho, M. K. Chae, B. Kim, J. Y. Sim, H. J. Park, “A single-chip 32-channel analog beamformer with 4ns delay resolution and 768-ns maximum delay range for ultrasound medical imaging with a linear array transducer,” IEEE Trans. Biomedical Circuits and Systems, vol. 9, no. 1, pp. 138-151, 2015. [11]. Z. Yu, M. A. P. Pertijis, and G. C. M. Meijer, “Ultrasound beamformer using pipeline operated S/H delay stages and charge mode summation,” Electronic letters, vol. 47, no. 18, pp. 1011-1012, September 2011. [12]. B. Stefanelli, I. O. Connor, L. Quiquerez, A. Kaiser and D. Billet, “An analog beam-forming circuit for ultrasound imaging using switched-current delay lines,” IEEE J. Solid-State Circuits, vol. 35, no. 2, pp. 202-211, February 2000. Page | 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Paper Title (use style: paper title)