Download Deep-Submicron CMOS Design Methodology for High

Deep-Submicron CMOS Design Methodology for High-Performance LowPower Analog-to-Digital Converters Abstract In this paper, we present a complete design methodology for high-performance low-power Analog-to-Digital Converters in deep submicron CMOS. This methodology is demonstrated on two ADC architectures, Flash and Folding&Interpolating(F&I). The designs were implemented in 0.18µm CMOS technology, achieving a high conversion rate of 2.6 GSamples/s for the Flash converter, and a 1 GSample/s rate for the Folding and Interpolating converter. In addition, the devices achieved a power consumption of 47 mW and 8mW respectively. Compared to previously published designs, this represents a 62.5% improvement in speed and 86% drop in power consumption for the Flash design and 3 times improved sensitivity (DNL) and 3.2 times reduction in power for the F&I design. 1. Introduction With the advancement of System-on-Chip (SOC) architectures, the demand for high performance lowpower designs is ever increasing. To meet these demands, designers are utilizing deep submicron technologies in order to achieve the high speed, and low power design requirements. The demand of System-onChip (SOC) architectures is forcing designers to cope with low-voltage and low-power design specifications for A/D converters. In addition, the frequency requirements are increasing. Thus, the use of deep submicron technology in creating high speed low-power data converters is increasing. The voltage limitations of these technologies are making it more difficult to produce accurate conversion devices due to a reduced operating range. In this paper, we are presenting two common techniques for realizing high-speed data converters in 0.18 µm CMOS technology. While these devices are typically well known, the design methodologies in these technologies are not. 2. ADC Development Methodology 2.1 Component Design 2.1.1 Comparator In developing a high-speed data converter, the main requirement is a well designed comparator. These devices provide the core operation of the converter and are required to have as large of operating range as possible. Some designs, such as the Folding converter discussed here, eases some of the requirements of the converter. However, for the Flash converter, also discussed, the comparator can be the bottleneck of the design for speed and power consumption. The comparator is the core component of A/D converters; it dictates the characteristics of the majority of the design including operating range and can possibly be the frequency limiting device in high speed designs. With a limited supply voltage, such as the ones used in deep submicron technologies, we have to consider the optimization of the available voltage range. Threshold voltages of the transistors in these technologies tend to consume 2/3rds of the operating range. To overcome this, we suggest the use of full swing differential gain stage for the comparator. While rail-to-rail operation is not realistic to achieve for any high-speed converter, a much larger operating range can be realized using this stage. The negative effect of using this technique is that the input capacitance to the circuit effectively doubles. This implies that a large buffer amplifier may be needed to drive this circuit for SOC applications. The comparator consists of two differential pairs tied in parallel, a decision circuit, and a gain/buffer stage, and is depicted in Figure 1. In the first stage an NFET differential pair is tied to a PFET differential pair, and is biased so that this device has a larger operating range than either differential pair alone could produce (large swing design). Following the differential gain stage is the decision circuit. Figure 1 – Comparator Schematic The use of a decision circuit that has positive feedback is relatively simple to design and Baker et. al provides a standard technique for developing these devices for lowresolution applications [7]. While the decision circuit can be very fast, we followed it by an additional gain stage and a buffer. The gain stage is based upon a selfbiasing differential structure. A final buffer stage is implemented using a basic inverter. Sizing of the inverter depends upon the decoder design, and the methodology is typical of any standard digital circuit [7]. This comparator design is more complicated than typical designs, but is responsible for the high accuracy and speed of the converter. The large operating range created by the use of both a PFET and NFET differential pairs, compensates for the low voltage supply that deep sub-micron technologies require. While this is not necessarily the lowest power solution, we found that it provided a high degree of performance. 2.1.2 Folding Amplifiers The folding amplifiers are the main design consideration in this type of A/D converter. By increasing the folding factor, the number of comparators is reduced. Although, this initially seems inherently desirable, it also implies that the folding amplifiers must have a bandwidth proportional to the input frequency multiplied by the folding factor. In addition, using a 1.8 V supply limited our dynamic range. However, we were able to achieve a 1 V operating range (520mV – 1.52 V). This is the common-mode range of our folding circuit. In this range, the folding amplifier behaves linearly. At the lower end, we were able to operate the folding circuit slightly above the threshold voltage of the current source transistors. At the higher end, we were limited by the voltage drop across the load resistors. This reduced the requirement of the comparator’s operating range. Our folding amplifier design consisted of five identical differential pairs with the outputs cross-connected. In designing folding and interpolating A/D converters, the bias current is one of the most important issues to be dealt with. A high bias current is required in order to meet the specified bandwidth. The upper 3dB frequency is given by f OUT = 1 / 2 * π * τ OUT (1) ( ) where τ OUT = RL * COUT . (2) The total capacitance at the output of the folding circuit can be found by summing the load capacitance with the drain-base and gate-drain capacitances. The load resistance is given by RL = 1 / λ * I SS (3) ( ) where λ is the modulation effect. From these three equations, having a high bias current (I SS) reduces the load resistance, which reduces the output time constant. This results in a higher 3dB frequency. Additionally, there must be an equal amount of bias current flowing in each differential pair [1]. Different bias current results in different gain in the folding amplifiers. This distorts the shape of folded signals and thus, changes the positions of their zero-crossings. Therefore, as can be seen in Figure 2 (See Note 1), we used a regulated cascode current source to provide the same amount of bias current. Figure 2 – Schematic of Folding Block comparator and thus, low power techniques can be applied to the digital encoder logic to further reduce Realizing that, if there are mismatch errors in the resistor power consumption. There are a couple of techniques loads, the position errors of the folded signals are further that can be used to reduce the power of this decoder to increased. This can be remedied by designing the match the performance of the comparator, such as differential pairs to have a small VGS close to the reducing the Vdd voltage and gate resizing. However, threshold voltage. When calculating this value, one first these techniques are often unnecessary as the encoder has to define the gate-source biasing voltage in terms of consumes only a small fraction of the power of the the excess gate-source voltage ∆V as analog circuitry. In our Flash implementation, we were VGS = ∆V + VTH . (4) able to reduce the power consumed by 7% by reducing In a standard cascode current source, the voltage seen at the operating voltage of the encoder to 1 V. At this the gate in Figure 2 (See Note 4) is voltage, the encoding circuit speed matched the 2(∆V +VTH ) . (5) Comparator operating speed. If this voltage can be reduced to 2.2 Architecture Design 2∆V +VTH (6) then the voltage on the drain of gate in Figure 2 (See 2.2.1 Flash Converter Design Example Note 5) becomes The Flash converter is composed of a series of VGS = ∆V (7) comparators connected in parallel biased to reference which is the desired value. We used large transistors in voltages that are created via a resistive ladder. A block the differential pairs to avoid mismatch errors such as an diagram is depicted in Figure 3. On the left hand side of offset. The value of the offset voltage is given by the the figure is the typical resistive ladder that is used to fix following equation the reference voltages for each of the comparators. The analog input is tied to each comparator input and is VOS = VTH 1 + 2 * I D1 * L1 / W1 − VTH 2 + 2 * I 2 * L2 / W2 compared to the fixed reference voltage that is By increasing the size of the transistors (making them appropriate for each bit to be determined by the wider) in the differential pairs (See Note 2), we can comparator. The output of each comparator is then reduce the offset voltage. Although there are five latched using D-latches on each clock cycle [5,6,10]. differential pairs, the fifth differential pair is used to These latched outputs are passed into a Fat Tree eliminate the DC component from the output (See Note encoding logic [4]. This logic takes the thermometer 3). logic from the comparators and converts it into, a 6-bit binary representation of the input signal. The encoding 2.1.3 Encoders logic is based upon a Fat Tree encoder. This encoding Fat Tree encoders have excellent performance, over methodology uses a NOR-NAND structure to generate ROM encoding techniques in both speed and power [4]. the desired 6-bit output [4]. The initial converter design Often this encoder can be faster than the consisted of a ROM logic encoder, which consisted of a Thermometer Decoder and then an OR-gate tree structure that was used to generate the desired 6-bit representation. However, due to the lack of symmetry of the logic, this technique proved to be very slow. The maximum speed that the OR-gate logic allowed the converter to run at was 1.67 GHz. In addition, the logic had asynchronous race conditions within its structure, which caused glitches on the output. In order to get around these glitches, we latched the output of the logic with a clock that was out of phase with the clock that was driving the D-latch. We found that the Fat Tree encoder eliminated the asynchronous race conditions, and improved the speed of the converter to 2.6 GHz. Further simulations indicated that the decoder logic would have allowed the converter to run up to 5.5 GHz. Thus, it was determined that the comparator is the frequency limiting component in this design. Figure 4 - Basic Block Diagram of a Folding and Interpolating ADC where in the fold the input voltage is and the course Flash ADC determines which fold the input is in. The number of comparators is reduced by the degree of folding [2]. For a 6-bit implementation, the folding factor needs to be a multiple of two. As the folding factor is increased, each folding block must have a bandwidth of finput * folding factor. Therefore, increasing the folding factor limits the input signal bandwidth as well as deteriorates the linearity of the folded signals. It is for these reasons, that we were limited to a folding factor of four using the 0.18 micron process. Figure 3-Flash Converter Block Diagram 2.2.2 Folding and Interpolating It is well known that, Flash A/D converters offer extremely fast conversion rates at the expense of power consumption and large chip area. These disadvantages arise from the fact that the complexity of the Flash converter grows exponentially as resolution increases because the number of comparators increases by 2n-1 (where n is the number of resolved bits). Most of the comparators are saturated, while only the ones “near” a given input voltage level are required to resolve a small difference. Folding and interpolating ADC’s decrease the number of comparators to about 2n/2 + 1 [8]. In Figure 4 is presented the basic block diagram of a folding and interpolating ADC. In our design, the fine Flash ADC resolves 4 bits by generating a triangular waveform, while the course Flash ADC determines if the output is above or below the mid-scale [3]. In other words, the fine Flash ADC locates Fig. 5a and 5b. The output characteristic of a Folding ADC using a saw tooth (a) and triangular waveform (b) respectively. Ideally, a saw-tooth waveform would be used (see Figure 5a), however this tends to be difficult to implement. A triangular waveform is much more realistic to produce and that is what we have utilized here with folding amplifiers (see Figure 5b) [3]. There are 64 zerocrossings utilized in the folding/interpolating component (See Figure 6, and note the number of intersections along any horizontal line, particularly near 932mV). This comes from the 8 folding blocks * folding factor of 4 * interpolation factor of 2 = 64. For any input voltage within the specified range, only one differential pair is active, while all the others are saturated. The “rounding” of the folds is a result of the limited bandwidth of the folding amplifiers. Each folder output is shifted by an approximately an LSB (17.12 mV). The final requirement in this design was to institute some delay in the course ADC, because these bits would be resolved much faster than the LSBs. Because the delay was so small, a simple implementation of a series of inverters was utilized. 3. Results Figure 6 – Offset Folded Signals This allowed for a more robust design, because now our comparator only needs to detect the zero-crossings. Our final design included 19 comparators, 16 in the LSB encoder and 3 in the 2-bit course Flash A/D converter. This design enables us to have a large operating range due to the reduced requirements on the comparator. The Flash reference levels are not at the extremes of the device, and the folding circuit only uses the comparator as a zero crossing detector. 3.1 Flash ADC The Flash converter that was designed operates at 2.6 GHz with the Fat Tree logic structure. Figure 8, depicts the output timing diagram of the converter with a ramp as the input signal. The clock was running at 2.6 GHz, and there are 20 samples per “step” of the output signal. As can be seen from inspection, the converter is fairly linear within its operating range. In fact, the INL for this converter was measured to be 0.1 LSB and the DNL was 0.2 LSB. Figure 8 – Flash ADC Timing Diagram Figure 7 – Cyclic Code to Thermometer Code Conversion Thus, the required operating range of the comparator is less than the overall device. In both, the course and the fine Flash A/D converter, we used the fat tree implementation to encode the thermometer code [4]. However, in the fine encoder, cyclic thermometer code is produced. In order to go from cyclic thermometer code to thermometer code, we used the LSB of the Flash to toggle between the inverted and non-inverted comparators output values (See Figure 7). The Flash converter gains its speed from utilizing a simple architecture; however, this simplicity comes at a price. The architecture is highly repetitive, and a large number of comparators and latches are used. These items individually don’t consume a large amount of power. However, in the numbers used to build a reasonable resolution converter their combined power consumption is at a significant amount. The converter consumed an average of 47 mW of power with a full swing sine wave input signal that was at the Nyquist frequency of the converter. The Fat Tree encoder’s power consumption is higher than the ROM encoding architecture, but the performance is far superior. When the A/D converter was run at its maximum speed with the ROM encoder it consumed 30mW of power versus 43mW for that Fat Tree upgraded design (both were run at 1.67GHz). This performance of this converter is both lower in power and superior INL/DNL performance when compared to the work presented by Scholtens et al [10]. While Scholtens et al used extensive techniques to improve performance, we focused on an improved comparator design, and utilized a more effective encoding circuit. This resulted in a simplified design that consumed less power and higher overall performance. Table III shows the overall performance versus the work done by [10]’s performance for the Flash converter. Table III – Performance Summary for Flash ADC Technology Max. Sampl. Rate Resolution Operating Range INL DNL Av. Power Power Supply Previous [10] 0.18µ CMOS 1.6 GSample 6 Bits N/A 0.4 LSB N/A 328mW 1.95 Analog 2.35 Digital Current 0.18µ CMOS 2.6 GSample 6 Bits 700-1100mV 0.1 LSB 0.2 LSB 47 mW 1.8 V 3.2 Folding and Interpolating ADC The Folding and Interpolating converter appeared to have a significant amount of non-linear behavior, which is due in part to the large operating range of this device. This non-linearity can be observed in Figure 6 as the zero-crossings do not all occur at the same voltage. However, this nonlinearity was well within our 1 LSB constraint. Although this converter exhibited more nonlinearity than its flash counterpart, compared to previous implementations it performed far better with a reduction of 0.5 LSB as can be seen in Table IV below. This converter has a high degree of performance with over a 1 GSample/second conversion rate. Due to the intense analog nature of this device, it was difficult to achieve the gain and bandwidth necessary to make a viable device in this process, and this is directly reflected by a scarcity or recent publications in this area. Table IV summarizes the performance of this device against the most recent comparable device made by [2], and shows that this new device delivers excellent performance. This is directly reflected by the low power consumption of 8mW in conjunction with a high sampling rate. Table IV – Performance Summary for F/I ADC Performance Metric Technology Max. Sampl. Rate Resolution Operating Range INL DNL Av. Power Power Supply Previous 0.5µ BiCMOS 400 MS/s 6 Bits N/A N/A 0.9 LSB 200mW 3.2V Achieved 0.18µ CMOS 1 GS/s 6 Bits 520-1520mV 0.4 LSB +0.3/-0.2 LSB 8mW 1.8 V 4. Conclusion A design methodology for high-performance lowpower A/D converters in deep submicron technology has been proposed. It was demonstrated on two ADC architectures. Clearly, one can conclude that the Flash converter is the fastest converter running at 2.6 GSamples/second, but at the expense of 47mW of power consumption. However, the folding converter though less accurate (higher INL and DNL), provides a 1 GSamples/second sampling rate at less than 1/5th the power. Clearly the Folding device is more power efficient, and the Flash converter provides maximum performance. When both designs are compared to previously published works, we have achieved a significant improvement in speed, power and sensitivity for both type of architectures. We have achieved a 62.5% improvement in speed and 86% drop in power consumption and 4 times improved sensitivity (INL) for the Flash design and 3 times improved sensitivity (DNL) and 3.2 times reduction in power consumption and higher power efficiency for the F&I design. 5. References [1] T. Kim, J. Sung, S. Kim, W. Joo, S. You and S. Kim. “A 10-bit 40 MSamples/s Cascading Folding & Interpolating A/D Converter with Wide Range Error Correction,” IEEE Circuits and Systems, 2001. [2] Michael P. Flynn and Ben Sheahan. “A 400-MSample/s, 6b CMOS Folding and Interpolating ADC,” IEEE Journal of Solid-State Circuits, Vol. 33, NO. 12, pp.1932-1938, 1998. [3] Robert M. Senger, Paul M. Walsh, and Jerome Le Ny. “A 150 Msamples/s Folding and Current Mode Interpolating ADC in 0.35µm CMOS.” EECS 598-02 Analog to Digital Integrated Circuits, 1-7, 2002. [4] D. Lee, J. Yoo, K. Choi, and J. Ghaznavi. “Fat Tree Encoder Design For Ultra-High Speed Flash A/D Converters.” IEEE Circuits and Systems, 2001. [5] Paul. G.A. Jespers. Integrated Converters D to A and A to D Architectures, Analysis, and Simulation. Oxford: New York, 2001. [6] Alfi Moscovici. High Speed A/D Converters Understanding Data Converters Through Spice. Kluwer Academic Publishers: Massachusetts, 2001. [7] R. Jacob Baker, Harry W. Li, and David E. Boyce. CMOS Circuit Design, Layout, and Simulation. IEEE Press: New York, 1998. [8] T. Kim, J. Sung, and S. Kim. “A Low Power 10-bit 40Msamples/s CMOS Folding and Interpolating ADC with a Novel Architecture.” Korean Conference on Semiconductors, Jan. 2000. [9] C. Lin and B. Liu. “A New Successive Approximation Architecture for Low-Power Low-Cost CMOS A/D Converter,” IEEE Journal of Solid-State Circuits, Vol. 38, NO. 1, pp.54-62, 2003. [10] Peter C.S. Scholtens and Maarten Vertregt. “A 6-b 1.6 Gsample/s Flash ADC in 0.18µm CMOS Using Averaging Termination.” IEEE Journal of Solid-State Circuits, Vol. 37, No.12, pp.1599-1609, December 2002.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Deep-Submicron CMOS Design Methodology for High