Download A Phase Interpolator CDR with Low-Voltage CML Circuits ..........Li

314 JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 10, NO. 4, DECEMBER 2012 A Phase Interpolator CDR with Low-Voltage CML Circuits Li-Nan Li and Wei-Peng Cai Abstract⎯In this paper, a phase interpolator clock and data recovery (CDR) with low-voltage current mode logic (CML) latched, buffers, and muxes is presented. Because of using the CML circuits, the CDR can operate in a low supply voltage. And the original swing of the differential inputs and outputs is less than that of the CMOS logic. The power supply voltage is 1.2 V, and the static current consumption is about 20 mA. In this phase interpolator CDR, the charge pump and loop filter are replaced by a digital filter. And this structure offers the benefits of increased system stability and faster acquisition. Index Terms⎯Clock and data recovery, current mode logic, low voltage, phase interpolator. 1. Introduction Recently, with the fast increasing information capacity and data rate requirement, high-speed transmission has been a popular issue. In receiver designs[1], clock and data recovery (CDR), which is used to retime and sample the bits, is very important. For better BER (bit error rate), CDR helps the receiver to sample the bits in the middle of the data eyes. This paper presents a PI CDR (phase interpolator CDR) with low-voltage current mode logic (CML) circuits in 0.13 μm CMOS. The CDR in this design has the functions: 1:10 demultiplex, clock and data recovery, and 1 GHz data and clock drivers. The design considerations for PI CDR are the PI resolution, PI phase shift linearity, and the loop latency. In this design, The PI takes 64 steps to travel from phase 0 to 2π. And the digital filter uses 8 UI (unit intervals) to operate the up/down signal from the PD (phase detector). The paper is organized as follows. Section 2 illustrates the CDR architecture. In Section 3, we will discuss the CML circuits which are used to make up the CDR circuit. The simulation results will be presented in Section 4. Manuscript received August 14, 2012; revised November 2, 2012. This work was supported by the Fundamental Research Funds for the Central Universities under Grant No. 2009JBM001. L.-N. Li and W.-P. Cai are with the School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://www.intl-jest.com. Digital Object Identifier: 10.3969/j.issn.1674-862X.2012.04.006 2. PI CDR Design 2.1 Architecture of PI CDR Fig. 1 shows the architecture of PI CDR. References [2]–[4] reported some PI CDR designs. The design methods of those designs are used in our design. Firstly, the SLICER samples the incoming DATA into the data and the phase. Secondly, the data and phase are demuxed, and the differential signal is converted into the single-end signal. In this paper, the data and phase are used to determine whether the phase of the recovered clock is ahead. PD generates an up/down signal to ask DIGITAL FILTER to change the 64 control bits of the recovered clock phase. Finally, PI changes the clock phase with the 64 control bits from the digital filter. 2.2 SLICER Fig. 2 shows the architecture of SLICER. The recovered clock samples the DATA into the data on the rising edge and samples the DATA into the phase on the falling edge. In 8 B/10 B encoding, the maximum length of constant 1’s or 0’s is 5. To obtain a higher confidence of correct detection, we use a demultiplex to deserialize the data bits and the phase bits. The data and phase are output on the rising edge and are deserialized to data<1:4> and phase<1:4>, which are used by PD to determine whether the phase of the clock is ahead. Data D<4:1> DATA+ DEMUX DATA− SLICER P<4:1> Phase CLK+ PD D<4:1> DIGITAL FILTER P<4:1> Filter<0:63> CLK+ PI CLK from PLL 0° 45° 90° 135° 180° 225° 270° 315° Fig.1. Architecture of PI CDR. Data+ Data– CLK+ CLK− Fig. 2. SLICER. Latch Latch CLK+ CLK− CLK+ CLK− CLK+ CLK− CLK+ CLK− Latch Latch Latch Phase+ CLK+ CLK− Phase– Data+ Data– 315 LI et al.: A Phase Interpolator CDR with Low-Voltage CML Circuits 2.3 Demux The demux takes the recovered data and clock from the CDR, deserializes the 1 Gb/s data down to a 10-bit-wide parallel data stream, and deserializes data and phase to 5-bit-wide parallel data stream. A 1:2 CML demux follows the SLICER to deserialize the data and phase stream. Since both data and clock propagate downstream in the demux and the data and clock have been reshaped in the SLICER and 1:2 CML demux, timing constraints are not so seriously demanded as the SLICER and 1:2 CML demux. In order to obtain smaller layout area, the CML demux is used to deserialize 2-bit-wide data stream into 10-bit-wide parallel data stream and convert data and phase into 5-bit-wide parallel data stream, while the differential signal is converted into the single-end signal. Then the CMOS circuit is used to deserialize the 1 Gb/s data down to a 10-bit-wide parallel data stream, and deserialize the data and phase into 5-bit-wide parallel data stream. 2.4 PD PD[5] is made up by eight four XNOR gates. data<n> XNOR phase<n>=UP<n>; phase<n> XNOR data<n+1> =DN<n> The digital logic of PD operates as follows: for any data transition, if the phase bit agrees with the previous data bit, the phase sample is early; if the phase bit agrees with the next data bit, the phase sample is late. The digital filter core is a 2nd order digital filter. The 2nd order digital filter has 2 pole and 1 zero. The transfer function of the 2nd order digital filter is H ( z) = K ⋅ DN<1:4> Input decoder Digital filter core df<5:0> Output decoder ( ) ⎞ ⎟ (1) 2 ⎟ ⎟ ⎠ 2.6 PI In the design of Fig. 7, filter_1<63:0> and filter_2<63: 0> are different. The conventional PI[6] just generates the recovered clock with two pairs of quadrature clock signals 0°, 90°, 180°, and 270°. In this paper, we use two PIs to generate two clock signals which have the same phase, then using the SUM to get the sum of them. It is the way to shape the wave of the clock. Filter<63:0> is the output of the output decoder. In addition, filter_2<63:0> is different from filter_1<63:0>, it also has sixteen 1’s.Filter_1<63:0> is equal to filter<63:0>, and filter<63:0> moves right 16 bits to make up filter_2<63:0>. Table 2: Output decoder process df<5:0> filter<63:0> 0 000000…001111111111111111 1 000000…011111111111111110 … … RESE Signed <3:0> 1− 2z−1 + z−2 ⎛ 1−1/ M 1 1 = K ⋅⎜ + ⋅ ⎜⎜ 1− z−1 M 1− z−1 ⎝ where K = 1 2, M = 32, T = 4 ns . Table 2 shows how the output decoder to generate the control bits of the phase interpolator. 2.5 Digital Filter The digital filter is composed of an input decoder, a digital filter core, and an output decoder. Table 1 shows how the input decoder processes UP<1:4> and DN<1:4>. UP<1:4> 1− (1−1 M ) z−1 … … 62 110000…000011111111111111 63 100000…000111111111111111 filter <63:0> filter_1<63:0> CLK 0° 180° Fig. 3 Structure of digital filter. Table 1: Input decoder process (number of ‘1’ in UP<1:4>) – (number of ‘1’ in DN<1:4>) signed[3:0] –4 1100 –3 1011 –2 1010 –1 1001 0 0000 1 0001 2 0010 3 0011 4 0100 PI 90° 270° SUM SUM CLK+ CLK- filter_2<63:0> 45° 225° 135° 315° Fig. 7. PI. PI JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 10, NO. 4, DECEMBER 2012 316 Vdd Vdd R R R M2 M3 Vout_p Vin_n M4 M5 Vin_p M6 M7 Vin_n CLK+ Vbias Vout_n Vout_p Vout_n Vin_p R M2 M1 Vbias Fig. 8. CML buffer. 3. CML Circuits Design CML circuits are the most popular for high speed applications. They can operate with low signal voltage at a low supply voltage and operate with higher operating frequency than CMOS circuits. CML circuit uses the differential signal which generated from differential stage is more linear than those from single-ended state due to the absence of the even harmonics from the input-output characteristics. A good way was introduced in [7] to reduce the power consumption for CML circuits design and can make the continuous bias current smaller. In this paper, the CML operates with 400 mV signal voltage at 1.2 V supply voltage. The static power consumption Pstatic = Vdd I , where Vdd is the supply voltage and I is the continuous bias current. 3.1 CML Buffer Fig. 8 shows the CML buffer structure[8]. It can be seen that Vout_p and Vout_n vary from Vdd − IR to Vdd. The maximum output swing of a CML buffer is less than that of a CMOS inverter. Due to this, we can find that the CML buffer is an ideal choice for low-voltage integrated circuit design. The value range of the input common-mode level Vcm,in is given as I ⎡ ⎤ Vgs2,3 + Vgs1 + Vthn ≤ Vcm,in ≤ min ⎢Vdd − R + Vthn ,Vdd ⎥ (5) 2 ⎣ ⎦ where Vgs2,3 is the common-mode overdrive voltage of transistors M2 and M3, Vgs1 is the overdrive voltage of transistor M1. M3 CLK- Ibias M1 Fig. 9. CML latch. To meet the requirement of the next buffer, we need IR ≥ ΔVin_next,max . If the next stage is the same CML buffer, the formula becomes IR ≥ 2I . μn Cox (W L) (6) The static power consumption Pstatic = Vdd I , where I should be small to save the static power consumption. The load resistor R should be small to reduce the RC delay and increase the bandwidth. So we should be careful to design the best value of I and R. In order to get a high frequency, the NMOS transistors of the differential pair must operate in the saturation, Vgs2 − Vthn ≤ Vds2 . So we can know that Vin,max − Vthn ≤ Vout ≤ Vdd . (7) 3.2 CML Latch The latch has two levels, Vin_n and Vin_p as the first level, and CLK+ and CLK- as the second level, as shown in Fig. 9. The track and latch modes are determined by CLK+ and CLK- to a second differential pair, M2 and M3. When CLK+ is high and CLK- is low, Ibias flows through M4 and M5. If Vin_n and Vin_p change, Vout_n and Vout_p change too, this is the track mode. In the latch mode, CLK+ is low and CLK- is high, Ibias flows through M6 and M7. If Vout_n is “low” and Vout_p is high in this stage, Vout_n will be low and Vout_p will be high in the next stage. In the CML latch design, the tail current Ibias must be large to achieve a wide range of linearity and a large transconductance. But high frequency does not need a large tail current while a larger tail current causes the larger power consumption. 317 LI et al.: A Phase Interpolator CDR with Low-Voltage CML Circuits Vdd R R Vout_n Vout_p Vin1_n M4 M5 Vin1_p Vin2_n M6 Sel_p M2 M3 M7 Vin2_p Sel_n Ibias Vbias M1 degrades the circuit maximum operation frequency. Fig. 12 shows the clock and data eye traces. The clock rising edge samples the data at the centre of the data eye, which makes sure that the recovery data is correct. Verilog-A is used to build the data generator model, jitter model, cable model, and pad model. ISI is the inter-symbol interference, which is caused by the channel loss, dispersion, and reflections. Sinusoidal jitter (SJ) and random jitter (RJ) are unbounded and modeled with Gaussian distribution. Sign P/N is the jitter of the data from the data generator. Cableoutp/n is the jitter of the data from the cable. Vckipi/ni is the recovered clock jitter. Vrxip/n is the jitter of the input data for CDR. Table 3 shows the all jitters. 500 Y0 (mV) Fig. 10. CML MUX. 3.3 CML MUX Fig. 10 shows the CML MUX. When Sel_p is high and Sel_n is low, Ibias flows through M4 and M5. If Vin1_n and Vin1_p change, Vout_n and Vout_p change too. When Sel_p is low and Sel_n is high, Ibias flows through M6 and M7. Vout_p and Vout_n track Vin2_p and Vin2_n. 0 –250 –500 46.0 46.5 47.0 47.5 Time (ns) 48.0 48.5 49.0 (a) 500 Y0 (mV) 4. Simulation Result Simulations were performed in a SMIC 0.13 μm technology. Fig. 11 shows the simulation results of the same structures with conventional CMOS. And they work with the same incoming data and the same power supply voltage. From Fig. 11, when operating with the 1 Gbs incoming data, the two buffers can work well. But when operating with the 10 Gbs incoming data, we can find the swing of the CMOS buffer obviously reduces. The use of PMOS transistor CMOS CML IN 250 250 0 –250 –500 CMOS CML IN 40.05 40.10 40.15 40.20 40.25 40.30 Time (ns) (b) 40.35 40.40 Fig. 11. CML and CMOS structure of buffer: (a) 1 Gbs incoming data and (b) 10 Gbs incoming data. Fig. 12. 1 GHz clock and 1 GHz data eye traces with random data input. JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 10, NO. 4, DECEMBER 2012 318 Table 3: Jitter result A Thermometer decoder ISI SJ RJ Sign P/N 77 ps 137 ps 140 ps 277 ps Cableout p/n 354 ps The power supply voltage is 1.2 V and the static current consumption is about 20 mA. The input clock frequency of the CDR is 1 GHz and the input clock duty is 50%. CDR band width is 2 MHz. The common mode is 800 mV and the swing is 400 mV. Eyeopening is 200 mV, RJ (peak to peak) is 140 ps, SJ (peak to peak) is 137 ps, and the phase centre delay is about 50 ps. 2 Conclusion This paper discusses a PI CDR circuit which uses low_voltage CML latches, buffers and MUXs. The CML circuits are the best choice for high-speed applications. The maximum output swing of CML circuit is less than that of CMOS circuit, which makes CML circuit an ideal choice for low voltage circuit design. The power supply voltage is 1.2 V and the static current consumption is about 20 mA. References [1] L. Henrickson, D. Shen, U. Nellore, et al., “Low-power fully integrated 10-Gb/s SONET/SDH transceiver in 0.13-μm CMOS,” IEEE Journal of Solid-State Circuits, vol. 38, no. 10, pp. 1595–1601, 2003. [2] B. Abiri, R. Shivnaraine, A. Sheikholeslami, et al., “A 1-to-6Gb/s phase-interpolator-based burst-mode CDR in 65nm CMOS,” in Proc. 2011 IEEE Int. Solid-State Circuits Conf., San Francisco, 2011, pp. 154–156. [3] J. Park, J.-F. Liu, L. R. Carley, and C.-P. Yue, “A 1-V, 1.4-2.5 GHz charge-pump-less PLL for a phase interpolator based CDR,” in Proc. of IEEE Custom Integrated Circuits Conf., San Jose, 2007. pp. 281–284. [4] S.-W. Lee, C.-K. Seong, W.-Y. Choi, and B.-C. Lee, “Clock and data recovery circuit using digital phase aligner and phase interpolator,” in Proc. of the 49th IEEE Midwest Symposium on Circuits and Systems, Puerto Rico, 2006, pp. 690–693. Vrxi p/n 380 ps Vffeo p/n 380 ps Vckipi/ni (recovered clock) 149 ps [5] M.-T. Hsieh and G. E. Sobelman, “Clock and data recovery with adaptive loop gain for spread spectrum SerDes applications,” in Proc. of 2005 IEEE Int. Symposium on Circuits and Systems, Kobe, 2005, pp. 4883–4886. [6] Y. Jiang and A. Piovaccari, “A compact phase interpolator for 3.125G serdes application,” in Proc. of Southwest Symposium on Mixed-Signal Design, Las Vegas, 2003, pp. 249–252. [7] H.-J. Hsu, C.-T. Chiu, and Y. Hsu, “Design of ultra low power CML MUXs and latches with forward body bias,” in Proc. of IEEE Int. SOC Conf., Tampere, 2007, pp. 141–144. [8] P. Heydari and R. Mohavavelu, “Design of ultra high-speed CMOS CML buffers and latches,” IEEE Int. Symposium on Circuits and Systems, Bangkok, 2003, pp. 208–211. Li-Nan Li was born in Shanxi Province, China in 1969. He received the B.S. degree from Harbin Institute of Technology, Harbin in 1991, the M.S. degree from Shanxi Institute of Microelectronics, Xi’an in 1994, and the Ph.D. degree from Institute of Microelectronics of Chinese Academy of Sciences, Beijing in 2001. Now he is an associate professor with Beijing Jiaotong University. His research interests are focused on RF and analog circuit design, submicron CMOS process, and VLSI IC design. Wei-Peng Cai was born in Fujian Province, China in 1989. He received the B.S. degree from Beijing Jiaotong University, Beijing in 2011. He is currently pursuing the M.S. degree with Beijing Jiaotong University. His research interest is focused on analog circuit design.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A Phase Interpolator CDR with Low-Voltage CML Circuits ..........Li