* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download C5-2 A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring
Survey
Document related concepts
Time-to-digital converter wikipedia , lookup
Solar micro-inverter wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Stray voltage wikipedia , lookup
Three-phase electric power wikipedia , lookup
Power inverter wikipedia , lookup
Resistive opto-isolator wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Alternating current wikipedia , lookup
Voltage optimisation wikipedia , lookup
Buck converter wikipedia , lookup
Power electronics wikipedia , lookup
Mains electricity wikipedia , lookup
Transcript
C5-2 A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring a Half-Rate TimeBased Decision Feedback Equalizer Po-Wei Chiu, Somnath Kundu, Qianying Tang and Chris H. Kim Dept. of ECE, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USA (Email:[email protected]) Abstract An all-digital 2-tap half-rate time-based decision feedback equalizer (TB-DFE) was demonstrated on a 10mm on-chip serial link. Implemented in a 65nm GP technology, the transmitter and receiver achieve an energy-efficiency of 31.9 and 45.3 fJ/b/mm, respectively, at a data rate of 10Gb/s. A Bit Error Rate (BER) less than 10-12 was verified for an eye width of 0.43 Unit Interval (UI) using an in-situ BER monitor. Introduction On-chip serial links are attractive for high-speed point-to-point applications as they can achieve 10Gb/s or higher data rates without using power-hungry and floorplan-disrupting repeaters. Several circuit techniques have been proposed to improve the communication speed and energy-efficiency of on-chip serial links. Some recent examples include charge-injection based feedforward equalization [1], capacitive based de-emphasis [2], and current mode transceiver with active inductor [3]. However, these analog-intensive approaches are susceptible to PVT variation and suffer from headroom issues at low operating voltages. Moreover, they do not take full advantage of the technology scaling benefits and require considerable design effort to implement in a new technology. Decision feedback equalization (DFE) has now become indispensable for improving the performance of off-chip links, however they have not been adopted widely in on-chip links due to the large power consumed by the current mode logic (CML) circuits required for high-speed operation. In an effort to make DFE more digital-friendly and amenable to technology scaling, we propose a time-based DFE technique where the weighted sum filter operation is performed entirely in the time domain. Our digital intensive approach utilizes inverters and digitallycontrolled delay elements that can be readily designed in advanced technologies. Another advantage of the proposed TB-DFE is that a higher number of taps can be incorporated by simply adding more delay stages. Time-Based DFE A comparison between conventional and proposed TB-DFE is shown in Fig. 1. The output voltage of a conventional CML based DFE can be expressed as ððð·ð·ð·ð·ð·ð· = ððð ð ð ð (ðĄðĄ) + âðð ðĨðĨ[ðð â ðð] â ðĪðĪðð where x[n-i] is the preceding binary data and wi is the weight of each tap. This voltage is determined by the current through the multiple pull-up and pull-down paths in the CML circuit, and is converted to a binary value by a slicer. The proposed time-based scheme converts the voltage signal into a time delay, and the basic operation can be described as ððð·ð·ð·ð·ð·ð· = ððð ð ð ð (ðĄðĄ) + âðð ðĨðĨ[ðð â ðð] â ðĪðĪðð . Delay TRX(t) is a function of the incoming channel voltage VRX(t) which is equalized using the previous binary outputs x[n-i] and weights wi. The phase detector (PD) compares the phase difference between the DFE path and the reference path to generate a binary output. This is equivalent to the slicer operation in a conventional DFE, but in the time domain. Fig. 2 shows the operation example of the proposed TB-DFE for a bit sequence of â0100â. Due to channel inter-symbolinterference (ISI), the received voltage signal VRX(t) will be distorted, which results in a smaller sensing margin between data â1â and data â0â in the time domain. The reduced phase difference between the DFE path signal NDFE and reference path signal NREF C56 978-4-86348-606-5 ©2017 JSAP may lead to an incorrect decision. The TB-DFE feedback loop can enhance the phase difference by utilizing the previous decision data and DFE weights wi. 65nm Transceiver Design Fig. 3 shows the 10mm on-chip transceiver system we implemented in a 65nm GP test chip. It consists of a half-rate feed forward equalizer (FFE) on the transmitter side, and an inverter based TIA followed by a TB-DFE on the receiver side. TX data is generated by a 215-1 pseudo random bit generator (PRBS). A highswing voltage-mode output driver with a 3-tap half-rate FFE deemphasizes the output signal and transmits data over a 10 mm channel. The output driver is implemented using a bank of inverters with resistors connected between the driving transistors and channel for TX impedance matching. An inverter based TIA containing a bank of programmable transmission gates enables RX impedance matching. An in-situ BER monitor was implemented for BER bathtub and eye diagram measurements. Details of the 2-tap half-rate TB-DFE are shown in Fig. 4. Each delay stage is composed of three parallel tri-state inverters, an always-enabled inverter, and three MOS capacitors to support high speed operation and fine-grain programmable delays. The first delay stage is controlled directly by the analog voltage VRX(t) while the second delay stage is controlled by the binary feedback data. The third stage serves as the offset delay ÎT for eye diagram measurements, which is equivalent to the offset voltage in conventional DFEs. The inverter based implementation significantly reduces the circuit complexity and makes it easier to re-design the DFE circuit in a new technology. To further optimize the circuit, one of the two taps was folded into the reference path with a complementary weight configuration. With the modification, the number of inverters and capacitors is cut down by half which also reduces power consumption and chip area. A zero-offset aperture PD was adopted in our design [4]. In-situ BER Monitor and Measurement Results Fig. 5 shows the proposed circuit for in-situ BER bathtub and eye diagram measurements. The programmable phase delay allows the clock to sample data over a 2 UI range. Since the proposed TB-DFE converts voltage into time, instead of sweeping the voltage offset, we sweep the time offset ÎT to create the BER eye diagram. This was implemented by adding a third delay stage in the TB-DFE delay lines which allows the delay offset to be swept up to one tap delay. The BER monitor compares the TX data from the 215-1 PRBS with the DFE output. Finally, the error count is tallies and serially read out. Fig. 6 shows the bathtub curves with and without TB-DFE for two consecutive bits. The TB-DFE lowers the BER floor from 10-10 to 10-12 while ensuring an eye width of 0.43 UI. Fig. 7 shows the BER eye diagram. The delay offset was swept from 0ps to 18ps using a 6-bit control code. Fig. 8 shows the energy-efficiency and data rate measured at different supply voltages. To save test time, only the data point at 1.2V is based on a BER criteria of 10-12 while the other data points are for a BER of 10-9. The die photo and summary table are shown in Figs. 9 and 10, respectively. The transmitter and receiver blocks (not including the BER monitor) achieve an energy-efficiency of 30.9 and 45.3 fJ/b/mm, respectively, at a data rate of 10 Gb/s. References [1] B. Kim et al., ISSCC, 2009. [2] D. Walter et al., ISSCC, 2012. [3] S. Lee et al., ISSCC, 2013. [4] S. Kundu et al., ISSCC, 2016. [5] M. Chen et al., VLSI, 2015. 2017 Symposium on VLSI Circuits Digest of Technical Papers Conventional CML-based DFE W1 ... Data bypass to BER Mon. Output ... WN Z-1 WN ... Z-1 VRX(t) WN ... ... CLK (5GHz) W1 W2 NDFE CLK NREF Reference Delay 4X 1X 2X x[n] PD 8X TIA PD=Phase Detector TRX TW1X1 CLK (5GHz) RST W2 1 0 W1 W2 W1 0 D=0 D=1 w/o DFE EN 0 NDFE CLK Enhanced window Fig. 2. Illustration of time-based DFE compensation for data stream â0100â. The time difference window expands with DFE. 65nm GP, 1.2V, 25ºC D[n+1] ÎT Delay Offset (Code) -6 10 -9 10 10 -1.2 -0.9 -0.6 -0.3 0 0.3 0.6 0.9 2X 1X 1X 2X 4X (30×59 Ξm ) (23×24 Ξm2) RX BER TX Mon. 10mm Link 100 Ξm Fig. 9. 65nm chip micrograph 1X 2X 4X 4X 1.2 TREF DFE To TRX DFE To+ÎT* *Programmable delays for bathtub and BER eye measurements 65nm GP, 1.2V, 25ºC BER 10-3 10-4 16 10-5 24 10-6 32 10-7 40 10-8 48 10-9 56 10 64 -1.2 -0.9 -0.6 -0.3 TX and RX D' PD D Error count Fig. 5. Circuit blocks used for bathtub and BER eye diagram measurement. 8 Technology BER Monitor Err 215-1 PRBS 0 0.3 0.6 0.9 1.2 ISSCC'09 [1] 90nm Current mode driver+TIA -10 10-11 Phase (UI) Phase (UI) Fig. 6. Measured BER bathtub curves Fig. 7. Measured BER eye diagram for two -12 (<10 ) for two consecutive bits. consecutive bits. 2 D W<2:0> 65nm GP, 25ºC 40 BER=10-9 11 Data Rate (Gb/s) 10 Phase Delay* 0 w/o DFE w/ DFE -3 FF -W2 EN 1X FF PD Fig. 4. Half rate 2-tap time-based DFE circuit implementation. NREF w/ DFE W1,W2: 6 bit DFE weigts 2X 4X FF W<5:3> D 4X Energy-Efficiency (fJ/b/mm) W1 FF W1 TREF T-W2X2 VRX VRX W2 PD To NDFE w/ DFE Bit Error Rate To+ÎT D[n] To+ÎT 4X -12 TRX TW1X1 Zero-offset aperture PD [4] NREF w/o DFE PD Out To W1 D Q VRX(t) -W2 TREF T-W2X2 VRX(t) Output Fig. 1. Conventional DFE and time-based DFE. W1 D[n]= sign(TREF TRX-TW1X1-TW2X2) Programmable delay offset = ÎT Z-1 Z-1 In-situ BER Monitor Fig. 3. Block diagram of 10mm on-chip serial link test chip. Z-1 Proposed Time-Based DFE Z-1 D D' Programmable Delay CLK Generator W1 W2 2-tap Time-based DFE (Half-rate) TIA Parallel to serial Current mode logic Slicer W=2Ξm, L=10mm, M9 TX with 3-tap FFE (Half-rate) 11b Counter ÎĢ VRX(t) IBIAS VDFE 215-1 PRBS Generator x[n] 35 10 30 9 25 8 BER=10-12 20 15 0.95 1.00 1.05 1.10 1.15 7 1.20 6 Supply Voltage (V) Fig. 8. Supply voltage versus data rate and energy-efficiency. ISSCC'12 [2] ISSCC'13 [3] 65nm 65nm Voltage mode Current mode driver+sense amp. driver+sense amp. VLSI'15 [5] 65nm CTLE-based repeater This work 65nm Voltage mode driver+TIA Features No DFE No DFE No DFE No DFE 2-tap TB-DFE Data Rate Throughput (Gb/s/Ξm) Link Length BER Bathtub BER Eye Energy Efficiency (fJ/b/mm) 4Gb/s 10Gb/s 3Gb/s 4Gb/s 10Gb/s 2 2.56 0.75 4 2 10mm < 10E-6 Yes (< 10E-6) 6mm < 10E-12 No 10mm < 10E-12 Yes (< 10E-12) 2.5mm+2.5mm < 10E-12 No 35.6 174 9.5 48.4 10mm < 10E-12 Yes (< 10E-11) TIA 14.4 DFE 30.9 FFE 31.9 Fig. 10. Comparison with state-of-the-art on-chip serial links. 2017 Symposium on VLSI Circuits Digest of Technical Papers C57