Download C5-2 A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time-to-digital converter wikipedia , lookup

Solar micro-inverter wikipedia , lookup

Multidimensional empirical mode decomposition wikipedia , lookup

Stray voltage wikipedia , lookup

Three-phase electric power wikipedia , lookup

Power inverter wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Alternating current wikipedia , lookup

Voltage optimisation wikipedia , lookup

Buck converter wikipedia , lookup

Islanding wikipedia , lookup

Power electronics wikipedia , lookup

Mains electricity wikipedia , lookup

Metadyne wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
C5-2
A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring a Half-Rate TimeBased Decision Feedback Equalizer
Po-Wei Chiu, Somnath Kundu, Qianying Tang and Chris H. Kim
Dept. of ECE, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USA (Email:[email protected])
Abstract
An all-digital 2-tap half-rate time-based decision feedback
equalizer (TB-DFE) was demonstrated on a 10mm on-chip serial
link. Implemented in a 65nm GP technology, the transmitter and
receiver achieve an energy-efficiency of 31.9 and 45.3 fJ/b/mm,
respectively, at a data rate of 10Gb/s. A Bit Error Rate (BER) less
than 10-12 was verified for an eye width of 0.43 Unit Interval (UI)
using an in-situ BER monitor.
Introduction
On-chip serial links are attractive for high-speed point-to-point
applications as they can achieve 10Gb/s or higher data rates
without using power-hungry and floorplan-disrupting repeaters.
Several circuit techniques have been proposed to improve the
communication speed and energy-efficiency of on-chip serial
links. Some recent examples include charge-injection based
feedforward equalization [1], capacitive based de-emphasis [2],
and current mode transceiver with active inductor [3]. However,
these analog-intensive approaches are susceptible to PVT
variation and suffer from headroom issues at low operating
voltages. Moreover, they do not take full advantage of the
technology scaling benefits and require considerable design effort
to implement in a new technology.
Decision feedback equalization (DFE) has now become
indispensable for improving the performance of off-chip links,
however they have not been adopted widely in on-chip links due
to the large power consumed by the current mode logic (CML)
circuits required for high-speed operation. In an effort to make
DFE more digital-friendly and amenable to technology scaling, we
propose a time-based DFE technique where the weighted sum
filter operation is performed entirely in the time domain. Our
digital intensive approach utilizes inverters and digitallycontrolled delay elements that can be readily designed in advanced
technologies. Another advantage of the proposed TB-DFE is that
a higher number of taps can be incorporated by simply adding
more delay stages.
Time-Based DFE
A comparison between conventional and proposed TB-DFE is
shown in Fig. 1. The output voltage of a conventional CML based
DFE can be expressed as 𝑉𝑉𝐷𝐷𝐷𝐷𝐷𝐷 = 𝑉𝑉𝑅𝑅𝑅𝑅 (ð‘Ąð‘Ą) + ∑𝑖𝑖 ð‘Ĩð‘Ĩ[𝑛𝑛 − 𝑖𝑖] ∙ ð‘Īð‘Ī𝑖𝑖 where
x[n-i] is the preceding binary data and wi is the weight of each tap.
This voltage is determined by the current through the multiple
pull-up and pull-down paths in the CML circuit, and is converted
to a binary value by a slicer. The proposed time-based scheme
converts the voltage signal into a time delay, and the basic
operation can be described as 𝑇𝑇𝐷𝐷𝐷𝐷𝐷𝐷 = 𝑇𝑇𝑅𝑅𝑅𝑅 (ð‘Ąð‘Ą) + ∑𝑖𝑖 ð‘Ĩð‘Ĩ[𝑛𝑛 − 𝑖𝑖] ∙ ð‘Īð‘Ī𝑖𝑖 .
Delay TRX(t) is a function of the incoming channel voltage VRX(t)
which is equalized using the previous binary outputs x[n-i] and
weights wi. The phase detector (PD) compares the phase
difference between the DFE path and the reference path to
generate a binary output. This is equivalent to the slicer operation
in a conventional DFE, but in the time domain.
Fig. 2 shows the operation example of the proposed TB-DFE
for a bit sequence of ‘0100’. Due to channel inter-symbolinterference (ISI), the received voltage signal VRX(t) will be
distorted, which results in a smaller sensing margin between data
‘1’ and data ‘0’ in the time domain. The reduced phase difference
between the DFE path signal NDFE and reference path signal NREF
C56
978-4-86348-606-5 ©2017 JSAP
may lead to an incorrect decision. The TB-DFE feedback loop can
enhance the phase difference by utilizing the previous decision
data and DFE weights wi.
65nm Transceiver Design
Fig. 3 shows the 10mm on-chip transceiver system we
implemented in a 65nm GP test chip. It consists of a half-rate feed
forward equalizer (FFE) on the transmitter side, and an inverter
based TIA followed by a TB-DFE on the receiver side. TX data is
generated by a 215-1 pseudo random bit generator (PRBS). A highswing voltage-mode output driver with a 3-tap half-rate FFE deemphasizes the output signal and transmits data over a 10 mm
channel. The output driver is implemented using a bank of
inverters with resistors connected between the driving transistors
and channel for TX impedance matching. An inverter based TIA
containing a bank of programmable transmission gates enables
RX impedance matching. An in-situ BER monitor was
implemented for BER bathtub and eye diagram measurements.
Details of the 2-tap half-rate TB-DFE are shown in Fig. 4. Each
delay stage is composed of three parallel tri-state inverters, an
always-enabled inverter, and three MOS capacitors to support
high speed operation and fine-grain programmable delays. The
first delay stage is controlled directly by the analog voltage VRX(t)
while the second delay stage is controlled by the binary feedback
data. The third stage serves as the offset delay ΔT for eye diagram
measurements, which is equivalent to the offset voltage in
conventional DFEs. The inverter based implementation
significantly reduces the circuit complexity and makes it easier to
re-design the DFE circuit in a new technology. To further optimize
the circuit, one of the two taps was folded into the reference path
with a complementary weight configuration. With the
modification, the number of inverters and capacitors is cut down
by half which also reduces power consumption and chip area. A
zero-offset aperture PD was adopted in our design [4].
In-situ BER Monitor and Measurement Results
Fig. 5 shows the proposed circuit for in-situ BER bathtub and
eye diagram measurements. The programmable phase delay
allows the clock to sample data over a 2 UI range. Since the
proposed TB-DFE converts voltage into time, instead of sweeping
the voltage offset, we sweep the time offset ΔT to create the BER
eye diagram. This was implemented by adding a third delay stage
in the TB-DFE delay lines which allows the delay offset to be
swept up to one tap delay. The BER monitor compares the TX data
from the 215-1 PRBS with the DFE output. Finally, the error count
is tallies and serially read out. Fig. 6 shows the bathtub curves with
and without TB-DFE for two consecutive bits. The TB-DFE
lowers the BER floor from 10-10 to 10-12 while ensuring an eye
width of 0.43 UI. Fig. 7 shows the BER eye diagram. The delay
offset was swept from 0ps to 18ps using a 6-bit control code. Fig.
8 shows the energy-efficiency and data rate measured at different
supply voltages. To save test time, only the data point at 1.2V is
based on a BER criteria of 10-12 while the other data points are for
a BER of 10-9. The die photo and summary table are shown in Figs.
9 and 10, respectively. The transmitter and receiver blocks (not
including the BER monitor) achieve an energy-efficiency of 30.9
and 45.3 fJ/b/mm, respectively, at a data rate of 10 Gb/s.
References [1] B. Kim et al., ISSCC, 2009. [2] D. Walter et al.,
ISSCC, 2012. [3] S. Lee et al., ISSCC, 2013. [4] S. Kundu et al.,
ISSCC, 2016. [5] M. Chen et al., VLSI, 2015.
2017 Symposium on VLSI Circuits Digest of Technical Papers
Conventional CML-based DFE
W1
...
Data bypass to BER Mon.
Output
...
WN
Z-1
WN
...
Z-1
VRX(t)
WN
...
...
CLK
(5GHz)
W1
W2
NDFE
CLK
NREF
Reference Delay
4X
1X 2X
x[n]
PD
8X
TIA
PD=Phase Detector
TRX TW1X1
CLK
(5GHz)
RST
W2
1
0
W1
W2
W1
0
D=0
D=1
w/o DFE
EN
0
NDFE
CLK
Enhanced
window
Fig. 2. Illustration of time-based DFE
compensation for data stream ‘0100’. The time
difference window expands with DFE.
65nm GP, 1.2V, 25ºC
D[n+1]
ΔT Delay Offset (Code)
-6
10
-9
10
10
-1.2 -0.9 -0.6
-0.3
0
0.3
0.6
0.9
2X
1X
1X
2X
4X
(30×59 Ξm )
(23×24 Ξm2) RX
BER
TX
Mon.
10mm Link
100 Ξm
Fig. 9. 65nm chip micrograph
1X
2X
4X
4X
1.2
TREF DFE
To
TRX DFE
To+ΔT*
*Programmable delays for bathtub
and BER eye measurements
65nm GP, 1.2V, 25ºC BER
10-3
10-4
16
10-5
24
10-6
32
10-7
40
10-8
48
10-9
56
10
64
-1.2 -0.9 -0.6 -0.3
TX and RX
D'
PD
D
Error
count
Fig. 5. Circuit blocks used for bathtub and BER eye diagram measurement.
8
Technology
BER Monitor
Err
215-1 PRBS
0
0.3
0.6
0.9 1.2
ISSCC'09 [1]
90nm
Current mode
driver+TIA
-10
10-11
Phase (UI)
Phase (UI)
Fig. 6. Measured BER bathtub curves Fig. 7. Measured BER eye diagram for two
-12
(<10 ) for two consecutive bits.
consecutive bits.
2
D
W<2:0>
65nm GP, 25ºC
40
BER=10-9
11
Data Rate (Gb/s)
10
Phase
Delay*
0
w/o DFE
w/ DFE
-3
FF
-W2
EN
1X
FF
PD
Fig. 4. Half rate 2-tap time-based DFE circuit implementation.
NREF
w/ DFE
W1,W2: 6 bit
DFE weigts
2X
4X
FF
W<5:3>
D
4X
Energy-Efficiency (fJ/b/mm)
W1
FF
W1
TREF T-W2X2
VRX
VRX
W2
PD
To
NDFE
w/ DFE
Bit Error Rate
To+ΔT
D[n]
To+ΔT
4X
-12
TRX TW1X1
Zero-offset
aperture PD [4]
NREF
w/o DFE
PD
Out
To
W1
D Q
VRX(t)
-W2
TREF T-W2X2
VRX(t)
Output
Fig. 1. Conventional DFE and time-based DFE.
W1
D[n]= sign(TREF TRX-TW1X1-TW2X2)
Programmable delay offset = ΔT
Z-1
Z-1
In-situ BER Monitor
Fig. 3. Block diagram of 10mm on-chip serial link test chip.
Z-1
Proposed Time-Based DFE
Z-1
D D'
Programmable
Delay
CLK Generator
W1
W2
2-tap Time-based
DFE (Half-rate)
TIA
Parallel
to serial
Current mode logic
Slicer
W=2Ξm, L=10mm, M9
TX with 3-tap
FFE (Half-rate)
11b
Counter
ÎĢ
VRX(t)
IBIAS
VDFE
215-1 PRBS
Generator
x[n]
35
10
30
9
25
8
BER=10-12
20
15
0.95
1.00
1.05
1.10
1.15
7
1.20
6
Supply Voltage (V)
Fig. 8. Supply voltage versus data rate and
energy-efficiency.
ISSCC'12 [2]
ISSCC'13 [3]
65nm
65nm
Voltage mode
Current mode
driver+sense amp. driver+sense amp.
VLSI'15 [5]
65nm
CTLE-based
repeater
This work
65nm
Voltage mode
driver+TIA
Features
No DFE
No DFE
No DFE
No DFE
2-tap TB-DFE
Data Rate
Throughput
(Gb/s/Ξm)
Link Length
BER Bathtub
BER Eye
Energy
Efficiency
(fJ/b/mm)
4Gb/s
10Gb/s
3Gb/s
4Gb/s
10Gb/s
2
2.56
0.75
4
2
10mm
< 10E-6
Yes (< 10E-6)
6mm
< 10E-12
No
10mm
< 10E-12
Yes (< 10E-12)
2.5mm+2.5mm
< 10E-12
No
35.6
174
9.5
48.4
10mm
< 10E-12
Yes (< 10E-11)
TIA
14.4
DFE
30.9
FFE
31.9
Fig. 10. Comparison with state-of-the-art on-chip serial links.
2017 Symposium on VLSI Circuits Digest of Technical Papers
C57