Download Chapter 2 Interconnect Parasitic Extraction

Document related concepts

Decibel wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Voltage optimisation wikipedia , lookup

Rectifier wikipedia , lookup

Ground loop (electricity) wikipedia , lookup

MIMO wikipedia , lookup

History of electric power transmission wikipedia , lookup

Dither wikipedia , lookup

Islanding wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Buck converter wikipedia , lookup

Mains electricity wikipedia , lookup

Transmission line loudspeaker wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Alternating current wikipedia , lookup

Metadyne wikipedia , lookup

Rectiverter wikipedia , lookup

White noise wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
Chapter 7
High-Speed Signal
Prof. Lei He
Electrical Engineering Department
University of California, Los Angeles
URL: eda.ee.ucla.edu
Email: [email protected]
High-speed Links are Everywhere
Backbone
Router
Rack
PC or
Console
[Sredojevic:ICCAD’08]
High-Speed Links: Applications








Chip-to-chip signaling
Computers, games: SDRAM(DDR, DDR2) 100-700MHZ,
RDRAM 800-1600MHz, DDR3 800-1600MHz, DDR4 1.63.2GHz, XDR DRAM 3.2-6.4GHz
Board-to-board
Computers: Peripherals- PCI (66-133-400MHz), PCIe (250M500M-1GHz), Infiniband (2.5Gb/s)
Networks
LAN: Fast Ethernet, Gigabit Ethernet, 10G Ethernet
WAN: OC-12 (625MHz), OC-192(12.5GHz)
Routers: 625Mb/s – 2.5Gb/s
Outline
Link Design Basics




Signal Integrity
High Speed Signaling Architectures
Equalization
Post-Silicon Tuning of High-Speed Signaling
Noise








Signals may be corrupted from many sources
Inter-symbol interference (ISI)
– Frequency-dependent attenuation (dispersion)
– Reflection
– Oscillation
Crosstalk
Power supply noise
Real noise
– Thermal and shot noise
Parameter variation
Noise measure
Eye diagram
– Timing jitter
– Amplitude noise
Inter-Symbol Interference

A signal interfering with itself

Ideally a transmission system is time invariant


No history of previous bits
In reality, the state of the system is affected by previous bits



Signals that don’t reach the rails by the end of cycle
– Signal’s transition time is limited by channel bandwidth
Reflections on the transmission lines
Magnitude and phase of excited resonances
ISI - Dispersion




Frequency-dependent attenuation
In general, channel is low pass
Our nice short pulse gets spread out
Example: a 101 pattern
ISI - Reflection
Reflections of previous bits travel up

and down transmission lines
A mismatch of δ gives (to the first

order) a reflection of ρ
Z0  Zt 1  


Z0  Zt 1  
ISI - Resonances




Oscillations are excited by signal
transitions and may interfere with later
transitions
Excitation of resonant circuits is reduced
with longer transition times
Slower edge has less high frequency
spectral content
Resistance damps oscillation
Crosstalk
• Crosstalk is the coupling of energy from one line to another via:
• Mutual capacitance (electric field)
• Mutual inductance (magnetic field)
• One signal interfering with another signal
Mutual Inductance, Lm
Mutual Capacitance, Cm
Zo
Zo
Zo
Zo
far
far
Cm
Lm
near
Zs
Zo
near
Zs
Zo
Crosstalk Induced Noise
• The mutual inductance will induce current on the victim line opposite
of the driving current (Lenz’s Law)
• The mutual capacitance will pass current through the mutual
capacitance that flows in both directions on the victim line
Zo
I Cm
Zo
Zo
dV
 Cm
dt
Zo
far
far
ICm
ILm
Lm
near
Zs
Zo
near
Zs
Zo
I near  I Cm  I Lm
I far  I Cm  I Lm
VLm
dI
 Lm
dt
Voltage Profile of Coupled Noise
• Near end crosstalk is always positive
• Currents from Lm and Cm always add and flow into the node
• For PCB’s, the far end crosstalk is “usually” negative
• Current due to Lm larger than current due to Cm
• Note that far and crosstalk can be positive
Zo
Zo
Far End
Driven Line
Un-driven Line
“victim”
Zs
Driver
Near End
Zo
Power Supply Noise
The power supply network has parasitic
elements
On-chip: resistive
Off-chip: inductive
 Current draw across these elements
induces a noise voltage:



d
Ri  L i (t )
dt

Instantaneous current is what matters
May be many times the DC current
– 10W chip draws 4A at 2.5V
– Peak current may be 10-20A

Simultaneous Switching Outputs (SSO)







When several outputs switch
simultaneously, significant current is
drawn from the supply or sent into
ground
Supply connections have inductance
SSO currents produce a voltage drop
across these inductances
On-chip, the VDD to VSS voltage
difference decreases
Effect grows with number of drivers
switching
Quadratic with the inverse of
transition time
Between chips, the drops across VSS
inductances can effect driver timing
and shift the receiver threshold
Other Noise Sources
Alpha particles
5MeV particle injects 730fC of charge into substrate
One node typically collects less than 50fC


Thermal and shot noise
Proportional to bandwidth – typically in the uV

Parameter mismatch
VT and β have deviation proportional to 1/sqrt(WL)
Systematic variations depend on layout


Eye Diagram
This is a “1”
This is a “0”
Eye – space between 1 and 0
With voltage noise
With timing noise
With both!!
Eye Diagram (cont’d)

Standard measure for signaling
Synchronized superposition of
all possible realizations of the
signal viewed within a
particular interval
channel


Timing jitter


Deviation of the zero-crossing
from its ideal occurrence time
Amplitude noise


Set by signal-to-noise ratio
(SNR)
The amount of noise at the
sampling time
TX
RX
Outline
Link Design Basics




Signal Integrity
High Speed Signaling Architectures
Equalization
Post-Silicon Tuning of High-Speed Signaling
Signaling – Main Idea
A good signaling system isolates the signal from noise rather than
trying to overpower the noise




Crosstalk
– Terminate both ends, use homogeneous media
ISI
– Matched terminations, no resonators, rise-time control
Power supply noise
– Avoid coupling into signal or reference
 Differential signaling
 Current mode
 stable reference
...1001...
...1001...
Multiplexer
Driver
Amp
Channel
Tx Clock
Rx Clock
Demultiplexer
Architecture of Signaling
Signaling Architecture Tradeoffs










Signal modulation
PAM (Pulse-amplitude modulation)
Pulsed (Return-to-Zero, RZ) signaling
Binary (ex:NRZ) or Multiple-level signaling (MLS)
Uni-directional or Bidirectional
Time-multiplexed bidirectional or simultaneous bidir.
Single-ended or differential
Current mode or voltage mode
Bus or single-trace
Point-to-point or multi-drop
Example System - Trade-offs
Transmitter choices:
output impedance,
bipolar/unipolar driver,
amplitude, rise time,
single-ended/differential
Line considerations:
length, impedance,
attenuation, discontinuities
Termination choices:
source, destination,
both, underterm
Reference considerations:
VDD-div, Tx, Rx, diff
Receiver choices:
offset, sensitivity, BW
Voltage Mode vs Current Mode
Main differences are
Ease of control and generation
– Much easier to generate a small current than a small voltage
Coupling of supply noise
– 50% of supply noise shows up on the data line in the matched
voltage mode; potentially much less in a high-Z current-mode
driver
Generation of high-Z switches easier than controlled-Z switches



Single-ended vs Differential
Single-ended signaling
compare to shared reference
Often used with a bus
Issues
– Generates SSO noise
– How to make reference
– How to quiet reference
– Crosstalk cannot be made
common-mode



• Differential signaling
• compare between two lines
• Noise immunity
• Many noise sources
become common mode
• Issues
• Differential must run >
2x as fast as singleended to make sense
• Otherwise, powerx2,
pinsx2
Binary vs Multiple-level (4-PAM)
Binary (NRZ) is 2-PAM
Use 2-levels to send one-bit per
symbol
• 4-PAM uses 4-levels to
send 2 bits per symbol
• Each level has 2 bit value
When Does 4-PAM Make Sense?
Simultaneous Bidirectional Signaling

Wires can transmit waves in both directions

It seems a shame to only use one direction at a time




Simultaneous Bidirectional Signaling
Transmit waves in both directions at the same time
Waveform on wire is superposition of forward and reverse
traveling wave
Subtract transmitted wave at each end to recover received
wave

There are 3-levels on the line but it’s still 2-level signaling

Much more sensitive to reflections and crosstalk
Outline
Link Design Basics




Signal Integrity
High Speed Signaling Architectures
Equalization
Post-Silicon Tuning of High-Speed Signaling
Equalization
channel
equalizer

Channel is band-limited, most of them are low-pass

Goal is to flatten the overall response

Equalization: Boost higher frequencies relative to lower frequencies

Can be done at Tx or RX or both
Receiver Linear Equalizer
Amplifies high-frequencies attenuated
by the channel


Pre-decision

Digital or Analog FIR filter





Issues
Also amplifies noise!
Precision
Tuning delays (if analog)
Setting coefficients (adaptive filter)
– Adaptive algorithms such as LMS
Transmitter Linear Equalizer
Tx Pre-emphasis Filter




Attenuates low-frequencies
Need to be careful about output amplitude limited output power
– If you could make bigger swings, you would
– EQ really attenuates low-frequencies to
match high frequencies
Also FIR filter: D/A converter
Can get better precision than RX




Issues
How to set EQ weights?
Doesn’t help loss at high f
Tx Linear EQ: Single Bit Response
Outline
Link Design Basics




Signal Integrity
High Speed Signaling Architectures
Equalization
Post-Silicon Tuning of High-Speed Signaling
Process Variation vs Analog Circuits
[ITRS]

Threshold voltage variation is increasingly dominant and is primarily random


Due to increasing and random doping fluctuation
Corner-based design is not effective for match used widely in analog circuits

Often results in over-sized circuits and excessive area/power
Post-Silicon Tuning is Effective
[Li:ICCAD’08]


Post-silicon tuning is effective to compensate random process variation
Digitally tunable circuit is commonly adopted
Insensitivity to noise and variation
Suitable for process migration


Post-Silicon Tuning of High-Speed Signaling

Algorithm Framework


Problem formulation
Branch and bound based algorithm

Case Study I: Transmitter

Case Study II: PLL

Conclusions
Unit Cell Based Design Methodology

Pre-characterize different types of unit cell, e.g., transistor with a given threshold voltage and
unit W/L.

A transistor of larger W/L can be synthesized by connecting those unit cells of same type
in parallel

Design variables simply become
–
type of unit cell α(threshold)
number of unit cells in parallel (sizing)
Constraints such as output swing is satisfied for correct operation
–


Apply to other circuit elements such as unit capacitance and resistance

Make design better and modeling more accurate
Digitally Tunable Circuits
with variation
Analog output
A
A+ A
A
A’=A- A’
LSB size
D[0]
D[1]
D[2]
w/o variation
A’
D’ D
Digital Input
one tap in a pre-emphasis filter
current source can be implemented by
current-division DAC


Current-division DAC is commonly used to combat process variation
Two tuning parameters
LSB size (): minimum step during digital-to-analog conversion
Resolution (β): number of bits used


Impact of Post-Silicon Tuning
(a) Without Tuning

(b) With Tuning
Example: BER for a high-speed link
4-tap pre-emphasis filter in a transmitter
0% (3σ) variation in Vt



Design-time optimization and post-silicon tuning circuit both need area, and
joint optimization is must
Joint Optimization
parametric yield
power constraint. Process variation changes
power
area constraint. Process variation does not
change layout area
bound on design parameters
bound on the total number of unit cells types
bound on the LBS and resolution
e
Optimization Challenges
 3000 Monte Carlo runs over different unit cell design α, resolution β,
and LSB size for one tap of FIR

Discrete problem with non-convex objective and constraints

Solution space surface is rough and many local maxima exist

Significant improvement can be expected
Overall Algorithm
 Algorithm framework:

 Partition the solution space by LSB size ( ) and unit cell type (α)
 Develop a bound on the parametric yield
 Discard (fathom) if bound worse than the current best solution
Use gradient ascent method to
find the local maxima
– Sequentially take steps in
γ
the direction proportional to
the gradient.
αi
γ
Bound estimation
Remove the area and power
γ
constraints
αj
All
infeasible
Use LMS algorithm to find
γ
optimal yield value

i
j

i

j
αk
γi
γj
Pruned by upper bound check
Gradient Ascend Method
 In each un-pruned region, sequentially take steps in direction
proportional to the gradient, until a local maximum of the
objective function is reached.
 At each step, increase/decrease each variable by 1 in turn and
check the change of the objective function.
 Always take the change (direction) that causes the maximum
increase.
 Termination of the algorithm indicates that one of the local
maxima has been reached or that we have reached the boundary.
 The initial guess for the GDA can be arbitrarily chosen. In our
experiments, we find that it did not influence runtime or quality
significantly.
 We also observed that the algorithm always converges to local
optimum within two or three iterations.
Post-Silicon Tuning of High-Speed Signaling

Algorithm Framework

Case study 1: transmitter



Knobs for design-time and post-silicon
Modeling and formulation
Experimental results

Case Study 2: PLL

Conclusions
Knobs for Optimization
Receiver
Transmitter
RD
Pre-driver
Channel
Slicer
out
data
Pre-amplifier
FIR pre-emphasis filter/driver
CDR
RD
N-tap FIR filter
a1
a0
an
Filter coefficients
IC0
IC1
ICn
Given transmission channel → filter coefficient → transistor size
change channel behavior ← parasitic capacitance
Knobs for Optimization
Receiver
Transmitter
RD
Pre-driver
Channel
Slicer
out
data
Pre-amplifier
FIR pre-emphasis filter/driver
CDR
RD
N-tap FIR filter
a1
a0
an
Filter coefficients
with
IC1
IC0
variation
ICn
Analog output
A
A+ A
A
A’=A- A’
LSB size
w/o variation
A’
D’ D
D[0]
D[1]
D[2]
Problem Formulation
For transmitter

,
random variable
BER Distribution Comparison


20% (3σ) variation in Vth with 10000
Monte Carlo runs
Design 1 - without tuning circuit
All resources are used for filter
– Unavoidable large variation
–

Design 2 - one tap filter
–
–

All resources are used for DAC
Has extreme small variance but
suffers severe ISI
Design 3 – heuristic design
Assume 4-tap filter
Assume LSB size is equal for each tap
– Limit the solution space
– Good improvement compared to two
extreme cases
–
–

Design 4 - our algorithm
–
Provides better solution (mean,
Yield Rate
Experiment setting
Channel – 30cm differential microstrip line with FR-4 substrate
5GHz data rate
Yield is set by BER=1e-15 (estimated by EVM)



Yield comparison for different area constraints




Our algorithm always provide
better yield than design heuristic
With aggressive area constraint,
our algorithm has much less
yield degradation
Saturation effect
Up to 47% improvement
area
Yield with Power Constraint
power
vt variation
Post-Silicon Tuning of High-Speed Signaling

Algorithm Framework

Case study 1: Transmitter

Case study 2: PLL Design

Conclusions
Jitter Modeling
PLL output clock jitter
Hnin and HnVCO are the noise transfer function of reference clock noise and VCO
noise

E.g.
 Tunable PLL

CLKVCO
CP1
Jitter can be changed by
tuning the charge pump current
ratio
[Mansuri:JSSC’02]
CLKref
ФnIN
ω0
PFD
KPFD
UP /
DN
VINT
OPamp
CCP
1/gmOP
VCtrl
VCO
KVCO/S
ФnVCO
ICP
VCtrl
η×ICP
VINT
CP2
ФOUT
Joint Optimization
Design-time optimization


Two charge pumps Icp1, Icp2
Ratio (Icp1/ Icp2) determines
output RMS jitter

Optimal ratio can be found using
design-time optimization
Again, process variation would cause
performance degradation
Digitally tuned current mirror

Small reference current
–
Consumes less power
–
η need to be far less than
unity
–

Limited tuning resolution
Large reference current
–
Good tunability
–
Power and area penalty
[Horowitz:JSSC’00]
Same Formulation Applies
For PLL
objective function becomes
and area can be computed in a way similar to the transmitter case.
Experimental Results




PLL with digitally controlled charge pump current
Yield is defined by output clock RMS jitter
Design heuristic using minimized biasing current
Consider 30% Vth variation
Improve the yield by up to 56%
Conclusions

Formulate a joint optimization problem for digitally tuned analog
circuits
Consider both design-time optimization and post-silicon tuning
Maximize performance yield s.t. power and area constraints
Propose a general optimization framework
Combine branch-and-bound and gradient-ascent algorithm
Effectively find the global optimum
Two joint optimization design examples for high-speed serial link
Transmitter design
PLL design
Experiments show great (>47%) yield improvement over common
circuit design heuristic








