* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 2 Interconnect Parasitic Extraction
Survey
Document related concepts
Pulse-width modulation wikipedia , lookup
Voltage optimisation wikipedia , lookup
Ground loop (electricity) wikipedia , lookup
History of electric power transmission wikipedia , lookup
Analog-to-digital converter wikipedia , lookup
Resistive opto-isolator wikipedia , lookup
Buck converter wikipedia , lookup
Mains electricity wikipedia , lookup
Transmission line loudspeaker wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Alternating current wikipedia , lookup
Rectiverter wikipedia , lookup
Transcript
Chapter 7 High-Speed Signal Prof. Lei He Electrical Engineering Department University of California, Los Angeles URL: eda.ee.ucla.edu Email: [email protected] High-speed Links are Everywhere Backbone Router Rack PC or Console [Sredojevic:ICCAD’08] High-Speed Links: Applications Chip-to-chip signaling Computers, games: SDRAM(DDR, DDR2) 100-700MHZ, RDRAM 800-1600MHz, DDR3 800-1600MHz, DDR4 1.63.2GHz, XDR DRAM 3.2-6.4GHz Board-to-board Computers: Peripherals- PCI (66-133-400MHz), PCIe (250M500M-1GHz), Infiniband (2.5Gb/s) Networks LAN: Fast Ethernet, Gigabit Ethernet, 10G Ethernet WAN: OC-12 (625MHz), OC-192(12.5GHz) Routers: 625Mb/s – 2.5Gb/s Outline Link Design Basics Signal Integrity High Speed Signaling Architectures Equalization Post-Silicon Tuning of High-Speed Signaling Noise Signals may be corrupted from many sources Inter-symbol interference (ISI) – Frequency-dependent attenuation (dispersion) – Reflection – Oscillation Crosstalk Power supply noise Real noise – Thermal and shot noise Parameter variation Noise measure Eye diagram – Timing jitter – Amplitude noise Inter-Symbol Interference A signal interfering with itself Ideally a transmission system is time invariant No history of previous bits In reality, the state of the system is affected by previous bits Signals that don’t reach the rails by the end of cycle – Signal’s transition time is limited by channel bandwidth Reflections on the transmission lines Magnitude and phase of excited resonances ISI - Dispersion Frequency-dependent attenuation In general, channel is low pass Our nice short pulse gets spread out Example: a 101 pattern ISI - Reflection Reflections of previous bits travel up and down transmission lines A mismatch of δ gives (to the first order) a reflection of ρ Z0 Zt 1 Z0 Zt 1 ISI - Resonances Oscillations are excited by signal transitions and may interfere with later transitions Excitation of resonant circuits is reduced with longer transition times Slower edge has less high frequency spectral content Resistance damps oscillation Crosstalk • Crosstalk is the coupling of energy from one line to another via: • Mutual capacitance (electric field) • Mutual inductance (magnetic field) • One signal interfering with another signal Mutual Inductance, Lm Mutual Capacitance, Cm Zo Zo Zo Zo far far Cm Lm near Zs Zo near Zs Zo Crosstalk Induced Noise • The mutual inductance will induce current on the victim line opposite of the driving current (Lenz’s Law) • The mutual capacitance will pass current through the mutual capacitance that flows in both directions on the victim line Zo I Cm Zo Zo dV Cm dt Zo far far ICm ILm Lm near Zs Zo near Zs Zo I near I Cm I Lm I far I Cm I Lm VLm dI Lm dt Voltage Profile of Coupled Noise • Near end crosstalk is always positive • Currents from Lm and Cm always add and flow into the node • For PCB’s, the far end crosstalk is “usually” negative • Current due to Lm larger than current due to Cm • Note that far and crosstalk can be positive Zo Zo Far End Driven Line Un-driven Line “victim” Zs Driver Near End Zo Power Supply Noise The power supply network has parasitic elements On-chip: resistive Off-chip: inductive Current draw across these elements induces a noise voltage: d Ri L i (t ) dt Instantaneous current is what matters May be many times the DC current – 10W chip draws 4A at 2.5V – Peak current may be 10-20A Simultaneous Switching Outputs (SSO) When several outputs switch simultaneously, significant current is drawn from the supply or sent into ground Supply connections have inductance SSO currents produce a voltage drop across these inductances On-chip, the VDD to VSS voltage difference decreases Effect grows with number of drivers switching Quadratic with the inverse of transition time Between chips, the drops across VSS inductances can effect driver timing and shift the receiver threshold Other Noise Sources Alpha particles 5MeV particle injects 730fC of charge into substrate One node typically collects less than 50fC Thermal and shot noise Proportional to bandwidth – typically in the uV Parameter mismatch VT and β have deviation proportional to 1/sqrt(WL) Systematic variations depend on layout Eye Diagram This is a “1” This is a “0” Eye – space between 1 and 0 With voltage noise With timing noise With both!! Eye Diagram (cont’d) Standard measure for signaling Synchronized superposition of all possible realizations of the signal viewed within a particular interval channel Timing jitter Deviation of the zero-crossing from its ideal occurrence time Amplitude noise Set by signal-to-noise ratio (SNR) The amount of noise at the sampling time TX RX Outline Link Design Basics Signal Integrity High Speed Signaling Architectures Equalization Post-Silicon Tuning of High-Speed Signaling Signaling – Main Idea A good signaling system isolates the signal from noise rather than trying to overpower the noise Crosstalk – Terminate both ends, use homogeneous media ISI – Matched terminations, no resonators, rise-time control Power supply noise – Avoid coupling into signal or reference Differential signaling Current mode stable reference ...1001... ...1001... Multiplexer Driver Amp Channel Tx Clock Rx Clock Demultiplexer Architecture of Signaling Signaling Architecture Tradeoffs Signal modulation PAM (Pulse-amplitude modulation) Pulsed (Return-to-Zero, RZ) signaling Binary (ex:NRZ) or Multiple-level signaling (MLS) Uni-directional or Bidirectional Time-multiplexed bidirectional or simultaneous bidir. Single-ended or differential Current mode or voltage mode Bus or single-trace Point-to-point or multi-drop Example System - Trade-offs Transmitter choices: output impedance, bipolar/unipolar driver, amplitude, rise time, single-ended/differential Line considerations: length, impedance, attenuation, discontinuities Termination choices: source, destination, both, underterm Reference considerations: VDD-div, Tx, Rx, diff Receiver choices: offset, sensitivity, BW Voltage Mode vs Current Mode Main differences are Ease of control and generation – Much easier to generate a small current than a small voltage Coupling of supply noise – 50% of supply noise shows up on the data line in the matched voltage mode; potentially much less in a high-Z current-mode driver Generation of high-Z switches easier than controlled-Z switches Single-ended vs Differential Single-ended signaling compare to shared reference Often used with a bus Issues – Generates SSO noise – How to make reference – How to quiet reference – Crosstalk cannot be made common-mode • Differential signaling • compare between two lines • Noise immunity • Many noise sources become common mode • Issues • Differential must run > 2x as fast as singleended to make sense • Otherwise, powerx2, pinsx2 Binary vs Multiple-level (4-PAM) Binary (NRZ) is 2-PAM Use 2-levels to send one-bit per symbol • 4-PAM uses 4-levels to send 2 bits per symbol • Each level has 2 bit value When Does 4-PAM Make Sense? Simultaneous Bidirectional Signaling Wires can transmit waves in both directions It seems a shame to only use one direction at a time Simultaneous Bidirectional Signaling Transmit waves in both directions at the same time Waveform on wire is superposition of forward and reverse traveling wave Subtract transmitted wave at each end to recover received wave There are 3-levels on the line but it’s still 2-level signaling Much more sensitive to reflections and crosstalk Outline Link Design Basics Signal Integrity High Speed Signaling Architectures Equalization Post-Silicon Tuning of High-Speed Signaling Equalization channel equalizer Channel is band-limited, most of them are low-pass Goal is to flatten the overall response Equalization: Boost higher frequencies relative to lower frequencies Can be done at Tx or RX or both Receiver Linear Equalizer Amplifies high-frequencies attenuated by the channel Pre-decision Digital or Analog FIR filter Issues Also amplifies noise! Precision Tuning delays (if analog) Setting coefficients (adaptive filter) – Adaptive algorithms such as LMS Transmitter Linear Equalizer Tx Pre-emphasis Filter Attenuates low-frequencies Need to be careful about output amplitude limited output power – If you could make bigger swings, you would – EQ really attenuates low-frequencies to match high frequencies Also FIR filter: D/A converter Can get better precision than RX Issues How to set EQ weights? Doesn’t help loss at high f Tx Linear EQ: Single Bit Response Outline Link Design Basics Signal Integrity High Speed Signaling Architectures Equalization Post-Silicon Tuning of High-Speed Signaling Process Variation vs Analog Circuits [ITRS] Threshold voltage variation is increasingly dominant and is primarily random Due to increasing and random doping fluctuation Corner-based design is not effective for match used widely in analog circuits Often results in over-sized circuits and excessive area/power Post-Silicon Tuning is Effective [Li:ICCAD’08] Post-silicon tuning is effective to compensate random process variation Digitally tunable circuit is commonly adopted Insensitivity to noise and variation Suitable for process migration Post-Silicon Tuning of High-Speed Signaling Algorithm Framework Problem formulation Branch and bound based algorithm Case Study I: Transmitter Case Study II: PLL Conclusions Unit Cell Based Design Methodology Pre-characterize different types of unit cell, e.g., transistor with a given threshold voltage and unit W/L. A transistor of larger W/L can be synthesized by connecting those unit cells of same type in parallel Design variables simply become – type of unit cell α(threshold) number of unit cells in parallel (sizing) Constraints such as output swing is satisfied for correct operation – Apply to other circuit elements such as unit capacitance and resistance Make design better and modeling more accurate Digitally Tunable Circuits with variation Analog output A A+ A A A’=A- A’ LSB size D[0] D[1] D[2] w/o variation A’ D’ D Digital Input one tap in a pre-emphasis filter current source can be implemented by current-division DAC Current-division DAC is commonly used to combat process variation Two tuning parameters LSB size (): minimum step during digital-to-analog conversion Resolution (β): number of bits used Impact of Post-Silicon Tuning (a) Without Tuning (b) With Tuning Example: BER for a high-speed link 4-tap pre-emphasis filter in a transmitter 0% (3σ) variation in Vt Design-time optimization and post-silicon tuning circuit both need area, and joint optimization is must Joint Optimization parametric yield power constraint. Process variation changes power area constraint. Process variation does not change layout area bound on design parameters bound on the total number of unit cells types bound on the LBS and resolution e Optimization Challenges 3000 Monte Carlo runs over different unit cell design α, resolution β, and LSB size for one tap of FIR Discrete problem with non-convex objective and constraints Solution space surface is rough and many local maxima exist Significant improvement can be expected Overall Algorithm Algorithm framework: Partition the solution space by LSB size ( ) and unit cell type (α) Develop a bound on the parametric yield Discard (fathom) if bound worse than the current best solution Use gradient ascent method to find the local maxima – Sequentially take steps in γ the direction proportional to the gradient. αi γ Bound estimation Remove the area and power γ constraints αj All infeasible Use LMS algorithm to find γ optimal yield value i j i j αk γi γj Pruned by upper bound check Gradient Ascend Method In each un-pruned region, sequentially take steps in direction proportional to the gradient, until a local maximum of the objective function is reached. At each step, increase/decrease each variable by 1 in turn and check the change of the objective function. Always take the change (direction) that causes the maximum increase. Termination of the algorithm indicates that one of the local maxima has been reached or that we have reached the boundary. The initial guess for the GDA can be arbitrarily chosen. In our experiments, we find that it did not influence runtime or quality significantly. We also observed that the algorithm always converges to local optimum within two or three iterations. Post-Silicon Tuning of High-Speed Signaling Algorithm Framework Case study 1: transmitter Knobs for design-time and post-silicon Modeling and formulation Experimental results Case Study 2: PLL Conclusions Knobs for Optimization Receiver Transmitter RD Pre-driver Channel Slicer out data Pre-amplifier FIR pre-emphasis filter/driver CDR RD N-tap FIR filter a1 a0 an Filter coefficients IC0 IC1 ICn Given transmission channel → filter coefficient → transistor size change channel behavior ← parasitic capacitance Knobs for Optimization Receiver Transmitter RD Pre-driver Channel Slicer out data Pre-amplifier FIR pre-emphasis filter/driver CDR RD N-tap FIR filter a1 a0 an Filter coefficients with IC1 IC0 variation ICn Analog output A A+ A A A’=A- A’ LSB size w/o variation A’ D’ D D[0] D[1] D[2] Problem Formulation For transmitter , random variable BER Distribution Comparison 20% (3σ) variation in Vth with 10000 Monte Carlo runs Design 1 - without tuning circuit All resources are used for filter – Unavoidable large variation – Design 2 - one tap filter – – All resources are used for DAC Has extreme small variance but suffers severe ISI Design 3 – heuristic design Assume 4-tap filter Assume LSB size is equal for each tap – Limit the solution space – Good improvement compared to two extreme cases – – Design 4 - our algorithm – Provides better solution (mean, Yield Rate Experiment setting Channel – 30cm differential microstrip line with FR-4 substrate 5GHz data rate Yield is set by BER=1e-15 (estimated by EVM) Yield comparison for different area constraints Our algorithm always provide better yield than design heuristic With aggressive area constraint, our algorithm has much less yield degradation Saturation effect Up to 47% improvement area Yield with Power Constraint power vt variation Post-Silicon Tuning of High-Speed Signaling Algorithm Framework Case study 1: Transmitter Case study 2: PLL Design Conclusions Jitter Modeling PLL output clock jitter Hnin and HnVCO are the noise transfer function of reference clock noise and VCO noise E.g. Tunable PLL CLKVCO CP1 Jitter can be changed by tuning the charge pump current ratio [Mansuri:JSSC’02] CLKref ФnIN ω0 PFD KPFD UP / DN VINT OPamp CCP 1/gmOP VCtrl VCO KVCO/S ФnVCO ICP VCtrl η×ICP VINT CP2 ФOUT Joint Optimization Design-time optimization Two charge pumps Icp1, Icp2 Ratio (Icp1/ Icp2) determines output RMS jitter Optimal ratio can be found using design-time optimization Again, process variation would cause performance degradation Digitally tuned current mirror Small reference current – Consumes less power – η need to be far less than unity – Limited tuning resolution Large reference current – Good tunability – Power and area penalty [Horowitz:JSSC’00] Same Formulation Applies For PLL objective function becomes and area can be computed in a way similar to the transmitter case. Experimental Results PLL with digitally controlled charge pump current Yield is defined by output clock RMS jitter Design heuristic using minimized biasing current Consider 30% Vth variation Improve the yield by up to 56% Conclusions Formulate a joint optimization problem for digitally tuned analog circuits Consider both design-time optimization and post-silicon tuning Maximize performance yield s.t. power and area constraints Propose a general optimization framework Combine branch-and-bound and gradient-ascent algorithm Effectively find the global optimum Two joint optimization design examples for high-speed serial link Transmitter design PLL design Experiments show great (>47%) yield improvement over common circuit design heuristic