Download Advanced VLSI Design - Washington State University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Electrical substation wikipedia , lookup

Spectral density wikipedia , lookup

Wireless power transfer wikipedia , lookup

History of electric power transmission wikipedia , lookup

Buck converter wikipedia , lookup

Voltage optimisation wikipedia , lookup

Audio power wikipedia , lookup

Power over Ethernet wikipedia , lookup

Electric power system wikipedia , lookup

Electrification wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Distribution management system wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Atomic clock wikipedia , lookup

Rectiverter wikipedia , lookup

Electrical grid wikipedia , lookup

Alternating current wikipedia , lookup

Power engineering wikipedia , lookup

Mains electricity wikipedia , lookup

Islanding wikipedia , lookup

Flip-flop (electronics) wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Transcript
EE 587
SoC Design & Test
Partha Pande
School of EECS
Washington State University
[email protected]
1
SoC Physical Design Issues
Power and Clock Distribution
2
Overview
Reading
HJS - Chapter 11 Power Grid and Clock Design
For background information refer to chapter 5 of the text book
3
Purpose of Power Distribution
• Goal of power distribution system is to deliver the required
current across the chip while maintaining the voltage levels
necessary for proper operation of logic circuits
• Must route both power and ground to all gates
• Design Challenges:
– How many power and ground pins should we allocate?
– Which layers of metal should be used to route
power/ground?
– How wide should be make the wire to minimize voltage
drops and reliability problems
– How do we maintain VDD and Gnd within noise budget?
– How do we verify overall power distribution system?
4
Design Example
•
Target Impedance of the power grid
Z
VDD  ( FractionalNoiseBudge t )
I DD
• For a supply voltage of 1.2 V and a supply current of 100 A,
with a 10 % noise budget the power grid impedance can be
1.2 mΩ
5
Power Distribution Issues - IR Drop
< Vdd
Vdd
•
•
n4
n3
< Vdd
n2
•
n1
n8
n5
n6
n7
•
•
Narrow line widths
increase metal line
resistance
As current flows through
power grid, voltage drops
occur
Actual voltage supplied to
transistors is less than
Vdd
Impacts speed and
functionality
Need to choose wire
widths to handle current
demands of each
segment
6
Block Interaction yields IR Drop
Plots courtesy of Simplex Solutions, Inc.
7
Power Grid Issues – Static IR Drop
• Block placement and
global power routing
determines IR drop
on the chip
• Possible solutions
– Rearrange blocks
– More Vdd pins
– Connect bottom
portion of grid to
top portion
Plot courtesy of Simplex Solutions, Inc.
8
Power Grid Issues – Static IR Drop
• If we connect bottom
portion of grid to top
portion, the IR drop is
reduced significantly
• However, this is only
one part of the
problem
• We must also
examine
electromigration
Plot courtesy of Simplex Solutions, Inc.
9
Power Grid Issues - Electromigration
• As current flows down
narrow wires, metal begins
to migrate
• Metal lines break over time
due to metal fatigue
• Based on average/peak
current density
• Need to widen wires
enough to avoid this
phenomenon
n4
n3
n2
n1
n8
n5
n6
n7
10
Case Study – IR and EM Tradeoff
11
Ldi/dt Effects in the Power Supply
•
•
•
In addition to IR drop, power system inductance is also an issue
Inductance may be due to power pin, power bump or power grid
Overall voltage drop is:
Vdrop = IR + Ldi/dt
•
Distribute decoupling capacitors (decaps) liberally throughout design
– Capacitors store up charge
– Can provide instantaneous source of current for switching
12
On-chip Decoupling Caps
•
•
•
On-chip decaps help to stabilize the power grid voltage
First line of defense against noise which can extend beyond 10GHz
Simple Example:
•
•
– Drop across inductors = 2 x L x di/dt = 2 x 0.2nH x 20mA/100ps =
80mV (problematic if supply is 1.2V)
Actual power pad or bump may need to support thousands of inverters
Use capacitors to supply instantaneous charge to inverters
13
Making a Decoupling Cap
•
•
•
•
Decaps are basically NMOS transistors. Top plate is polysilicon,
bottom-plate is inverted channel, insulator is gate oxide.
Connect poly to Vdd and source/drain to Vss
Low-frequency capacitance is roughly COX W L.
Since these are large capacitance to be used at high frequencies, more
accurate representation is needed
14
How much Decoupling Cap?
•
To estimate required decap value, run SPICE on patch of chip area
with power grid, part of logic block, and sprinkle of decaps
•
Amount of decap depends on:
– Acceptable ripple on Vdd-Vss (typically 10% noise budget)
– Switching activity of logic circuits (usually need 10X switched cap)
– Current provided by power grid (di/dt)
– Required frequency response (high frequency operation)
– How much decap exists ( non-switching diffusion, gate, wire caps)
15
Simulation with Decoupling Caps
• Plot center region of grid
• Local Vdd-Vss in worst-case
location in patch (center)
• First dip can be dealt with by a
low-frequency on-chip voltage
regulator and low-frequency
decaps
• Steady-state ripple is
controlled by high-frequency
decoupling caps
• Adjust location of decaps until
the ripple is within noise
budget
16
Designing Power Distribution
• Floorplanner should be aware of IR+Ldi/dt drop and EM
problems and design accordingly
– Requires knowledge of current distributions and voltage drop
constraints of blocks being placed
• Provide adequate number of VDD and Gnd pins
• Route power distribution system according to current demands
of the blocks
• Widen wires based on expected current density in branches
• Distribute decoupling capacitors liberally throughout design
• Verify full chip with IR/EM tools
17
Reducing the Effects of IR drop and Ldi/dt
• Stagger the firing of buffers (bad idea: increases skew)
• Use different power grid tap points for clock buffers (but it makes
routing more complicated for automated tools)
• Use smaller buffers (but it degrades edge rates/increases delay)
• Make power busses wider (requires area but should do it)
• Use more Vdd/Vss pins; adjust locations of Vdd/Vss pins
• Put in power straps where needed to deliver current
• Place decoupling capacitors wherever there is free space
• Integrate decoupling capacitors into buffer cells
These caps act
as decoupling
caps when they
are not
switching
18
Power Routing Examples
Block A
Single Trunk
Block B
Block A
Block B
Multiple Trunks
19
Simple Routing Examples
Block A
Block B
Double-Ended Connections
Block A
Block B
Wider Trunks
20
Interleaved Power/Ground Routing
Interleaved Vdd/Vss
21
Flip-Flop and Clock Design
• Flip-flops and latches are used to gate signals in sequential
logic designs.
• The critical parameters of setup and hold times
• Clock design is also a complex issue in DSM due to RC delay
components in the interconnect and the power dissipation.
• We will look at clock trees, H-trees and clock grids.
• An overall examination of the issues of clocks skew, IR drop and
signal integrity, and how to manage them using circuit
techniques.
• Let’s start by revisiting the expected limits of clock speed
22
Clocked D Flip-flop
• Very useful FF
• Widely used in IC design for temporary storage of data
• May be edge-triggered (Flip-flop) or level-sensitive (transparent
latch)
data
CK
D
Q
Clk Q
output
D
Qn+1
0
1
0
1
23
Latch vs. Flip-flop
Latch (level-sensitive, transparent)
When the clock is high it passes In value to Out
When the clock is low, it holds value that In had when the clock fell
Flip-Flop (edge-triggered, non transparent)
On the rising edge of clock (pos-edge trig), it transfers the value of In to Out
It holds the value at all other times.
In
Out
In
In
In
Clk
Out
Out
CLK
Latch
Out
Clk
CLK
Flip-Flop
24
Alternative View
Clocks serve to slow down signals that are too fast
• Flip-flops / latches act as barriers
Latch
Flip-Flop
DQ
DQ
Clk
Ld
Clk
Clk
Soft Barrier
Hard Barrier
Clk 0->1
Clk=1
DQ
DQ
Ld
Ld
Clk
– With a latch, a signal can propagate through while the clock is high
– With a Flip-flop, the signal only propagates through on the rising edge
Flip-flops consist of two latch like elements (master and slave latch)
25
Clocking Overhead
FF and Latches have setup and hold times that must be satisfied:
Flip Flop
won’t work
will work
Latch
may work
Din
Din
Tsetup
Clk
Thold
Qout
Clk
Thold
Qout
Tsetup + Tclk-q
Td-q
If Din arrives before setup time and is stable after the hold time, FF will work; if Din
arrives after hold time, it will fail; in between, it may or may not work; FF delays the
slowest signal by the setup + clk-q delay in the worst case
Latch has small setup and hold times; but it delays the late arriving signals by Td-q
26
Clock Skew
• Not all clocks arrive at the same time, i.e., they may be skewed.
• SKEW = mismatch in the delays between arrival times of clock edges at FF’s
SKEW causes two problems:
Tclk-q
Fix critical path
Logic
Td
Tcycle = Td +Tsetup + Tclk-q + Tskew
Shows up as a SETUP time violation
Late
Flop
Shows up as a HOLD time violation
Flop
when Tskew + Thold > Tclk-q
Early
Td=0
• The part can get the wrong answer
Flop
Flop
• The cycle time gets longer by the skew
Tsetup
Insert buffer
Delay elements
Early
Late
27
Overhead for a Clock
• CMOS FO4 delay is roughly 425ps/um x Leff
• For 0.13um, FO4 delay ≈ 40 - 50ps
– For a 1GHz clock, this allows < 20 FO4 gate delays/cycle
• Clock overhead (including margins for setup/hold)
– 2 FF/Latches cost about 2-3 FO4 delays
– skew costs approximately 2-3 FO4 delays
• Overhead of clock is roughly 4-6 FO4 delays
• 14-16 FO4 delays left to work with for logic
• Need to reduce skew and FF cost
Tcycle
Skew Tclk-q
Tlogic
CLOCK
28
Signal Integrity Issues at FF’s
• What happens if a glitch occurs in a clock signal?
Positive-Edge Triggered Flip-Flop
DQ
Clk
Clk
• Flip-flop captures and propagates incorrect data
• Could view any signal that, if glitched, could cause a logic upset
as a “clock” signal
• Need to space out clocks/signals or shield them
29
Signal Integrity Issues at FF’s
• What happens if a glitch occurs in data signal?
Positive-Edge Triggered Flip-Flop
DQ
Clk
Clk
• Flip-flop captures and propagates incorrect data
• Need to insure that data signal is stable during FF setup time
• Shielding with stable signals or spacing is needed
30
Sources of Clock Skew
Main sources:
1. Imbalance between different paths from clock source to FF’s
– interconnect length determines RC delays
– capacitive coupling effects cause delay variations
– buffer sizing
– number of loads driven
2. Process variations across die
– interconnect and devices have different statistical variations
Secondary Sources:
1. IR drop in power supply
2. Ldi/dt drop in supply
31
IR Drop Impacts on Clock Skew
Ideal Vdd
- Low delay
- Low skew
Delay (latency)
Skew
Conservative Vdd
- High delay
- Low skew
Actual IR drop impact
- delay about 5-15% larger
- skew about 25-30% larger
32
Power dissipation in Clocks
• Significant power dissipation can occur in clocks in highperformance designs:
•
•
•
•
•
clock switches on every cycle so P= CV2f (i.e., a=1)
clock capacitance can be ~nF range, say 1nF = 1000pF
assuming a power supply of 1.8V, CV = 1800pC of charge
if clock switches every 2ns (500MHz), that’s 0.9A
for VDD = 1.8V, P=IV=0.9(1.8)=1.6W in the clock circuit alone
• Much of the power (and the skew) occurs in the final drivers due
to the sizing up of buffers to drive the flip-flops
• Key to reducing the power is to examine equation CV2f and
reduce the terms wherever possible
– VDD is usually given to us; would not want to reduce swing
due to coupling noise, etc.
– Look more closely at C and f
33
Reducing Power in Clocking
• Gated Clocks:
– can gate clock signals through AND gate before applying to
flip-flop; this is more of a total chip power savings
– all clock trees should have the same type of gating whether
they are used or not, and at the same level - total balance
• Reduce overall capacitance (again, shielding vs. spacing)
shield
clock
shield
(a) higher total cap./less area
Signal 1
clock
Signal 2
(b) lower cap./ more area
– Tradeoff between the two approaches due to coupling noise
– approach (a) is better for inductive noise; (b) is better for
capacitive noise
34
Clock Design Objectives
• Now that we understand the role of the clock and some of the
key issues, how do we design it?
– Minimize the clock skew (in presence of IR drop)
– Minimize the clock delay (latency)
– Minimize the clock power (and area)
– Maximize noise immunity (due to coupling effects)
– Maximize the clock reliability (signal EM)
• Problems that we will have to deal with
– Routing the clock to all flip-flops on the chip
– Driving unbalanced loading, which will not be known until
the chip is nearly completed
– On-chip process/temperature variations
35
Clock Design
Tree
• Minimal area cost
• Requires clock-tree
management
• Use a large superbuffer to
drive downstream buffers
• Balancing may be an
issue
Multi-stage clock tree
Secondary clock drivers
Main clock
driver
36
Clock Configurations
H-Tree
• Place clock root at
center of chip and
distribute as an H
structure to all areas of
the chip
• Clock is delayed by an
equal amount to every
section of the chip
• Local skew inside blocks
is kept within tolerable
limits
37
Clock Configurations
Grid





Greater area cost
Easier skew control
Increased power
consumption
Electromigration risk
increased at drivers
Severely restricts
floorplan and routing
38
Clock Design and Verification
• Many design styles
– Low-speed designs: regular signals, symmetric tree
– Medium-speed designs: balanced H-tree
– High-speed designs
• Balanced buffered H-tree
• Grid
• time-domain variation of a given clock signal due to
random noise, IR drop, temperature, etc.
Normal Distribution
Jitter
1
0.8
0.6
Prob
• Clock verification is more complex in DSM
– RC Interconnect delays
– Signal integrity (capacitive coupling, inductance)
– IR drop
– Signal Electromigration
– Clock Jitter
Series1
0.4
0.2
0
-3
-2
-1
0
1
2
Clock
edge
sigma
39
3
Clock Design Today
•
•
•
•
•
Old methodology
Advanced Clock Verification
Route clock
Route rest of nets
Extract clock parasitics
Perform timing verification
Balance clock by “snaking”
route in reserved areas







IR Drop and Ldi/dt effects
Coupling capacitance
Electromigration checks
Full-chip skew/slew
analysis
Jitter analysis
Inductance Effects
Process variations
40
Good Practices in Clock Design
• Try to achieve the lowest Latency (Super Buffer/H-tree)
• Control transition times (keep edge rates sharp)
• Use 1 type of clock buffer for good matching (except perhaps in
the last leg where you need to have adjustable buffers)
• Have min/max line lengths for good matching
• Determine whether spacing or shielding provides better tradeoff
• Use integral decoupling in buffers to reduce IR and Ldi/dt
41