Download Creating your design: Timing (part2)

Document related concepts

Buck converter wikipedia , lookup

Transmission line loudspeaker wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Opto-isolator wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Field-programmable gate array wikipedia , lookup

Rectiverter wikipedia , lookup

Flip-flop (electronics) wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Transcript
System Design: Timing
ET 4054
12-1-2016
Delft
University of
Technology
Challenge the future
Acknowledgement
• All slides used from
• Michel Berkelaar, ex-CAS Group
ASIC Design: Backend
2 | 104
Timing Paths
• Four types of paths:
Combinational logic
PI
Reg
Start point
Reg
End point
PO
ASIC Design: Backend
3 | 104
Digital Circuit – Sequential Path
Data Path
Signal
ASIC Design: Backend
4 | 104
Static Timing Analysis (STA)
Data Path
Signal
Stage by stage delay calculation
Limited signal information is stored after each stage
ASIC Design: Backend
5 | 104
From Signal to Delay and Slew
tdelay = tout - tin
Slew = |t2 – t1 |
Limited signal information is stored after each stage
ASIC Design: Backend
6 | 104
Digital Circuit – Delay
Variation in
Process
Voltage
Temperature
(PVT)
ASIC Design: Backend
7 | 104
Digital Circuit – Delay Variation
Variation in
Process
Voltage
Temperature
(PVT)
ASIC Design: Backend
8 | 104
Delay Variation
Delay is not deterministic
Its distributions is PVT
dependent
So, there is a need of to
handle this. Methods:
Statistical STA
Corner-based
ASIC Design: Backend
9 | 104
Digital Circuit – Delay PDF
ASIC Design: Backend
10 | 104
Statistical Moments
•
Measure the appearance of a distribution
1.
Mean (µ) – Location
2.
Variance (σ2) – Spread
3.
Skewness (γ) – Symmetry
4.
Kurtosis (κ) – Flatness
Circuit Simulation Results
ASIC Design: Backend
11 | 104
Statistical Delay
50% Signal Crossing Time Difference
ASIC Design: Backend
12 | 104
Statistical Slew
10/90 or 20/80 or 30/70 % Signal Crossing Time Difference
ASIC Design: Backend
13 | 104
ASIC Design: Backend
14 | 104
ASIC Design: Backend
15 | 104
ASIC Design: Backend
16 | 104
Handling Variations
• Variability is huge! [-50%, +100%] is normal.
• Corners: provide best (µ−3σ), typical, worst (µ+3σ) case values
• With n varying parameters 3n corners (!)
• Simple calculations
• Pessimistic
• Statistical Analysis:
• Complex calculations (correlations!)
• Result hard to interpret
ASIC Design: Backend
17 | 104
Stochastic Process Variation in Deep-Submicron
CMOS: Circuits and Algorithms
Recommended reading
(available online via University Library and site of book)
ASIC Design: Backend
18 | 104
Deterministic timing analysis
Static Timing Analysis
Problem
Given a transistor level description of combinational circuit, find
arrival times at all gate outputs (pattern independent)
Solution
Clump transistors together into fundamental gates
(“channel-connected components”)
Propagate timing information (low and high polarities from
primary inputs (PI’s) to primary outputs (PO’s)
(“critical path method (CPM)”)
Overcome miscellaneous hurdles along the way
20
Channel-connected Components
A set of transistors interconnected by drain/source nodes
A
B
E
D
C
A
C
B
D
E
21
The Critical Path Method (CPM)
Respond to minor changes in event-driven manner
Propagate changes
by propagating events
a
2/1
2/1
3/2
8/5
b 4/2 4/2
1/1
c
d
1/3
3/5
3/1
3/1
4/2
4/2
2/2
7/6
7/11
incremental_analyze(g)
GATE g;
{
if (arr_time@output(g) changes) {
arr_time = newvalue;
for (i ∈ fanout(g))
incremental_analyze(i);
}
22
Questions
What simplifications are made here?
What are the implications?
23
False Paths (Briefly!)
Classic example
a
1
b
2
0 M
U
1 X
1
2
0 M
U
1 X
c
24
False paths II
No efficient algorithms exist to automatically flag false
paths
Knowledge like “this block will not run in two modes at
the same time” is often crucial to determine false paths
So: you need to specify them by hand…
25
Gate delay models
Build delay models for individual gates
In reality, Delay = f(widths, transition times, loads,…)
Similar idea used in standard cell characterization:
Delay = f (transition times, load)
Table lookup models: storage/accuracy tradeoff (e.g. .lib
format)
Fast circuit simulation – used in many delay calculators
Current source/voltage source models – model gates as
controlled sources and solve for delay using some iterative
methods
26
Interconnect modeling
Precise model requires transmission line analysis
dx
Break up wire into segments
Each segment can be modeled as
L-model
π-model
R(+sL)
R(+sL)
C
C/2
C/2
T-model
R/2(+sL/2)
R/2(+sL/2)
C
Other issues (crosstalk etc.) modeled using coupling caps.
27
Effective capacitance model
Includes the effects of gate nonlinearities
Gate driving RC interconnect
x
x
Determine waveform at gate output; analyze interconnect as a linear
system after that
Possible model for waveform at x
Gate driving total capacitance of net?
Gives erroneous results due to resistive shielding
Actual effective capacitance < total wiring capacitance
Techniques exist for determining Ceffective
R
C1 C2
28
RC Models
Delays can be calculated easily
For example: RC driven by a step excitation
R
V(t)
C
Response V(t) = ( 1 - e-t/RC )
Time constant = RC
Time constants for more complicated circuits?
29
Elmore Delay Computations
For an RC tree:
td = Σi on path Ri C downstream,i
R2
R1
1
C1
R4
2
C4
C2
R3
3
C3
4
R5
5
C5
td,4 = R1 (C1+C2+C3+C4+C5) + R2 (C2+C4+C5) + R4C4
30
Notes
• Elmore Delay inaccurate for the very resistive wires of modern
technologies!
• Crosstalk has significant impact as well in modern technologies,
as the wires are so close together.
ASIC Design: Backend
31 | 104
Logic Scaling
0
10
Power [W], P
10
10-6 J
-3
10-9 J
10
10-12 J
-6
10-15J
10
10-18 J
-9
10
-12
10
-15
Ptp ~ 1/S3
10
-12
10
-9
10
-6
10
Delay [s], tp
-3
10
0
Logic Scaling
Parameter
Relation
Constant Field Scaling: S = U
General Scaling
W, L, tox
1/S
VDD, VT
1/U
Cgate
Cox W L
1/S
Isat
Cox W V
1/U
Ron
V / Isat
1
Power / Device
Isat V
1/U2
Power Dissipation Sources
P ~ α ⋅ (CL + CCS )⋅Vswing ⋅VDD ⋅ f + (I DC + I Leak )⋅VDD
α – switching activity
CL – load capacitance
CCS – short-circuit
capacitance
Vswing – voltage swing
f – frequency
IDC – static current
Ileak – leakage current
C SC = k (a
τ in
+ b)
τ out
energy
P=
× rate + static power
operation
Interconnect Scaling
10
(Length)-2 [cm-2], L-2
10
10
10
10
10
10
10
-5
10
-4
8
10
6
-3
L-2
τ=
10-5 [s/cm-2]
(F = 0.1µ)
-7
10
4
10-9
10-11
2
-2
10
(1µ)
(10µ)
-1
10
(100µ)
10-13
10
(1000µ)
0
-0
10
-2
10
L -2 τ ~S 2
2
10
-4
-18
10
-15
10
-12
10
-9
10
Delay [s], τ
-6
10
-3
10
(Length)[cm], L
10
τ
ρε
2
=
r
c
=
∝
S
L2
ht
w
w
Idealized Wire Scaling Model
Parameter
Relation
Local Wire
Constant Length
Global Wire
W, H, t
1/S
1/S
1/S
L
1/S
1
1/SC
C
LW/t
1/S
1
1/SC
R
L/WH
S
S2
S2/SC
tp ~ CR
L2/Ht
1
S2
S2/SC2
E
CV2
1/SU2
1/U2
1/(SCU2)
Lower Bounds on Interconnect Energy
Shannon’s theorem on maximum capacity of
communication channel
PS
C ≤ B log 2(1 +
)
E = P / C kTB
bit
S
C: capacity in bits/sec
B: bandwidth
Ps: average signal power
Claude Shannon
Ebit (min) = Ebit (C / B → 0) = kT ln(2)
Valid for an “infinitely long” bit transition (C/B→0)
Equals 4.10-21J/bit at room temperature
Ex: 1V, 1mm wire: 200 fJ/bit
(CMOS90)
Reducing Interconnect Power/Energy
Same philosophy as with logic: reduce capacitance,
voltage (or voltage swing) and/or activity
A major difference: sending a bit(s) from one point to
another is fundamentally a communications
/networking problem, and it helps to consider it as
such.
Abstraction layers are different:
For computation: device, gate, logic, micro-architecture
For communication: wire, link, network, transport
Helps to organize along abstraction layers, well
understood in the networking world: the OSI protocol
stack
Crosstalk
• Increasing crosstalk effects
Coupling capacitance
w : width
t : thickness
h : height
s : space
Interconnect Structure and RC model
January 12, 2016
39
Motivations―Method―Results―Conclusion
The Impact of Crosstalk on Delay
• Increasing crosstalk effects
Input skew: arrival time difference
January 12, 2016
40
Motivations―Method―Results―Conclusion
Crosstalk and Variability
• Increasing crosstalk effects
January 12, 2016
41
Motivations―Method―Results―Conclusion
Dealing with Capacitive Cross Talk
•
•
•
•
•
•
Avoid floating nodes
Protect sensitive nodes
Differential signaling
Do not run wires together for a long distance
Use shielding wires
Use shielding layers
Shielding
Shielding
wire
GND
V DD
GND
Substrate (GND )
Shielding
layer
What About Clock Distribution?
• Clock easily the most energy-consuming signal
of a chip
• Largest length
• Largest fanout
• Most activity (α = 1)
• Skew control adding major overhead
• Intermediate clock repeaters
• De-skewing elements
• Opportunities
• Reduced swing
• Alternative clock distribution schemes
• Avoiding a global clock altogether
Interconnect
• Interconnect important component of overall power
dissipation
• Structured approach with exploration at different
abstraction layers most effective
• Lot to be learned from communications and
networking community – yet, techniques must be
applied judiciously
• Cost relationship between active and passive
components different
• Some exciting possibilities for the future: 3Dintegration, novel interconnect materials, optical or
wireless I/O
Timing Metrics Reminder
tc-q
tsu
thold
tplogic
tcd
T
46
:
:
:
:
:
delay from clock (edge) to Q
setup time
hold time
worst case propagation delay of logic
best case propagation delay
(contamination delay)
: clock period
T ≥ tc-q + tplogic + tsu
thold≤ tcdregister + tcdlogic
D-Latch
[Taskin, Kourtev & Friedman, The VLSI Handbook]
47
D-Register
(more commonly known as D-Flip-Flop)
[Taskin, Kourtev & Friedman, The VLSI Handbook]
48
Sequential Circuit Timing
+
Tclk
> tc-q + tplogic + tsu
tplogic
= tp,add + tp,abs + tp,log
How to reduce Tclk?
Pipelining
+
Tclk
> tc-q + tp,logic + tsu
Tp,logic = tp,add + tp,abs +tp,log
Non-pipelined
Tclk > tc-q + max(tp,add, tp,abs, tp,log) + tsu
Increase functional throughput
Pipelined
Pipelining Observations
•
•
•
•
Very popular/effective measure to increase functional throughput and
resource utilization
At the cost of increased latency
All high performance microprocessors excessively use pipelining in
instruction fetch-decode-execute sequence
Pipelining efficiency may fall dramatically because of branches in program
flow
• Requires emptying of pipeline and restarting
• Partially remedied by advanced branch prediction techniques
Bottom line: more flip-flops, greater timing design problems
Edge-Triggered Slow Path Skew Constraint
One clock tick later
Τ+δ
clock
tmax
R1
data
R2
Data at R2
should be stable
before clock pulse
is applied
(ti = interconnect delay)
Timing constraint
T + δ ≥ tmax = tp,logic+ ti+ tc-q,max+ tsu
Internal delay of
flip-flop
T ≥ tmax - δ
Minimum Clock Period Determined by Maximum Delay
between Latches minus skew δ
Edge-Triggered Fast Path Skew Constraint
δ
Same clock tick
clock
tmin
R1
data
R2
Race between
clock and data
δ +t hold < t cq,min + t cd,logic + ti
δ
thold
δ + thold < t cq,min + t cd,logic + t i
δ < t min = t cq,min + t cd,logic + t i −t hold
Maximum Clock Skew
Determined by Minimum Delay
between Latches
Edge-triggered long path constraints
Data must not reach destination FF too late
d(i,j)
i
j
si + max[d(i,j)] + Tsetup ≤ sj+ P
si
sj
sj + P
d(i,j)
Tsetup
54
Edge-triggered short path constraints
Data must not reach destination FF too early
d(i,j)
i
j
si + min[d(i,j)] ≥ sj + Thold
si
sj
sj + P
Thold
d(i,j)
55
Setup and Hold
If your design does not meet the setup timing constraints,
it will work at a lower clock frequency
If your design does not meet hold timing constraints, it
will not work at any clock frequency!
56
Network-on-a-Chip (NoC)
or
Dedicated networks with reserved links preferable for
high traffic channels – but: limited connectivity, area
overhead
Flexibility an increasing requirement in multi (many) –
core chip implementations
Point-to-point or time-multiplexed bus network
57
The Network Trade-off’s
Interconnect-oriented architecture trades off flexibility,
latency, energy and area-efficiency through the following
concepts
Locality - eliminate global structures
Hierarchy - expose locality in communication requirements
Concurrency/Multiplexing
Very Similar to Architectural Space Trade-off’s
Dedicated wiring
[Courtesy: B. Dally, Stanford]
Network-on-a-Chip
58
Networking Topology
Homogeneous
Bus, Star, Tree, Ring, Crossbar,
Mesh (granularity), …
Heterogeneous
Crossbar
Hierarchy
Tree
Mesh (FPGA)
59
Energy x Delay
Network Topology Exploration
Mesh
Binary Tree
Short connections in tree are redundant
Energy x Delay
Manhattan Distance
Inverse clustering complements mesh
Mesh
Binary Tree
Mesh + Inverse
Manhattan Distance
[Ref: V. George, Springer’01]
60
FPGA Architecture
61
FPGA PSM
62
FPGA Delay
The switches in the switch matrix are small (pass-)
transistors.
In FPGA’s, the switch matrices in the connections add
considerable resistance and hence delay!
63
Pass Transistor Logic
n
t p (Vn ) = 0.69∑ kCReq = 0.69CReq
k =0
n(n + 1)
2
Propagation delay is proportional to n2!
Insert buffers
mopt = 1.7
t pbuf
CReq
In current technologies, mopt is typically 3 or 4
64
64
Signaling Protocols
Network
din
reqin ackin
dout
Globally
Asynchronous
reqout ackout
self-timed handshaking protocol
Din
REQ
Processor
Module
(µProc, ALU, MPY, SRAM…)
in
Allows individual modules
to dynamically
trade-off performancedone
for energy-efficiency
1 cm chip: signal need 66 ps from one side to the other
2 GHz clock
65
Signaling Protocols
Network
din
reqin ackin
dout
Globally Asynchronous
reqout ackout
Physical Layer
Interface Module
din
dout
clk
Din
REQ
done
in
Clk
done
Processor
Module
(mProc, ALU, MPY, SRAM…)
Locally synchronous
66
Collaborative Networks
Metcalfe’s Law
to the rescue of
Moore’s Law!
Networks are intrinsically robust → exploit it!
Massive ensemble of cheap, unreliable components
Network Properties:
Bio-inspired
Local information exchange → global resiliency
Randomized topology & functionality → fits nano
properties
Distributed nature → lacks an “Achilles heel”
67
Traditional corner-based analysis
Given a set of parameters p1, p2, … , pk
Each parameter varies between [pi,min, pi,max]
The variational region forms a multidimensional box
Corner-based analysis performs simulations at each corner
Typically parameters correspond to process parameters,
temperature and voltage
68
Corner checks
69
Corner-based analysis: problems
The number of corners we need to examine grows
exponentially with the number of parameters
It is only conservative if there is a monotone relationship
between the parameter and the delay; otherwise the
worst behavior may be somewhere in between!
For very advanced technologies, monotonicity is no longer
true for all parameters!
70
Statistical Timing Analysis
Intra-die vs. inter-die variation
Flip-Flop
…
…
…
Intra-die
Inter-die only
P
P
Circuit delay is
maximum of individual
path delays
Ignoring intra-die
variation underestimates circuit delay
Dpath
Dpath
P
P
Dpath
Dpath
P
P
Dpath
Dpath
72
Statistical static timing analysis
Intra-die variations in addition to inter-die variations
Deterministic timing analysis
Statistical timing analysis
Path-based analysis: find variability along a single path
D1
PO
D2
…
…
PI
max(D1,D2…Dn)
Dnpath
Block-based analysis: Find the distribution (PDF/CDF) of:
Dmax = max(D1, D2, … , Dnpaths)
Di: distribution of ith path delay
73
Difficulties in statistical timing analysis
Path correlation due to reconvergent fanouts
a→b→d
Max func. of
corr. paths
b
d
a
c
a→c→d
Correlated
paths
circuit delay
distribution
Spatial correlations between nearby gates
74
Modeling spatial correlations
Chip area divided into rectangles
Nearby squares are correlated
Alternative hierarchical model
1,1
2,1
∆Lg1= ∆L2,1+ ∆L1,1+ ∆L0,1
∆Lg2 = ∆L2,4+ ∆L1,1+ ∆L0,1
∆Lg3 = ∆L2,15+ ∆L1,4+ ∆L0,1
2,2
2,5
2,3
1,2
1,3
2,6
2,4
2,7
1,4
2,8
2,14
2,13
2,10
2,16
2,15
2,9
2,12
2,11
75
Modeling spatial correlations
76
Problem statement
Find the PDF/CDF for circuit delay distribution:
Dmax = max(D1, D2, … , Dnpaths)
where Di : delay distribution of i
th
path in the circuit
Assume normal distributions on process parameter values
Why?
Is this reasonable? If not, what is?
Parameter correlations
Leff shows high spatial correlations
Tox, Nd are largely uncorrelated
77
Some basics of probability
Mean, variance
Covariance
Covariance matrix is positive semidefinite
Correlation coefficient
Independence
78
Commonly-encountered distributions
Gaussian or normal distribution N(µ,Σ)
In one variable
PDF
CDF
For a Gaussian, independence is identical to uncorrelatedness
[http://users.isr.ist.utl.pt/~mir/pub/probability.pdf]
79
Commonly-encountered distributions (contd.)
Multivariate Gaussian
PDF
80
Commonly-encountered distributions (contd.)
Bivariate Gaussian
PDF
The sum of two Gaussians is a Gaussian
Z= X+Y, where X, Y are uncorrelated Gaussians
81
The ellipsoid
Locus of equiprobable points for most distributions is not
a box
Gaussian: locus of equiprobable points forms an ellipsoid
Ellipse centered at (mX ,mY), axes along the eigenvectors of the
covariance matrix
Uncorrelated case
Correlated case
y
PC2
PC1
x
82
Idea of orthogonal transformations
PC2
y
Set of equiprobable
points for a 2D Gaussian
= ellipse
PC1
Correlated
basis
Gaussian with a diagonal
covariance matrix
x
y
PC2
Uncorrelated
basis
PC1
[Adapted from Tim Marks http://cogsci.ucsd.edu/~desa/pca.pdf]
x
83
Berkelaar’s method
[Ber97a,Ber97b,JB00]
Types of operations in STA
SUM: Ta→out = Ta + da→out ; Tb→out = Tb + db→out
MAX: Tout = max(Ta→out , Tb→out )
a
b
out
Gate delay modeled as a Gaussian
SUM is easy: sum of Gaussians = Gaussian
S = A+B: µS = µA + µB, σs2 = σA2 + σB2
MAX of Gaussians is not a Gaussian
Approach: approximate max by a Gaussian
Analytic expressions for mean, variance in [JB00]
Note: Calculating “MIN” for early mode analysis is analogous to calculating
“MAX” since MIN(f) = MAX(-f)
84
Why I didn’t write the precise expression
in the last slide…
85
Why I didn’t write the precise expression
in the last slide… (proof)
86
Clock Tree Synthesis
Clock problem:
•
•
•
•
•
•
•
Heavy clock net loading
Long clock insertion delay
Clock skew
Skew across clocks
Clock to signal coupling effect
Clock is power hungry
Electromigration on clock net
ASIC Design: Backend
87 | 104
Arguments for Sleep Mode Management
• Many computational applications operate in burst modes,
interchanging active and non-active modes
• General purposes computers, cell phones, interfaces,
embedded processors, consumer applications, …
• Prime concept: Power dissipation in standby should
absolutely minimum, if not zero
• Sleep mode management has gained importance with
increasing leakage
ASIC Design: Backend
88 | 104
Standby Power - Was Not A Concern In Earlier Days
Pentium-1: 15 Watt (5V - 66MHz)
Pentium-2: 8 Watt (3.3V- 133 MHz)
Processor in idle mode!
Floating Point Unit and Cache powered down when not in use
ASIC Design: Backend
[Source: Intel]
89 | 104
Dynamic Power - Clock Gating
• Turn off clocks to idle modules
• Ensure that spurious activity is set to zero
• Must ensure that data inputs to module are in stable
mode
• Primary inputs are from gated latches or registers
• Or, disconnected from interconnect network
• Can be done at different levels of system hierarchy
ASIC Design: Backend
90 | 104
Clock Gating
Turning off the clock for non-active components
Bus
Clk
Register File
Enable
Enable
Logic Module
Logic Module
Disconnecting the inputs
ASIC Design: Backend
91 | 104
Clock-gating Efficiently Reduces Power
Without clock gating
30.6mW
MPEG4 decoder
With clock gating
8.5mW
VDE
0
5
10
15
20
Power [mW]
25
MIF
DSP/
HIF
90% of F/F’s clock-gated.
70% power reduction by clock-gating alone.
DEU
896Kb SRAM
ASIC Design: Backend
92 | 104
Clock Gating
• Challenges to skew management and clock distribution (load on clock
network varies dynamically)
• Fortunately state-of-the-art design tools are starting to do a better job
• For example, physically aware clock-gating inserts gaters in clock-tree
based on timing constraints and physical layout
Power savings
Simpler skew management, less area
CG
CG
CG
CG
CG
ASIC Design: Backend
93 | 104
Trade-Off between Sleep-Modes and Sleep-Time
Typical operation modes
Active mode
normal processing
Standby mode
fast resume
high passive power
Sleep mode
slower resume
low passive power
Resume time from clock gating determined by the time it takes to turn
on the clock distribution network
Standby Options:
Just gate the clock to the module in question
Turn off phased-locked loop(s)
Turn off clock completely
ASIC Design: Backend
94 | 104
Sleep Modes in µProcessors and µControllers
ASIC Design: Backend
95 | 104
Wake-up Delay
The Standby Design Exploration Space
Sleep
Nap
Doze
Standby
Standby Power
Trade-off between different operational modes
Should blend smoothly with run-time optimizations
ASIC Design: Backend
96 | 104
Default Clock Behavior
• Defining the clock in a single-clock design constrains all timing
paths between registers for single-cycle, setup time
• By default the clock rises at 0ns and has a 50% duty cycle
• By default DC will not “buffer up” the clock network, even when
connected to many clock/enable pins of flip-flops/latches
• The clock network is treated as “ideal” - infinite drive capability
•
Zero rise/fall transition times
•
Zero skew
•
Zero insertion delay or latency
• Estimated skew, latency and transition times can, and should be modeled for
a more accurate representation of clock behavior
ASIC Design: Backend
97 | 104
Defining a Clock
Default clock with
50% duty cycle
Clock with specified
create_clock -period 2 –waveform {0 0.6} –name My_CLK [get_ports Clk]
duty cycle
ASIC Design: Backend
98 | 104
Modeling Clock
set_clock_uncertainty –setup TU [get_clocks CLK]
Pre-Layout: clock skew + jitter + margin
reset_design
create_clock -p 5 -n MCLK Clk
set_clock_uncertainty 0.5 MCLK
set_clock_transition 0.08 MCLK
set_clock_latency -source –max 4 MCLK
set_clock_latency –max 2 MCLK
ASIC Design: Backend
99 | 104
Specifying Setup-Timing Constraints
• Objective: Define setup timing constraints for all paths
within a sequential design
• All input logic paths (starting at input ports)
• The internal (register to register) paths
• All output paths (ending at output ports)
ASIC Design: Backend
100 | 104
Constraining Input Paths
The user must specify the latest arrival time of the data at input A
What is Tmax for N ?
create_clock –period 2 [get_ports Clk]
set_input_delay –max 0.6 –clock Clk [get_ports A]
The maximum delay for path N = 1.5 ns
create_clock –period 2.5 [get_ports Clk]
set_input_delay –max 0.9 –clock Clk [get_ports A]
ASIC Design: Backend
101 | 104
Constraining Output Paths
The user must specify the latest arrival time of the data at output B
What is Tmax through S ?
create_clock –period 2 [get_ports Clk]
set_output_delay –max 0.8 –clock Clk [get_ports B]
The maximum delay to port B = 0.7 ns
create_clock –period 2 [get_ports Clk]
set_output_delay –max 1.3 –clock Clk [get_ports B]
ASIC Design: Backend
102 | 104
Environmental attributes
• Input drivers and transition times
• Capacitive output loads
• Process/Voltage/Temperature
(PVT) operating conditions
• Interconnect parasitic RCs
ASIC Design: Backend
103 | 104
Input drivers and transition times
• Rise and fall transition times on an input port affect the cell
delay of the input gate
set_input_transition 0.6 [get_ports A]
set_driving_cell –lib_cell OR3B [get_ports A]
set_driving_cell –lib_cell FD1 –pin Qn [get_ports A]
ASIC Design: Backend
104 | 104
Capacitive output loads
Capacitive loading on an output port affects the transition time
and thereby the cell delay of the output driver
set_load [expr 30.0/1000] [get_ports B]
ASIC Design: Backend
105 | 104
Load budget
Assumptions:
1.The maximum fanout capacitance of any block’s input port is
limited to the equivalent of 10 “and2a1” gates
2.Output ports can drive a maximum of 3 other blocks
3.The driving gate of every output is the cell inv1a1
ASIC Design: Backend
106 | 104
Multi-Mode Multi-Corner
• Sometimes a block/IC needs to be able to work in different
modes:
•
•
•
•
•
Test mode
Low power / low performance/frequency mode
High power / high performance/frequency mode
Sleep mode
Etc…
• Each mode needs a separate set of timing constraints!
• The circuit needs to work for all modes in all corners…
ASIC Design: Backend
107 | 104