Download High-speed Digital Architectures

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Chirp spectrum wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Opto-isolator wikipedia , lookup

Heterodyne wikipedia , lookup

Flip-flop (electronics) wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Atomic clock wikipedia , lookup

Phase-locked loop wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Transcript
High-Speed Digital Architectures
Chris Allen ([email protected])
Course website URL
people.eecs.ku.edu/~callen/713/EECS713.htm
1
Overview
Topics include
• Pipelining
• Latency
• Demultiplexing
• Multiplexing
• Clock fanout and distribution
• Clock skew and fine timing adjustments
• Clock signal sources
2
Logic devices and high-speed designs
Pipelining & latency
Consider the multi-bit adder, A + B
We want to add two 18-bit binary numbers
(unsigned binary)
A17 A16 … A1 A0 + B17 B16 … B1 B0
A0 and B0 are the least significant bits (LSBs)
A17 and B17 are the most significant bits (MSBs)
How fast can we add two 18-bit numbers?
A 6-bit ECL adder we be the building block for this design
Inputs
A5:A0
B5:B0 Cn (carry input)
Outputs F5:F0 G (carry output)
A: 0 to 31
F: 0 to 31
B: 0 to 31 Cn: 0 or 1
G : 0 or 1
Cn+ A + B (maximum case)  F = 63, G = 0
3
Pipelining & latency
18-bit adder
Consider the propagation delay (typical)
An, Bn, Cn  Fn
: 3 ns
An, Bn, Cn  G
: 2.5 ns
Note:
The adder is combinational
logic, not sequential, there is
no clock signal
How to find T
Identify critical path (longest delay)
5:0
A, B  F
A, B  G
3 ns
2.5 ns
11:6
A, B  F
Cn  F
A, B  G
3 ns
5.5 ns
5 ns
17:12 A, B  F
Cn  F
A, B  G
3 ns
8 ns
7.5 ns
For this configuration, the output is
stable after 8 ns  125 MHz is the
max rate for this 18-bit adder
A greater number of bits (e.g., 36-bit adder) would
further increase the delay, reducing the add rate
4
Pipelining & latency
Multiplexed 18-bit adder
twice as much hardware to produce results twice as fast
A clock signal
has been added to
synchronize the
registers
In this configuration, the
adding rate is 250 MHz
and the latency is 8 ns
Propagation delay for each channel is still 8 ns
2:1 multiplexing cuts time in half  4 ns (250 MHz)
4:1 multiplexing cuts time by four  2 ns
But the latency is still 8 ns
5
Pipelining & latency
Pipelined 18-bit adder
While a single 6-bit add takes
only 3 ns (333 MHz), the
propagation of the carry bit
slows the 18-bit addition to 8 ns
(125 MHz)
A pipelined architecture allows
the adder to operate with a 3-ns
add time plus 0.5 ns for setup
and propagation through the
register  3.5 ns cycle time
(286 MHz)
This scheme is expandable to
N-bit adds with same rate
In this configuration, the
adding rate is 286 MHz
and the latency is
6 clock cycles or 21 ns
6
Pipelining & latency
Pipelining & latency
The price to be paid for achieving this speed is
• Circuit complexity
• Data latency
Some applications can tolerated large latency
Examples include one-way data transfers
such as TV broadcast signal
Other applications cannot tolerate much latency
Examples include two-way data exchange
such as voice communications
(calls via satellite have latency of ~ 0.5 s)
Techniques to further speed up the add process
If a 6-bit add takes 3 ns
a 2-bit add should take ~ 1 ns
a 1-bit add should take ~ 0.8 ns
Theoretically could do adds as fast as 1.5 ns (667 MHz) with 18 add stages
and 36 clock cycles of latency
Note also that this approach requires a large number of clock signals (not
shown)
7
High-speed digital design examples
Consider a data acquisition (DAQ) system
Analog signals are digitized and recorded
• Example applications include –
oscilloscopes, radar receiver
The maximum bandwidth of the
acquired signal is limited by the
ADC clock frequency
The precision of the digitized signal
is limited by the number of bits
in the ADC
The length of the data record is limited
by the memory size (2N)
WE: write enable
8
High-speed digital design examples
Consider an arbitrary waveform generator (AWG) system
Analog waveforms are produced from
stored digital records
• Example application –
radar waveform generation
The maximum bandwidth of the
output signal is limited by the
DAC clock frequency
The precision of the digitized signal
is limited by the number of bits in the DAC
The length of the waveform is limited by the memory size (2N)
In both cases, the maximum data vector size is X x 2N
i.e., X-bit wide word, 2N word vector length
9
High-speed digital design examples
Data acquisition system
Consider the case where X = 8 bits, N = 16  64k word vector
1-GHz clock rate, maximum record length is 65.5 s
DAQ high-level timing
Within the 1-ns clock period
• The acquired data must stablize
• The memory must be addressed
• The Write Enable line must toggle
All in compliance with the memory’s timing requirements
• Setup and hold times for Data and Address relative to Write Enable
Key to the DAQ operation are the address generator and the memory
This design requires a 16-bit synchronous counter with preset inputs
– Not a ripple counter
The Addr_CLK must be the system clock (1 GHz)
The memory write cycle time < 1 ns
It is difficult to achieve the required timing with available technology
10
High-speed digital design examples
The timing requirements can be relaxed with demultiplexing
The SPT7760 ADC has an integrated 1:2 demux that reduces the
effective output data rate (per channel) to 500-MSa/s
11
High-speed digital design examples
The ADC’s 1:2 demux doubles the memory’s write time to 2 ns
Consider a design using the following devices:
8-bit, 1 GSa/s ADC with 1:2 demux
700 MHz, 8-bit sync counter
1k x 4 RAM with 5-ns write cycle time
Since the RAM’s 5-ns write cycle time > ADC’s 2-ns demux’d update time
further demultiplexing is required
A 4:1 demux will reduce the data rate to 8 ns
a rate the RAM can accommodate
One cost of this approach is the added complexity
both in terms of added hardware
and in terms of signal formatting for output
4:1 DEMUX
12
High-speed digital design examples
High-level timing for system with 2:1 ADC demux and 4:1 demux on PCB
13
High-speed digital design examples
Just as demultiplexing relaxed the DAQ timing requirements,
multiplexing eases the arbitrary waveform generator’s timing challenges
4:1 MUX
A 4:1 mux will reduce the data rate from each memory device by 4
14
High-speed digital design examples
Integrating a multiplexer in the digital-to-analog converter
allows the converter to operate at higher rates
Integrated
1:2 Mux
15
High-speed digital design examples
General design rules for these high-speed applications
• Keep uniform line lengths within a data bus to ensure constant signal
latency
• Keep analog signal lines away from digital lines
digital lines contain significant broadband ‘noise’ that can degrade the
analog signal through crosstalk
• Clock signal distribution design is critical to achieve maximum
operating speed
• Jitter in the clock signal (due to clock generator circuit) will result in
phase noise in the data
16
Clock signal issues
Clock signals provide a time reference for the entire system
Issues to consider regarding clock signals
Clock fanout and distribution
Clock skew and fine timing adjustments
Clock division: fCLK/2, fCLK/4, …
Clock signal generation
Clock fanout
Consider case where multiple registers must be clocked simultaneously
However the fanout limit of the
technology is ~ 5 (3 to 10)
Clock fanout buffers
Intended to provide multiple
copies of the clock signal
with equal latencies
17
Clock skew
Clock skew describes when timing signals arrive at
different components at different times
Possible causes include
Clock buffer skew
Mismatched trace propagation delay
Capacitive loading or coupling
Clock buffer skew
Gate-to-gate skew:
20 ps (typ), 50 ps (max)
18
Clock skew
Even with low-skew clock buffers, some clock skew will remain
Timing variations can compound as devices are cascaded leading to
increaed uncertainty
Impact?
System timing variations  reduced timing margin
How to compensate for clock skew?
For critical timing applications, we can employ delay adjustments
Delay line (passive) delay depends on length
Gate delay (active) delay depends on gate characteristics
Example
Consider two clock (or data) lines we wish to synchronize using delay line
variations
By changing jumper
connections can make
tB < tA or tB = tA or tB > tA
19
Clock skew
Similar schemes for varying signal delay.
20
Clock skew
Problem
Jumpers may cause impedance mismatch  reflections
Using surface mount strips close the gaps helps reduce mismatches
Problem
Occupies significant board area
Hard to implement at chip level or in MCM
Implement jumper selection electronically
Consider implementing the variable delay with a simple gate
(OR, XOR, AND, … 300 to 1500 ps)
and a multiplexer
The delay is controlled
electronically by
bits S0 and S1
S1 S0
0 0
0 1
1 0
1 1
F
A
B
C
D
Delay
0
Tp
2Tp
3Tp
21
Clock skew
Single-chip programmable delay lines available
22
Clock division
Subharmonics of the clock signal (fCLK/2, fCLK/4, …) can be produced using
simple flip-flops configured as clock frequency dividers
The output signals have a 50% duty cycle regardless of the input signal’s
duty cycle
Shift registers can be used to divide the signal frequency by other integer
multiples (know as ring counters or Johnson counters)
Various duty cycles can be produced from these configurations
Ring counter
Johnson counter
23
Clock sources
Clock signals are used to provide a timing reference
Typically only one clock oscillator is used per system
In computers, higher frequency signals may be derived from a single
oscillator through frequency multiplication (e.g., PLL)
In radar systems, the radar frequency, the A/D sample clock, and other
timing and frequency signals are derived from a master clock oscillator
(an exception would be the clock that drives the DSP which operates
asynchronously from the rest of the system)
Specifying the clock oscillator for digital apps, consider several parameters
•
•
•
•
•
•
•
Output voltage level (TTL or ECL, not sinusoidal with zero mean)
Frequency (MHz, GHz) nominal operating freq @ nominal temp & voltage
Stability (ppm) long-term frequency drift driven by temp, aging, voltage
Rise/Fall time (ps)
Waveform symmetry (%) may want to use CLK and CLK for split phase timing
Environmental factors temperature range, shock/vibration
Package DIP vs. SMT metal vs. plastic or ceramic
24
Clock sources
Stability factors
Temperature – quartz crystals used as resonant elements
• Piezoelectric effect
Resonance frequency determined by physical dimensions
Temperature induces expansion/contraction  frequency changes
Several varieties
Non-compensated – large f / T
Temperature compensated – less f / T
Oven-controlled – T is constant
Short-term frequency variations
Characterized in terms of phase noise or timing jitter
Phase noise refers to a random, uncorrelated clock-period variation
• Introduces timing variations that reduce timing margin
• Frequency multiplication amplifies the phase noise
25
Clock sources
Various methods available to characterize clock jitter (phase noise)
Spectral analysis
An ideal clock signal has spectral energy at the fundamental and harmonic
frequencies only
Jitter (phase noise) causes a broadening of the spectral lines
Power level below fundamental at f offset
• For example, -50 dBc @ 100 kHz
26
Clock sources
Converting jitter from measured phase noise
W. Kester, “Converting Oscillator Phase Noise to Time Jitter,”
MT-008 TUTORIAL, Rev. A, Oct. 2008, Analog Devices, Inc.
jitterRMS 
 RMS
o
27
Clock sources
Delay line method of characterizing clock jitter
Beat a sample of the clock signal
with a delayed version of itself
v1  cos  t  1 
v 2  cos  t     2 
Mixer produces  and  terms
the LPF rejects the  term leaving v0
v 0  cos    1  2   cos     
For fixed delay value, , and a stable 
v0 varies as  changes
To relate  to time
jitter  

Example, for fo = 300 MHz,  = 2 (35 mrad), jitter = 18.5 ps
this value will vary with delay line length
28
Clock sources
Example data sheet
29
Clock sources
For testing purposes, it is useful to vary the clock frequency
Finding the maximum operating clock frequency
In laboratory testing we can use a variable clock generator (if you have one)
Older versions have a maximum clock output frequency of 250 MHz
However we can use standard laboratory oscillator (sinusoidal) if se set the
amplitude to V (logic levels) and apply a DC bias = threshold voltage
Example, with ECL and GaAs devices
C1: AC couples the oscillator to the circuit
RT: provides impedance matching
and level shifting to VBB
L: provides DC couples / AC blocks
VBB from CLK
C2: AC path for return current
RT = Zo (50 )
C1, C2  (2 f C)-1 << RT (< 1 )
L  2f L >> RT (> 1 k)
30