Download Crosstalk Avoidance in On

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mains electricity wikipedia , lookup

Buck converter wikipedia , lookup

Alternating current wikipedia , lookup

Bus (computing) wikipedia , lookup

Transcript
Energy Efficient and High Speed On-Chip Ternary Bus
Chunjie Duan
Mitsubishi Electric Research Labs, Cambridge, MA, USA
Sunil P. Khatri
Texas A&M University, College Station, TX, USA
Motivation
• Trends in VLSI design
– Shrinking feature size
• Deep SubMicron (DSM) and Very Deep SubMicron (VDSM) processes
– Scaling down supply voltage
– Increasing die-size (e.g. SoC, NoC, CMP)
• Impacts



χ
χ
χ
χ
Smaller gate delay (high speed logic)
Lower switching power per gate
High complexity (>billion gates)
Increasing power consumption
Higher leakage current (standby power)
Reduced noise margin
Increasing interconnect delay
• Interconnect delay >> gate delay
• Global interconnect becomes the performance bottleneck
03/13/2008
2
3
On-chip Bus Interconnects
• The impact of DSM / VDSM:
– W↓, P↓
– L↑, T↑
• to avoid quadratic increase in resistance of the wire:
•
R
Inter-wire capacitance CI is much greater than substrate
capacitance CL, → crosstalk becomes dominant
– λ = CI / CL > 10 for metal 4 in a 0.1mm CMOS process
W
P
CI
CI
CI
L
WT
CI
T
CL
CL
Earlier process
03/13/2008
CL
CL
CL
DSM process
CL
4
Ternary Bus and Mapping
•
Advantage of a ternary bus
–
•
We propose a bit-to-bit binary-ternary mapping scheme
–
–
–
•
•
low voltage step: Vdd/2 instead of Vdd
Each binary bit is mapped directly to a line on the ternary bus.
A binary 0 is mapped to a middle value on the ternary bus. i.e. 0b->0t.
A binary 1 is mapped to either high or low value on the ternary bus. i.e. 1b+ or
1b  - .
Disadvantage: lower bit density (1 bit/line vs 1.58 bit/line for true ternary bus)
Advantages: direct mapping and flexible polarity
–
–
•
03/13/2008
Ternary to binary conversion is very slow and complex
Flexible polarity results in low crosstalk. e.g., the ternary vectors +0+, -0-, +0- and
-0+ all represent the same binary value 101.
Each ternary value is represented by the polarity Pj and the magnitude Dj
Dj
Pj
Tj
Vout
0
X
0
V0
1
0
-
V-
1
1
+
V+
Ternary driver truth table
5
Crosstalk in a Multi-valued Bus
•
Define the effective crosstalk as
X eff , j  abs2d j  d j , j 1  d j , j 1 
– where dj,k = sgn(dj) DVk is the normalized voltage change,
Vstep 
Vdd
NOL
and d j 
DV j
Vstep
. NOL is the number of logic levels
• Delay can be approximated as
 j  k  CL Vstep  d j  l  X eff , j 
Table 1. Examples of Total Crosstalk
Vt-1
Vt
Xeff
000
+++
0
000
0++
1
000
0+-
5
+0+
0+0
4
+0+
0-0
0
-+0
+-0
6
+-+
-+-
8
• Bus speed/power is highly data pattern dependent! +++
---
0
– for l >> 1,  j  k  CL Vstep  l  X eff , j
• Energy consumption is
Etotal   d j  X eff . j  l  CL  DVstep
n
j 1
– when l >> 1,
n
2
Etotal  CL  X eff . j DVstep
j 1
2
• For ternary bus, Vstep = Vdd/2, we know
– max(Xeff,j)= 8
– min(Xeff,j)=0
03/13/2008
6
A Low Power, High Speed 4X Ternary Bus
•
•
Using direct bit-to-bit mapping
Coding rules:
– Rule #1: A direct - ↔ + transition is prohibited.
– Rule #2: A 1b0b is mapped as -t0t or +t0t depending only on the current
polarity of the 1b.
– Rule #3: For a 0b1b transition on bj, if bj-1 is transitioning, Pj is coded so both
lines transition in the same direction.
– Rule #4: For a 0b1b transition on bj, if bj-1 is not transitioning and and bj+1 is
transitioning from 1 to 0, Pj is coded so that the jth and (j+1)th line transition in the
same direction.
– Rule #5: For a 0b1b transition on bj, if no transition on either neighbor, Pj is
coded so {Pj = Pj-1 or Pj = Pj+1} with Pj = Pj-1 having the higher priority.
•
•
•
The 1st rule guarantees max(Xeff,j) = 4, therefore a 2X speed up from a
conventional binary bus
The other rules are designed to lower the probability of high value Xeff,j’s
occurrence on the bus
Binary
Ternary
Xeff
Identical encoder/decoder logic for each bit
An example of 4X ternary sequences
03/13/2008
11110111
00110101
11100011
01010100
10101110
01110001
00000011
00011110
++-000-+
00—0+0+
++-000-+
0+0+0+00
-0-0-+-0
0+-+000000000-000+++-0
01100121
01220111
10112122
00001021
01212200
13431121
00110121
7
An Even Faster 3X Ternary Bus
•
•
•
•
Partition the bus into 5-bit groups
Insert shield wire between groups
Apply the same rules for 4X bus
It can be proven that such a configuration guarantees max(Xeff) = 3
– Additional 33% speed up over 4X ternary bus
• At the cost of 20% additional wires
Bj+4
Bj+3
Bj+2
Bj+1
To j+2, …
Enc
Enc
Enc
Enc
Enc
Enc
Enc
Enc
Pj+4
Dj+4
Pj+3
Dj+3
Pj+2
Dj+2
Dj+1
Pj+1
Dj
Pj
Pj+1
Dj+1
Pj
Dj
Pj-1
Dj-1
Ternary
driver
Ternary
driver
Ternary
driver
Ternary
driver
Ternary
driver
Ternary
driver
Ternary
driver
Ternary
driver
Tj-1
Tj
Tj+1
Tj
Tj+1
Tj+2
Tj+3
Tj+4
4X bus encoder and driver circuit
03/13/2008
Bj
Bj+1
Bj
Bj-1
To j-2, …
3X bus encoder and driver circuit
8
Circuit Implementations
•
•
•
Encoder implemented based on the 5 rules
Decoder is extremely simple (implemented with two 2-input gates)
Ternary driver and receiver can be implemented in current or voltage mode
– Current mode is more power hungry (static current)
– Voltage mode requires a low impedance Vdd/2 supply
I ref
Vdd
2Iref
Iref
din
M1
ENC
to Dj+1
M2
out2
CI
out1
dout
R
I-driver
M5
M3
CL
bus
w xtalk
M4
I-receiver
to Dj-1
(A) current mode
shared V-ref
Vdd
Vdd/2
Vref1
Vdd
Vdd
M2
to Dj+1
M1
din
ENC
Vref2
Vref1
CI
R
Vdd
M3
CL
bus
V-driver
dout
to Dj-1
03/13/2008
(B) Voltage mode
Vref2
V-receiver
9
Experimental Results
• The power saving comes from the redistribution of the Xeff
– More transitions are pushed towards lower Xeff
• The average power saving is ~27%
Crosstalk distribution and normalized energy consumption comparison
(code ternary vs. half-swing binary)
0X
1X
2X
3X
4X
EF
(x104)
%
B
52821
81837
46056
20289
3792
25.0
34.5
T
74712
99228
28101
2754
0
16.3
B
16924
26509
14432
6123
1540
7.99
T
21792
31373
11104
1259
0
5.73
B
15541
25637
15437
7264
1641
8.49
T
19843
31302
12685
1690
2
6.17
B
14852
25109
15949
7771
1823
8.76
T
18976
31285
13550
1691
2
6.35
Bus Size
5
8
16
32
03/13/2008
4X: ternary bus using 4X code;
HB: half-swing binary bus;
RP: ternary bus with random polarity;
TT: true ternary bus
28.2
27.2
27.5
10
Experimental Results
• The proposed 4X and 3X busses are advantageous over
other bus coding schemes.
• EF: Normalized total energy
• PDP: power delay product
Bus type
4XT
3XT
SB
HB
RP
TT
EF (x104)
6.13
6.67
19.7
8.38
12.1
7.55
Delay
4x
3x
4x
4x
8x
8x
PDP (x105)
2.45
2.00
7.88
3.35
9.68
6.04
Pwr saving (%)
68.9
66.1
0
57.5
38.6
61.7
PDP gain (%)
68.9
74.6
0
57.5
-22.8
23.4
Bus Area
1
1.2
1.97
1
1
0.68
Bus performance comparison
03/13/2008
4XT: ternary bus using 4X code;
3XT: ternary bus with 3X code;
SB: binary bus with shielding;
HB: half-swing binary bus;
RP: ternary bus with random polarity;
TT: true ternary bus
Experimental Results
Eye diagrams for uncoded an coded busses (10mm)
03/13/2008
11
Summary
12
• Crosstalk classification was extended to multi-valued buses
• We proposed a direct bit-to-bit binary-ternary mapping scheme which
results in a simple CODEC design.
• We proposed a 4X coding scheme that allows us to double the speed
of a conventional ternary bus and save energy.
• We proposed a coding scheme (3X coding) to attain an additional
33% speed gain at the cost of 20% area overhead.
• We designed and implemented the CODEC and ternary
driver/receiver.
• Our experimental results show significant power saving (27%) and
speed gain (2X or more) over other schemes
03/13/2008