Download talk

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Integrated Circuit Design with
NEM Relays
Fred Chen2, Hei Kam1, Dejan Markovic3,
Tsu--Jae King Liu1, Vladimir Stojanovic2, Elad Alon1
Tsu
1Department
of EECS, University of California, Berkeley, CA
2Department of EECS, Massachusetts Institute of Technology, Cambridge, MA
3EE Department, University of California, Los Angeles, CA
CMOS is Scaling, Power Density is Not
Power Density Prediction circa 2000
1000

10.1
100
Sun’s Surface
Rocket Nozzle
Nuclear Reactor
Normalized Energy/op
Power Density (W/cm2)
10000
100
8086 Hot Plate
10 4004
P6
80088085 386
Pentium® proc
286
486
8080
1
1970
1980
1990
2000
2010
Year
S. Borkar
80
60
40
20
0 0
10
1
10
2
3
10
10
1/throughput (ps/op)
4
10
Vdd and Vt not scaling well  power/area not scaling
ICCAD 2008 – November 12, 2008
2
CMOS is Scaling, Power Density is Not
Power Density Prediction circa 2000
1000


10.1
100
Sun’s Surface
Rocket Nozzle
Nuclear Reactor
Normalized Energy/op
Power Density (W/cm2)
10000
100
Core 2
8086 Hot Plate
10 4004
P6
80088085 386
Pentium® proc
286
486
8080
1
1970
1980
1990
2000
2010
Year
S. Borkar
80
60
Operate at a
lower energy
point
40
20
0 0
10
Run in parallel to
recoup performance
1
10
2
3
10
10
1/throughput (ps/op)
4
10
Vdd and Vt not scaling well  power/area not scaling
Parallelism to improve performance within power budget
ICCAD 2008 – November 12, 2008
3
Where Parallelism Doesn’t Help
25
100
20
80
15
Etotal
10
Edynamic
5
Eleak
0.1

0.2
0.3
Vdd (V)
0.4
0.5
60
Scale Vdd & VT:
40
20
0 1
10
2
10
3
4
10
10
1/throughput (ps/op)
5
10
CMOS circuits have wellwell-defined minimum energy



10.1
Energy/op vs. 1/throughput
Normalized Energy/op
Normalized Energy/cycle
Energy/op vs. Vdd
Caused by leakage and finite sub-threshold swing
Need to balance leakage and active energy
Limits energy-efficiency, regardless how slow the circuit runs
ICCAD 2008 – November 12, 2008
4
What if There Was No Leakage?
Normalized Energy/cycle in CMOS
Measured relay I-V
1.2
W x L = 2mm x 20mm
1.0
20
15
IDS (mA)
Normalized Energy/cycle
25
Etotal
10
0.6
VDS = 1V
0.0
34
0.1


10.1
0.2
0.3
Vdd (V)
0.4
compliance
limit
0.4
0.2
Edynamic
5
H = g = 0.6mm
0.8
0.5
35
36
VGB (V)
37
No longer need to balance increasing component of
energy as Vdd decreases
Relays offer near infinite subsub-threshold slope and no
leakage current
ICCAD 2008 – November 12, 2008
5
Outline


If we could reliably build relays, would circuits
implemented using relays offer superior
energy--efficiency to CMOS?
energy
Compact Circuit Model for NEM Relays


Circuit Design Paradigms for NEM Relays


How to deal with ―slow‖ relays
Compare vs. CMOS


10.1
Fast yet key functionality captured
Digital Logic: 32-bit adder
Mixed Signal I/O: DAC, ADC
ICCAD 2008 – November 12, 2008
6
NEM Relay Structure & Operation
Schematic Cross-Section
Electrical Gate
Metal Channel
Gate Dielectric
W HH
Source
t gap
Closed Relay (“ON”):
|Vgb| > Vpi (pull-in voltage)
Air gapH
Drain
Drain
tdimple
Body Dielectric
Bodyody
Body

Plan-View
L
Source
Open Relay (“OFF”):
|Vgb| < Vpo (pull-out voltage)
Anchor
Electrical
Gate
Body Electrode
(under body
dielectric)
W
Drain
Channel

Contact Dimples
10.1
Relatively low on-state resistance
(100Ω < Ron < 1kΩ @ 90nm)
Infinite off-state resistance  Zero
off-state leakage
ICCAD 2008 – November 12, 2008
7
NEM Relay Model
Anchor
Mechanical
Model
Spring k
Mass m
Damper b
C
0.5Rc
Cs
S
Rsd
0.5Rc
Cd
Go
Cgc
Rcs
So
Cgb
D
Rcd
Rsd
Do
Csd_b
Csd_b
B

Compact Verilog
Verilog--A model for circuit simulation:



10.1
Mechanical dynamics: spring (k), damper (b), mass (m)
Electrical parasitics: non-linear gate-body (Cgb), gate-channel
(Cgc), and source/drain-body cap (Csd_b), contact resistance (Rcs,d)
Vpi and delay verified against device measurements
ICCAD 2008 – November 12, 2008
8
NEM Relay Scaling

Constant EE-field scaling has similar benefits as CMOS
Pull-in Voltage with Beam Scaling

Measured pullpull-in voltages scale linearly
If we can overcome reliability issues & surface forces scale:
{W,L,tgap} = {90nm,2.3um,10nm}  Vpi = 200mV


10.1
Mechanical delay also scales linearly (~10ns @90nm)
ICCAD 2008 – November 12, 2008
9
NEM Relay as a Logic Element

4-terminal design mimics MOSFET operation
 Electrostatic actuation is ambipolar
 Non
Non--inverting logic possible: actuation independent of
drain/source voltages

Mechanical delay (~10ns) >> electrical t (~1ps)
 Can stack 200 devices (with 1kW on-resistance) before
electrical delay is comparable to mechanical delay
10.1
ICCAD 2008 – November 12, 2008
10
Digital Circuit Design with Relays

CMOS: delay set by electrical time constant



Relays: delay dominated by mechanical movement


10.1
Quadratic delay penalty for stacking devices (pass transistor logic)
Buffer & distribute logical/electrical effort over many stages
Want all relays to switch simultaneously
Implement logic as a single complex gate
ICCAD 2008 – November 12, 2008
11
Digital Circuit Design with Relays
4 gate delays

Delay Comparison vs. CMOS



Single mechanical delay vs. several electrical gate delays
For reasonable loads, relay delay unaffected by fan-out/fan-in
Area Comparison vs. CMOS


10.1
1 mechanical delay
Larger individual devices
Fewer devices needed to implement the same logic function
ICCAD 2008 – November 12, 2008
12
CMOS vs. Relays: Digital Logic
Normalized Energy/op
100
80
60
40
20
0 1
10

2
10
3
4
10
10
1/throughput (ps/op)
5
10
Most processor components exhibit the energy vs.
performance tradeoff of static CMOS
 Control, Datapath
Datapath,, Clock

10.1
Adder energy performance tradeoff is representative
ICCAD 2008 – November 12, 2008
13
Relay--based Adder
Relay

Full adder cell:




10.1
12 relays vs. 24
transistors
XOR for free
Use fully complementary
signals to avoid
additional mechanical
delay (to invert)
Relays all sized
minimally
ICCAD 2008 – November 12, 2008
14
N-bit RelayRelay-based Adder



10.1
Ripple carry
configuration
Cascade full adder
cells to create larger
complex gate
Stack of N relays—
relays—
still 1 mechanical
delay
ICCAD 2008 – November 12, 2008
15
Snapshot: NEM Relays vs. CMOS
Energy/op vs. Delay/op across Vdd
9x
10x
32-bit adder
CMOS
Relay
Supply
Voltage
0.5 V
0.32 - 0.9 V
Load Cap
per Output
25 fF
25 fF
Total Gate
Cap
4.0 pF
125 fF
600 mm2
480 mm2
Area
Energy is for adder only


Compare vs. Sklansky CMOS adder1
For similar area:
 >9x lower energy/op, >10x greater delay
1D.
10.1
Patil et. al., “Robust Energy-Efficient Adder Topologies,” in Proc. 18th IEEE Symp. on Computer Arithmetic (ARITH'07).
ICCAD 2008 – November 12, 2008
16
Energy--Delay Results
Energy

Relays always provide a more energy efficient solution

> 10x better, even in the GOPS range
Energy/op vs. Delay/op across Vdd & CL

30x capacitance gain



2.4x Vdd gain

10.1
ICCAD 2008 – November 12, 2008
Lower device Cg, Cd
Fewer devices
No leakage energy
17
Area vs. Throughput Tradeoff
30
W Relay (buffered, 25fF)
A
Relay
/A
CMOS
25
20
W Relay (buffered, 100fF)
15
Au Relay (25fF)
10
Au Relay (100fF)
5
Erelay = 20fJ/op
0.5


2.5
3
For high throughputs, relays require area overhead to
achieve similar throughputs as CMOS
Area overhead is bounded (max. 6x6x-25x) :

10.1
1
1.5
2
Throughput (GOPS)
To maintain minimum energy/op for throughputs > 1GOPS,
CMOS needs parallelism too
ICCAD 2008 – November 12, 2008
18
CMOS vs. Relays: Mixed Signal

I/O energy dominant if core is ~30x more efficient
 See if I/O circuits can benefit as well
 Representative examples: DAC & ADC

How to process/generate analog signals?
 No ―linear gain‖ in relays – rely on switching
10.1
ICCAD 2008 – November 12, 2008
19
NEM Relay Based DAC
DAC Architecture
Voltage Output
AC Impedance
EBIT 



Need to add passive R to set impedance
Don’t have to buffer up or size relays to drive output
Relays’ gate voltage independent of I/O voltage
Energy dominated by I/O energy:

10.1
2
VIO2 CL
Same topology as CMOS except:



@VIO=200mV, CL=1pF: 62.8fJ/bit vs. ~780fJ/bit for CMOS
ICCAD 2008 – November 12, 2008
20
NEM Relay Based Flash ADC

Topologically similar to CMOS



Key differences…


10.1
Sample/Hold input
Flash Converter – bank of
comparators
Sample & Hold: Unbalanced ―on‖ vs.
―off‖ switching times
Comparators: Body terminal used as
threshold, ambipolar actuation,
hysteretic switching
ICCAD 2008 – November 12, 2008
21
Relay Based Comparators

Body terminal used to set
threshold
VT

Hysteretic switching impact


10.1
Comparator will have different
threshold if previous state varies
Need to reset all comparators to
the same state before each
evaluation  dynamic logic
ICCAD 2008 – November 12, 2008
22
Relay Based Comparators
60
50
Output Code
Max
40
30
VT
Min
20
10
0

-0.2
0.8
1
Comparators above & below input will evaluate  need a different
decoder
Each comparator has a ―dead―dead-zone‖ where it stays off

10.1
0.2
0.4
0.6
Input Voltage (V)
Ambipolar actuation impact


0
Can use ―Max‖ or ―Min‖ of this range to determine code
ICCAD 2008 – November 12, 2008
23
ADC Results
Vcore = 0.3V
C = 500fF
R = 4kW
VREF = 1V
fsamp = 10 MS/s
6-bit res: 5.5fJ/conv

Energy dominated by resistor string (320fJ out of 350fJ)



10.1
Largely dependent on input dynamic range (VREF), not core voltage
FOM improves with more resolution as resistor string still
dominates
Most advantages stem from lower Ron at smaller Cg’s
ICCAD 2008 – November 12, 2008
24
Conclusions

NEM relays offer unique characteristics

Superior Ion/Ioff ratio vs. CMOS
Superior Ron*Cg vs. CMOS

Switching delay largely independent of electrical t


Relay circuits show order of magnitude energy
efficiency improvements over CMOS



10.1
Use parallelism (and area) to achieve comparable
throughputs in digital circuits
Mixed signal circuits see similar benefits from low Ron at
small Cgs
Key challenge is scaling devices reliably
ICCAD 2008 – November 12, 2008
25
Acknowledgements





10.1
Berkeley Wireless Research Center
NSF
DARPA
FCRP (C2S2)
Intel Foundation Graduate Fellowship
ICCAD 2008 – November 12, 2008
26