Download Kim

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Implementation of a Simple 8-bit
Microprocessor with Reversible
Energy Recovery Logic
Seokkee Kim and Soo-Ik Chae
System Design Group
School of Electrical Engineering
Seoul National University
2005 / 05 / 05
Contents
• Introduction to nRERL
• 8-bit nRERL Microprocessor
• Phase Scheduling
• Reversibility Breaking
• Measurement Results
• Future Works
SDGroup, School of Electrical Engineering, SNU
1/15
Introduction to nRERL (1)
• nRERL is nMOS Reversible Energy Recovery Logic
*)
– A Fully adiabatic circuit using reversible logic
– Only nMOS SW is used by exploiting Bootstrapped
– Phase-pipelining using 6-phase clocked power
fi+1
fi+2
fi
F
Xi
G
Xi+1
fi+2
fi+4
fi+2
H
G-1
F-1
fi+3
fi+3
fi+1
H-1
fi+3
fi+5
fi+4
*) J. Lim, D.-G. Kim, and S.-I Chae, “nMOS reversible energy recovery logic for ultra-low-energy
applications,” IEEE Journal of Solid-State Circuits, vol. 35, no. 6, pp. 865-875, June, 2000.
SDGroup, School of Electrical Engineering, SNU
2/15
Introduction to nRERL (2)
Forward
Logic
fi+1 switch
MFL
T0 T1 T2 T3 T4 T5 T6
fi
n1
Xi+1
MFI
Xi
MFLB
clamp
n2
MRI
n3
MRIB n4
MRLB
fi+3
fi+1
0
Vdd
fi+2
0
Vdd
0
Vdd-Vthb
Xi
MRL
Reverse
Isolation
switch
0
Vdd
fi+3
MFIB
Xi
Vdd
fi
Forward
Isolation
switch
Xi+1
Reverse
Logic
switch
fi+2
SDGroup, School of Electrical Engineering, SNU
Vdd
n1
0
Vdd-Vthb
Xi+1
0
Vdd
n3
0
3/15
8-bit nRERL Microprocessor (1)
• Issues
– Area v.s Reversibility
: How we should control the reversibility to integrate the
microprocessor in the limited silicon area ?
– Pipelining v.s Energy
: How we should schedule the phase pipelining to minimize
the total energy consumption of the microprocessor ?
– Energy v.s Reversibility
: How we could control the reversibility without increasing
the total energy consumption of the microprocessor ?
SDGroup, School of Electrical Engineering, SNU
4/15
8-bit nRERL Microprocessor (2)
• A subset of DLX
Instruction Set
Architecture
8-bit adiabatic Microprocessor
Controller
ALU
Program
Counter(PC)
Branch PC
Generator
clocked power
data flow path
Register File
(16w x 8b)
RAM
(128w x 8b)
ROM
(64w x 20b)
– No floating point
Instructions
– 19 Instructions
6-phase Clocked
Power Generator
• 5 macro-blocks:
– IF  ID  EXE 
MEM  WB
– Fully adiabatic circuit
fREF
fOSC
Off-chip
SDGroup, School of Electrical Engineering, SNU
• 6-phase CPG is also
integrated
– A shared off-chip
inductor is used
5/15
Phase scheduling (1)
Time
Phase
T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17
0
1
2
CASE I:
Cycle-based 
scheduling
Register
File
CASE II:
Phase-based 
Scheduling
(best case)
Register
File
CASE III:
Phase-based 
Scheduling
(worst case)
Register
File
3
4
5
0
1
2
3
4
5
ALU
0
1
2
3
4
Memory
Writeback data
ALU
Buffer
Memory
Writeback data
SDGroup, School of Electrical Engineering, SNU
ALU
Memory
Writeback data
6/15
5
Phase scheduling (2)
Time T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23
Phase 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Buffer
MUX
Data path
Control Signal
Write data
Forward data
Page register
External
Instruction
forward
Register File
ROM
ALU
RAM
Memdata
PC register
Control
Eqcheck
branch pc generation
write to register
Decoded
Instructions
Branch
Flush
pc increment
Instruction Decoding
Instruction
Fetch Engineering,
SDGroup,
School of Electrical
SNU
/Register
Fetch
Execution
Memory
Acess
7/15
Writeback
Overhead
Reversibility Breaking (1)
• SERC: Self-Energy Recovery Circuit
– Energy recovery with its own data instead of using reversible
logic
1
2
– Nonadiabatic loss exists ( 2 CVthb )
f0
f1
f2
T2 T3 T4 T5 T6 T7 T8
Data*
f1
f2
f4
f4
f3
Data
Vdd
SERC
0
Vdd
f5
0
n7
n7
Vthb
Vdd
0
Data*
Data*
Data*
n8
f5
Vthb
f4
SDGroup, School of Electrical Engineering, SNU
Vdd  Vthb
0
8/15
Reversibility Breaking (2)
• Infinite memory cannot be implemented on the limited
silicon area
Read port bit[n]_out
wd[m]_rd_iso_f2
wd[m]_rd_f3
• SERC is used
for unwrite and
refresh
operations.
Write
port
 SERC in Memory Cell
wd[m] _unwr_f4 (ref_f4)
wd[m] _unwr_f5 (ref_f4)
SDGroup, School of Electrical Engineering, SNU
9/15
Measurement Results
• ANAM 0.18m (1P6M)
microprocessor core
ROM
& PC
ALU &
Register
file
Bias
generator
Bias
Memory
Control
– Core: 2.62 x 2.03 mm2
– CPG: 1.0 x 0.6 mm2
– Vdd=1.8V, Vth0=0.35V
– E=8.5 pJ/cycle (P=7.5 W)
@ Vdd=1.8V, f=880kHz
• E_cpg = 4.97 pJ/cycle
6-phase Clocked Power Routing
CPG
(58.5%)
• E_core = 3.53 pJ/cycle
(41.5%)
SDGroup, School of Electrical Engineering, SNU
10/15
Hardware Complexity
# of transistors
(Portions to Core)
Area
(Portions to Core)
ROM (64w x 20b)
10,000 (13.3%)
0.60 x 0.50 mm2 (7.9%)
PC
17,000 (22.6%)
0.60 x 0.58 mm2 (9.2%)
ALU
5,200 (6.9%)
0.50 x 0.60 mm2 (7.9%)
Reg.file (16w x 8b)
7,600 (10.1%)
0.36 x 0.50 mm2 (4.8%)
400 (0.5%)
0.70 x 0.24 mm2 (4.4%)
28,000 (37.2%)
0.65 x 1.30 mm2 (22.3%)
Control
5,700 (7.6%)
1.60 x 0.70 mm2 (22.2%)
Phase aligning buffers
1,400 (1.9%)
Microprocessor core
75,300 (100%)
2.62 x 2.03 mm2 (100%)
2,700
1.00 x 0.60 mm2
-
0.4 x 7.0 mm2
78,000
4.0 x 4.0 mm2
Forward
RAM (128w x 8b)
CPG
Clock routing
Total chip
SDGroup, School of Electrical Engineering, SNU
-
11/15
Energy Partitions
• The energy portion of CPG is more than a half.
– More optimization is required for CPG design.
• At optimal condition, Adiabatic, Leakage, CPG raildriver energy loss should be same.
< nRERL microprocessor >
ALU8b &
reg. file
6%
Control &
others
6%
64x20b ROM
9%
E_total (8.5pJ/cycle)
<nRERL microprocessor>
E_total (8.5pJ/cycle)
E_cpg (58.5% )
E_cpg (58.5%)
E_core (41.5% )
E_core (41.5%)
CPG (clk.
driver)
58%
128x8b RAM
21%
<Partitioned by functional blocks>
SDGroup, School of Electrical Engineering, SNU
leakage
20%
adiabatic
21%
CPG,rail-driver
35%
SERC
8%
CPG,controller
16%
<Partitioned by energy components>
12/15
Comparisons (1): CMOS v.s nRERL
• Minimum Energy Consumption
60
Energy loss per cycle [pJ/cycle]
52.0pJ
50
4.6%
9.7%
40
16.3%
30
22.1%
ALU8b & reg. file
64x 20b RO M
Control & others
128x 8b RAM
CPG (clk. driv er)
20
10
47.3%
8.5pJ
CMO S
nRERL
0
SDGroup, School of Electrical Engineering, SNU
13/15
Summary
Hardware
Complexity
Operating
Region
8-bit nRERL
microprocessor
8-bit CMOS
microprocessor
# of Tr’s
78,000
15,000
Core Area
2.62 x 2.03 mm2
0.82 x 0.51 mm2
Supply
voltage
1.8V
0.8V ~ 1.8V
Frequency
200kHz ~ 10MHz
~ 1GHz
8.5 pJ/cycle
@ Vdd=1.8V,
Vbias=1.5V, f=880kHz
52.0 pJ/cycle
@ Vdd=0.65V,
f=200kHz ~1MHz
Minimum energy
consumption
(optimal condition)
SDGroup, School of Electrical Engineering, SNU
14/15
Future Works
• More energy-efficient CPG design is required.
• More study on the complexity reduction is required
for the implementation of more complex circuits.*)
*) Seokkee Kim and S.-I Chae, “Complexity reduction in an adiabatic microprocessor using
reversible logic,” will be published on proc. International Symposium on Low Power
Electronics and Design, Aug., 2005.
SDGroup, School of Electrical Engineering, SNU
15/15
Related documents