Download slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Area and Speed Oriented Implementations
of Asynchronous Logic
Operating Under Strong Constraints
Igor Lemberski
Baltic International Academy
Riga, Latvia
e-mail: [email protected]
Petr Fišer
Czech Technical University in Prague
Faculty of Information Technology
e-mail: [email protected]
Outline
Asynchronous circuits model used
Motivation & proposed method
Experimental results
Conclusions
EUROMICRO DSD 2010, Lille
2
Asynchronous Circuits Model Used
Unbounded delay model



Gate and wire delays are not limited
The circuit is able to recognize the moment when input
states have changed
Dual-rail encoding
Positive and negative values of each signal are provided
= 1, f(1) = 0 – log. 1
f(0) = 0, f(1) = 1 – log. 0
f(0) = 0, f(1) = 0 – space state (spacer)
f(0) = 1, f(1) = 1 – not allowed
 f(0)



EUROMICRO DSD 2010, Lille
3
Four-Phase Discipline
Inputs in space
state (00)
Outputs in
working state
(10, 01)
Outputs in
space state (00)
Inputs in
working state
(10, 01)
EUROMICRO DSD 2010, Lille
4
Seitz’s Constraints
Strong constraints

Each output changes its state only when all
inputs have changed their state
In contrast to weak constraints

Some outputs are permitted to change their
state when some inputs have changed their
state
EUROMICRO DSD 2010, Lille
5
Seitz’s Constraints
Strong constraints

Each output changes its state only when all
inputs have changed their state
In contrast to weak constraints

Some outputs are permitted to change their
state when some inputs have changed their
state
EUROMICRO DSD 2010, Lille
6
Seitz’s Strong Constraints
Pros




Regularity
Extra completion detection logic not needed
Circuit delay is based on actual gate delays
No additional synchronization chains
Cons

Rather high area and delay
DIMS (Delay-Insensitive Minterm Synthesis)
NCL (Null Convention Logic)
Direct Logic
EUROMICRO DSD 2010, Lille
7
DIMS (Delay-Insensitive Minterm Synthesis)
2-level implementation
2n n-input C-elements +
n-input OR
 Function implemented
as sum-of-minterms
EUROMICRO DSD 2010, Lille
8
NCL (Null Convention Logic)
Library of 27 special gates
Based on threshold functions
Any function up to 4 inputs can be
implemented
… but in dual-rail, 4 inputs = 2 variables only
EUROMICRO DSD 2010, Lille
9
Direct Logic
Two-level C-OR DIMS
logic implemented as a
single gate
Both positive and
complemented outputs
are provided
Different delays for
each input
EUROMICRO DSD 2010, Lille
10
Comparison
DIMS
Inputs
Trans.
Direct logic
Delay
Trans.
Delay
2
24
8.2
22
8.2
3
64
15.1
34
12.3
4
160
21.3
54
19.4
5
384
N/A
90
N/A
6
896
N/A
158
N/A
NCL
2-input gate
Trans.
Delay
AND, OR
21
5.8
XOR
24
8.6
EUROMICRO DSD 2010, Lille
11
Multi-Level Dual-Rail Network
Positive and complemented values of each
signal provided
Each node implemented as DIMS, NCL, or
Direct logic
EUROMICRO DSD 2010, Lille
12
Motivation & Proposed Method
State-of-the-art

Nodes are implemented as simple gates (NAND, XOR)
4x 2-input gate = 22*4 = 88 transistors in Direct logic
EUROMICRO DSD 2010, Lille
13
Motivation & Proposed Method
Proposed

Nodes are implemented as complex gates
1x 2-input gate + 1x 3-input gate = 22 + 34 = 56 transistors
EUROMICRO DSD 2010, Lille
14
Motivation & Proposed Method
State-of-the-art

Nodes are implemented as simple gates (NAND, XOR)
Proposed




Nodes are implemented as complex gates, i.e. gates of a
given number of inputs and any function
Can be implemented both in DIMS and Direct logic
Like FPGA LUTs
Tools for synchronous synthesis can be used
 FPGA mapping
EUROMICRO DSD 2010, Lille
15
Where’s the Problem?
Facts:
Increase of the number of node inputs will:
 Decrease
the number of nodes
 Decrease the number of levels
 Increase the node size
 Increase the node delay
Question:
Where is the trade-off?
EUROMICRO DSD 2010, Lille
16
Experimental Setup
228 circuits processed (MCNC, ISCAS)
Optimized by ABC choice script
1.
Mapped into k-input NANDs (ABC map command )
 state-of-the-art (k-NAND)
2.
Mapped into k-LUTs (ABC fpga command)
 complex gates (k-CG)
3.
Mapped into MCNC standard cells (ABC map)
 something in-between (SC)
k = 2…6
Implemented as DIMS, Direct logic, and NCL
EUROMICRO DSD 2010, Lille
17
Results – DIMS - Area
2-NAND
3-NAND
4-NAND
5-NAND
6-NAND
2-CG
3-CG
4-CG
5-CG
6-CG
SC
0,0
2,0M 4,0M 6,0M 8,0M 10,0M 12,0M 14,0M 16,0M
Transistors
EUROMICRO DSD 2010, Lille
18
Results – DIMS - Area
2-NAND
3-NAND
4-NAND
5-NAND
6-NAND
2-CG
3-CG
4-CG
5-CG
6-CG
SC
0%
10%
20%
30%
40%
50%
60%
70%
Best in
EUROMICRO DSD 2010, Lille
19
Results – DIMS – Delay
2-NAND
3-NAND
4-NAND
2-CG
3-CG
4-CG
SC
0,0
5,0k
10,0k
15,0k
20,0k
25,0k
Delay
EUROMICRO DSD 2010, Lille
20
Results – DIMS – Delay
2-NAND
3-NAND
4-NAND
2-CG
3-CG
4-CG
SC
0%
10%
20%
30%
40%
50%
60%
Best in
EUROMICRO DSD 2010, Lille
21
Discussion - DIMS
Implementation using arbitrary 2-input gates
is the best one, both in area and delay
No big surprise. Complexity (and delay)
of DIMS grows exponentially with the
number of gate inputs
Results are consistent – the more node
inputs, the higher area and delay
EUROMICRO DSD 2010, Lille
22
Results – Direct Logic - Area
2-NAND
3-NAND
4-NAND
5-NAND
6-NAND
2-CG
3-CG
4-CG
5-CG
6-CG
NCL
SC
0,0
500,0k
1,0M
1,5M
2,0M
2,5M
3,0M
Transistors
EUROMICRO DSD 2010, Lille
23
Results - Direct Logic - Area
2-NAND
3-NAND
4-NAND
5-NAND
6-NAND
2-CG
3-CG
4-CG
5-CG
6-CG
NCL
SC
0%
10%
20%
30%
40%
50%
Best in
EUROMICRO DSD 2010, Lille
24
Results – Direct Logic - Delay
2-NAND
3-NAND
4-NAND
2-CG
3-CG
4-CG
NCL
SC
0,0
5,0k
10,0k
15,0k
20,0k
Delay
EUROMICRO DSD 2010, Lille
25
Results – Direct Logic - Delay
2-NAND
3-NAND
4-NAND
2-CG
3-CG
4-CG
NCL
SC
0%
10%
20%
30%
40%
50%
60%
70%
Best in
EUROMICRO DSD 2010, Lille
26
Discussion - Direct Logic
Implementation using 3-input complex gates is the best
one, both in area and delay
This is a good result confirming our theory
Results are consistent - no coincidence
State-of-the-art 2-NAND implementation is extremely
inefficient:


21% area improvement
19% delay improvement
3-CG implementation is even better than NCL


10% area improvement
19% delay improvement
EUROMICRO DSD 2010, Lille
27
Conclusions
Efficient implementation of asynchronous logic
operating under strong constraints proposed
Tools (& methods) for synchronous synthesis are
used for asynchronous synthesis
3-input complex nodes implemented using
Direct logic
Extensive experiments confirmed the theory
cca. 20% area and delay improvement vs. all
state-of-the-art methods
EUROMICRO DSD 2010, Lille
28
Related documents