Download single output cdm standard cell library design

Document related concepts

Transmission line loudspeaker wikipedia , lookup

Integrated circuit wikipedia , lookup

Buck converter wikipedia , lookup

Bio-MEMS wikipedia , lookup

Power electronics wikipedia , lookup

Metadyne wikipedia , lookup

Control system wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Opto-isolator wikipedia , lookup

Digital electronics wikipedia , lookup

CMOS wikipedia , lookup

Transcript
Standard Cell Library Design and Optimization with CDM for Deeply Scaled FinFET
Devices.
by
Ashish Joshi, B.E
A Thesis
In
Electrical Engineering
Submitted to the Graduate Faculty
of Texas Tech University in
Partial Fulfillment of
the Requirements for
the Degree of
MASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
Dr. Tooraj Nikoubin
Chair of Committee
Dr. Brian Nutter
Dr. Stephen Bayne
Mark Sheridan
Dean of the Graduate School
May, 2016
© Ashish Joshi, 2016
ACKNOWLEDGEMENTS
I would like to sincerely thank my supervisor Dr. Nikoubin for providing me the
opportunity to pursue my thesis under his guidance. He has been a commendable
support and guidance throughout the journey and his thoughtful ideas for problems
faced really been the tremendous help. His immense knowledge in VLSI designs
constitute the rich source that I have been sampling since the beginning of my research.
I am especially indebted to my thesis committee members Dr. Bayne and Dr. Nutter.
They have been very gracious and generous with their time, ideas and support. I
appreciate Dr. Nutter’s insights in discussing my ideas and depth to which he forces me
to think.
I would like to thank Texas Instruments and my colleagues Mayank Garg, Jun, Alex,
Amber, William, Wenxiao, Shyam, Toshio, Suchi at Texas Instruments for providing
me the opportunity to do summer internship with them. I continue to be inspired by their
hard work and innovative thinking. I learnt a lot during that tenure and it helped me
identifying my field of interest. Internship not only helped me with the technical aspects
but also build the confidence to accept the challenges and come up with the innovative
solutions.
I have the great pleasure working with Dr. Li. He helped me understanding the
intricacies of the Analog Designs which further strengthen my interest towards mixed
signal design and verification. His guidance to his students goes well beyond the regular
duty of the course instructor.
I am highly indebted and thankful to my family, Dr. Surinder Kumar Joshi, Mrs. Renu
Joshi and Ashinder Joshi for their continual moral support, encouragement and
confidence in me, without whom it was not possible.
Lastly, to all my friends, thank you for your understanding and encouragement in many
moments of crisis. I cannot list all the names here, but you are always on my mind.
ii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ........................................................................................... ii
TABLE OF CONTENTS .............................................................................................. iii
ABSTRACT .................................................................................................................. vi
LIST OF TABLES ....................................................................................................... vii
LIST OF FIGURES .................................................................................................... viii
I. INTRODUCTION ...................................................................................................... 1
1.1
Motivation ....................................................................................................... 1
1.1.1
Why do need Low Power ......................................................................... 1
1.1.2
Why Improve the Standard Cell Based Design flow ............................... 1
1.2
Contribution of the Thesis ............................................................................... 2
1.3
Organization of the Thesis .............................................................................. 2
II. DIGITAL DESIGN FLOW ....................................................................................... 3
2.1
Full Custom Design Flow................................................................................ 3
2.2
Semi-Custom Design Flow ............................................................................. 4
2.2.1
Introduction .............................................................................................. 4
2.2.2
ASIC Cell based design flow ................................................................... 4
2.2.3
Advantages and Limitations of ASIC ...................................................... 5
2.2.4
Application and Trends ............................................................................ 5
2.2.5
Standard Cell library design in Industry .................................................. 6
2.3
Power Dissipation in the CMOS Circuits........................................................ 6
2.3.1
2.3.1.1
Dynamic Power ........................................................................................ 6
Hazards and Glitch Power ............................................................................... 7
2.3.2
Short Circuit Power Dissipation .............................................................. 8
2.3.3
Leakage Power Dissipation ...................................................................... 9
III. FinFET vs PLANER BULK MOSFET DEVICES ................................................ 12
3.1
I-V Characteristics ......................................................................................... 13
3.2
Drain Induced Barrier Lowering ................................................................... 14
3.3
Subthreshold Swing ....................................................................................... 15
iii
IV. CDM LOGIC STYLE ............................................................................................ 17
4.1
CDM with Complementary Outputs Cells .................................................... 17
4.2
Feedback and Correction Mechanism ........................................................... 18
4.3
Performance comparison in CDM and CMOS for complementary outputs . 19
4.3.1
AND-NAND gate implemented in CDM and CMOS ........................... 19
4.3.2
OR-NOR gate implemented in CDM and CMOS Logic Style .............. 22
4.3.3
3-Input AND-NAND gate in CMOS and CDM Logic Style ................. 25
4.3.4
3-Input NOR-OR gate implemented in CMOS and CDM Logic Style . 28
4.4
Power Saving with CDM .............................................................................. 30
4.5
CDM with Single Output Cells ..................................................................... 31
4.5.1
AND Gate............................................................................................... 33
4.5.2
OR gate................................................................................................... 34
4.5.3
3-Input AND gate in CDM single output logic style ............................. 36
4.5.4
3-Input OR Gate ..................................................................................... 37
4.5.5
Half Adder Comparison ......................................................................... 39
4.5.6
Full Adder Comparison .......................................................................... 40
4.5.7
4:2 Compressor comparison................................................................... 42
4.5.8
4 bit by 4 bit Multiplier .......................................................................... 43
4.6
Data Analysis ................................................................................................ 45
V. SINGLE OUTPUT CDM STANDARD CELL LIBRARY DESIGN .................... 47
5.1
Standard Cell Library Design Flow ............................................................... 49
5.2
Benchmark Circuits ....................................................................................... 50
5.3
Synthesis Results with CDM standard cell library ........................................ 51
5.4
Data Analysis ................................................................................................ 54
5.5
Binary to BCD Converter .............................................................................. 56
5.5.1
Binary to BCD converter in CBLD ........................................................ 57
5.5.1.1
Introduction .................................................................................................. 57
5.5.1.2
Binary to BCD converter architecture ........................................................... 58
5.5.1.3
Synthesis Results........................................................................................... 59
5.5.1.4
Synthesis Results with CDM standard cell library......................................... 63
VI. CONCLUSION AND FUTURE WORK .............................................................. 66
iv
Appendix A .................................................................................................................. 68
DESIGN COMPILER SCRIPT................................................................................ 68
C-CMOS LOGIC CELL NETLIST ......................................................................... 69
CDM LOGIC CELL NETLIST ............................................................................... 70
SILICON SMART STANDARD CELL LIBRARY CHARACTERIZATION
SCRIPT .................................................................................................................... 76
References .................................................................................................................... 80
v
ABSTRACT
In this thesis, we propose the new methodology to achieve the minimum glitch standard
cell based design. The Standard cell library has been designed using the logic cells
designed in the CDM logic style. The CDM logic style has been analyzed and compared
with the conventional CMOS logic style with the FinFET devices in super-threshold
operation. Standard cell library with FinFET logic gates in CDM and static CMOS logic
style has been developed in various selected technologies (7nm, 10nm, 14nm, 16nm &
20nm) and used to synthesize the ISCAS’85 benchmark designs to evaluate the
performance improvement. Synopsys silicon smart and library compiler tool has been
used to generate the standard cell libraries using FinFET device models from PTM and
design compiler to synthesize the designs with developed standard cell libraries. The
simulation results shows that CDM based standard cell library achieve the average
power improvement of 17-21% and average PDP improvement of 7-26% for all
benchmark designs compared with conventional CMOS standard cell library in 7nm,
10nm, 14nm, and 16nm & 20nm technology node respectively. Hence we demonstrated
that our low power standard cell design is comparable to the contemporary custom
design optimization techniques used to save power in the design.
vi
LIST OF TABLES
Table 4.1. Performance parameter for AND/NAND in CDM & C-CMOS................. 22
Table 4.2. Performance parameter for OR/NOR in CDM & C-CMOS ....................... 25
Table 4.3. Performance parameter for 3 input AND/NAND in CDM & C-CMOS .... 28
Table 4.4. Performance parameter for 3 input OR/NOR in CDM & C-CMOS .......... 30
Table 4.5. Performance parameter for 2 input AND in CDM & C-CMOS ................. 34
Table 4.6. Performance parameter for 2 input OR in CDM & C-CMOS .................... 35
Table 4.7. Performance parameter for 3 input AND in CDM & C-CMOS ................. 37
Table 4.8. Performance parameter for 3 input OR in CDM & C-CMOS .................... 38
Table 4.9. Performance parameter for half adder in CDM & C-CMOS ...................... 40
Table 4.10. Performance parameter for full adder in CDM & C-CMOS .................... 41
Table 4.11. Performance parameter for 4:2 compressor in CDM & C-CMOS ........... 43
Table 4.12. Performance parameter for 4 bit multiplier in CDM & C-CMOS ............ 45
Table 5.1. Logic functions from Single level Single output CDM basic cell .............. 48
Table 5.2. ISCAS’85 Benchmark Designs ................................................................... 51
Table 5.3 Synthesis Results with 7nm Standard Cell Library ..................................... 51
Table 5.4 Synthesis Results with 10nm Standard Cell Library ................................... 52
Table 5.5 Synthesis Results with 14nm Standard Cell Library ................................... 52
Table 5.6 Synthesis Results with 16nm Standard Cell Library ................................... 53
Table 5.7 Synthesis Results with 20nm Standard Cell Library ................................... 53
Table 5.8. Post-Layout synthesis results with 90nm technology node. ....................... 60
Table 5.9. Pre-Layout Synthesis with 90, 45 & 32nm technology node. .................... 60
Table 5.10. Pre-Layout synthesis with 14(CMOS), 7 & 5nm (FinFET) technology. .. 61
Table 5.11. Power dissipation with CDM and CMOS in 7nm technology node ......... 63
Table 5.12. Power Delay Product with CDM and CMOS in 7nm ............................... 64
vii
LIST OF FIGURES
Fig.2.1. Full-Custom Design Flow ................................................................................. 3
Fig 2.2. ASIC Design Process [20] .................................................................................. 5
Fig 2.3. An example of the static hazard ........................................................................ 7
Fig 2.4. An Example of the dynamic hazard .................................................................. 8
Fig 2.5. Effect of load capacitance on short circuit power dissipation .......................... 9
Fig 2.6. Short Circuit Energy Dissipation vs Input rise/fall time................................... 9
Fig 2.7. Leakage Power in the inverter before occurrence of the transition at Input. .. 11
Fig 2.8. Leakage Current vs Drain voltage characteristic of the MOSFET ................. 11
Fig 3.1. Different structures of the FinFET Devices .................................................... 13
Fig 3.2. I-V characteristics of the bulk CMOS (22nm) and FinFET (20nm) devices.. 13
Fig 3.3. Ion/Ioff variation for CMOS and FinFET with different supply voltages ...... 14
Fig 3.4. Drain Current vs Gate Source Voltage for the FinFET and bulk CMOS ....... 14
Fig 3.5. Drain current versus Gate source voltage while Vds=VDD [13] ..................... 15
Fig 3.6. Gate Dielectric tunneling current in NMOS bulk planer device. ................... 16
Fig 4.1. Basic Cell representation in CDM logic Style [4]............................................ 17
Fig 4.2. Different feedback circuits to get full swing outputs [4].................................. 19
Fig 4.3. Schematic of AND-NAND in CDM logic Style ............................................ 19
Fig 4.4. Schematic of NAND-AND logic gate in CMOS logic Style.......................... 20
Fig 4.5. Test Bench for NAND/AND implemented in CMOS and CDM logic style . 20
Fig 4.6. Output NAND/AND waveforms for CDM logic gate. ................................... 21
Fig 4.7. Output waveforms from CMOS NAND-AND gate. ...................................... 21
Fig 4.8. PDP between CMOS and CDM with varying load capacitance. .................... 21
Fig 4.9. Delay vs Power plot for AND-NAND in CMOS and CDM logic style. ........ 22
Fig 4.10. Schematic of NOR-OR in CDM Logic Style ............................................... 22
Fig 4.11. Schematic of NOR-OR in CMOS Logic Style ............................................. 23
Fig 4.12. Test Bench for performance comparison between CMOS and CDM .......... 23
Fig 4.13. Output Waveforms for the CDM Logic Style .............................................. 24
Fig 4.14. Output Waveform for CMOS logic Style ..................................................... 24
viii
Fig 4.15. PDP between CMOS & CDM OR-NOR with varying load capacitance ..... 24
Fig 4.16. Delay vs Power plot for NOR-OR in CMOS and CDM logic style. ............ 25
Fig 4.17. Schematic of 3 input NAND-AND in CDM Logic Style ............................. 25
Fig 4.18. Test Bench for the 3 input NAND-AND in CMOS and CDM Logic Style . 26
Fig 4.19. Output Waveforms for 3 input CDM NAND-AND logic gate. ................... 26
Fig 4.20. Output Waveforms for 3 input CMOS NAND-AND logic gate. ................. 27
Fig 4.21. PDP for 3-NAND-AND in CMOS and CDM with varying load cap. ......... 27
Fig 4.22. Delay vs Power comparison for 3 Input CMOS and CDM AND-NAND. .. 27
Fig 4.23. Schematic of 3 input NOR-OR gate in CDM logic Style ............................. 28
Fig 4.24. Test Bench for 3 input CMOS and CDM NOR-OR gate ............................. 29
Fig 4.25. Output waveforms from the CDM NOR-OR Logic cell. ............................. 29
Fig 4.26. Output Waveforms from CMOS NOR-OR Logic cell. ................................ 29
Fig 4.27. PDP for the CMOS and CDM NOR-OR with varying load capacitance ..... 29
Fig 4.28. Delay vs Power comparison for CMOS and CDM OR-NOR. ..................... 30
Fig 4.29. Single Output CDM basic cells (a) Single Level (b)-(d) Two Level............ 32
Fig 4.30. Schematic of AND gate with CDM single output logic style....................... 33
Fig 4.31. Test Bench for CMOS and CDM AND gate ................................................ 33
Fig 4.32. Output Waveforms from CMOS and CDM AND gate ................................ 34
Fig 4.33. Schematic of the OR gate in the single output CDM logic style .................. 34
Fig 4.34. Test Bench for CMOS and CDM OR gate. .................................................. 35
Fig 4.35. Output waveforms from the CMOS and CDM OR gate. ............................. 35
Fig 4.36. Schematic of 3 input AND gate in CDM single output logic style. ............. 36
Fig 4.37. Test Bench for CMOS and CDM 3 Input AND gate. ................................... 36
Fig 4.38. Output waveforms from 3 input AND gate in CDM and CMOS. ................ 37
Fig 4.39. Schematic of 3 Input OR gate in CDM single output logic style. ................ 37
Fig 4.40. Test Bench for 3 Input CMOS and CDM OR gate. ...................................... 38
Fig 4.41. Output Waveforms from 3 Input OR gate in CMOS and CDM. .................. 38
Fig 4.42. Schematic of Half Adder in single output CDM logic style ......................... 39
Fig 4.43. Test Bench for CDM and CMOS half adder ................................................ 39
ix
Fig 4.44. Output waveforms from the half adder in CMOS and CDM logic style. ..... 40
Fig 4.45. Schematic of Full Adder in CDM single output logic style. ........................ 40
Fig 4.46. Test Bench for CDM and CMOS Full adder. ............................................... 41
Fig 4.47. Output waveforms from the Full adder in CMOS and CDM. ...................... 41
Fig 4.48. Schematic of 4:2 compressor in single output CDM logic style. ................. 42
Fig 4.49. Test Bench for CMOS and CDM 4:2 compressor design ............................ 42
Fig 4.50. Output Waveforms from 4:2 compressor in CMOS and CDM. ................... 43
Fig 4.51. Schematic of 4 bit by 4 bit multiplier in CDM logic style. .......................... 44
Fig 4.52. Test Bench for CDM and CMOS 4 bit by 4 bit multiplier. .......................... 44
Fig 4.53. Output Waveforms from the Multiplier in CMOS logic style. ..................... 44
Fig 4.54. Output Waveforms from the Multiplier in CDM logic style ........................ 44
Fig 5.1. CDM Logic Cells (a) Single Level (b)-(d) Two Level ................................... 47
Fig 5.2. Standard Cell Library Design Flow ................................................................ 49
Fig 5.3.ISCAS-85 c6288 16x16 multiplier .................................................................. 54
Fig 5.4. Full adder module for ISCAS-85 c6288 16x16 multiplier ............................. 55
Fig 5.5. Power Improvement with CDM over CMOS standard cell libraries.............. 56
Fig 5.6. PDP Improvement with CDM over CMOS standard cell libraries ................ 56
Fig.5.7. Binary to BCD converter design with CBLD algorithm ................................ 59
Fig 5.8. Normalized Pre-Layout synthesis with 90nm technology. ............................. 61
Fig 5.9 Normalized Post-Layout Synthesis with 90nm technology. ............................ 62
Fig 5.10 Pre-Layout delay result for 14, 7 and 5nm technology node. ........................ 62
Fig 5.11 Pre-Layout PDP results for 14nm (cmos), 7nm & 5nm (FinFET). ............... 63
Fig 5.12 Power Dissipation with CDM and CMOS in 7nm technology. ..................... 64
Fig 5.13. Power Delay Product with CDM and CMOS in 7nm technology. ............... 65
x
CHAPTER 1
INTRODUCTION
The primary contribution of this work is the low power driven standard cell library based
design methodology. We have worked on designing the standard cell library with the
new logic style. The results obtained are comparable to power saving figures from
various glitch reduction methodologies tailored for the full custom design flow, thus
reducing the performance gap between the two design styles.
1.1
Motivation
1.1.1
Why do need Low Power
The Continual decrease in the feature size, corresponding increase in the device density
and high operating frequencies have made the power consumption a major concern in
the VLSI Design. Excessive power consumption in the integrated circuits discourages
their use in the portable systems. Excessive power consumption also results in
overheating resulting in decrease in the reliability and lifetime of the chip. To control
the temperature levels within the chip, specialized cooling and packaging techniques are
used thereby further increasing the chip cost. The growing need for the portable
communication and computing systems has increased the need for the power
optimization within the chip. Hence the low power design is the critical technology
required in the semiconductor industry today. Simultaneously, we need to decrease the
critical path delay while reducing the overall power consumption of the chip.
1.1.2
Why Improve the Standard Cell Based Design flow
The standard cell design is semi-custom design styles that is based on the set of the
prefabricated standard cells. The design flow used the highly automated synthesis and
place and route tools that uses the highly optimized advanced algorithms. This reduces
the manual efforts required to complete the design in silicon. Existing semi-custom
design flow doesn’t leave any flexibility to optimize for power consumption by reducing
the glitch power. Hence there is strong need to design the standard cell library with cells
1
having non-skewed (balanced) output to minimize the glitches and their propagation
within the design to minimize the power consumption.
1.2
Contribution of the Thesis
In this thesis, we have successfully designed the standard cell library with cells having
balanced outputs (for multi output cells) with new logic style called as CDM, hence
minimizing the glitches within the design. We have applied the proposed technique to
ISCAS’85 benchmark circuits and found that our methodology is capable of producing
the minimum transient energy design. Standard cell library has been designed with the
FinFET device models from PTM based on BSIM-CMG on five different technology
nodes (7nm, 10nm, 14m, 16nm & 20nm) and has been used to synthesize the benchmark
designs. Simulation results has shown the power improvement of 20-21% and Energy
Improvement (PDP) of 7-21 % compared with the standard cell libraries designed with
C-CMOS logic style on the same technology nodes.
1.3
Organization of the Thesis
A detailed explanation of the thesis work is provided in the following six
chapters.Chapter-2 reviews the basic digital design flows and main sources of the power
consumption in CMOS digital ICs. Chapter 3 demonstrates the advantages of the
FinFET devices over the planer bulk CMOS devices with short channels and hence the
reason for using the FinFET devices in designing the standard cell library for technology
nodes like 7nm, 10nm etc. In Chapter 4, the new logic style Cell Design Methodology
(CDM) has been introduced. Cell has been designed in CDM logic style with both
complementary outputs and Single outputs and compared with their C-CMOS
implementation. Power and PDP improvement obtained for each cell in CDM over CCMOS has been shown with various simulation results. Chapter 5 describes the flow
used for the standard cell library design, experimental setup to prove the proposed
concept and presents the results from the ISCAS’85 benchmark circuits. Finally
Chapter-6 presents the conclusion from our experiments and proposed future work.
2
CHAPTER 2
DIGITAL DESIGN FLOW
2.1
Full Custom Design Flow
In the full custom design flow, the design is divided into smaller submodules and each
submodule is designed at the transistor level. W/L of each transistor is decided optimally
and other parameters to improve and meet the required module level specifications. Full
Custom design flow provides the designer full control over all the design parameters
and highly efficient designs can be completed with full custom design flow. Even today,
the full custom design flow is used for the building of Analog Blocks and Analog ICs.
Full Custom design flow can provide us highest performance but has longer
development time. For completing the design for the application specific Integrated
Circuit quickly thereby reducing the time to market, the cell based (Semi-Custom)
design flow is gaining more attraction. Steps involved in full custom design flow are
shown in Fig.2.1
Fig.2.1. Full-Custom Design Flow
3
2.2
Semi-Custom Design Flow
2.2.1
Introduction
Cell based Design is widely adopted design approach in the current application specific
Integrated Circuits (ASIC) and System on Chip (SoC) designs. Standard cell libraries
are the critical component in the cell based design flow. A standard cell library is the
collection of the primitive gates that are used in the process of synthesizing the
behavioral RTL to the gate level netlist for the cell based designs. The use of the
standard cell library offers shorter time to development, have minimum errors in the
design and easier to maintain.
2.2.2
ASIC Cell based design flow
An ASIC design flow typically starts with the VHDL or Verilog description of the
design. The description is then synthesized into the gate level netlist and placed and
routed to generate the layout. Standard cell are the building blocks for the gate level
netlist and the layout. The design described in the Verilog is first simulated for the
verification and if it meets the design specifications, synthesized with the standard cell
library into the gate level netlist and verified again for the functionality. If the gate level
netlist meets the design specifications, the netlist is imported into Cadence integrated
circuit front to back (ICFB) as the schematic view and cadence SoC encounter to
generate the place and route view. Once the layout is imported into cadence SoC
encounter, the physical verification consisting of DRC checks and LVS matching is
done to verify if the synthesized netlist matches the layout. The I/O pads are then added
manually to the design and submitted to the foundry for fabrication. Power analysis,
STA analysis can also be done after the design synthesis stage.Fig.2.2 shows the
complete flow of the ASIC design process as follows:
4
Fig 2.2. ASIC Design Process [20]
2.2.3
Advantages and Limitations of ASIC
Cell based ASIC top –down approach shortens the design time compared with the full
custom design and promotes the design reuse to lower the cost. However, the ASIC
design performance cannot match the full custom design performance in terms of speed.
Microprocessors designed with full custom approach can work at much higher
frequencies relative to the microprocessor design using ASIC/Semi-custom design flow.
2.2.4
Application and Trends
ASIC designs has been used in the multiple applications including home appliances,
telecommunications, medical applications etc. The increased demands for the more
capability in the small designs with shorter time for development will metamorphose
ASIC design methodology with single processor specific to some application to multiple
systems with processor on chip. The new generation of ASIC is called as SoC, the new
trend i.e., nowadays leading the design community.
As the designers moving from ASIC to SoC, the standard cell library regains the interest
because they are extensively used in both these design approaches. Also, they remain
the basic building blocks in both mentioned design approaches.
5
2.2.5
Standard Cell library design in Industry
Independent of the methodology adopted, IC design companies uses the standard cell
library to reduce the design time and reusability of the designs .Due to the large
resources needed to develop the standard cell library, they use the standard cell library
developed by other companies like Synopsys, Nangate etc. Some of the companies
develop their own standard cell for the internal uses.
At the logic level, the design is in the form of the network of the logic gates and their
interconnections. This is the good representation of the design but cannot be used to
determine the performance precisely. Library mapping binds the logic level netlist to
the cells available in the standard cell library that includes the primitive gates like AND,
NAND, NOR, OR etc. The design representation is still in the form of the network but
now consists of the standard cells with known characteristics which enables to analyze
the performance of the design accurately since the power, delay and area information
for each of the standard cell is well known. At this level of abstraction, we can get the
estimate but more precise will be from physical level which includes information about
parasitics too. Hence followed by the physical design (placement and routing) called as
back end can change the design performance drastically.
2.3
Power Dissipation in the CMOS Circuits
There are three main sources for the power dissipation in the CMOS Circuits:

Dynamic Power

Short Circuit Power

Leakage Power
2.3.1
Dynamic Power
The Dynamic Power in the CMOS Circuits us due to the charging and discharging of
the load capacitance driven by gate. This capacitance consists of the wiring capacitance
of the fanout net and the capacitance of the gate terminals of the transistors controlled
by the fanout net. The Power dissipation can be calculated by the following equation:
1
𝑃𝑑𝑦𝑛 = 𝐶𝑙𝑜𝑎𝑑 𝑉𝑑𝑑 2 𝑓 𝐷
2
(1)
6
Where

Pdyn : Dynamic Power Dissipation of the Gate

Cload: Load capacitance of the Gate

Vdd: Power Supply

f: Clock frequency

D: Switching probability
The dynamic power dissipation is thus proportional to the number of the transitions
occurring at the gate. Thus an accurate estimation of switching probability in the circuit
provide the estimate of the dynamic power dissipation. In the earlier technologies,
dynamic power dissipation accounts for most of the power dissipation within the circuit.
But with the advent of the deep sub-micron technologies, the other components of the
total power consumption are also becoming significant. Dynamic Power can be
classified into necessary switching activity for the correct functionality and unnecessary
transitions due to the unbalanced paths and skewed outputs (unbalanced outputs) from
the cells in the circuit. The latter component of the dynamic power is called as glitching
power and is explained in the next section.
2.3.1.1 Hazards and Glitch Power
Before signal of the digital circuit reach the steady state, gates can have multiple
transitions. Since the power consumed is proportional to the number of the transitions,
the unnecessary transitions increases the power dissipation. These unnecessary
transitions are called as glitches or hazards. Glitches happen in the circuit due to unequal
arriving time of the signal to the input gates. Glitch power contribute significantly to the
overall power dissipation in some of the typical cases like adders etc.
Fig 2.3. An example of the static hazard
7
Fig 2.4. An Example of the dynamic hazard
Consider the example of Fig.2.3 with each gate having one unit of delay. Due to the
unequal arriving time of the inputs to the AND gate, the output of the AND gate shows
the glitch and transmits the pulse of 1 unit width, which equal the inverter delay. This
is known as the Static Hazard. In Fig.2.4, the OR gate introduces the static hazard of 1
unit width, but output transition consists of three edges(2 rising and 1 falling).This is
called as Dynamic Hazard with 2 unit widths, which equals to the delay of OR and NOT
gate.
2.3.2
Short Circuit Power Dissipation
Short circuit power dissipation occurs when the gate switches. During the transition,
there is the short time when both nMOS and pMOS conduct. This effect is equivalent
to shorting the power supply and ground for the shorter amount of time. The current
flowing during these transitions dissipates power called as the Short Circuit Power
Dissipation.
The value of the short circuit current depends on the value of the capacitance connected
to the output of the gate. Consider the example shown in Fig.2.5.For the larger load
capacitance, the output fall time is significantly larger than the input rise time and
conversely for the low load capacitance, the output fall time is substantially smaller than
the input rise time. The amount of the short circuit power dissipation can be calculated
by the following formula:
𝑃𝑜𝑤𝑒𝑟𝑆ℎ𝑜𝑟𝑡 𝐶𝑖𝑟𝑐𝑢𝑖𝑡 =
𝛽
𝜏
(𝑉𝑑𝑑 − 2𝑉𝑇 )3
12
𝑇
(2)
Where

Pshort-circuit=Short Circuit Dissipation

β : gain factor of the gate

Vdd: Power Supply
8

τ : rise or fall time of the Input Signal

T: Clock period
Fig 2.5. Effect of load capacitance on short circuit power dissipation
Fig 2.6. Short Circuit Energy Dissipation vs Input rise/fall time
Fig.2.6 is the plot of the energy dissipation versus the ratio of the input rise/fall time to
the propagation delay. It can be observed from the graph that Short Circuit power
dissipation increases with increase in the input rise and fall time. For most of the ICs,
the short circuit power dissipation is 5-10% of the overall power dissipation.
2.3.3
Leakage Power Dissipation
There are types of the leakage currents: reverse bias leakage on the transistors drains,
and sub-threshold leakage through the channel of the device. The magnitude of those
currents is predominantly determined by the processing technology. However there are
some things that designer can lay with to minimize the contribution.
9
Diode leakage current flows when a transistor is turned off and another active device
charges up/down the drain with respect to the bulk potential. Consider the inverter in
Fig.3.5, when it is given the high input. The pMOS transistor will be off and nMOS
transistor is making the drain to bulk potential –Vdd for the pMOS. The diode current
thus flowing through the junction is given by the expression:
𝐼 = 𝐴𝑑 𝐽𝑠
(3)
Where

Ad: area of the diffusion at the junction of drain and body

Js: leakage current density; set by the technology
It is desirable to reduce both the quantities. The leakage current density increases
with the temperature.
The other component of the power dissipation is the Sub-threshold conduction. In
the inverter shown the fig. 2.7, when the transistor is turned off there is still current
flowing through the channel due to the drain –source voltage Vds and this current is
called as subthreshold current. The plot of the drain current with Vds is shown in
the fig 2.8 and it exhibits the exponential relation in the sub-threshold conduction
region. This is due to the decrease in the threshold voltage with increase in Vds. In
other words the width of the drain junction depletion region increases with increase
in Vds. This effect is known as Drain Induced Barrier Lowering and causes a sharp
increase in the current. The magnitude of the sub-threshold current is the functions
of the process, device size and supply voltage and is given by the following formula:
𝑊𝑒𝑓𝑓 ∗ 𝑉𝑡 2 1.8 [𝑉𝑔𝑠−𝑉𝑡ℎ]
𝐼 = µ0 𝐶𝑜𝑥
𝑒 𝑒 𝑛𝑉𝑡ℎ [1 − 𝑒 −𝑣𝑑𝑠/𝑉𝑡 ]
𝐿𝑒𝑓𝑓
(4)
In the above equation that process parameter that effects the value of the sub
threshold current is the threshold voltage. Reducing the threshold voltage
significantly increases the subthreshold current. The Subthreshold current is
proportional to the supply voltage and device size.
10
Fig 2.7. Leakage Power in the inverter before occurrence of the transition at Input.
Fig 2.8. Leakage Current vs Drain voltage characteristic of the MOSFET
11
CHAPTER 3
FinFET vs PLANER BULK MOSFET DEVICES
As the CMOS devices are shrinking to the nanometer regime, increasing the
consequence in short channel effects, subthreshold swing, Gate induced drain leakage
and variation in the process parameters which lead to cause the reliability in the circuit
as well as performance. To solve all the above issues, FinFET is one of the promising
technologies without sacrificing reliability and performance of the design. Moore’s law
motivates the technology scaling in order to improve the performance features like
power, area, and speed. While the circuits and system takes the inevitable advantage of
scaling down the technology, undesired features like Short channel effects (SCE) and
sensitivity to the process parameters increases. SCE includes limitation imposed on
electron drift characteristics in the channel and threshold voltage variation along with
Ion/Ioff reduction and increase in the leakage current have made the use of bulk CMOS
transistors in sub 22nm technologies impossible. Also, the leakage current increment
increases the static power consumption in the circuits.
SCE can be reduced by using the thinner gate oxide, while it lead to higher gate leakage
current due to tunneling. It increases the total power consumption and reduce the device
reliability. FinFET considered to be the best candidate compared with bulk CMOS in
deep sub-micron technologies. The thin silicon fin in FinFET plays the role of the
channel and conducts electrons between source and drain. Fig.3.1 shows the different
structures of the FinFET. As shown, the channel is surrounded from three sides with the
gate results in superior control of the channel. It also reduces the short channel effect
due to the fully depleted channel that causes less sensitivity to process variations.
12
Fig 3.1. Different structures of the FinFET Devices
3.1
I-V Characteristics
Fig.3.2 shows the I-V characteristics of the bulk CMOS (22nm) and FinFET (20nm)
Transistors when Vgs changed from 0 to 0.9v.It can be shown that under strong
inversion region, the level of the ON current for the FinFET is higher than Bulk CMOS
and also it has higher output resistance due to fact that channel surrounded with gate
from 3 sides providing better control.
Fig 3.2. I-V characteristics of the bulk CMOS (22nm) and FinFET (20nm) devices
Fig.3.3 shows the Ion/Ioff ratio versus supply voltage for both the devices. For lower
supply voltages Ion/Ioff ratio is higher for the FinFET than CMOS, while for the higher
13
supply voltages (higher than 0.72v) it is higher for CMOS. It’s due to the fact that for
higher supply voltages CMOS has lower Ioff but for lower supply voltages Ioff for
CMOS is comparable with FinFET while FinFET have higher Ion current compared
with CMOS.
Fig 3.3. Ion/Ioff variation for CMOS and FinFET with different supply voltages
3.2
Drain Induced Barrier Lowering
Fig 3.4. Drain Current vs Gate Source Voltage for the FinFET and bulk CMOS
Fig 3.4. Shows the variation of the drain current with Gate to Source voltage for the
FinFET and bulk CMOS Devices when Vds is 0.1v and 1.1v.It can be observed from
the figure that threshold voltage decreases with increase in the gate source voltage for
14
the short channel devices. This effect is called as Drain Induced Barrier Lowering.
Increase in the drain voltage caused the depletion region at drain to penetrate more into
the channel Drain Induced barrier lowering is higher for the bulk CMOS devices
(124mV/V) as compared with the FinFET devices (58mV/V).It shows the lower
threshold variation due to short channel for the FinFET devices. Another important fact
from the figure is the lower threshold voltage for the FinFET devices compared with
CMOS devices which is one of the reason for higher Ion/Ioff ratio.
3.3
Subthreshold Swing
Fig.3.4 also shows that subthreshold swing for the FinFET is 21% lower than the bulk
CMOS at room temperature. Subthreshold Swing of the device is defined as the change
in gate voltage required to increase the drain current by a decade. It shows more
dependency of the drain current on the gate voltage in the FinFET devices. Hence the
drain current increases at faster rate with the change in the gate source voltage for the
FinFET devices.
3.4
Gate Induced Drain Leakage
Gate current leakage in the nanoscale devices is the biggest concern. Gate Induced
Drain Leakage in the bulk CMOS devices happens due to the lateral diffusion of the
source and drain regions. For calculating the Gate induced drain leakage the gate voltage
is swept from negative to positive voltage values.
Fig 3.5. Drain current versus Gate source voltage while Vds=VDD [13]
15
It can be shown from the fig.3.5 that behavior of the FinFET devices is different from
the bulk CMOS devices for the negative values of the gate source voltages and shows
the better GIDL. With negative value of Vgs, the drain current of both the devices
decreases but for CMOS, it kept constant till Vgs<-0.1 and increases rapidly at Vgs<0.3.Higher negative gate voltages for the bulk CMOS devices results in the band
bending at polysilicon, oxide and p well interface as shown in Fig.3.6, resulting
electrons from the valence band in p well tunnel to the conduction band of the n+ and
increasing the gate leakage current.
Fig 3.6. Gate Dielectric tunneling current in NMOS bulk planer device.
From all the above results, It can be observed that FinFET possess better characteristics
compared with the bulk CMOS devices for the short channels. FinFET provide better
control over the channel with less leakage and subthreshold conduction which
significantly contributes to the overall power dissipation. Also, with the independent
gate FinFET devices, we can dynamically change the threshold voltage of the device by
connecting the back gate to the reverse voltage further reducing the off state current.
Better off state current performance from the FinFET devices is the motivation to design
the standard cell libraries in sub-micron technologies with FinFET devices.
16
CHAPTER 4
CDM LOGIC STYLE
Power Reduction is the serious concern nowadays. As the MOS devices are widespread,
there is need for the low power designs mainly for the portable devices which run on
batteries. The CDM is the better way to implement the circuits designed for the low
power applications. CDM Logic style can have single or double complementary outputs.
The complementary outputs from the CDM logic cells are balanced with each other,
hence resulting in no glitches since both the complementary outputs are available at the
same instant. It has also been observed that cells with multiple outputs (ex. half adder,
full adder) in the CDM logic style have non-skewed outputs [9]. Non-skewed outputs
helps in minimizing the glitch propagation and hence saving power. We have analyzed
the primitive logic cells performance with complementary outputs and single outputs
implemented in the CDM logic style and compared with their C-CMOS logic style
implementation.
CDM logic style allows the inputs to be tied to the source and drain of the transistors,
thus creating the possible situations where NFET has to drive logic 1 and PFET has to
drive logic 0.Since NFET is not the good pull-up device, the output voltage will suffer
from threshold voltage drop, therefore different feedback mechanism has been devised
to achieved the full swing output with sufficient drive capability.
4.1
CDM with Complementary Outputs Cells
The following Fig 4.1. shows the typical CDM design with complementary outputs.
Fig 4.1. Basic Cell representation in CDM logic Style [4]
17
In the process of designing balanced complementary circuits we face two independent
inputs and two complementary outputs. In the elementary basic cell which has been
presented in Fig. 4.1 we present four elements, deciding two outputs (each output
includes two elements).Each element is a transistor and has two input controls, i.e., the
gate and either the drain or the source.
The input signals (applied to the two input terminals of these transistors) and the
selection of pMOS and nMOS transistors decide various output states. As presented in
Figure 1 we refer to the input pins (IN1 to IN4) as A or B, or their complements
respectively. We also assume that pins G1 to G4 can also be A or B or their
complements. This form of the circuit (as the elementary basic cell) is power-less and
ground-less (P-/G-).Therefore, the complementary outputs are only affected by input
drivability and are charged or discharged.
4.2
Feedback and Correction Mechanism
All circuits with complementary outputs have the ability to optionally determine the
state of an output or amplify it through the use of another output and a suitable transistor.
Transistor or transistors which are placed between the two outputs to influence the
second output through activating the first one, are called feedback networks. This
feedback network is placed between the two complementary outputs and causes the high
impedance output states to be eliminated and replaced by the desired levels. Also, it is
possible to ensure full swing operation at the outputs. As different basic cell versions
presented in this work come with different short comings, the required feedback
network should be different. In Fig. 4.2 we present four such networks: Fp, Fn, Fc and
Fnp. Fp is a feedback network using two pMOS transistors. Fn is a feedback network
with two nMOS transistors. Fn is a complementary feedback network and Fnp includes
nMOS and pMOS transistors placed between the two complementary outputs Y and 𝑌̅
.Note that we improve the driving capability of feedback networks as we use VDD and
GND connections.
18
Fig 4.2. Different feedback circuits to get full swing outputs [4]
Now in the next section, we have designed various logic cells in complementary output
CDM logic style and compared them with same cells designed in C-CMOS logic style.
4.3
Performance comparison in CDM and CMOS for complementary outputs
Basic logic cells has been implemented with 7nm FinFET devices in both CDM and
CMOS logic style with complementary outputs. The Schematics, output waveforms,
PDP variation with load, power consumption and delay comparison for the various cells
has been shown in the following figures.
4.3.1
AND-NAND gate implemented in CDM and CMOS
Fig 4.3. Schematic of AND-NAND in CDM logic Style
19
Fig 4.4. Schematic of NAND-AND logic gate in CMOS logic Style.
Fig 4.5. Test Bench for NAND/AND implemented in CMOS and CDM logic style
20
Fig 4.6. Output NAND/AND waveforms for CDM logic gate.
Fig 4.7. Output waveforms from CMOS NAND-AND gate.
Fig 4.8. PDP between CMOS and CDM with varying load capacitance.
21
Fig 4.9. Delay vs Power plot for AND-NAND in CMOS and CDM logic style.
Table 4.1. shows the various performance parameters for the AND-NAND logic cell in
CMOS and CDM with 1f F load capacitance.
Table 4.1. Performance parameter for AND/NAND in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.3.2
CDM
22.98
129.6
2.979
CMOS
18.22
136.8
2.492
OR-NOR gate implemented in CDM and CMOS Logic Style
Fig 4.10. Schematic of NOR-OR in CDM Logic Style
22
Fig 4.11. Schematic of NOR-OR in CMOS Logic Style
Fig 4.12. Test Bench for performance comparison between CMOS and CDM
23
Fig 4.13. Output Waveforms for the CDM Logic Style
Fig 4.14. Output Waveform for CMOS logic Style
Fig 4.15. PDP between CMOS & CDM OR-NOR with varying load capacitance
24
Fig 4.16. Delay vs Power plot for NOR-OR in CMOS and CDM logic style.
Table 4.2. shows the performance parameter comparison of 2 input OR-NOR between
CMOS and CDM with 1f F load capacitance.
Table 4.2. Performance parameter for OR/NOR in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.3.3
CDM
7.609
100.1
761.7m
CMOS
12.08
98.25
1.037
3-Input AND-NAND gate in CMOS and CDM Logic Style
Fig 4.17. Schematic of 3 input NAND-AND in CDM Logic Style
25
Fig 4.18. Test Bench for the 3 input NAND-AND in CMOS and CDM Logic Style
Fig 4.19. Output Waveforms for 3 input CDM NAND-AND logic gate.
26
Fig 4.20. Output Waveforms for 3 input CMOS NAND-AND logic gate.
Fig 4.21. PDP for 3-NAND-AND in CMOS and CDM with varying load cap.
Fig 4.22. Delay vs Power comparison for 3 Input CMOS and CDM AND-NAND.
27
Table 4.3. shows the performance parameter comparison of 3 input AND/NAND
between CMOS and CDM with 1f F load capacitance
Table 4.3. Performance parameter for 3 input AND/NAND in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.3.4
CDM
13.74
114.5
1.573
CMOS
16.71
89.83
1.501
3-Input NOR-OR gate implemented in CMOS and CDM Logic Style
Fig 4.23. Schematic of 3 input NOR-OR gate in CDM logic Style
28
Fig 4.24. Test Bench for 3 input CMOS and CDM NOR-OR gate
Fig 4.25. Output waveforms from the CDM NOR-OR Logic cell.
Fig 4.26. Output Waveforms from CMOS NOR-OR Logic cell.
Fig 4.27. PDP for the CMOS and CDM NOR-OR with varying load capacitance
29
Fig 4.28. Delay vs Power comparison for CMOS and CDM OR-NOR.
Table 4.4. shows the performance parameter comparison of 3 input OR/NOR between
CMOS and CDM with 1f F load capacitance
Table 4.4. Performance parameter for 3 input OR/NOR in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
CDM
10.59
131.9
1.398
CMOS
16.04
104
1.667
From the above simulation results, it can be shown that CDM logic is more efficient in
comparison with the CMOS logic for the complementary outputs with balanced and
symmetrical designs. CMOS logic style works better for NAND-AND logic gates,
hence CMOS standard cell library are more efficient for the NAND/AND intensive
design synthesis.
4.4
Power Saving with CDM
The Power consumption of the circuit can be reduced by considering the following
parameters:

Switching Activity in the circuit.

Switching capacitance of each node.

Supply Voltage

Short Circuit current

Leakage current
30
Now, the advantage of CDM comes from the fact that it is best suitable to implement
all the above power reduction techniques:
1. Switching Activity in the circuit can be reduced by eliminating the glitches.
CDM Designs provides balanced complementary outputs and non-skewed
outputs for the cells with multiple outputs, hence the reduced chances for the
glitches and power dissipation due to glitch propagation.
2. Switching capacitance of the node in CDM will be small compared to the node
in the CMOS design, due to the smaller size if the transistors in CDM
implementation because of less no. of transistor in the critical path (less parasitic
capacitance).
3. Like the CMOS technology, the supply voltage can be reduced but with increase
in the delay for the circuit.
4. There are few ground and power connections means fewer VDD to GND
connections during switching. So CDM implementation should draw the least
amount of the short circuit power.
5. Leakage current contribute significantly as going deep the feature size and
therefore to address this problem, FinFET devices has been used in place of the
bulk MOS transistors to minimize the leakage power. FinFET devices has better
Ioff current performance compared with bulk MOS transistors.
4.5
CDM with Single Output Cells
Fig 4.29. shows the basic single level and two level logic cells with single output CDM
logic style
31
Fig 4.29. Single Output CDM basic cells (a) Single Level (b)-(d) Two Level
Single output CDM basic cell can be seen as the half of the complementary output CDM
basic cell. All the feedback networks used with the complementary outputs CDM cells,
requires the outputs to be complementary which is not true with the single output CDM
cells. Hence we have used inverter at the output of the cells to get the full swing outputs
with enough drive capability.
Single output CDM are observed efficient compared with static CMOS for the better
implementation of the arithmetic circuits such as adders, multipliers and other XOR
intensive circuits. It can proved with the following simulation results for the various
arithmetic modules like half adder, full adder,4 bit multiplier and various primitive
gates like 2 input AND gate, NAND gate, OR gate ,NOR gate, XOR gate , 3 input AND
gate, NAND gate, OR gate, NOR gate and XOR gate.
32
4.5.1
AND Gate
Fig 4.30. Schematic of AND gate with CDM single output logic style
Fig 4.31. Test Bench for CMOS and CDM AND gate
33
Fig 4.32. Output Waveforms from CMOS and CDM AND gate
Table 4.5. shows the delay and power consumption of the AND gate implemented in
both logic styles
Table 4.5. Performance parameter for 2 input AND in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.5.2
CDM
13.15
75.96
0.998
CMOS
12.53
75.95
0.951
OR gate
Fig 4.33. Schematic of the OR gate in the single output CDM logic style
34
Fig 4.34. Test Bench for CMOS and CDM OR gate.
Fig 4.35. Output waveforms from the CMOS and CDM OR gate.
Table 4.6. shows the delay and ,power consumption of the OR gate implemented in
both logic styles.
Table 4.6. Performance parameter for 2 input OR in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
CDM
12.21
92.81
1.103
35
CMOS
12.92
87.88
1.135
4.5.3
3-Input AND gate in CDM single output logic style
Fig 4.36. Schematic of 3 input AND gate in CDM single output logic style.
Fig 4.37. Test Bench for CMOS and CDM 3 Input AND gate.
36
Fig 4.38. Output waveforms from 3 input AND gate in CDM and CMOS.
Table 4.7. shows the delay and ,power consumption of the AND gate implemented in
both logic styles
Table 4.7. Performance parameter for 3 input AND in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.5.4
CDM
20.89
66.1
1.381
CMOS
13.14
59.36
780m
3-Input OR Gate
Fig 4.39. Schematic of 3 Input OR gate in CDM single output logic style.
37
Fig 4.40. Test Bench for 3 Input CMOS and CDM OR gate.
Fig 4.41. Output Waveforms from 3 Input OR gate in CMOS and CDM.
Table 4.8. shows the delay and power consumption of the OR gate implemented in
both logic styles
Table 4.8. Performance parameter for 3 input OR in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
CDM
18.1
92.95
1.6
38
CMOS
12.81
102.4
1.3
4.5.5
Half Adder Comparison
Fig 4.42. Schematic of Half Adder in single output CDM logic style
Fig 4.43. Test Bench for CDM and CMOS half adder
39
Fig 4.44. Output waveforms from the half adder in CMOS and CDM logic style.
Table 4.9. shows the delay and ,power consumption of the half adder gate
implemented in both logic styles.
Table 4.9. Performance parameter for half adder in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.5.6
CDM
16.23
164
2.66
Full Adder Comparison
Fig 4.45. Schematic of Full Adder in CDM single output logic style.
40
CMOS
14.16
197
2.79
Fig 4.46. Test Bench for CDM and CMOS Full adder.
Fig 4.47. Output waveforms from the Full adder in CMOS and CDM.
Table 4.10. shows the delay and ,power consumption of the full adder gate
implemented in both logic styles.
Table 4.10. Performance parameter for full adder in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
CDM
28.66
462.5
12.7
41
CMOS
19.08
690.2
13.17
4.5.7
4:2 Compressor comparison
4:2 Compressor has been designed using the full adders in both CMOS and CDM
logic style. The following figure shows the compressor schematic in the CDM logic
style consisting of full adders again implemented in CDM logic style.
Fig 4.48. Schematic of 4:2 compressor in single output CDM logic style.
Fig 4.49. Test Bench for CMOS and CDM 4:2 compressor design
42
Fig 4.50. Output Waveforms from 4:2 compressor in CMOS and CDM.
From the Fig 4.50, it can be observed that CMOS outputs has more glitches when
compared with CDM logic cell, hence CMOS implementation results in more power
consumption.
Table 4.11. shows the delay and power consumption of the 4:2 compressor
implemented in both logic styles
Table 4.11. Performance parameter for 4:2 compressor in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.5.8
CDM
26.71
668.4
17.85
4 bit by 4 bit Multiplier
43
CMOS
20.65
1092
22.54
Fig 4.51. Schematic of 4 bit by 4 bit multiplier in CDM logic style.
Fig 4.52. Test Bench for CDM and CMOS 4 bit by 4 bit multiplier.
Fig 4.53. Output Waveforms from the Multiplier in CMOS logic style.
Fig 4.54. Output Waveforms from the Multiplier in CDM logic style
44
Again for the multiplier design, from the above shown figures, we can observe that
CMOS multiplier has more glitches in the output as compared with the CDM logic
style and hence has more power consumption.
Table 4.12. shows the delay and power consumption of the multiplier implemented in
both logic styles.
Table 4.12. Performance parameter for 4 bit multiplier in CDM & C-CMOS
Parameters
Delay (in ps)
Power (in nW)
PDP (in 10 e-18)
4.6
CDM
30
2K
60
CMOS
24
2.8K
67.2
Data Analysis
Single Output CDM is not universally better than static CMOS for all types of the
designs; for the NAND/AND intensive circuits, static CMOS can result in better
implementation as compared with single output CDM for the smaller load capacitance.
But for larger load capacitance, CDM implementation of the 2 and 3 input NAND gate
works better compared with C-CMOS logic style. This is demonstrated in the simulation
result for the 2 & 3 input AND/NAND gates (Fig. 4.8 and Fig. 4.21). Single Output
CDM implementation is better in the XOR rich or MUX rich designs compared with
the static CMOS logic style. Motivated with the performance of the Single output CDM
logic gates, there’s the need to design the standard cell library with these cells and
integrate these logic cells in the existing design flows for the optimal performance.
Since Single Output CDM and CMOS have their respective advantages and
disadvantages in terms of the performance characteristics like area, power and delay.
Standard cell library design with C-CMOS and Single Output CDM logic cells on
various technology nodes has been designed in the thesis and performance improvement
for the designs synthesized with each CMOS and Single Output CDM standard cell
library are analyzed. In addition to CMOS standard cell library design, proposed Single
Output CDM standard cell library contains wide variety of complicated logic cells
constructed from only few basic 1 & 2 level single output CDM cells. Note that
45
traditional CMOS standard cell library usually consists of thousands of logic cells with
individual layouts but Single Output CDM standard cell library consists of various
complicated logic functions that can be derived by changing the signals (VDD, GND,
and Variable) at the input lines with same layout. Hence Single Output CDM standard
cell libraries result in synthesized designs to be symmetrical. With same footprint and
by changing the input signal, different logic functions can be generated in the single
output CDM logic style. Therefore it results in reduced manual design efforts for the
standard cell library design.
46
CHAPTER 5
SINGLE OUTPUT CDM STANDARD CELL LIBRARY DESIGN
We have already explored various CDM logic cells with complementary outputs and
compared their performance parameter with CMOS logic cells with complementary
outputs. We have observed CDM logic cells are more efficient with respect to CMOS
logic cells, but industry standard tools available for the ASIC design flow are designed
for the single output (but not complementary outputs) standard cell library. Hence to
enjoy the benefits from the complementary output CDM logic cells, algorithmic level
modification is required with the standard tools which is outside the scope of the thesis.
Hence, we worked on designing the single output CDM standard cell library with
FinFET devices and compare the performance parameters after synthesizing the
benchmark designs with both the designed CDM and CMOS standard cell libraries.
Fig 5.1 shows the basic single level and two level logic cells with single output CDM
logic style
Fig 5.1. CDM Logic Cells (a) Single Level (b)-(d) Two Level
The Basic expression for the output from the single level and two level CDM single
output logic cells can be written as follows:
47
Y (a) =
𝐴(𝐼𝑛1) + 𝐴(𝐼𝑛2)
(5)
Y (b) =
𝐴(𝐼𝑛1) + 𝐴(𝐵(𝐼𝑛2) + 𝐵(𝑖𝑛3))
(6)
Y (c) =
𝐴 (𝐵(𝑖𝑛1) + 𝐵(𝑖𝑛2)) + 𝐴(𝐼𝑛3)
(7)
Y (d) =
(8)
𝐴 (𝐵(𝑖𝑛1) + 𝐵(𝑖𝑛2)) + 𝐴(𝐵(𝐼𝑛3) + 𝐵(𝑖𝑛4))
The major advantage from the CDM logic cells is that they support the automatic logic
design. We can define different algorithms to extract the different logic functionalities
just with above two basic cells. From the single level CDM basic cell, total of 32 = 9
different logic functions can be generated shown in the following Table 5.1.
Table 5.1. Logic functions from Single level Single output CDM basic cell
In1
0
0
0
1
1
1
In2
0
1
B
0
1
B
Y
1
𝐴
𝐴𝐵
A
0
C
0
C
C
1
B
𝐴𝐶
𝐴+𝐶
𝐴+𝐵
𝐴𝐶 + 𝐴𝐵
Similarly with the two level CDM logic cell, total of 34 = 81 different logic functions
can be generated. Even after removing the repeating/redundant logic cell, total of 55
different logic cells can be generated with CDM logic style just with two level
implementation. Layout for the basic cells remains the same, only changing the input
lines can change the functionality of the cell. Therefore, this results in CDM cell library
being richer than CMOS cell library with reduced area and power consumption and less
manual efforts. We have confined the CDM logic cells to two level only though there is
feasibility to extend it to 3 levels to make the library richer in terms of the logic
functions. CMOS standard cell library has also been generated consisting of the
primitive logic gates along with arithmetic modules like full adder and half adder. All
48
the cells in both the designed standard cell library has single instance and device sizes
hasn’t been scaled to multiples to allow better drive capability with more power
consumption.
Standard cell libraries designed are optimized for Energy (PDP) and hence the logic
cells are sized for the minimum PDP for both static CMOS and Single Output CDM
standard cell library. Since standard cell library are designed using the various FinFET
device models from PTM, we can only change the number of the Fins for the FinFET
devices and other parameters are fixed with the model library. Therefore the number of
the fins for PFET and NFET used in the logic cells are selected for the minimum PDP
consumption. SEA algorithm [6] has been used for FinFET sizing in the Single Output
CDM standard cell library to minimize the PDP. The FinFETs at the identical positions
in the basic cells has been grouped together for fin sizing as one variable and then sweep
is performed for all the variables to find the combination for minimum PDP.
5.1
Standard Cell Library Design Flow
Fig 5.2 shows the complete flow for the standard cell library characterization for both
CDM and C-CMOS logic cells
Fig 5.2. Standard Cell Library Design Flow
49
Once the number of FINS has been decided for Single Output CDM and CMOS logic
cells, netlist for the logic cells was generated and formatted as per HSPICE format and
the fed into silicon smart with device models for the standard cell library
characterization using HSPICE simulator. Silicon smart generate the standard cell
library in the liberty format (.lib) containing all the timing and power information of the
cells included in the standard cell library. Liberty format is then converter into the
database format (.db) using the library compiler. All the tools mentioned were from
Synopsys. Standard Cell libraries with database (.db) format are further used to
synthesize various benchmark circuits using Synopsys design vision. All the scripts
used in this flow for the various tools and logic cell netlist for both CMOS and Single
Output CDM cells has been included in the appendix A
BSIM-CMG FinFET device models for feature size 7nm, 10nm, 16nm and 20nm are
available from PTM for HP (high performance) and LSTP (low stand by power).Hence
standard cell library has been designed for all the available device models in different
feature size. All the designed standard cell libraries has been used to synthesize the
benchmark designs to prove the design methodology is independent of the technology
feature size.
5.2
Benchmark Circuits
Benchmark circuits are the collection of the various circuits to evaluate more objectively
the performance of the various synthesis tools. Some of the popular benchmark circuits
includes ISCAS’85, ISCAS’89 and ITC’02 .In general ISCAS’85 is the generally used
for combinational logic circuits. Since the designed standard cell library consists of the
combinational cells only, we use ISCAS’85 for our experiments. Table 5.2 shows the
functionality of the various benchmark designs synthesized with the designed standard
cell libraries for the performance comparison.
50
Table 5.2. ISCAS’85 Benchmark Designs
5.3
Synthesis Results with CDM standard cell library
Synthesis results for the various circuits in the benchmark designs with both the CCMOS and Single output CDM standard cell library is shown in the following Tables.
Table 5.3 shows the synthesis results for the benchmark circuits with 7nm CDM and
CMOS FinFET standard cell libraries.
Table 5.3 Synthesis Results with 7nm Standard Cell Library
Architecture
C1355
c1908a
c3540a
c499
c432
c6288
CMOS
Power Delay
14.27
124.68
7.2
110.77
13.84
197.46
14.7
103.85
2.72
162.16
95.94
608.74
CDM
Power Delay
9.3
136.27
5.6
152.77
11.21
241.38
9.8
120.79
2.06
211.99
82.89
754.88
51
CMOS
PDP
1779.18
797.54
2732.85
1526.60
441.08
58402.52
CDM
PDP
1267.31
855.51
2705.87
1183.74
436.70
62572.00
c880
4.6
130.21 4.15
165.74 598.97
687.82
c17
0.064
16
0.057
18.61
1.02
1.06
c2670
10.63
148
8.5
159.33 1573.24 1354.31
c5315
27.5
146.5
22.5
175.5
4028.75 3948.75
c7552
39.8
337.19 31.3
251.53 13420.16 7872.89
Table 5.4 shows the synthesis results for the benchmark circuits with 10nm CDM and
CMOS FinFET standard cell libraries.
Table 5.4 Synthesis Results with 10nm Standard Cell Library
Architecture
CMOS
CDM
CMOS
CDM
Power Delay Power Delay
PDP
PDP
C1355
22.03
146
13.7 153.12 3216.38 2097.744
c1908a
10.6 129.45
8 158.56 1372.17 1268.48
c3540a
19.4
226.6
16.4 214.34 4396.04 3515.176
c499
16.8 125.09
14.2 132.42 2101.512 1880.364
c432
3.83 183.86
3 213.51 704.1838
640.53
c6288
121
683.8
121.9 825.75 82739.8 100658.9
c880
6.73 143.34
5.78 171.31 964.6782 990.1718
c17
0.0832
18.24 0.0814
19.36 1.517568 1.575904
c2670
15.4 171.86
12.2 222.58 2646.644 2715.476
c5315
40.15 166.12
32.77 175.69 6669.718 5757.361
c7552
59 382.08
45 322.64 22542.72 14518.8
Table 5.5 shows the synthesis results for the benchmark circuits with 14nm CDM and
CMOS FinFET standard cell libraries.
Table 5.5 Synthesis Results with 14nm Standard Cell Library
Architecture
C1355
c1908a
c3540a
c499
c432
c6288
c880
c17
c2670
c5315
c7552
CMOS
Power Delay
28.1
171
14.3 148.57
26.4 263.08
22.8 135.84
5.23
207
162.9 765.18
9 162.41
0.1118
19.98
20.8 193.13
53.6 180.57
80 441.12
CDM
CMOS
CDM
Power Delay
PDP
PDP
17.1 171.77
4805.1 2937.267
10.44 182.26 2124.551 1902.794
22.1 262.57 6945.312 5802.797
19.3 144.72 3097.152 2793.096
4 229.43
1082.61
917.72
164.2 868.42 124647.82 142594.6
7.94
170.6
1461.69 1354.564
0.11
20.56 2.233764
2.2616
17.1
179.3 4017.104 3066.03
43.8 194.39 9678.552 8514.282
60 311.24
35289.6 18674.4
52
Table 5.6 shows the synthesis results for the benchmark circuits with 16nm CDM and
CMOS FinFET standard cell libraries.
Table 5.6 Synthesis Results with 16nm Standard Cell Library
Architecture
C1355
c1908a
c3540a
c499
c432
c6288
c880
c17
c2670
c5315
c7552
CMOS
CDM
Power
Delay
Power
Delay
46.3 234.48
26.4
220.8
21.8 213.79
15.7 237.77
40.6 355.32
33.3 338.85
35.2 189.45
28.4 196.28
8.23 282.97
6.52 304.22
252.2 1038.66
259.3 1204.05
14.1 225.92
12
220.7
0.1736
25.92 0.19918
24.49
32.4 264.69
24.2 281.84
83.6 260.24
67.3 250.28
126 611.92
90 398.18
CMOS
PDP
10856.424
4660.622
14425.992
6668.64
2328.8431
261950.05
3185.472
4.499712
8575.956
21756.064
77101.92
CDM
PDP
5829.12
3732.989
11283.71
5574.352
1983.514
312210.2
2648.4
4.877918
6820.528
16843.84
35836.2
Table 5.7 shows the synthesis results for the benchmark circuits with 20nm CDM and
CMOS FinFET standard cell libraries.
Table 5.7 Synthesis Results with 20nm Standard Cell Library
Architecture
C1355
c1908a
c3540a
c499
c432
c6288
c880
c17
c2670
c5315
c7552
CMOS
Power
Delay
72.6 336.57
33.8 298.92
65.6 528.88
55.3 273.83
13.2 403.29
396.7 1463.13
22.1 325.96
0.272
35.25
51.7 375.79
133 363.04
195.1 918.48
CDM
Power
Delay
40 301.41
24.4
309.5
49.4 418.72
41.9 254.83
10.12 431.25
371.1 1499.75
18.4 295.56
0.305
31.67
39 390.43
100.2 360.83
140 627.01
CMOS
PDP
24434.982
10103.496
34694.528
15142.799
5323.428
580423.67
7203.716
9.588
19428.343
48284.32
179195.45
CDM
PDP
12056.4
7551.8
20684.77
10677.38
4364.25
556557.2
5438.304
9.65935
15226.77
36155.17
87781.4
From the above simulation results by the designed standard cell libraries in different
technologies with both single output CDM and CMOS logic styles, it can be observed
53
that CDM standard cell library results in Power and PDP efficient designs as
compared with the CMOS standard cell libraries.
5.4
Data Analysis
Power and PDP (Power Delay Product) savings with CDM compared with C-CMOS
has been calculated from the data presented in the tables and shown in Fig 5.5 & Fig
5.6. From the figures it can be observed that Power and PDP has saved with CDM for
all the benchmark designs except c17 and c6288.Design c17 is the small six-NAND
gate circuit and c6288 is 16x16 bit multiplier, with the following schematic shown in
Fig5.3. We have already observed that CDM NAND/AND gate are not optimized
compared with C-CMOS logic style, hence c17 being the NAND intensive design has
been optimized in terms of power and energy compared with C-CMOS standard cell
libraries.
Fig 5.3.ISCAS-85 c6288 16x16 multiplier
Full Adder and half adder logic cells has been used in the multiplier design has been
implemented as shown in Fig.5.4
54
Fig 5.4. Full adder module for ISCAS-85 c6288 16x16 multiplier
The full adder module has been implemented with primitive logic gates (NOR) in the
design. Hence during synthesis with the CDM standard cell, even if there’s full adder
cell in the library the compiler chose to use logic gates to implement full adder and use
that in multiplier design. If compiler would have chosen CDM full adder directly from
standard cells rather than designing it using logic gates, than power and PDP savings
are possible with CDM Standard cell libraries for c6288 as well. This requires change
in the HDL modelling to design c6288 design but since we are using the original
benchmark design only therefore the results are not optimized compared with C-CMOS.
For the rest of the benchmark designs, % savings in terms of power and PDP are
significant.
Synthesis with the CDM standard cell libraries has resulted in average power saving of
17-21% for all the benchmark designs and 7-26% PDP savings compared with C-CMOS
standard cell libraries for all benchmark designs.
55
Fig 5.5. Power Improvement with CDM over CMOS standard cell libraries
Fig 5.6. PDP Improvement with CDM over CMOS standard cell libraries
5.5
Binary to BCD Converter
Other than the Benchmark circuits, we have also used the designed standard cell
libraries for the synthesis of the various Binary to BCD converter designs [].New
architecture for binary to BCD converter has also been devised using Complement
Based Logic Design Algorithm (CBLD) [].CBLD algorithm tend to make you design
more XOR gate intensive and CDM standard cell library has been proved to be efficient
compared with C-CMOS for the XOR intensive designs. Hence the synthesis results of
56
the CBLD based designs with CDM standard cell libraries results in more savings in
terms of energy and power. The following sections explains about the design of the
binary to BCD converter design with CBLD and comparison with various other state of
art designs. Comparison has been completed in various technologies and result shows
CBLD algorithm capable of designing fast and energy efficient modules. Later synthesis
for all binary to BCD architectures has been completed with the designed C-CMOS and
CDM standard cell libraries and result shows the CBLD designs with CDM standard
cell libraries achieve 50% energy saving compared with the best near performance
design.
5.5.1
Binary to BCD converter in CBLD
5.5.1.1 Introduction
The goal of this new method of the converter design is to optimize the conversion speed,
power dissipation and area consumed. Most of the recently proposed multiplier designs
uses the 7 bit binary to BCD converters. Binary to BCD converters is the critical
component of the multiplier designs and hence the proposed algorithm has been
designed for such multipliers is based on Complement Based Logic Design. For better
understanding, let us assume the arbitrary truth table for three inputs A, B, C and three
outputs Y1, Y2, Y3, such that:
Y1 = ⨍(A, B, C)
(9)
Y2 = ⨍(A, B, C)
(10)
Y3 = ⨍(A, B, C)
(11)
Therefore, so as to implement all our output functions in terms of the inputs, we can use
the identity matrix multiplied with all our outputs and further multiplying the output
identity matrix with the inputs as shown in in the following equation:
𝒚𝟏 ⊕ 𝟎
𝟎
𝒚𝟏⨁𝑨 𝒚𝟏⨁𝑩 𝒚𝟏⨁𝑪
𝑨𝑩𝑪
𝑭𝟏𝟏
[ 𝟎 𝒚𝟐 ⊕ 𝟎 ] ∗ [𝑩 𝑪 𝑨 ]= [𝒚𝟐⨁𝑩 𝒚𝟐⨁𝑪 𝒚𝟐⨁𝑨] ≅ ⌈𝑭𝟐𝟏
𝒚𝟑⨁𝑪 𝒚𝟑⨁𝑨 𝒚𝟑⨁𝑩
𝟎
𝟎 𝒚𝟑 ⊕
𝑪𝑨𝑩
𝑭𝟑𝟏
57
𝑭𝟏𝟐 𝑭𝟏𝟑
𝑭𝟐𝟐 𝑭𝟐𝟑⌉
𝑭𝟑𝟐 𝑭𝟑𝟑
With functions F11, F12 and F13, output Y1 is expressed in terms of the inputs A, B, C
and hence out of those three functions, one is selected for less area, power and high
speed. The above proposed algorithm is scalable and is possible to generate all the
possible functions with respect to inputs with the help of the integration of MATLAB
and Quine-McCluskey Software. Let’s assume we select functions F11, F21, F32 out of
all the available ones as they been simpler and smaller when compared with others.
Hence the final outputs can be expressed with the following equations:
Y1=F11⊕A
Y2=F21⊕B
(12)
(13)
(14)
Y3=F32⊕A
zUsing the conventional methods of logic realization, we can only get SOP and POS
functions, but here we are defining our outputs with the help of the final XOR gate
which is either buffering or complementing the input line as per the output function with
the help of the functions defined above.
5.5.1.2 Binary to BCD converter architecture
Let ABCDEFG be the seven binary bits to be converted into two BCD digits
(C3C2C1C0B3B2B1B0). We applied the above algorithm for Binary to BCD converter
and each of the output is expressed in terms of the 7-bit binary input and the optimized
function selected out of the seven possible functions are as follows:
F1=AC
F2= AC' + BC'D'
F3= BCG' + CD'E' + BD' + A
F4= BCG + DFG' + (A'CD'E' + C'DE + BC'D) + AD
F5= CDE' + CEF + B'CD'F' + A'B'DE'F'
F6= AC'D' + BFG + A'B'CF' + B'DEF' + CD
F7= BCD'+E'FG+C'D'F+(A'CD'E'+C'DE+BC'D) F’+AD
F8= G
The final architecture of binary to BCD converter is represented in Fig. 5.7.
58
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
Final Output functions are shown in the following equations (23-30)
C3 = AC
C2= (AC' + BC'D') ⨁ B
C1= (BCG' + CD'E' + BD' + A) ⨁ C
C0= (BCG + DFG' + (A'CD'E' + C'DE + BC'D) + AD) ⨁ B
B3= (CDE' + CEF + B'CD'F' + A'B'DE'F') ⨁ C
B2= (AC'D' + BFG + A'B'CF' + B'DEF' + CD) ⨁ E
B1= (BCD'+E'FG+C'D'F+(A'CD'E'+C'DE+ BC'D)F'+AD)⨁B
B0 = G
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
This architecture is based on three stages, two first stages are with Sum of Product (SOP)
structure for producing control functions (Fi) and the last stage contains two input XOR
gates.
A C
B
F1
C3
C
F2
C2
F4
F3
C1
C
B
C0
E
B2
G
F7
F6
F5
B3
B
B1
B0
Fig.5.7. Binary to BCD converter design with CBLD algorithm
5.5.1.3 Synthesis Results
We have compared nine different designs for Binary to BCD Conversion (Table 2).
These designs are: (i) Four different architectures proposed in [30], (ii) Three-Four split
[31], (iii) Four-Three split [31], (iv) the design proposed in [33], and (v) the version of
architecture of [32]. We describe all architectures using Verilog HDL structural
modelling. The designs were verified with all the input possible combinations using
Isim. All designs were synthesized using Synopsys design vision and IC compiler with
saed90nm_typ_ht cell library from Synopsys for both the Pre-Layout and Post-Layout
Analyses (Fig. 5.8 & 5.9) with saed32hvt_tt1p05v25c for 32nm (from Synopsis) while
45nm using OSU Standard Cell Libraries [37] and 14nm, 7nm, 5nm using the standard
59
cells libraries from USC [28] for Pre-Layout Analysis (Fig. 5.10 & 5.11). The synthesis
results are shown in Table 5.8 and Table 5.9-5.10. From the simulation results, we can
confirm that proposed architecture is consistently fast and Energy efficient for all the
technologies from 90nm to 14nm (using CMOS based standard cells) and 7nm, 5nm
using FinFET based standard cell library.
Table 5.8. Post-Layout synthesis results with 90nm technology node.
Architecture
3-4[31]
4-3[31]
Binary New(BN)[32]
Shift Add by 3[30]
3-3-1 Design[33]
331 modified 1 [30]
331 modified 2 [30]
Range Detect(RD)
[30]
CBLD[29]
Area
293
295
524
257
411
405.5
387.0
Power
90
67.2
85.7
84.8
97.8
110.2
98.1
Delay
1.23
1.52
1.18
2.52
1.62
1.95
2.06
330.8
69
1.8
348
70.3
1.03
Table 5.9. Pre-Layout Synthesis with 90, 45 & 32nm technology node.
Architecture
µm
Area
324
294
90nm
µW
Pow
94.53
68
ns
Del
1.41
1.47
509
84
1.15
232
62.25
0.43
135
10.4
0.54
267.2
81.36
2.53
95.2
43.14
0.71
74.71
6.99
1.13
434
103
1.76
193.3
69
0.58
117.2
10.1
0.77
451
108.5
1.62
185
68.4
0.49
124
10.47
0.75
428.5
103.8
1.71
174.1
62.12
0.55
119
9.92
0.77
304
54
1.81
151
45
0.64
86.66
6.97
0.74
354
76
1.08
158.6
46.24
0.32
98.60
6.38
0.52
2
Parameters
3-4[31]
4-3[31]
Binary
New[32]
Sh-Add-3
[30]
3-3-1 [33]
331 mod 1
[30]
331 mod 2
[30]
Range
Detect [30]
CBLD[29]
45nm
µm
µW
Area Pow
122.4
51
114
40
2
60
ns
Del
0.44
0.48
32nm
µm
µW
Area Pow
93.52 8.13
76.75 6.42
2
ns
Del
0.57
0.69
Table 5.10. Pre-Layout synthesis with 14(CMOS), 7 & 5nm (FinFET) technology.
Architecture
Parameters
3-4[31]
4-3[31]
Binary
New[32]
Sh-Add-3
[30]
3-3-1 [33]
331 mod 1
[30]
331 mod 2
[30]
Range
Detect [30]
CBLD[29]
µm2
Area
11.44
11.03
14nm
µW
ns
Pow
Del
2.65 121.48
2.48
181
µm2
Area
1.27
1.28
7nm
µW
Pow
1.05
0.98
ns
µm2
Del
Area
63.98 0.468
74.59 0.476
5nm
µW
Pow
0.45
0.42
ns
Del
12.51
13.18
18.44
2.4
130.32
2.08
1.38
70.4
0.898
0.52
11.79
9.62
2.67
266.8
1.08
0.78
120.6 0.396
0.4
22.46
15.85
3.26
165.8
1.7
1.31
73
0.696
0.61
13.73
16.67
4.12
211.14
1.87
1.35
68.74
0.72
.60
14.44
15.35
3.13
175.6
1.72
1.22
68.35 0.706
0.61
14.79
10.81
2.32
160.07
1.25
0.84
70.09 0.498
0.32
17.13
12.76
1.50
119.9
1.47
0.99
56.19
0.30
7.28
0.58
Pre-Layout_90nm
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Area
3-4[31]
Power
Delay
4-3[31]
PDP
APDP
EDP
Binary New[32]
Shift Add by 3[30]
3-3-1 Design[33]
331 modified 1 [30]
331 modified 2 [30]
Range Detection [30]
CBLD[29]
Fig 5.8. Normalized Pre-Layout synthesis with 90nm technology.
61
Post_Layout_90nm
1
0.8
0.6
0.4
0.2
0
Area
3-4[31]
Power
Delay
4-3[31]
PDP
APDP
EDP
Binary New[32]
Shift Add by 3[30]
3-3-1 Design[33]
331 modified 1 [30]
331 modified 2 [30]
Range Detection [30]
CBLD[29]
Fig 5.9 Normalized Post-Layout Synthesis with 90nm technology.
211
181
176
121
22
14
69
75
13
15
68
73
14
17
70
70
64
166
160
130
12
7
50
13
100
121
150
120
200
56
Delay(in ps)
250
267
Delay with different Technologies
300
0
14nm
7nm
5nm
Fig 5.10 Pre-Layout delay result for 14, 7 and 5nm technology node.
62
14nm
7nm
550
541
449
371
8.7
93
94
9.0
83
9.0
8.4
96
73
5.5
59
5.5
67
5.6
6.1
97
180
313
322
712
870
PDP with different Technologies
56
2.2
PDP(µW.ps)
900
800
700
600
500
400
300
200
100
0
5nm
Fig 5.11 Pre-Layout PDP results for 14nm (cmos), 7nm & 5nm (FinFET).
5.5.1.4 Synthesis Results with CDM standard cell library
Further synthesis with the newly designed CDM and C-CMOS standard cell library in
7nm technology node for all the binary to BCD converter designs are shown in table
5.11 & table 5.12 and in Fig 5.12 & Fig 5.13. From the following figures it can be
observed that synthesis with CDM standard cell library results in power savings for all
the referenced architectures and achieving 10% more PDP efficiency in CDM compared
with the C-CMOS logic style for the proposed CBLD based binary to BCD converter
design.
Table 5.11. Power dissipation with CDM and CMOS in 7nm technology node
CBLD[29]
3-4[31]
BN[32]
RD[30]
3-3-1[33]
331mod2[30]
4-3[31]
331mod1[30]
Shiftadd[30]
Power
Dissipation(7nm) (µW)
CMOS
CDM
0.878
0.676
1.268
0.864
1.466
1.078
0.877
0.741
1.434
1.129
1.32
1.078
0.896
0.773
1.489
1.266
1.085
0.919
63
Table 5.12. Power Delay Product with CDM and CMOS in 7nm
CBLD[29]
3-4[31]
BN[32]
RD[30]
3-3-1[33]
331mod2[30]
4-3[31]
331mod1[30]
Shiftadd[30]
PDP(7nm)(µW.ps)
CMOS
CDM
33.90
31.30
79.49
51.84
70.87
75.63
60.96
62.93
108.87
96.47
93.34
99.55
59.10
72.73
107.31
113.76
122.06
139.46
Power Dissipation with CDM & CMOS (7nm)
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
CMOS
CDM
Fig 5.12 Power Dissipation with CDM and CMOS in 7nm technology.
64
60.00
40.00
139
122
73
96
59
52
80.00
61
63
71
76
79
100.00
34
31
PDP(µW.ps)
120.00
93
100
109
140.00
107
114
PDP with CDM and CMOS Cells(7nm)
20.00
0.00
CMOS
CDM
Fig 5.13. Power Delay Product with CDM and CMOS in 7nm technology.
65
CHAPTER 6
CONCLUSION AND FUTURE WORK
FinFET technology is becoming prominent VLSI technology due to its extraordinary
properties and advantages compared to planer bulk transistors. Standard cell library
facilitated the circuit synthesis and performing the timing and power analyses for the
circuits. Standard cell libraries with primitive gates and arithmetic modules like full
adder and half adder has been designed in CMOS and CDM logic style in various
technology node using FinFET devices. Simulation results from benchmark circuits
predicted the average power improved of 17-21% and average PDP improvement of 726% with CDM standard cell library compared with CMOS standard cell in 7nm, 10nm,
14nm, 16nm and 20nm respectively. It can be observed that CDM standard cell library
results in Power and PDP efficient designs as compared with the CMOS standard cell
libraries. CDM implementation is area efficient compared with CMOS standard cells
because CDM logic style, results in symmetrical implementation of the logic cells and
with few basic cells (single level, two level & three level) CDM standard cell library
with massive number of logic cells and complicated functionality can be generated,
resulting in CDM standard cell library more richer and area efficient compared with
CMOS counterparts. Simulation Results from Binary to BCD converter designs
synthesis with the designed CDM standard cell libraries shows the Power Efficiency of
15-31% for all referenced architecture and further 10% PDP saving compared with CCMOS logic style for the proposed Binary to BCD converter design using CBLD
algorithm. Proposed architecture being XOR intensive due to CBLD algorithm, is the
reason for power and PDP efficiency with CDM standard cell library synthesis.
Future Work includes making the CDM cell library richer with Sequential logic cells
along with combinational cells and synthesizing the benchmark designs using both
combinational and sequential cells. We have seen CDM NAND/AND gate are not
efficient compared with C-CMOS AND/NAND gate, hence design for the hybrid
standard cell libraries has been planned which includes the logic cells from both C-
66
CMOS and CDM logic style. Hybrid Standard cell libraries can provide us PDP efficient
designs exploiting the power efficiency from CDM cells and timing efficiency from CCMOS cell during logic synthesis.
67
Appendix A
DESIGN COMPILER SCRIPT
#/**************************************************/
#/* Compile Script for Synopsys
*/
#/* dc_shell-t -f <name_of_file.tcl>
*/
#/**************************************************/
#/* All verilog files, separated by spaces
*/
set my_verilog_files [list c6288.v]
#/* Top-level Module
*/
set my_toplevel c6288
#/* Reserved time for output signals (Holdtime etc.) */
#/**************************************************/
#/* No modifications needed below
*/
#/**************************************************/
set link_library /home/Ashish/lib/tsmc018/imp/cmos7nm_hp.db
set target_library /home/Ashish/lib/tsmc018/imp/cmos7nm_hp.db
define_design_lib WORK -path ./WORK
analyze -f verilog $my_verilog_files
elaborate $my_toplevel
current_design $my_toplevel
link
ungroup -all -flatten -simple_names
compile -map_effort medium
set filename [format "%s%s" $my_toplevel ".vh"]
write -f verilog -output $filename
set filename [format "%s%s" $my_toplevel ".sdc"]
write_sdc $filename
redirect timing.rep { report_timing }
redirect cell.rep { report_cell }
redirect power.rep { report_power }
quit
68
C-CMOS LOGIC CELL NETLIST
Inverter
.subckt INV in VSS VDD out
M0 out in VDD in pfet nfin=1
M1 out in VSS in nfet nfin=1
.ends INV
2 Input AND gate
.subckt ANDx2 A B VSS VDD AND
M1 net07 B VDD B pfet nfin=1
M0 net07 A VDD A pfet nfin=1
M3 net19 B VSS B nfet nfin=2
M2 net07 A net19 A nfet nfin=2
X0 net07 VSS VDD AND INV
.ends ANDx2
3 Input AND gate
.subckt ANDx3 A B C VSS VDD AND
M2 net07 C VDD C pfet nfin=1
M1 net07 B VDD B pfet nfin=1
M0 net07 A VDD A pfet nfin=1
M5 net31 C VSS C nfet nfin=3
M4 net32 B net31 B nfet nfin=3
M3 net07 A net32 A nfet nfin=3
X0 net07 VSS VDD AND INV
.ends ANDx3
2 Input XOR gate
.subckt XORx2 A B VSS VDD XOR
M3 net29 B VSS B nfet nfin=2
M2 net05 A_Bar net29 A_Bar nfet nfin=2
M1 net28 B_Bar VSS B_Bar nfet nfin=2
M0 net05 A net28 A nfet nfin=2
M7 net05 A_Bar net5 A_Bar pfet nfin=2
M6 net05 B net5 B pfet nfin=2
M5 net5 B_Bar VDD B_Bar pfet nfin=2
M4 net5 A VDD A pfet nfin=2
X8 net05 VSS VDD XOR INV
X1 B VSS VDD B_Bar INV
X0 A VSS VDD A_Bar INV
.ends XORx2
69
2 Input OR gate
.subckt ORx2 A B VSS VDD OR
M1 net07 B net27 B pfet nfin=2
M0 net27 A VDD A pfet nfin=2
M3 net07 B VSS B nfet nfin=1
M2 net07 A VSS A nfet nfin=1
X0 net07 VSS VDD OR INV
.ends ORx2
3 Input OR gate
.subckt ORx3 A B C VSS VDD OR
M2 net010 C net34 C pfet nfin=3
M1 net34 B net35 B pfet nfin=3
M0 net35 A VDD A pfet nfin=3
M5 net010 C VSS C nfet nfin=1
M4 net010 B VSS B nfet nfin=1
M3 net010 A VSS A nfet nfin=1
X0 net010 VSS VDD OR INV
.ends ORx3
Full Adder
.subckt full_adder1 A B C VSS VDD Carry Sum
X1 net19 C VSS VDD Sum XORx2
X0 A B VSS VDD net19 XORx2
X8 net19 C VSS VDD net25 ANDx2
X9 A B VSS VDD net24 ANDx2
X10 net25 net24 VSS VDD Carry ORx2
.ends full_adder1
Half Adder
.subckt half_adder1 A B VSS VDD Carry Sum
X0 A B VSS VDD Sum XORx2
X7 A B VSS VDD Carry ANDx2
.ends half_adder1
CDM LOGIC CELL NETLIST
2 Input OR gate
.subckt ORx2 A B VSS VDD OR
M10 net6 B A_bar B pfet nfin=2
M7 net6 B VSS B nfet nfin=2
X0 A VSS VDD A_bar INV
X2 net6 VSS VDD OR INV
.ends ORx2
70
2 Input XOR gate
.subckt XORx2 A B VSS VDD XOR
M3 net7 A_bar B_bar A_bar nfet nfin=2
M0 net7 A B A nfet nfin=2
X10 B VSS VDD B_bar INV
X2 A VSS VDD A_bar INV
X8 net7 VSS VDD XOR INV
.ends XORx2
2 Input AND gate
.subckt ANDx2 A B VSS VDD AND
M14 A_bar B net4 B nfet nfin=2
M13 net4 B VDD B pfet nfin=2
X0 A VSS VDD A_bar INV
X3 net4 VSS VDD AND INV
.ends ANDx2
3 Input AND gate
.subckt ANDx3 A B C VSS VDD AND
M6 net02 A VDD A pfet nfin=1
M5 net02 A_bar net1 A_bar pfet nfin=4
M4 net1 B VDD B pfet nfin=4
M20 net1 B_bar C_bar B_bar pfet nfin=4
X2 C VSS VDD C_bar INV
X1 B VSS VDD B_bar INV
X0 A VSS VDD A_bar INV
X3 net02 VSS VDD AND INV
.ends ANDx3
3 Input OR gate
.subckt ORx3 A B C VSS VDD OR
M7 net02 A VSS A nfet nfin=1
M6 net02 A_bar net2 A_bar nfet nfin=4
M5 net2 B_bar C_bar B_bar nfet nfin=4
M4 net2 B VSS B nfet nfin=4
X4 net02 VSS VDD OR INV
X2 C VSS VDD C_bar INV
X1 B VSS VDD B_bar INV
X0 A VSS VDD A_bar INV
.ends ORx3
Full Adder
.subckt full_adder1 A B C VSS VDD Carry Sum
M17 net012 B_bar VSS B_bar pfet nfin=4
71
M16 net012 B C_bar B pfet nfin=4
M15 net07 B_bar C_bar B_bar pfet nfin=4
M14 net07 B VDD B pfet nfin=4
M13 net072 A_bar net012 A_bar pfet nfin=4
M12 net072 A net07 A pfet nfin=4
M11 net6 B_bar C_bar B_bar pfet nfin=4
M10 net071 A_bar net6 A_bar pfet nfin=4
M9 net6 B C B pfet nfin=4
M8 net071 A net1 A pfet nfin=2
M7 net1 B_bar C B_bar pfet nfin=4
M1 net1 B C_bar B pfet nfin=2
X13 net072 VSS VDD Carry INV
X11 net071 VSS VDD Sum INV
X9 B VSS VDD B_bar INV
X8 A VSS VDD A_bar INV
X16 C VSS VDD C_bar INV
.ends full_adder1
Half Adder
.subckt half_adder1 A B VSS VDD Carry Sum
M8 net36 A_bar B_bar A_bar pfet nfin=2
M7 net36 A VDD A pfet nfin=2
M4 net35 A_bar B A_bar pfet nfin=2
M3 net35 A B_bar A pfet nfin=2
X11 net36 VSS VDD Carry INV
X13 net35 VSS VDD Sum INV
X5 B VSS VDD B_bar INV
X4 A VSS VDD A_bar INV
.ends half_adder1
Different functions generated by changing the input lines to single output CDM single
level and two level basic cell
Func1
.subckt func1 A B E VSS VDD output
M4 net11 B VDD B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 VSS B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func1
Func2
72
.subckt func2 A B E VSS VDD output
M4 net11 B E B nfet nfin=1
M0 net12 A VDD A nfet nfin=1
M2 VSS B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func2
Func3
.subckt func3 A B E D VSS VDD output
M4 net11 B D B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 VSS B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func3
Func4
.subckt func4 A B E D VSS VDD output
M4 net11 B VDD B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 D B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func4
Func5
.subckt func5 A B E D VSS VDD output
M4 net11 B E B nfet nfin=1
M0 net12 A VDD A nfet nfin=1
M2 D B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func5
Func6
.subckt func6 A B E C D VSS VDD output
M4 net11 B D B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 C B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func6
73
Func7
.subckt func7 A B E C D F VSS VDD output
M5 net012 B F B nfet nfin=1
M4 net11 B D B nfet nfin=1
M0 net12 A net012 A nfet nfin=1
M3 E B net012 B pfet nfin=1
M2 C B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output not
.ends func7
Func8
.subckt func8 A B VSS VDD output
M0 net12 A B A nfet nfin=1
M1 VSS A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func8
Func9
.subckt func9 A B VSS VDD output
M0 net12 A B A nfet nfin=1
M1 VDD A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func9
Func10
.subckt func10 A B VSS VDD output
M0 net12 A VSS A nfet nfin=1
M1 B A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func10
Func11
.subckt func11 A B VSS VDD output
M0 net12 A VDD A nfet nfin=1
M1 B A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func11
Func12
.subckt func12 A B E VSS VDD output
M4 net11 B VSS B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 VDD B net11 B pfet nfin=1
74
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func12
Func13
.subckt func13 A B E VSS VDD output
M4 net11 B E B nfet nfin=1
M0 net12 A VSS A nfet nfin=1
M2 VDD B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func13
Func14
.subckt func14 A B E VSS VDD output
M4 net11 B E B nfet nfin=1
M0 net12 A VDD A nfet nfin=1
M2 VDD B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func14
Func15
.subckt func15 A B E D VSS VDD output
M4 net11 B D B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 VDD B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func15
Func16
.subckt func16 A B E D VSS VDD output
M4 net11 B VSS B nfet nfin=1
M0 net12 A E A nfet nfin=1
M2 D B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func16
Func17
.subckt func17 A B E VSS VDD output
M4 net11 B VDD B nfet nfin=1
M0 net12 A VSS A nfet nfin=1
75
M2 E B net11 B pfet nfin=1
M1 net11 A net12 A pfet nfin=1
X0 net12 VSS VDD output INV
.ends func17
SILICON SMART STANDARD CELL LIBRARY CHARACTERIZATION
SCRIPT
# See SiliconSmart User Guide Appendix B for a complete list of parameters and
definitions
#################################
# OPERATING CONDITIONS DEFINITION
#################################
#
# Create one or more operation conditions here. Example:
#
create_operating_condition op_cond
set_opc_process op_cond {
{.lib "~/standard/PTM-MG/model" ptm7hp}
}
add_opc_supplies op_cond VDD 0.7
add_opc_grounds op_cond VSS 0.0
set_opc_temperature op_cond 25
#
#################################
# GLOBAL CONFIGURATION PARAMETERS
#################################
define_parameters default {
set_parameter pmos_model_names pfet
set_parameter nmos_model_names nfet
# List of operating conditions as defined by create_operation_condition
set active_pvts { op_cond }
# HSPICE
set simulator hspice
set simulator_cmd {hspice <input_deck> -o <listing_file>}
# HSPICE (client/server mode)
# set simulator hspice_cs
# set simulator_cmd {hspice -CC <input_deck> -port <port_num> -o
<listing_file>}
76
# SPECTRE
# set simulator spectre6
# set simulator_cmd {spectremdl -tab -batch <mdl_file> -design <input_deck>
<listing_file> >&/dev/null}
# ELDO
# set simulator eldo
# set simulator_cmd {eldo -compat -i <input_deck> > <listing_file> >&/dev/null}
# MSIM
# set simulator msim
# (csh)
# set simulator_cmd {msim -hsp -i <input_deck> -o <listing_file> >&/dev/null}
# (sh)
# set simulator_cmd {msim -hsp -i <input_deck> -o <listing_file> 2>/dev/null}
# Default simulator options for Finesim, Hspice, Spectre, Msim, and Eldo
set simulator_options {
"common,finesim: finesim_mode=spicehd finesim_method=gear
finesim_speed=0 finesim_dvmax=0.1"
"common,hspice: probe=1 runlvl=5 numdgt=7 measdgt=7 acct=1 nopage"
"common,spectre6: compression=yes step=10ps maxstep=1ns relref=allglobal"
"common,spectre6: method=trap lteratio=4 gmin=1e-18 autostop=0
save=none"
"common,msim: probe=1 accurate=1"
"common,eldo: gmindc=1n gmin=1p itl1=500 ingold=1 numdgt=4 measout=0
cptime=18000 relvar=0.01"
"op,eldo: dv=0.5 method=gear"
"tran,eldo: brief=0 relvar=0.001"
"optimize,eldo: lvltim=3 relvar=0.001"
"power,eldo: method=gear"
}
# Simulation resolution
set time_res_high 1e-12
# Controls which supplies are measured for power consumption
set power_meas_supplies { VDD }
77
# list of ground supplies used (required for Functional Recognition)
set power_meas_grounds { VSS }
# specifies which multi-rail format to be used in Liberty model; none, v1, or v2.
set liberty_multi_rail_format none
# LOAD SHARE PARAMETERS
# job_scheduler: 'lsf' (Platform), 'grid' (SunGrid), or 'standalone' (local machine)
set job_scheduler standalone
set run_list_maxsize 1
set normal_queue "lsf_queue_name"
}
############################
# DEFAULT PINTYPE PARAMETERS
############################
pintype default {
set logic_high_name VDD
set logic_high_threshold 0.8
set logic_low_name VSS
set logic_low_threshold 0.2
set prop_delay_level 0.5
# Number of slew and load indices
# (when importing with -use_default_slews -use_default_loads)
set numsteps_slew 5
set numsteps_load 5
set constraint_numsteps_slew 3
# Operating load ranges
set smallest_load 1e-15
set largest_load 50e-15
# Operating slew ranges
set smallest_slew 10e-12
set largest_slew 1.2e-9
set max_tout 1.0e-9
# Automatically determine largest_load based on max_tout; off or on
set autorange_load off
78
# Noise of points in for noise height
set numsteps_height 8
# Input noise width.
set numsteps_width 5
# driver model: pwl, emulated, active, active-waveform, custom
set driver_mode pwl
# driver cell name (relevant only when driver_mode is "active")
set driver pwl
}
#####################################
# LIBERTY MODEL GENERATION PARAMETERS
#####################################
define_parameters liberty_model {
# Add Liberty header attributes here for use with "model -create_new_model"
set_parameter liberty_time_unit "1ps"
set delay_model "table_lookup"
set default_fanout_load 0.0
set default_inout_pin_cap 0.0
set default_input_pin_cap 0.0
set default_output_pin_cap 0.0
set default_cell_leakage_power 0.0
set default_leakage_power_density 0.0
}
#######################
# VALIDATION PARAMETERS
#######################
define_parameters validation {
# Add validation parameters here
}
79
References
[1] Q. Xie, X. Lin, Y. Wang, S. Chen, M.J. Dousti, and M. Pedram. “Performance
Comparisons between 7nm FinFET and Conventional Bulk CMOS Standard Cell
Libraries,” IEEE Trans. on Circuits and Systems II, Vol. 62, No. 8, Aug. 2015, pp.
761-765.
[2] Q. Xie, X. Lin, Y. Wang, M.J. Dousti, A. Shafaei, M. Ghasemi-Gol, and M. Pedram.
“5nm FinFET standard cell library optimization and circuit synthesis in near- and
super-threshold voltage regimes,” Proc. of IEEE Computer Society Annual Symp.
on VLSI, Jul. 2014.
[3] Shen-Fu Hsiao,Ming-Yu Tsai, Chia-Sheng Wen."Low Area/Power Synthesis Using
Hybrid Pass Transistor/CMOS Logic Cells in Standard cell-Based Design
Environment,"IEEE Trans. on Circuits and Systems II,EXPRESS BRIEFS, VOL. 57,
NO. 1, JANUARY 2010.
[4] T. Nikoubin, F. Eslami, A. Baniasadi, and K. Navi, A new cell design methodology
for balanced XOR-XNOR circuits for hybrid-CMOS logic. Journal of Low Power
Electronics 5, 2 (2009).
[5] T.Nikoubin,, Grailoo, M., & Mozafari, H. (2010). Cell design methodology based
on transmission gate for low-power high-speed balanced XOR-XNOR circuits in
hybrid-CMOS logic. Journal of Low Power Electronics, 6, 1–10.
[6] Tooraj
Nikoubin,Poona
Bahrebar,Sara
Pouri,Keivan
Navi,
and
Vaez
Iravani2."Simple Exact Algorithm for Transistor Sizing of Low-Power High-Speed
Arithmetic Circuits".Hindawi Publishing Corporation VLSI Design Volume 2010,
Article ID 264390.
[7] K. Yano, Y. Sasaki, K. Rikino, and K. Seki, “Top-down pass-transistor logic
design,” IEEE J. Solid-State Circuits, vol. 31, no. 6, pp. 792–803, Jun. 1996.
[8] C. Yang and M. Ciesielski, “Bds: a bdd-based logic optimization system,”
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on,
vol. 21, no. 7, pp. 866–876, Jul 2002.
80
[9] T. Nikoubin, M. Grailoo, and C. Li, “Cell design methodology (cdm) for balanced
carry-inversecarry circuits in hybrid-cmos logic style,” International Journal of
Electronics, vol. 101, no. 10, pp. 1357–1374,2014.
[10]
http://web.eecs.umich.edu/~jhayes/iscas.restore/c6288.html
[11]
Uppalapati, Siri, Michael L. Bushnell, and Vishwani D. Agrawal. "Glitch-free
design of low power ASICS using customized resistive feedthrough cells." Proc. of
the 9th VLSI Design and Test Symposium. 2005.
[12]
http://venividiwiki.ee.virginia.edu/mediawiki/index.php/Main_Page
[13] Farkhani, Hooman, et al. "Comparative study of FinFETs versus 22nm bulk
CMOS technologies: SRAM design perspective." System-on-Chip Conference
(SOCC), 2014 27th IEEE International. IEEE, 2014.
[14] http://ptm.asu.edu/
[15]
Brunvand, E. “Digital VLSI Chip Design with Cadence and Synopsys CAD
Tool,” Addison-Wesley, 2010.
[16]
Synopsys, Design Compiler User Guide, Product Version 13.3, April 2013.
[17]
Liberty User Guides and Reference Manual Suite, Version 2013.03
[18]
Synopsys Inc., "Liberty™ ncx user guide," F-2011.06 ed., 2011.
[19]
https://www.coursera.org/course/vlsicad
[20]
Standard Cell Library design, Lecture Notes Advanced VLSI Design,CMPE-
641,UMBC
[21]
http://www.ecs.umass.edu/ece/labs/vlsicad/bds/bds.html
[22]
https://embedded.eecs.berkeley.edu/pubs/downloads/sis/
[23]
J. Rabaey, Low Power Design Essentials (Integrated Circuits and Systems),
2009.
[24]
Sung-Mo Kang and Yusuf Leblebici, CMOS Digital Integrated Circuits
(Analysis and Design), 2nd Edition.
[25]
T.Nikoubin, N.Navi, and O.Kavei, A new method in reorganization of the timing
behavior of symmetric XOR/XNOR circuits. CSI J. Computer Science and
Engineering 5, 276 (2007).
81
[26]
K. Yano et al., A 3.8ns CMOS 16×16-b multiplier using complementary pass-
transistor logic. IEEE J. Solid-State Circuits 25, 388 (1990).
[27]
S. Rapolu and T. Nikoubin, "Fast and energy efficient FinFET full adders with
Cell Design Methodology (CDM)," 2015 6th International Conference on
Computing, Communication and Networking Technologies (ICCCNT), Denton, TX,
2015, pp. 1-5.
[28]
http://sportlab.usc.edu/
[29]
Ashish Joshi, Sri Rathan Rangisetti, Tooraj Nikoubin." Fast and Energy efficient
binary to BCD converter with Complement based logic design, "IEEE Trans. on
Circuits and Systems II,EXPRESS BRIEFS(Submitted).
[30]
Sri Rathan Rangisetti, Ashish Joshi, Tooraj Nikoubin, “Area-Efficient and
Power-Efficient Binary to BCD Converters”, IEEE, Sixth International Conference
on Computing, Communications and Networking Technologies 6th ICCCNT–
35239, Denton, U.S.A, July 13 - 15, 2015.
[31]
Osama Al-Khaleel, Zakaria Al-Qudah and Mohammad Al-Khaleel, “Fast and
compact binary-to-BCD conversion circuits for decimal multiplication,” IEEE 29th
International Conf. on Computer Design, pp. 226 – 231, Oct. 2011.
[32]
Tso-Bing Juang,Yu-Ming Chiu."Fast Binary to BCD Converters for Decimal
Communications Using New Recoding Circuits". IEEE International Symposium
on Integrated Circuits (ISIC), pp.188 – 191, 2014.
[33]
Arvind Kumar Mehta, Mukesh Gupta, Vipin Jain, Sudhir kumar." High
Performance Vedic BCD Multiplier and Modified Binary to BCD Converter". IEEE
Annual India Conference (INDICON), pp. 1 – 6 2013.
[34]
J. Bhattacharya, A. Gupta, and A. Singh. “A high performance binary to BCD
converter for decimal multiplication”. IEEE International Symposium on VLSI
Design, Automation and Test (VLSI-DAT), pp. 315 – 318, 2010.
[35]
G. Jaberipur and A. Kaivani, “Improving the Speed of Parallel Decimal
Multiplication” IEEE Transactions on Computers, vol. 58, issue 11, pp. 1539 - 1552.
2009.
82
[36]
Tso-Bing Juang and Yu-Ming Chiu. "High-speed binary to binary-coded-
decimal converters for decimal multiplications". IEEE International, SoC Design
Conference (ISOCC), pp. 370 – 371, 2013.
[37]
45nm Standard Cell Library. [online]. Available:
http://www.eda.ncsu.edu/wiki/FreePDK45:Contents
83