Download DesignReview1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Islanding wikipedia , lookup

History of electric power transmission wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Rectifier wikipedia , lookup

Power engineering wikipedia , lookup

Variable-frequency drive wikipedia , lookup

Stray voltage wikipedia , lookup

Earthing system wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Power electronics wikipedia , lookup

Metadyne wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Buck converter wikipedia , lookup

Transmission line loudspeaker wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Voltage optimisation wikipedia , lookup

Opto-isolator wikipedia , lookup

Surge protector wikipedia , lookup

Alternating current wikipedia , lookup

Mains electricity wikipedia , lookup

Transcript
Brad Bridgeman
Yanqing Zhang
ECE 6332
October 8, 2009
ECE 6332
Design Review 1
1. Summaries:
Variation Driven Device Sizing for minimum energy Sub-threshold circuits
This paper discusses methods behind choosing an optimal size and supply voltage for
system functionality. In addition to discussing the outcomes behind modifying the supply voltage
and size of the devices, this paper discusses the idea of designing a circuit based on a chosen
failure rate. By calculating optimal values of width’s and voltages, you can design your circuit to
have a high success rate, and in addition to that, you can still modify the values to minimize the
variability constraints imposed on the design. The goal here is that within a design you should
have the smallest possible sizes and voltages that satisfy the design constraints.
Additionally, one major problem that was discussed was functionality error. In a design,
if the NMOS is strengthened enough, it will introduce a strong enough leakage current, that the
PMOS will not provide a strong active high, which gives both a functionality error, and a
problem due to the leakage. To solve this problem, a butterfly plot was recommended to be used
to ensure proper noise margin’s for each device, rather than using a set standard.
Advantages in ultra-low voltage design
This paper discusses various topics within the field of ultra-low power systems, and
introduces several projects that have met recent success within the field. The major point that
was discussed was the importance of optimizing Vdd and Vt as a method to optimizing system
performance. As you lower Vdd, the delay increases exponentially, and the leakage energy
increases, but as you increase Vdd, energy drawn increases, and circuit performance increases,
which introduces the tradeoff. Two key challenges were mentioned, relating to Ion, and the ratio
of Ion/Ioff, both coming from the fact that Ion is small compared to a strong-inversion current.
This causes leakage current to effect the output logic, as well as output voltage values.
A second topic of variation was also discussed within this paper. Variation will affect the
strength of the devices, such as providing a strong or weak mosfet. Looking specifically at subthreshold, the main variation comes from random dopant fluctuations, which causes shifts within
Vt. These shifts within Vt exponentially influence the resulting current from the circuit design. In
addition to variation causing unknown currents, it also establishes an uncertainty in the delay
(this is where a Monte Carlo simulation will come in handy). Other topics that were mentioned
include using a butterfly plot to determine stability, using a read buffer to assist reading hold
times, introducing redundancy into a circuit for managing variation, and finally using a sensor
mechanism to cause the supply voltage to constantly keep the circuit at the minimum energy
point (this can be done using the switched capacitor DC-DC converter).
Nanometer MOSFET Variation in Minimum Energy Subthreshold Circuits
This paper discusses a wide assortment of topics dealing with the design and optimization
of subthreshold circuits. The paper begins by introducing the primary uses of a low power IC,
and proceeds to discuss what the comment points of failure are when looking at low power
operation. The main problem that is introduced is a result of the SRAM cells, more importantly,
implications due to the SRAM cells. The SRAM in a low power design is typically the first set of
devices to fail; additionally, they generally require the largest power. One additional question
that was introduced, was what happens to the circuit when you reduce Vdd? The answer to this is
that the switching speed decreases leading to an increase in the leakage current due to a longer
delay in the circuit; additionally, the DIBL effect will be decreased as well, due to a smaller
value of Vds.
One of the major solutions proposed to improve the SRAM, was to introduce a buffer,
more specifically a read buffer. This buffer eliminates the limitation imposed by the read static
noise margin, which improves the overall operation of the device. The problem with this comes
from the size of the buffer that must be introduced, and the resulting additional leakage current
that comes along with it. Overall the read buffer was determined to improve the operation of the
circuit, however it does show room for improvement in the design if this buffer could be reduced
or eliminated. In addition to discussing SRAM, other topics that were discussed include
variation, guidelines to circuit design, and possible testing schemes that can be incorporated.
A 65nm Sub-Vt Microcontroller with Integrated SRAM and Switched Capacitor DC-DC
Converter
This paper introduces several key topics related to Sub-Vt operation. The major topics
discussed in this paper include the effects of logic and depth, as well as the results of modifying
the supply voltage and width of the devices. When you are analyzing a circuit in Sub-Vt, one
major factor to the performance relates to the output, and external connections being made to the
device, or in other words, the fan-out. The larger the fan-out of a system, the slower the system
will perform. Additionally, when you are operating in the Sub-Vt region, you will have a large
variability with the delay of the system.
There are two methods that were introduced to reduce the delay variability in sub-Vt
operation, increasing the width of the devices, and increasing the supply voltage. Both of these
however, will increase the amount of energy consumed by the device. This brings up the topic of
the optimum voltage. For every design, there will be an optimum supply voltage that will
provide the minimum energy requirement based on the resulting performance to energy
consumption ratio. This is where the DC-DC converter is introduced; it was designed to act as a
control system that maintains the supply voltage at the optimum level for the minimum energy.
Other topics that were introduced within this paper include: low-power suspended states, holdstatic noise margins, usage of clock buffers, effect of fan-ins, and usage of read-buffers.
A 65 nm Sub-Vt Microcontroller with Integrated SRAM and Switched Capacitor DC-DC
Converter
In this paper, a 65-nm Sub-Vt Microcontroller design is presented while proposing
crucial design problems and methods of solving them in the sub-threshold regime. Methods to
optimize energy and power consumption are given. More importantly to our project, concerns
over delay modeling in sub-vt and a method of good approximation are given. Also, important
factors to consider during synthesis, including fan-in and constructing a custom library for sub-vt
operation are described. Points that are very valuable to us include constructing a process
variation aware library that works in sub-vt, considering statistical timing analysis with respect to
variation and deriving delay dependent upon logic depth and device sizes. On the energy
efficiency side, it is shown that SRAM energy consumption is huge, and design should take
notice of this.
Energy Efficient Design for Subthreshold Supply Voltage Operation
In this paper, an analysis method for minimum energy is given. We are able to see the
different energy scenarios for different low power approaches. More specifically, the paper
compares energy consumption with or without clock and power gating. A specific model of
variation impact wraps up the end of the paper, where we learn an analytical way of determining
the ideal Vmin giving consideration to logic depth and variation.
Modeling Energy of the Clock Generation and Distribution Circuitry
In this paper, equations for calculating the clock load are given. We learn that a large part
of clockload is used for cache and register files. Once we know what technology we will be
using, we can approximate easily the clock load with the derived model in this paper. (Note: I do
not know how useful this will be to us.)
Energy-Efficient Subthreshold Processor Design
In this paper, architectural and circuit level considerations are given to design an energy
efficient processor. The point is made that variation will pose problems to ensure we have the
correct Vdd,min and operation frequency. The paper goes on to show that variation will have
more effect on operating frequency than Vdd due to the exponential relationship between Vt and
operating frequency. This is important to consider because as it is presented, processors have the
tendency to operate at higher frequencies than are expected. Therefore, we can save energy by
finding the optimal Vdd, lowering the operating frequency. However, the tradeoff is that this will
induce greater variation, which in turn will complicate timing closure issues. Once again, random
dopant variation is the main culprit. Monte Carlo simulations in the paper show the validity of
this conclusion by either fixing Vdd to the optimal voltage and varying the operation period t,
and comparing it to the simulations where t is fixed and Vdd is varied. The variation of t is much
more substantial than that of Vdd. We learn that we will have to give much attention to this
spread in frequency to ensure proper operation when making our two-phase clock.
2. Synthesis Flow and Corresponding Schematics
Since we have not really determined what we will be designing our two-phase clock for, here we
will describe the synthesis flow that we are now familiar with, and what important steps will
mean for our project.
a) Synthesis flow begins with the VHDL architecture, here a simple 16-bit accumulator is used
for simplicity:
ENTITY Acc IS
GENERIC(width : POSITIVE := 16);
PORT( data_in : IN std_logic_vector(width -1 DOWNTO 0);
data_out : OUT std_logic_vector(width -1 DOWNTO 0);
clock, control : IN std_logic);
END ENTITY Acc;
…
FOR ALL : Reg USE ENTITY work.Reg;
FOR ALL : Adder USE ENTITY work.Adder;
BEGIN
RX : Reg
GENERIC MAP(width => width)
PORT MAP (D => data_in, Q => x, clock =>clock, enable =>'1', reset =>control);
RY : Reg
GENERIC MAP(width =>width)
PORT MAP (D => z, Q => y, clock =>clock, enable =>'1', reset =>control);
AD : Adder
GENERIC MAP (width =>width)
PORT MAP (x =>x, y => y, z =>z);
data_out <= z;
END ARCHITECTURE structure;
b) The next step is to choose a technology to implement a design, and we will use Cadence’s
RTL Compiler to generate a mapped Verilog netlist of our design to the technology. This is a
script version (there is a GUI for RTL Compiler) to accomplish this text. Important
comments are in bold:
#-----------------------------------------# synthesis scripts
# This script is by Zhenyu(Jerry) QI ([email protected])
#-----------------------------------------#You may source other .tcl files here
#######################################
# Set global variables and directories
#######################################
set TOPMODULE Acc
#######################################
# Technology Library Info
#######################################
set_attribute library
/app/lib/hcmos9gp_920/CORE9GPLL_SNPS_AVT_4.1/SIGNOFF/bc_1.32V_m40C_wc_1.08V_125C/PT_LIB/CORE9GPLL_
Worst.lib
#We’ll use the ST130 technology, considering that the design we will ultimately be using was implemented in this way
#######################################
# don't ignore DRC for timing
#######################################
set_attribute drc_first true
#######################################
# Dont-Use Cells
#Perhaps we will be adding items in this category depending on our implementation
#######################################
# Global Attributes Settings
#######################################
…
###########################################
# Elaborating Design and Generate Object
###########################################
read_hdl -vhdl Reg.vhd
read_hdl -vhdl Adder16.vhd
read_hdl -vhdl Acc.vhd
elaborate ${TOPMODULE}
#######################################
# Timing Constraints
#######################################
#read_sdc -stop_on_errors ${TOPMODULE}.sdc
##########################################
# More Attributes on the Design Object
##########################################
set_attribute max_leakage_power 0.0 ${TOPMODULE}
#This field will perhaps be changed
set_attribute lp_optimize_dynamic_power_first true ${TOPMODULE}
set_attribute lp_power_analysis_effort low
########################################
…
# Synthesize to Technology Mapped Gates
#########################################
synthesize -to_mapped -eff low
###write -m > {TOPMODULE}_mapped.v
write -m > Acc_mapped.v
#This is the outputted file
c) This mapped verilog file should be simulated in a digital simulator, for example ModelSim
d) Next, we will be using Cadence’s Encounter to do our Place and Route steps. First, we will
setup the tool using Standard Cell Files, Common Timing Libraries, and Timing Constraint
Files, with the suffixes .lef, .ltf(or .lib), and .sdc respectively. After setup, we are able to
check the standard cells used and the schematic block diagram for our design, which may
prove quite valuable:
Schematic View
Standard Cell Lookup List
e) The next step is Placement, where we specify the core size, ground and power supplies, and
place our standard cells within the core. We can see the results of placement in two ways.
There is an overall architecture view, where we can see specifically see where our modules
are placed, and a standard cell view, where we can see the outline of individual cells:
Architectural View
Standard Cell View
f) Our final step before implementing the clock is Routing:
Cell View After Routing
g) These steps are actually the preamble to the research we will be looking into. Once we have
done enough research and have decided how we will implement our two phase clock, we will
then go on to place and route that clock within the design. We will then go on to simulate
Timing Closure. Chances are that we will not be successful the first time, so then we will
have to be able to extract timing reports, and figure out a way to meet timing requirements
from that. After being successful, we will then find a way to extract power and
performance(speed) parameters from our finished circuit. How we optimize that will also be
based on our reading and understanding of Sub-vt operation (which has been summarized in
the first part of this design review).
3. Conclusion of Progress and Looking Ahead (Remaining Tasks)
With our reading (so that’s why it’s called Reading Days), we have established some very
important concepts about design in the Sub-vt regime that will ultimately guide us in our design.
Specifically, the method of finding the optimal Vdd, the horrendous effects of variation at such
low power supply, and what that means to meeting timing constraints. We will give notice to a
statistical method of approximating delay based on logic depth and device sizing. We also take
notice that recent success in Sub-vt design always make sure of functionality due to leakage
effects in sub-threshold.
We have also familiarized ourselves with the tools that we will be using for our project. We
have been able to ‘test run’ our flow with a simple 16-bit accumulator design. Namely, we are
now conscious of what happens at every stage of the synthesis flow, so that we will know where
to turn to when we encounter a problem. For example, if we find that energy and power can be
optimized by not using gates with fan-ins greater than 3, we know that we will have to specify
this in the RTL Compiler. We also have a conceptual idea of how a chip finally looks like, with
supply and ground rails and rings obvious around the core, and our cell placement in a stacked
view in the core. We also notice that the same module in our schematic may be placed across the
core, even for such a simple design as an accumulator! This conceptual understanding boosts our
confidence of how important synthesizing a quality clock that guarantees timing is correct can
be.
The tasks that remain for our design prior to the proposal primarily relate to research on the
topic. As of now we have a strong basis when it comes to the introduction of variability into our
design; however we still have room to discover more options on how to reduce the effects of that
variation. We currently know that the project we are working on has the goal of reducing the
energy and area spent on a device, while ideally operating at the same speed or faster than the
current model. The question now becomes how we can convert our knowledge into something
feasible.
We will also need to explore ideas of how to synthesis our clock. Some suggestions that have
been made are the classic clock tree style, but having two trees for each of our phases. Another is
adding huge buffers from the sides our top and bottom of our design that can fully drive the two
phases. Yet another is clock buffering, and then optimizing the buffering for two phases. As
found in our research thus far, the buffers that are introduced purely in an SRAM bitcell account
for up to 40% of the area of that bitcell, additionally, they provide an additional leakage current
that must be addressed; on the other hand, they are currently one of the major building blocks
towards a working design. We must continue our research in an attempt to find alternative
methods that have been performed to reduce the requirement of these buffers.
A ‘microscopic’ optimization method that has been suggested is to take long paths with bad
variation from the timing report, and route our two phase clock specifically to those places that
need it, whilst feeding the ‘normal’ clock to other places that already meet timing requirements.
In any case that we decide, another good idea would be to fiddle further with the Place and Route
Tool, namely know how to get our timing report, how to place our clock, and how to get
simulation data for power and speed from the tool.
Looking at the upcoming proposal deadline, we will be working on continuing our simulation
efforts trying various synthesis options to familiarize ourselves with the system; additionally we
will continue researching the topics of subthreshold designs looking at delays, variations, and
buffer alternatives. Once we have sufficient knowledge on the subject we will begin determining
the best option when it comes to a design to test paths of various lengths as a way to measure the
effects of clock skew on the overall delay of the circuit.
Looking ahead beyond the proposal, once we have familiarized ourselves with synthesizing,
and we have chosen a valid device to test; we will begin weighing in our options on what
methods we should incorporate. Based on the synthesis method we take, we know we will begin
by finding the value of Vdd, and minimum width that would provide the optimum performance,
however beyond that is where our ‘real’ research begins to take hold. Once the background
information has been decided we will create a design of experiments (DOE) in which will serve
as a way to measure the results based on an independent change. After that, the next step is
incorporating our design choices into the current model we choose, and reporting the data.
List of Documented Notes:





















If NMOS is strengthened, then the PMOS will not drive VDD due to leakage currents from the
PDN
VOH and VOL must be sufficient enough to prevent functionality errors
A good idea, is to use a Butterfly plot as a guideline for designing VOH and VOL parameters
A good measure to test design variability, is to perform a Monte Carlo simulation, it will provide
data showing the average, best and worst case scenario.
As you increase the width of a device, the variability will decrease
As you decrease the voltage, the delay of the circuit will exponentially increase
Devices should always be made as small as possible, while still abiding by the design constraints
imposed due to yield and variability
When looking at stacked devices, the output swing is more important than current variability
Given a desired yield, you can calculate the optimal Vdd, and Width of the minimum sized
device
As you increase the size of devices, you will offset the effect of variations due to lower voltages,
however you will also increase the required energy due to an increase in leakage current
One way to model the delay variation of a system, is to analyze the delay distribution of each
stage.
Using the optimum Vdd and Vt will minimize the energy consumption of the design
The optimum Vdd is not always the lowest operating voltage, there are also performance factors
As you lower Vdd, the delay increases exponentially, and the leakage energy also increases,
around 300mv, the leakage current dominates the circuit
Two key challenges to ultra low voltage designs, that impact the circuit functionality:
o Ion is << I-strong inversion, this can allow the leakage current to effect the on-state
o Ion/Ioff is reduced, this weakens the output values, on strong devices, that is VOH =
VDD – delta, and VOL = VSS + delta
To address this, models were derived for output and input requirements, as well as minimum
operating voltages
Variation affects the strengths of devices, one major type being a strong Pmos/weak Nmos, or
vice versa
The main variation in sub-Vt is from random dopant fluctuation (RDF), where dopants in the
channel shift Vt, causing exponential effects on Ion/Ioff
Process variation increases the uncertainty in delay, at a low voltage, an increased sensitivity to
local variation causes the distribution of delay to widen
SRAMs typically are the dominant portion of area and power of a system
Use a butterfly plot to determine if a bitcell is stable, the Static noise margin(SNM) indicates the
maximum noise that can be applied before a state is destroyed, if variation causes both the
PMOS and NMOS to shift out of the SNM, then the device fails






















As VDD decreases, the read and hold SNM decrease. This introduces the need for a read buffer,
which can be introduced using an additional 2T on the SRAM.
Redundancy is good for managing variation
MEP changes often, so a detector system can be used to ensure the supply voltage will
constantly be working at the MEP, this relates back to the DC-DC convertor that was discussed
in another paper.
There are three main types of systems that benefit from voltage scaling to the sub-threshold
level: low-speed systems, dynamic speed systems and constant high speed systems  through
the use of parallel networks, most speed requirements can be satisfied.
In low voltage designs, SRAMs require the most power, and typically are the first devices to fail
In modern designs, active switching and leakage are the dominant sources of energy
When you reduce the Vdd of a circuit, it will impede the switching speed; additionally it will also
affect the charge on the capacitor. E= C * Vdd^2; while Vdd is decreasing, the total time to
charge the capacitor is increasing, thus the integral form of this equation will show that the
leakage energy actually increases.
In sub-threshold designs, looking at the short-channel model, VDS will be decreased, which
results in a reduced effect on DIBL. By decreasing the effect of DIBL, it significantly decreases the
leakage current.
Random dopant fluctuation (RDF) and processing variation are dominating effects in modern
designs; where both result in a change in Vt on the devices. As Vt changes, the drain current
changes, which results in overpowering
Geometric variation is also a problem, and it has a stronger effect on the channel length,
resulting in problems involving drift and short-channel effects such as DIBL.
RDF is the dominating source of variability in sub-Vt systems.
Another problem is the degradation in drain current, due to variation. Due to leakage paths,
active nodes may provide weaker logic levels than expected (VOH = Vdd – delta, VOL = Vss +
delta)
The worst case noise margin’s can be analyzed through NAND and NOR gates
Gates with more than two inputs should be avoided in sub-Vt designs
Variation can compromise logic functionality, which proves the tradeoff: Increase Width 
lower Vt variation, increased yield and increased energy consumption….. Increase Vdd 
increased energy, increased signal levels, decreased variability.
Failures in SRAM were discussed, along with possible solutions and their tradeoffs, one of which
uses a read buffer, which provides a 8T design instead of 6T
The read buffer eliminates the read SNM limitation, which improves the operating margin that is
now set by the hold SNM and the write margin. The problem however, is that this buffer
increases the size of
the cell by 25-40%, and adds an additional leakage path
At the end of the paper, it does discuss possible solutions to the sub-Vt read current
degradation
Bigger fan in gates can decrease transistor area
We will consider statistical timing analysis delay dependant upon logic depth and device sizes
Dynamic frequency more important than dynamic voltage optimization