Download ECE 260B project list

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Switched-mode power supply wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Transcript
ECE260B / CSE241A
Winter 2005
University of California, San Diego
Project List
Note:
1. There is a lot of infra-structure code (UCLApack or OpenAccess) to take care of most of the
coding required for projects.
2. Matlab, HSpice, Cadence/Synopsys tools are available in CS/EE computer systems.
3. Some literature search engines are at http://vlsicad.ucsd.edu/Resources/litsearch.html
Rules:
1. No project can have more than 2 students.
2. No project can be chosen by more than two different groups. If a project is chosen by more
than two groups then the two groups with lexicographically smallest student ID numbers
will be assigned to the project.
3. You have to form groups and submit the groups’ names together with the students’ ids to
[email protected] by 12PM on February 7. In the email, you have to indicate up to 5
desired projects for your group, and please list them in order of priority.
4. You will be sent an email on February 8 before 12PM indicating the projects you are
assigned to according to rule 3.
5. If you have a different project suggestion than the ones we next give, you are very
welcomed to talk to Dr. Bao Liu about it before February 6.
6. Feel free to ask if you have questions regarding some project(s).
7. Deadline for project delivery is Wednesday March 16 before 6PM.
1. Transistor sizing / multi-Vth design
The task here is to take a placed and routed design, generate a transistor level netlist for it and
optimize it. Optimization can be for example adjusting transistor widths, or assigning different
transistor threshold voltages in a dual threshold voltage design. This can be done algorithmically
(e.g. TILOS) or in an ad-hoc fashion. See how much timing improvement you get vs. just gate-level
optimization. Also notice the "error" in STA methodology and highlight it. Following timing
optimization, further optimize the design to reduce power consumption without losing any timing.
Deliverables: transistor level netlists before and after optimization,
description of approach, timing/power reports.
2. Transistor level technology remapping
In this project you are asked to apply technology remapping, for example, by combining several
cells to form new library cells, to optimize a transistor level netlist. This can be done
algorithmically (e.g. pattern matching) or in an ad-hoc fashion. Verify your area saving, timing
improvement and power consumption reduction.
Deliverables: netlists before and after optimization, description of approach,
timing/power reports.
3. Low power design
Given a transistor level netlist design, its working frequency and input vector statistics based on its
functionality, reduce its power consumption, for example, by applying dual supply voltages. Make
sure you didn’t compromise its performance. This can be done algorithmically (e.g. pattern
matching) or in an ad-hoc fashion. There could be additional placement and routing efforts to apply
dual supply voltages. Verify your power consumption reduction.
Deliverables: netlists before and after optimization, description of approach,
timing/power reports.
4. Synthesis/layout of arithmetic modules
In this project you will use Verilog to model various kinds of multipliers (e.g., Booth and Wallace
multipliers). The output should be a comparison table detailing the different wirelength, placement
area, speed, etc of the different multipliers? Is there a trade-off curve? Or are there some multipliers
that completely dominate others? Can you add Built-In Self Test (BIST) hardware to test your
designs?
Deliverables: description of the various multipliers you implemented + results
5. Re-timing
Re-timing reduces longest combinational logic paths by relocating some of the flip-flops, both
logically and physically. You are asked to take a test case and form a working flow to evaluate the
effectiveness of re-timing, with existing tools and/or some of your own scripts. Comparing with
useful clock skew is a plus.
Deliverables: report + scripts + results from implementing some of these techniques
6. Investigation on timing analysis inaccuracies
Investigate timing analysis inaccuracies due to crosstalk, multiple gate input switching, supply
voltage variation, temperature, manufacturing variation, etc. Verification tools can be SPICE,
PrimeTime, VoltageStorm. Try different designs and summarize your results.
Deliverables: Report + script + results from your technique.
7. Distributions in statistical timing
The objective here is to observe and highlight the impact of assumptions on gate-length variability
distributions (if any) on final design timing distributions. Implement a simple statistical timer by
(say) 500 Monte Carlo runs of STA (e.g. Primetime). Assume independent gate delays. Assume
gate-delay distributions and generate circuit delay distributions. Delays can be changed in the SDF
file. Interconnect may be ignored. Try this for a few probability distributions (e.g. Gaussian,
asymmetric Gamma, Triangular, etc).
Deliverables: Report + Scripts (random number generators, etc)
8. Correlations in statistical timing
Investigate the impact of parameter correlations on delay, i.e., by comparing statistical timing
analysis results with independent variables with that with correlated variables.
Deliverables: Report + Scripts (random number generators, etc)
9. Worst case crosstalk ATPG
Existing ATPG(automatic test pattern generation) techniques focus on stuck-at faults. In this
project you are asked to automatically generate test patterns to activate worst case crosstalk
scenario. In the presence of inductance, the worst case crosstalk scenario may not be simultaneous
switching of the aggressors in the same direction. Develop your scheme.
Deliverables: Report + script + results from your technique.
10. Crosstalk aggressor alignment for worst case gate delay
Timing of aggressor signal transition affects crosstalk induced delay variation on the victim wire.
The aim here is to figure out rules of aggressor alignment which leads to worst case gate delays for
the driver and the receiver cells. You are supposed to run SPICE simulation and collect your
observations. Based on your observations, a small utility program would be able to predict the
aggressor alignment which makes the worst case gate delays. Verify your aggressor alignment
prediction with SPICE simulation.
Deliverables: Report + Scripts + SPICE simulation results.
11. Noise propagation
Crosstalk induced noise could trigger the receiver gate and propagate alone a netlist. Find an
efficient abstraction of the noise waveforms which could lead to efficient yet accurate calculation
of noise propagation, e.g., through lookup tables. You are supposed to run SPICE simulations to
verify your results.
Deliverables: Report + results from SPICE simulations.
12. Clock skew variation estimation
Clock meshes are used in state-of-art designs to construct clock routing in contrast to clock trees in
older design. This shift in clock tree construction methodology is motivated by the fact that meshes
cope better with variability effects. In this project you are asked to use SPICE simulations to
measure the skew of tree routing versus grid clock routing while taking variability effects into
consideration. Can you extend your study to non-tree routings, i.e., clock trees with added short
cuts? Also include delay comparison between tree and non-tree structures.
Deliverables: description of mesh and tree techniques + results from SPICE simulations
13. Clock design
Clock network is usually formed by top-level mesh/network and bottom-level Steiner minimum
trees. The objective of clock network design is 1.) minimum or bounded skew, 2.) minimum delay,
3.) bounded process variation. In this project, you are asked to compare different clock topologies,
or, you can evaluate the effectiveness of clock boosters and feedback loops. Verify with SPICE
simulations. Find out rules of thumb regarding clock design for different applications.
Deliverables: Reports + results from SPICE simulations
14. Clock driver input alignment
Modern clock network includes several drivers, which delays are affected by the timing of their
input signal transitions. You are supposed to run SPICE simulation and find out the input alignment
of clock network drivers which leads to worst case driver gate delays.
Deliverables: Report + results from SPICE simulations.
15. Characterization of clock tree synthesis
We want to characterize the performance of the Cadence’s clock tree synthesis tool CKSynthesis in
SOCE. You have to evaluate the impact of the skew, insertion delay, buffer library, FF locations on
the final tree outcome performance. Is the tool stable? Or the outcome completely changes by small
perturbations in the input
Deliverables: a short survey of clock tree routing techniques + results of the aforementioned study.
16. Power/Ground network construction
Power/ground network is usually constructed as top-level grid and bottom-level tree structures.
Compare tree, grid, and whole plate power/ground structures, find optimum parameters of a
power/ground structure to minimize supply voltage drop, for DC or AC supply current sources.
Verify your results.
Deliverables: Report + results from implementing some of these techniques
17. IR drop driven placement
The objective here is to explore placement techniques which can lead to reduction in IR drop. One
way to do this is to place high current cells towards the periphery in a peripheral i/o design. Simple
way to implement this is to have a fixed dummy block at the center of the chip and attach fake nets
from it to cell instances in a DEF file. A commercial placer can then be used to place this netlist.
After placement, fake blocks and nets re deleted. This can lead to IR drop reduction.
Deliverables: Report + scripts + IR drop maps
Dependency: Contact Swamy Muddu ([email protected]) to confirm a working IR drop
analysis flow with VoltageStorm/AstroRail.
18. Impact of dummy fill on timing
The aim is to quantify the impact of dummy fill on post-layout timing. Dummy fill can be inserted
into a layout using SOC Encounter or post-tapeout tools like Calibre/Assura. You should then
extract dummy fill using Fire-n-Ice extractor and compare pre-fill and post-fill timing. Compare
the impact of filling approaches (grounded vs. floating). Talk to Puneet Gupta ([email protected])
if you need help.
Deliverables: Report + Timing Reports + Extracted RSPFs of the design + GDSIIs
(if possible)
Dependency: Contact Swamy Muddu ([email protected]) to confirm a working
fill extraction flow with Fire-n-Ice.
19. Pin ordering impact on wirelength and timing.
The objective of this work is to evaluate the impact of I/O pins placement on the final wirelength
and timing results. You are supposed to try the default ordering, random pin orderings as well as
trying literature/your own techniques to re-order pins and evaluate the impact on wirelength and
timing.
Deliverables: description of the various I/O placement techniques + results of these techniques
(timing/wirelength/vias/routing violations)
20. The effect of whitespace and aspect ratio on wirelength and timing.
Whitespace (empty space) is inserted in layouts in order to increase the routing resources of the
chip. In this project you are required to study the impact of whitespace (and aspect ratio) on timing
and wirelength, by say increasing the whitespace from 0% to 100% and evaluate the impact on both
wirelength and timing. Can you predict how this will look like? For a 300 mm wafer, can you
parameterize the relationship between the number of dies produced, timing, die aspect ratio,
wirelength and whitespace?
Deliverables: quantitative and qualitative description of the relationship between the
aforementioned parameters.
21. Implementing Kernighan-Lin technique into Multi-Level partitioners.
Multi-Level partitioners (e.g., MLPart) are the state-of-the-art technique for partitioning of largescale hypergraphs. In their core, the Fiduccia-Mattheyses (FM) technique is used to partition the
clustered hypergraph. The FM method is preferred over the Kernighan-Lin (KL) since it is faster,
however, the KL method examines a larger search space and it is probable that KL partitioners
produce solutions of better quality than FM. The objective of this project is to replace the FM
engine of MLPart with a KL engine and evaluate the partitioning results. In addition, the project
will extend the KL engine by implementing the quick KL implementation of S. Dutt's, i.e.,
QuickCut technique described in "New Faster Kernighan-Lin-Type Graph Partitioning
Algorithms".
Deliverables: source code + results + summary of techniques used
22. Evaluating the impact of net models in partitioning-based placers.
Net models are traditionally used to transform multi-pin hyperedges to two-pin nets in analytical
placers. In this project we want to evaluate the impact of the different net models in partitioningbased placers on the final routed wirelength. The motivation for this work is that a multi-pin
hyperedge cut usually translates into a number of cuts in final routing. This project evaluates this
discrepancy between hypergraph partitioning objectives and the routed wirelength. The project
requires processing of multi-pin nets in DEF files into a number of 2-pin nets according to the
different net models and evaluating the impact on the placement/final routed wirelength. You will
be using the partitioning-based placer Capo as your placer and Cadence’s WRoute for evaluating
wirelength.
Deliverables: Description of the various net models + the routed wirelength results for the net
models.
23. Simulated annealing placer
In this project you are asked to implement a simulated annealer. The input is a placement by your
Cadence's placer; the output is a placement that is wirelength optimized by simulated annealing.
Can you improve upon the placement quality of the commercial placer you used in the lab?
Deliverables: results + algorithm used + tool with source code.
24. Benchmarking for placement
Establishing wirelength lower bounds for circuit netlists remains an open problem. Recently, the
PEKO benchmarks are released. These are benchmarks with known optimal wirelength.
Experimental results on these benchmarks indicate that placers are far from being optimal. In this
project we want to evaluate the effect of the number of net pins on the optimality results. For
example, we want to answer the following questions. Is the placer placing 2-pin nets optimally
while failing to place 5-pin nets optimally? What is percentage of 2-pin nets, 3-pin nets, etc placed
optimally?
Deliverables: description of the problem + results of the required statistics + your opinion on how
to propose new benchmarking techniques
25. Mixed-block recursive bisection
State-of-the-art design feature standard cells and large macro blocks. In this project we want to
study the effect of the partitioning algorithm on the final placement of mixed-block designs.
Specifically you are required to change tweak Capo's partitioner to implement (1) plain partitioning
(FengShui) and (ii) ratio cut approach (by Wei and Cheng). Which is performing better?
Deliverables: description of the algorithms + implementation of the two approaches (approach 1 is
already implemented) + results
26. Effect of WLM and target frequency on performance
In this project you are required to quantify the effect of the WireLength Models (WLM) and target
frequency on the post-routing timing results.
Deliverables: A survey of WLM methods and your results of evaluating the various WLM.
27. Test pattern compactor
Automatic Test Pattern Generation (ATPG) tools generate a set of test vectors to test the designs
functionality. However, many times these patterns contain redundancies that can be exploited to
reduce the number of test patterns. In this project you are required to survey the test compaction
literature and implement a test compactor. Your input will be the ISCAS benchmarks and their test
vectors. Your output should be a test vector set that has less size than the input and tests the same
number of faults.
Deliverables: brief survey of the test compaction technique + source code + results + your new
ideas if possible
28. Dynamic power supply
Power gating adds enabling signals to a power supply network; dynamic power supply
management adjusts supply voltage according to data path criticality. You are asked to take a test
case and upgrade its power supply network to dynamic power supply. Verify power reduction of
your technique.
Deliverables: Report + your verification results
29. Clock tree theory
Constructing a zero-skew clock tree can be formulated as constructing a path-length balanced tree
(assuming path delay is proportional to path length), i.e., to have identical path length between the
root and any leave of the tree. The problem can be in a Euclidean plane, a rectilinear plane, or with
other distance metrics. This problem’s computation complexity is open. Can you find an
approximation algorithm for the problem which guarantees a given error bound?
Deliverables: Report + your proof
30. Statistical clock tree design
Clock skew is a function of process variation, i.e., delay from the clock source to a leave of the
clock tree is a statistical function. A rule of thumb for minimum process variation clock tree design
is to have balanced branches, i.e., identical buffers from identical distances to the clock source, and
symmetric clock routing branches with identical capacitive loads. Can you have a more flexible
clock tree design scheme, while maintaining a minimized/bounded clock skew from a statistical
point of view?
Deliverables: Report + your proof (theoretical and/or practical verification)
31. Randomized algorithm/approximation scheme for statistical timing analysis
Statistical timing analysis gives a distribution for signal delay at each node in a netlist. A Monte
Carlo simulation can give discrete distribution functions. Can there be a randomized algorithm or
approximation scheme for statistical timing analysis with guaranteed error bound?
Deliverables: Report + your proof (theoretical and/or practical verification)
Good Luck!