Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ECE260B / CSE241A Winter 2005 University of California, San Diego Project List Note: 1. There is a lot of infra-structure code (UCLApack or OpenAccess) to take care of most of the coding required for projects. 2. Matlab, HSpice, Cadence/Synopsys tools are available in CS/EE computer systems. 3. Some literature search engines are at http://vlsicad.ucsd.edu/Resources/litsearch.html Rules: 1. No project can have more than 2 students. 2. No project can be chosen by more than two different groups. If a project is chosen by more than two groups then the two groups with lexicographically smallest student ID numbers will be assigned to the project. 3. You have to form groups and submit the groups’ names together with the students’ ids to [email protected] by 12PM on February 7. In the email, you have to indicate up to 5 desired projects for your group, and please list them in order of priority. 4. You will be sent an email on February 8 before 12PM indicating the projects you are assigned to according to rule 3. 5. If you have a different project suggestion than the ones we next give, you are very welcomed to talk to Dr. Bao Liu about it before February 6. 6. Feel free to ask if you have questions regarding some project(s). 7. Deadline for project delivery is Wednesday March 16 before 6PM. 1. Transistor sizing / multi-Vth design The task here is to take a placed and routed design, generate a transistor level netlist for it and optimize it. Optimization can be for example adjusting transistor widths, or assigning different transistor threshold voltages in a dual threshold voltage design. This can be done algorithmically (e.g. TILOS) or in an ad-hoc fashion. See how much timing improvement you get vs. just gate-level optimization. Also notice the "error" in STA methodology and highlight it. Following timing optimization, further optimize the design to reduce power consumption without losing any timing. Deliverables: transistor level netlists before and after optimization, description of approach, timing/power reports. 2. Transistor level technology remapping In this project you are asked to apply technology remapping, for example, by combining several cells to form new library cells, to optimize a transistor level netlist. This can be done algorithmically (e.g. pattern matching) or in an ad-hoc fashion. Verify your area saving, timing improvement and power consumption reduction. Deliverables: netlists before and after optimization, description of approach, timing/power reports. 3. Low power design Given a transistor level netlist design, its working frequency and input vector statistics based on its functionality, reduce its power consumption, for example, by applying dual supply voltages. Make sure you didn’t compromise its performance. This can be done algorithmically (e.g. pattern matching) or in an ad-hoc fashion. There could be additional placement and routing efforts to apply dual supply voltages. Verify your power consumption reduction. Deliverables: netlists before and after optimization, description of approach, timing/power reports. 4. Synthesis/layout of arithmetic modules In this project you will use Verilog to model various kinds of multipliers (e.g., Booth and Wallace multipliers). The output should be a comparison table detailing the different wirelength, placement area, speed, etc of the different multipliers? Is there a trade-off curve? Or are there some multipliers that completely dominate others? Can you add Built-In Self Test (BIST) hardware to test your designs? Deliverables: description of the various multipliers you implemented + results 5. Re-timing Re-timing reduces longest combinational logic paths by relocating some of the flip-flops, both logically and physically. You are asked to take a test case and form a working flow to evaluate the effectiveness of re-timing, with existing tools and/or some of your own scripts. Comparing with useful clock skew is a plus. Deliverables: report + scripts + results from implementing some of these techniques 6. Investigation on timing analysis inaccuracies Investigate timing analysis inaccuracies due to crosstalk, multiple gate input switching, supply voltage variation, temperature, manufacturing variation, etc. Verification tools can be SPICE, PrimeTime, VoltageStorm. Try different designs and summarize your results. Deliverables: Report + script + results from your technique. 7. Distributions in statistical timing The objective here is to observe and highlight the impact of assumptions on gate-length variability distributions (if any) on final design timing distributions. Implement a simple statistical timer by (say) 500 Monte Carlo runs of STA (e.g. Primetime). Assume independent gate delays. Assume gate-delay distributions and generate circuit delay distributions. Delays can be changed in the SDF file. Interconnect may be ignored. Try this for a few probability distributions (e.g. Gaussian, asymmetric Gamma, Triangular, etc). Deliverables: Report + Scripts (random number generators, etc) 8. Correlations in statistical timing Investigate the impact of parameter correlations on delay, i.e., by comparing statistical timing analysis results with independent variables with that with correlated variables. Deliverables: Report + Scripts (random number generators, etc) 9. Worst case crosstalk ATPG Existing ATPG(automatic test pattern generation) techniques focus on stuck-at faults. In this project you are asked to automatically generate test patterns to activate worst case crosstalk scenario. In the presence of inductance, the worst case crosstalk scenario may not be simultaneous switching of the aggressors in the same direction. Develop your scheme. Deliverables: Report + script + results from your technique. 10. Crosstalk aggressor alignment for worst case gate delay Timing of aggressor signal transition affects crosstalk induced delay variation on the victim wire. The aim here is to figure out rules of aggressor alignment which leads to worst case gate delays for the driver and the receiver cells. You are supposed to run SPICE simulation and collect your observations. Based on your observations, a small utility program would be able to predict the aggressor alignment which makes the worst case gate delays. Verify your aggressor alignment prediction with SPICE simulation. Deliverables: Report + Scripts + SPICE simulation results. 11. Noise propagation Crosstalk induced noise could trigger the receiver gate and propagate alone a netlist. Find an efficient abstraction of the noise waveforms which could lead to efficient yet accurate calculation of noise propagation, e.g., through lookup tables. You are supposed to run SPICE simulations to verify your results. Deliverables: Report + results from SPICE simulations. 12. Clock skew variation estimation Clock meshes are used in state-of-art designs to construct clock routing in contrast to clock trees in older design. This shift in clock tree construction methodology is motivated by the fact that meshes cope better with variability effects. In this project you are asked to use SPICE simulations to measure the skew of tree routing versus grid clock routing while taking variability effects into consideration. Can you extend your study to non-tree routings, i.e., clock trees with added short cuts? Also include delay comparison between tree and non-tree structures. Deliverables: description of mesh and tree techniques + results from SPICE simulations 13. Clock design Clock network is usually formed by top-level mesh/network and bottom-level Steiner minimum trees. The objective of clock network design is 1.) minimum or bounded skew, 2.) minimum delay, 3.) bounded process variation. In this project, you are asked to compare different clock topologies, or, you can evaluate the effectiveness of clock boosters and feedback loops. Verify with SPICE simulations. Find out rules of thumb regarding clock design for different applications. Deliverables: Reports + results from SPICE simulations 14. Clock driver input alignment Modern clock network includes several drivers, which delays are affected by the timing of their input signal transitions. You are supposed to run SPICE simulation and find out the input alignment of clock network drivers which leads to worst case driver gate delays. Deliverables: Report + results from SPICE simulations. 15. Characterization of clock tree synthesis We want to characterize the performance of the Cadence’s clock tree synthesis tool CKSynthesis in SOCE. You have to evaluate the impact of the skew, insertion delay, buffer library, FF locations on the final tree outcome performance. Is the tool stable? Or the outcome completely changes by small perturbations in the input Deliverables: a short survey of clock tree routing techniques + results of the aforementioned study. 16. Power/Ground network construction Power/ground network is usually constructed as top-level grid and bottom-level tree structures. Compare tree, grid, and whole plate power/ground structures, find optimum parameters of a power/ground structure to minimize supply voltage drop, for DC or AC supply current sources. Verify your results. Deliverables: Report + results from implementing some of these techniques 17. IR drop driven placement The objective here is to explore placement techniques which can lead to reduction in IR drop. One way to do this is to place high current cells towards the periphery in a peripheral i/o design. Simple way to implement this is to have a fixed dummy block at the center of the chip and attach fake nets from it to cell instances in a DEF file. A commercial placer can then be used to place this netlist. After placement, fake blocks and nets re deleted. This can lead to IR drop reduction. Deliverables: Report + scripts + IR drop maps Dependency: Contact Swamy Muddu ([email protected]) to confirm a working IR drop analysis flow with VoltageStorm/AstroRail. 18. Impact of dummy fill on timing The aim is to quantify the impact of dummy fill on post-layout timing. Dummy fill can be inserted into a layout using SOC Encounter or post-tapeout tools like Calibre/Assura. You should then extract dummy fill using Fire-n-Ice extractor and compare pre-fill and post-fill timing. Compare the impact of filling approaches (grounded vs. floating). Talk to Puneet Gupta ([email protected]) if you need help. Deliverables: Report + Timing Reports + Extracted RSPFs of the design + GDSIIs (if possible) Dependency: Contact Swamy Muddu ([email protected]) to confirm a working fill extraction flow with Fire-n-Ice. 19. Pin ordering impact on wirelength and timing. The objective of this work is to evaluate the impact of I/O pins placement on the final wirelength and timing results. You are supposed to try the default ordering, random pin orderings as well as trying literature/your own techniques to re-order pins and evaluate the impact on wirelength and timing. Deliverables: description of the various I/O placement techniques + results of these techniques (timing/wirelength/vias/routing violations) 20. The effect of whitespace and aspect ratio on wirelength and timing. Whitespace (empty space) is inserted in layouts in order to increase the routing resources of the chip. In this project you are required to study the impact of whitespace (and aspect ratio) on timing and wirelength, by say increasing the whitespace from 0% to 100% and evaluate the impact on both wirelength and timing. Can you predict how this will look like? For a 300 mm wafer, can you parameterize the relationship between the number of dies produced, timing, die aspect ratio, wirelength and whitespace? Deliverables: quantitative and qualitative description of the relationship between the aforementioned parameters. 21. Implementing Kernighan-Lin technique into Multi-Level partitioners. Multi-Level partitioners (e.g., MLPart) are the state-of-the-art technique for partitioning of largescale hypergraphs. In their core, the Fiduccia-Mattheyses (FM) technique is used to partition the clustered hypergraph. The FM method is preferred over the Kernighan-Lin (KL) since it is faster, however, the KL method examines a larger search space and it is probable that KL partitioners produce solutions of better quality than FM. The objective of this project is to replace the FM engine of MLPart with a KL engine and evaluate the partitioning results. In addition, the project will extend the KL engine by implementing the quick KL implementation of S. Dutt's, i.e., QuickCut technique described in "New Faster Kernighan-Lin-Type Graph Partitioning Algorithms". Deliverables: source code + results + summary of techniques used 22. Evaluating the impact of net models in partitioning-based placers. Net models are traditionally used to transform multi-pin hyperedges to two-pin nets in analytical placers. In this project we want to evaluate the impact of the different net models in partitioningbased placers on the final routed wirelength. The motivation for this work is that a multi-pin hyperedge cut usually translates into a number of cuts in final routing. This project evaluates this discrepancy between hypergraph partitioning objectives and the routed wirelength. The project requires processing of multi-pin nets in DEF files into a number of 2-pin nets according to the different net models and evaluating the impact on the placement/final routed wirelength. You will be using the partitioning-based placer Capo as your placer and Cadence’s WRoute for evaluating wirelength. Deliverables: Description of the various net models + the routed wirelength results for the net models. 23. Simulated annealing placer In this project you are asked to implement a simulated annealer. The input is a placement by your Cadence's placer; the output is a placement that is wirelength optimized by simulated annealing. Can you improve upon the placement quality of the commercial placer you used in the lab? Deliverables: results + algorithm used + tool with source code. 24. Benchmarking for placement Establishing wirelength lower bounds for circuit netlists remains an open problem. Recently, the PEKO benchmarks are released. These are benchmarks with known optimal wirelength. Experimental results on these benchmarks indicate that placers are far from being optimal. In this project we want to evaluate the effect of the number of net pins on the optimality results. For example, we want to answer the following questions. Is the placer placing 2-pin nets optimally while failing to place 5-pin nets optimally? What is percentage of 2-pin nets, 3-pin nets, etc placed optimally? Deliverables: description of the problem + results of the required statistics + your opinion on how to propose new benchmarking techniques 25. Mixed-block recursive bisection State-of-the-art design feature standard cells and large macro blocks. In this project we want to study the effect of the partitioning algorithm on the final placement of mixed-block designs. Specifically you are required to change tweak Capo's partitioner to implement (1) plain partitioning (FengShui) and (ii) ratio cut approach (by Wei and Cheng). Which is performing better? Deliverables: description of the algorithms + implementation of the two approaches (approach 1 is already implemented) + results 26. Effect of WLM and target frequency on performance In this project you are required to quantify the effect of the WireLength Models (WLM) and target frequency on the post-routing timing results. Deliverables: A survey of WLM methods and your results of evaluating the various WLM. 27. Test pattern compactor Automatic Test Pattern Generation (ATPG) tools generate a set of test vectors to test the designs functionality. However, many times these patterns contain redundancies that can be exploited to reduce the number of test patterns. In this project you are required to survey the test compaction literature and implement a test compactor. Your input will be the ISCAS benchmarks and their test vectors. Your output should be a test vector set that has less size than the input and tests the same number of faults. Deliverables: brief survey of the test compaction technique + source code + results + your new ideas if possible 28. Dynamic power supply Power gating adds enabling signals to a power supply network; dynamic power supply management adjusts supply voltage according to data path criticality. You are asked to take a test case and upgrade its power supply network to dynamic power supply. Verify power reduction of your technique. Deliverables: Report + your verification results 29. Clock tree theory Constructing a zero-skew clock tree can be formulated as constructing a path-length balanced tree (assuming path delay is proportional to path length), i.e., to have identical path length between the root and any leave of the tree. The problem can be in a Euclidean plane, a rectilinear plane, or with other distance metrics. This problem’s computation complexity is open. Can you find an approximation algorithm for the problem which guarantees a given error bound? Deliverables: Report + your proof 30. Statistical clock tree design Clock skew is a function of process variation, i.e., delay from the clock source to a leave of the clock tree is a statistical function. A rule of thumb for minimum process variation clock tree design is to have balanced branches, i.e., identical buffers from identical distances to the clock source, and symmetric clock routing branches with identical capacitive loads. Can you have a more flexible clock tree design scheme, while maintaining a minimized/bounded clock skew from a statistical point of view? Deliverables: Report + your proof (theoretical and/or practical verification) 31. Randomized algorithm/approximation scheme for statistical timing analysis Statistical timing analysis gives a distribution for signal delay at each node in a netlist. A Monte Carlo simulation can give discrete distribution functions. Can there be a randomized algorithm or approximation scheme for statistical timing analysis with guaranteed error bound? Deliverables: Report + your proof (theoretical and/or practical verification) Good Luck!