Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
School of Engineering A Clustering Utility Based Approach for ASIC Design S. Areibi, M. Thompson, A. Vannelli [email protected] September 2001 14th ASIC/SOC 2001 Outline This paper introduces several approaches for circuit clustering that are used for hierarchical Standard Cell VLSI Placement problem. Introduction (VLSI Design Cycle). Motivation. Circuit Layout (Placement Problem) Clustering Based Standard Cell Placement. Weighted Hyper-edge Clustering. De-Clustering. Top Level Improvement. Numerical Testing and Comparison. Conclusions & Future Work. 14th ASIC/SOC 2001 Introduction Design Automation: is the task of automatically designing a circuit using software tools. The ultimate goal: is to fully automate the tasks of designing, verifying, and testing a circuit. The VLSI design process is a complicated task, and the only feasible approach to solving the VLSI design problem is a divide-andconquer strategy. One of these tasks is physical design, which is still incredibly complex. Not surprisingly, this complexity is handled by dividing the physical design task into more tractable sub-tasks. One sub-task within physical design is placement, in which technology-mapped logic components are arranged on a chip. 14th ASIC/SOC 2001 Motivation As we move to deep sub-micron designs below 0.18 microns, the delay of a circuit, as well as power dissipation and area, is dominated by interconnections between logical elements (i.e. transistors). To deal with the complexity of millions of components and to achieve a turn around time in a couple of months, VLSI design tools must not only be computationally fast but also generate optimal layouts. Since the delay of a circuit cannot be ignored, work must still be done to reduce the area of placement and routing of very high performance designs. There is a great need for DA tools that operate in a reasonable amount of time, while still arriving at “reasonably good’’ solutions. 14th ASIC/SOC 2001 Phases of VLSI System Design 14th ASIC/SOC 2001 Physical Design Process • Physical design is a complex process, therefore, it is broken down into various sub-steps. 14th ASIC/SOC 2001 Layout Styles The VLSI design process includes logical and physical designs of a circuit. The logical design of a circuit is independent of an implementation, while the physical design is inherently linked to the layout style of the target technology that will implement the desired behavior. The layout style dictates many design constraints for physical design, since it must be possible to fabricate the physical design in the desired technology. Different styles are used to alter the design style to achieve some quality gain. 14th ASIC/SOC 2001 Cont .. Layout Styles Different technologies that can implement a VLSI design. 14th ASIC/SOC 2001 Circuit Placement Description: Given a set of modules and nets, assign the modules to legal positions within a placement area such that the interconnection cost is minimized. The set of modules in the network is denoted by M and the set of nets by N. The modules in a net n N is denoted by Mn and the set of nets connected to a module m M is denoted by Nm. In the standard-cell layout, cell modules are placed within R parallel rows in the chip core area such that no cells in a row are overlapping, and a maximum row length is not exceeded. The objective in the VLSI circuit placement is to minimize the total wire-length (x,y) = Wij [(xi - xj )2 + (yi - yj )2] 14th ASIC/SOC 2001 Interconnection Cost/HPWL The cost (HPWL) is given by Cl (x) = i=1..N (H + V ) i i where N is the number of nets, Hi and Vi are the span of the net i in the horizontal and vertical direction separately. 14th ASIC/SOC 2001 Overlap Penalty The cost CO (x) is the overlap penalty function, and given by: CO (x) = pO I<j O(i , j) Where pO is a penalty parameter. The function O(i , j) returns the total amount of overlap area between cells i and j. Certainly, by checking every other cell on the same row as cell $i$ it can be determined which of these cells overlap with cell i. However, the complexity is O(Mi) , where Mi is the number of cells in the row. The time spent doing overlap computation can be substantial. Another way to compute the cost CO (x) is to search toward Wmax (maximum cell width) left away from cell i and toward Wmax right away from cell $i$, all other cells which overlap with cell i can be found. 14th ASIC/SOC 2001 Row Length Penalty The cost Cr (x) is the row length penalty function. It is given by: Cr (x) = pr I=1 ..R || Lai - Ldi || Here pr is a row penalty parameter, R is the number Of rows, Lai and Ldi are the actual and desired row length for row i. pr = 5 is approximately the smallest value which would yield uniform row lengths without placing excessively emphasis on Cr (x) in the objective function. 14th ASIC/SOC 2001 Approaches for Solving Placement The most basic division among approaches is between exact solution methods and approximation methods (Heuristics). 14th ASIC/SOC 2001 Advantages/Disadvantages of Approaches The first approach to a placement problem is to solve it in a top-down fashion, by considering globally the best positions for cells in a placement. The more conventional approach is to use a bottom-up, iterative improvement approach, which attempts to find a good overall solution by looking at one or a few cells movement at at time. A more recent approach to combining these techniques is called hierarchal improvement, and is a two step procedure, first proceeding bottom-up, and then top-down. The bottom-up technique is clustering (reduce complexity) while the goal of the top-down method is to determine the location for all clusters. 14th ASIC/SOC 2001 Multilevel Clustering Hierarchy Early methods of clustering performed the desired circuit size reduction in a single level. Clustering in steps produces superior results permitting more gradual de-clustering. 14th ASIC/SOC 2001 Weighted Hyper-edge Clustering Our work is based on Karypis (METIS) technique used for circuit partitioning. The major addition to simple hyper-edge clustering is the development of cluster size control. Our method is divided into passes: 1. In the first pass, cells hyper edges are greedily clustered together, but only if they are within width limits. 2. In the second pass, remaining cells on hyper edges are also greedily clustered together. 3. A third pass is performed to assign any remaining cells to a new cluster. 14th ASIC/SOC 2001 Weighted Hyper-edge Clustering Sort nets by increasing size For each sorted net If no cell is clustered and sum of cell width is within limits Cluster all cells on nets End If End For For each sorted net If sum of un-clustered cell widths is within limits Cluster all un-clustered cells on net End If End For For each un-clustered cell in circuit Create a new cluster from cell End For 14th ASIC/SOC 2001 Flatten De-Clustering Heuristic Previous methods for clustering-based placement flattened the circuit by placing the cells of a cluster randomly within the physical confines of the cluster at the previous hierarchal level. Since relative positions between cells in a cluster were not considered at any clustered level of the hierarchy, they are not implied at the flattened level, some method must be used to determine legal relative positions for the flattened cells to occupy. To minimize the quality deterioration during circuit flattening, further improvement is performed on the circuit at each flattening stage, using localized search heuristic. In our approach, results were obtained by using the ARP global placer at the top hierarchical level, without any iterative improvement, and then using only FLATTEN during de-clustering. 14th ASIC/SOC 2001 Circuit De-Clustering The average position of all connected pins is calculated for each cell in a cluster. The cells within the cluster are then sorted by their average pin xcoordinate, and given a relative order as they are flattened. 14th ASIC/SOC 2001 Numerical Testing and Comparison Test circuits used: MCNC '91 benchmarks. Ten circuits ranging in size from 125 cells to over 25,000 cells. Circuit Cells Pads Nets Pins Rows Fract 125 24 147 462 6 Prim1 752 81 904 5526 16 Struct 1888 64 1920 5471 21 Ind1 2271 814 2478 8513 15 Prim2 2907 107 3029 18407 28 Bio 6471 97 5742 26947 46 Ind2 12142 495 13419 125555 72 Ind3 15059 374 21940 176584 54 Avq.s 21854 64 22124 82601 80 Avq.l 25114 64 25384 82751 86 14th ASIC/SOC 2001 Clustering Depth Effects of different clustering depths on solution quality using different clustering methods. Results clearly show that, for WHEC, three levels of clustering gives good results for all sizes of circuits. 14th ASIC/SOC 2001 Circuit Size Reduction As we increase the number of clustering levels there is a gradual reduction in the number of cells and nets. 14th ASIC/SOC 2001 Contd.. Circuit Size Reduction As we increase the number of clustering levels there is a gradual reduction in the number of cells and nets. 14th ASIC/SOC 2001 Flatten De-Clustering Results Small Benchmarks 1.4 Wire Length 1.2 1 % - 0.23 0.8 % 0.09 Two Three 0.6 % 3.08 0.4 0.2 0 % 1.71 % 7.09 % 2.50 Fract Prim1 Struct 14th ASIC/SOC 2001 Flatten De-Clustering Results Medium Benchmarks 6 % 0.51 Wire Length 5 4 % 0.76 3 2 1 0 % 0.25 % 0.95 % 0.43 ind1 prim2 % 1.29 Two Three bio 14th ASIC/SOC 2001 Wire Length Flatten De-Clustering Results Large Benchmarks 100 90 80 70 60 50 40 30 20 10 0 % % 0.34 0.40 Two Three % % 0.02 0.14 ind2 % 0.54 ind3 % 1.56 avq. S % 7.67 % 0.97 avq. L 14th ASIC/SOC 2001 Wire-Length Comparison Small Benchmarks 1.4 Wire Length 1.2 1 Flat Ec MHEC WHEC 0.8 0.6 0.4 0.2 0 Fract Prim1 Struct 14th ASIC/SOC 2001 Wire-Length Comparison Medium Benchmarks 6 Wire Length 5 4 Flat Ec MHEC WHEC 3 2 1 0 Ind1 Prim2 Bio 14th ASIC/SOC 2001 Wire-Length Comparison Large Benchmarks 80 Wire Length 70 60 50 Flat Ec MHEC WHEC 40 30 20 10 0 Ind2 Ind3 Avq.S Avg.L 14th ASIC/SOC 2001 Run Time Comparison Small Benchmarks 250 Time 200 Flat Ec MHEC WHEC 150 100 50 0 Fract Prim1 Struct 14th ASIC/SOC 2001 Run Time Comparison Medium Benchmarks 1400 1200 Time 1000 Flat Ec MHEC WHEC 800 600 400 200 0 ind1 Prim2 bio 14th ASIC/SOC 2001 Time Run Time Comparison Large Benchmarks 8000 7000 6000 5000 4000 3000 2000 1000 0 Flat Ec MHEC WHEC ind2 ind3 avq.S avq.L 14th ASIC/SOC 2001 Conclusions ARP is very useful for initial solutions but requires further tuning. The WHEC method of clustering was shown to be very effective when applied to the standard-cell placement problem. Edge clustering and modified hyper edge clustering were shown to perform well but the size limiting feature of WHEC provided results superior to other methods. Average improvement using the FLATTEN heuristic was between 12% for most benchmarks. Little improvement was obtained but the heuristic has a linear time complexity. Comparison with flat placement indicates that the hierarchical placement based on WHEC method reduced wire-length on average by 9% for all circuits and the execution time was lower by 70%. 14th ASIC/SOC 2001 Future Work Our future work attempts to incorporate utility into a stochastic heuristic or a hill climbing technique such as Tabu Search. Also involves further investigation of the modified WHEC clustering technique in addition to enabling the advanced features of the ARP (initial placer) by adaptively tuning the parameters. For further information, contact me: [email protected] Presentation, is available: www.uoguelph.ca/~sareibi 14th ASIC/SOC 2001