Download Report - CAE Users

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cytosol wikipedia , lookup

Biochemical switches in the cell cycle wikipedia , lookup

Tissue engineering wikipedia , lookup

Endomembrane system wikipedia , lookup

Extracellular matrix wikipedia , lookup

Cell wall wikipedia , lookup

Cell encapsulation wikipedia , lookup

Programmed cell death wikipedia , lookup

JADE1 wikipedia , lookup

Cell cycle wikipedia , lookup

Cellular differentiation wikipedia , lookup

Cell culture wikipedia , lookup

Cell growth wikipedia , lookup

Cytokinesis wikipedia , lookup

Mitosis wikipedia , lookup

Organ-on-a-chip wikipedia , lookup

Amitosis wikipedia , lookup

List of types of proteins wikipedia , lookup

Transcript
Evolution-base Standard Cell Placement
(ESP)
ECE 556 Final Project
12/15/2004
I.
Jason WoodsIntroduction
I.1 Motivation and overview
VLSI CAD tool development is an interesting area of study that posses many challenges.
This stems from the fact that most problems that need to be addressed in VLSI CAD tool
development cannot feasibly be solved in a finite amount of time due to the immense size
of the solution space. These types of problems are classified as NP-hard and NPcomplete and they require some form of heuristics in order to generate an approximate
solution.
Standard cell placement is once such problem in VLSI CAD tool design that requires
heuristics to solve. Many such heuristics have been introduced to generate approximate
placement solutions. The most notable heuristics are Simulated Annealing, which was
successfully implemented in the TimberWolf package. Followed by Min-Cut placement
and Genetic algorithms.
Evolution-base Standard Cell Placement (ESP) is offered as an alternative heuristic that
promises better performance than its predecessors. ESP is described as a cross between a
genetic algorithm and Simulated Annealing. ESP uses a natural selection process similar
to the one found in genetic algorithms to eliminate ill-suited placements. It also uses
iterative improvements similar to Simulated Annealing in order to converge on a nearoptimal solution.
I.2 Comparison to Simulated Annealing and Genetic Algorithms
Although ESP is an amalgamy of Simulated Annealing and genetic placement there are
several major differences between the ESP heuristic and the two previously mentioned
heuristics.
Simulated Annealing (SA) has proven to be a versatile algorithm, and given arbitrary
cpu-time will generate near optimal solutions. However, the problem with Simulated
Annealing is that it is dependent upon parameter optimization. There are no clearly
defined patterns for parameter optimization, which means each problem has to be tailored
through trial and error. ESP requires no parameter optimization to generate a nearoptimal solution.
Another difference between ESP and SA is that SA uses a pairwise exchange of cells to
alter the layout; ESP permits simultaneous removal of and redistribution of large number
of cells from their current position.
Genetic Placement (GP) has also produces near optimal solutions through evolving
solutions from one generation to the next. GP maintains a set of solutions called parents
for each generation. Merging the placements of two parents into a child generates new
solutions. This is where ESP differs from GP. ESP generates one child form one parent
during each generation.
ESP also differs from GP in the method of selecting cells to be passed on to the next
generation. GP uses a random process to select a set of cells from the parent, while ESP
determines the “goodness” of each cell at its current position to determine its survival
between generations. This allows ESP to converge much faster than GP, thereby,
performing much less iterations.
These advantages of SA and GP coupled together give ESP its strength. Its main
advantages being that it is uncomplicated and easy to adapt, as well as achieving near
optimal solutions with a comparatively fast execution time.
II.
The Algorithm and Implementation Details
The algorithm consists of two main phases. The first phase is the setup phase in which
the parameters are set, the input file is read and the initial setup is performed. The second
phase is the main iterative process in which a near optimal solution is generated. The
algorithm flow is depicted below.
Figure 1: Block Diagram of Algorithm
II.1 Initial Setup
The first two steps (shown above) is the initial I/O. The initial I/O consists of the
interactive parameter input where the user can specify optimization parameters, and the
reading of the input file containing the circuit description. The input file format is as
follows:
[parameters]
aspectration=<width>:<height>
rowlength=<length>
cellheight=<height>
[nets]
N0 <weight>
N1 <weight>
.
Figure 2: Format of Input File
Worth noting is the format of the cell descriptions. First, each cell is represented by C
then a cell index (e.g. C0, C1, etc), followed by a width and lastly a connection list. The
connection list contains a list of nets (N0, N1, …, Nw) with a port assignment
([<x>,<y>]). The coordinates defined in the port assignment are relative to the position
of the cell itself. After the input parameters and circuit description are read in all internal
data structures are initialized.
The next step in the initial setup is chip floorplanning. During this stage the layout height
and width is determined in accordance with the specified aspect ratio. Two important
values to consider are the average row length and maximum row length. The average
row length is directly computed using the aspect ratio and follows as a guideline for cell
placement. In order to maintain the aspect ratio rows should not exceed the value
determined for the average row length. The maximum row length is directly set as a
parameter. The maximum row length acts as a cap; if a row must exceed the average row
length it cannot exceed the maximum row length.
Once floorplanning is completed an initial placement is generated by placing cells
sequential in the placement grid. The initial placement is generated by placing each cell
sequentially in each row, making sure not to exceed the average row length when
possible.
After the initial placement is generated, precomputation is done. Precomputation
calculates each nets optimal wire length. In order to compute the optimal wire length for
each net, every cell the net connects is assumed to be right next to each other in a quasisquare configuration. The optimal wire length is then the semi-perimeter of the bounding
rectangle of the cell configuration. The process is depicted in the figure below. To
further increase the accuracy of this approximation the optimal wire length is allowed to
adjust during the main iterative process if a new optimal wire length is found.
Figure 3: Example of a nets optimal wire length calculation
II.2 Main Iterative Process
The main iterative process consists of routines which form the evolutionary algorithm.
The iterative process is executed until no further improvement can be achieved.
The first step in the main iterative process is the mutation procedure. Mutation is
performed by simply selecting two cells at random and swapping them. The mutation
procedure is followed by a collision resolving routine that cleans up any overlaps or
spaces resulting from the mutation.
The next step is the evaluation of the current layout. Evaluation works by computing a
placement value for each cell. The placement value is determined by averaging the
“goodness” of each net connected to the cell and normalizing the value from 0 to 100%.
A nets “goodness” is computed as the ratio of the current wire length of the net, to the
nets optimal wire length. To current wire length of the net is determined by calculating
the semi-perimeter of the bounding rectangle of all the cell ports the net connects.
Figure 4: Example of a net current wire length calculation
After completing the evaluation process each cell is judged whether it will retain its
current position in the net generation, or whether it will be removed and scheduled for
allocation. Each cells chances of survival is equal to its placement value. A random
number between 0 and 100% is generated for each cell. If the cells placement value is
greater than the random number, the cell will survive. If the cell does not survive it is
removed from the layout and placed in an allocation queue for redistribution. Two userdefined parameters are available to tailor the judgment routine. First, a user can put an
upper limit on the number of cells that can be scheduled for allocation each iteration.
Secondly, the user can globally increase or decrease the survival chances of the cells.
The allocation queue is a priority queue based on the number of connections a cell has.
So the cell with the most connections will be placed first. To eliminate the need to sort
the queue, a modified binary search algorithm determines the index to insert a cell into
the queue.
Since the sorting routine is skipped, the allocation phase is next. During the allocation
phase each cell in the allocation queue is redistributed onto the layout. The redistribution
process is fairly complex. A cell is removed from the queue and placed at every empty
location in the layout. For each location the cell is placed, a modified evaluation routine
is run to determine the placement value of the cell at that location. The modified
evaluation routine simply determines the wire length of the nets connecting to the cell.
Only nets that connect to cells already placed on the grid are considered in the wire
length calculation. After the cell is placed in each empty location, the location with the
smallest wire length is chosen and the cell is placed in that location. Cells are only added
to rows if the total row length does not exceed the average row length. If cells remain
after this process, they are added to rows in which there is enough room, and the row
does not exceed the maximum row length.
After all the cells are placed a row realignment routine is run to fix any cell overlaps and
empty spaces remaining in each row. This is necessary because the redistribution routine
can place cells anywhere an empty slot exists.
Figure 5: Examples of row realignment
The I/O portion of the main iterative process consists of printing the current process
status to the standard out put. Similarly, postprocessing consists of printing the final
layout to the standard output.
III. Program Structure
III.1 Data Structures
The program consists of six main data structures: Net, Cell, Port, Row, Slot, and Layout.
A Net represents a net connecting ports on cells. Nets have weights and a list of
connections. A Cell represents a standard cell. Cells consist of a width and a list of
ports. A Port represents a physical location on a cell that a net can connect to. Ports
consist of an x and a y coordinate that is relative to the cells position. A Row represents a
row in the layout. Rows consist of a current length and a list of slots. A Slot represents a
placement in a row. Slots consist of an absolution x and y coordinate within the
placement grid and a cell. Slots can ether be empty or occupied. Lastly, a Layout
represents the placement grid as well as general layout parameters. Layouts consist of an
aspect ratio, average and maximum row lengths, cell height, and a list of rows.
III.2 Program Flow
The program flow follows the algorithm block diagram of figure 1. First the optimization
parameters are processed, and then an initial solution is generated from a specified input
file. After the initial solution is generated, precomputation is done. Next, the iterative
process begins and the solution is mutated, evaluated, judged, allocated and realigned.
Upon completion the final solution is printed. The program flow is depicted below,
main() {
- Process Input Parameters
- Construct Initial Solution
- Perform Precomputations
- Perform Iterative ESP Process
- Mutate
- Resolve Collisions
- Evaluate
- Judge
- Allocate
- Realign Rows
- Print Final Solution
}
Figure 6: General Program Flow
IV. Results
The performance results of the ESP algorithm were compared to that of the previously
implemented TimberWolf algorithm. General results showed that for small placement
problems (<= 10 cells) the two algorithms performed roughly the same. As the number
of cells increased ESP begins to out perform TimberWolf. For large placement problems
(>= 100 cells) ESP vastly outperforms TimberWolf.
However, it is important to note that there are discrepancies in directly comparing the
implementation of the ESP algorithm and the implementation of the TimberWolf
algorithm. Namely, the TimberWolf algorithm only deals with fixed cell widths, and
only allows net connections on the perimeter of the cell. ESP allows variable width cells
and allows net connections anywhere on the cell.
The common test parameters between the two algorithms are cell height, which is set at
four. Number of nets a cell can have connected, which is restricted to five. Finally, the
number of cells a single net can connect, which is restricted to three.
The results are broken down into two tables. The first one is the comparison of wire
lengths. The second one is the comparison of execution time.
ESP
10 – cell circuit
50 – cell circuit
100 – cell circuit
400
4716
16701
TimberWolf
385
5549
24310
Table 1: Wire lengths of resulting placements
ESP
10 – cell circuit
50 – cell circuit
100 – cell circuit
0.045
0.107
0.208
TimberWolf
0.04
0.147
0.311
Table 2: Execution time of resulting placements in seconds
V. Discussion
V.1 General Comments

Due to the high exchange rate of the initial solution at the beginning of the
algorithm, the algorithms performance is independent of the initial solution.
However, if the initial placement is already a good placement then the overall
computation time is reduced. So ESP might be a good follow up to a previously
generated placement.
V.2 Features






This implementation of ESP has the ability to enforce an aspect ratio. Allowing
for optimization of area
User defined parameters to control allocation process
User defined parameters to penalize aspect ratio violations
Cells are broken down into grids allowing net connections anywhere on the cell
grid. This is an enhancement of the previous implementation of TimberWolf in
which nets could only be place on the cell perimeter.
The precomputation of the optimal wire length is well tuned to generate the
closest approximation of an optimal placement of each cell into a squarebounding box for semi-perimeter wire length calculation.
Due to the implementation of a binary search for the insert point of a cell in the
allocating queue there is no need to sort the queue.
V.3 Challenges


How to exactly accomplish the precomputation of the optimal wire length.
Specifically how to approximate the initial square-bounding box.
How to efficiently redistribute the cells from the allocation queue. Including how
to keep track of empty slots, and how to optimize the evaluate routine to
efficiently determine the placement value for each cell in the allocation queue.