* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download voor dia serie SNS
Computational fluid dynamics wikipedia , lookup
Pattern recognition wikipedia , lookup
Computational complexity theory wikipedia , lookup
Theoretical computer science wikipedia , lookup
Genetic algorithm wikipedia , lookup
Inverse problem wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Stream processing wikipedia , lookup
Error detection and correction wikipedia , lookup
Dijkstra's algorithm wikipedia , lookup
Computational electromagnetics wikipedia , lookup
Selection algorithm wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Simplex algorithm wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Time complexity wikipedia , lookup
Fast Fourier transform wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
HiCap: A Fast Hierarchical Algorithm for 3D Capacitance Extraction Weiping Shi Department of Computer Science University of North Texas Outline Introduction Previous Research Integral Equation & N-Body Problem New Algorithm Experimental Results Conclusion Future Work Introduction Capacitance Extraction: Given a set of conductors in 3-D space, compute the capacitance between all pairs of conductors. - 1V + + + + - + - - - C=Q Signal delay = gate delay + interconnect delay Interconnect delay is caused by RC (resistance and capacitance) parasitic. R C C Interconnect delay dominates gate delay in deep sub-micron VLSI. Delay (ps) 45 40 35 30 25 20 15 10 5 0 Gate Interconnect (Al+SiO2) Interconnect (Cu+lowk) Sum (Al+SiO2) Sum (Cu+lowk) 0.85 0.5 0.35 0.25 0.18 0.13 0.11 Generation (micron) Importance in VLSI Fast and accurate capacitance extraction is crucial in the design and verification of VLSI circuits and packaging. Current 3D tools are too slow. FastCap, Raphael, QuickCap, etc. 2D/2.5D/Quasi-3D tools use 3D engines to generate library. Accuracy depends on 3D engines. Dracula, HyperExtract, Arcordia, Fire&Ice, StarRC, Columbus, etc. For critical nets and clock trees, 3D accuracy is necessary. Importance in MEMS Accurate capacitance extraction of complex 3-D structures is also important in design of MEMS (MicroElectroMechanical Systems). Design of most motion sensors needs accurate estimate of capacitance. Design of most drivers needs to solve a similar potential problem. A recent ARPA report estimates the market of above applications at 1 to 3 billion dollars by 2004. Enlarged comb driver Previous Research Differential Maxwell Equation (Finite Difference Method or Finite Element Method) Raphael Field Solver Integral Laplace Equation (Boundary Element Method) Multipole algorithm FastCap by Nabors & White. O(N) time. Kernel dependent. Pre-corrected FFT algorithm by Phillips & White. O(N log N) time. Kernel independent. SVD algorithm IES3 by Kapur & Long. O(N log N) time. Kernel independent. Integral Equation Approach where (x) is the known surface potential, (x’) is the charge density, da’ is an incremental conductor surface area, x’ is on da’, is the kernel. Partition conductor surfaces into N panels and assume uniform charge density on each panel. Then we have a linear system: Pq = v where P is an NxN matrix of potential coefficients, q is an N-vector of panel charges, v is an N-vector of known panel potentials. Each entry pij of potential coefficient matrix P represents the potential at panel Ai due to unit charge on panel Aj: Solution q of the linear system Pq = v gives the capacitance. Challenge Partition the conductor surfaces into N panels, Calculate and store the dense NxN matrix P, and Solve the linear system Pq = v In O(N) time? N-body Problem N-body Problem: Given N particles in 3D space, compute all forces between the particles. Hierarchical Algorithm (Appel 85) O(N) time (Esselink) Radiosity (Hanrahan, Salzman & Aupperle) Multipole Algorithm (Greengard & Rohklin 87) O(N) time FastCap Appel’s Key Ideas For practical purposes, forces acting on a particle need only be calculated to within the given precision. The force due to a cluster of particles at some distance can be approximated with a single term. Outline of New Algorithm Adaptively partition conductor surfaces into small panels according to a user supplied error bound Pe. Approximate potential coefficient matrix P and store it in a hierarchical data structure of size O(N). The data structure permits O(N) time matrix-vector product Px for any N-vector x. Solve linear system Pq = v using iterative methods. Adaptive Panel Partition If the potential coefficient estimate between two panels are greater than Pe, then partition the panels. Otherwise, record the coefficient. A C C B E F G H 1 2 I M N L J J 3 4 5 Coefficient Matrix Representation Entries of P are are stored in a hierarchical data structure as links. A B D F H C I E G J K M L N A Matrix with B block entries B D E A C K I L H J D H I C E K J L It can be shown the matrix contains O(N) block entries, where N is the number of panels. If expanded explicitly, the matrix would contain NxN entries. If panel sizes were uniform, the matrix would be much larger than NxN. Matrix-Vector Product Px Compute charge for all panels in O(N) time. A B D F H C I E G J K M L N Compute potential for all panels in O(N) time. A B D F H C I E G J K M L N Distribute potential to leaf panels in O(N) time. A B D F H C I E G J K M L N Solving Linear Systems Use iterative methods such as GMRES or MINRES. Each iteration requires a matrix-vector product Px and can be completed in O(N) time. Number of iterations needed is very small, normally 10-20 regardless of N. Error and Complexity Error of approximation can be controlled by the user supplied error bound Pe. Time complexity is O(N) because each of the above steps is O(N). Experimental Results Test examples: Bus crossing 2x2, 3x3, …, 6x6. In commercial tools, thousands of these crossings will be computed to build the library. 2x2 Bus crossing Previous 3D Algorithms FastCap expansion order 2 (assume accurate). FastCap expansion order 0. Pre-corrected FFT. 40% faster than FastCap(2) and uses 1/4 of memory of FastCap(2). IES3. 60% faster than FastCap(2) and uses 1/5 of memory of FastCap(2). CPU time (in seconds): 250 200 150 FastCap(2) FastCap(0) New 100 50 0 2x2 3x3 4x4 5x5 6x6 40 - 100 times faster than FastCap(2), 14 - 40 times faster than FastCap(0). Memory (in MB): 100 90 80 70 60 50 40 30 20 10 0 FastCap(2) FastCap(0) New 2x2 3x3 4x4 5x5 6x6 1/60 - 1/100 of memory of FastCap(2), 1/80 - 1/280 of memory of FastCap(0). Error with respect to FastCap(2): 10.00% 9.00% 8.00% 7.00% 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00% FastCap(0) New 2x2 3x3 4x4 5x5 6x6 Less than 2.7% error with respect to FastCap(2), 3 times more accurate than FastCap(0). Conclusion A new algorithm significantly faster than previous best algorithms. It provides the possibility for 3D extraction of clock trees and critical nets. It can also be used to generate libraries for commercial 2D/2.5D tools. Kernel independent. Can be applied to multi-layered dielectrics. Adaptive refinement scheme produces good partition of conductor surfaces. Hierarchical data structure is much more efficient than previous data structures. Future Research Capacitance Extraction High order basis function Bottom-up construction of hierarchy Full chip and critical net extraction Inductance Extraction FastHenry is too slow No commercial tool for mutual inductance. Variational Parasitic Extraction MEMS application