* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Parallel Solution of the Poisson Problem Using
Survey
Document related concepts
Genetic algorithm wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Mathematical optimization wikipedia , lookup
Inverse problem wikipedia , lookup
Perturbation theory wikipedia , lookup
Exact cover wikipedia , lookup
Simplex algorithm wikipedia , lookup
Reinforcement learning wikipedia , lookup
Multiple-criteria decision analysis wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
False position method wikipedia , lookup
Computational chemistry wikipedia , lookup
Transcript
Department of Mechanical & Nuclear Engineering Parallel Solution of the Poisson Problem Using MPI Jason W. DeGraw Dept. of Mechanical and Nuclear Engineering The Pennsylvania State University M&NucE Introduction • The Poisson equation arises frequently in many areas of mechanical engineering: – Heat transfer (Heat conduction with internal heat generation) – Solid mechanics (Beam torsion) – Fluid mechanics (Potential flow) • There are several reasons to use the Poisson equation as a test problem: – It is (relatively) easy to solve – Many different exact solutions are a available for comparison M&NucE A Basic Poisson Problem Our first case is the simple two dimensional Poisson equation on a square: (-1,1) (1,1) y u xx u yy 1 in [1,1] [1,1] x u 0 (-1,-1) (1,-1) M&NucE Finite Difference Approach • The standard second order difference equation is: ui , j 1 ui , j 1 4ui , j ui 1, j ui 1, j h 2 • This formulation leads to a system of equations that is – symmetric – banded – sparse 1 M&NucE Solving the System • The system may be solved in many ways: – Direct methods (Gaussian elimination) – Simple iterative methods (Jacobi, Gauss-Seidel, etc.) – Advanced iterative methods (CG, GMRES) • We will only consider iterative methods here, as we would like to test ways to solve big problems. We will use – Jacobi – Gauss-Seidel – Conjugate gradient M&NucE Jacobi and Gauss-Seidel • The Jacobi and Gauss-Seidel iterative methods are easy to understand and easy to implement. • Some advantages: – No explicit storage of the matrix is required – The methods are fairly robust and reliable • Some disadvantages – Really slow (Gauss-Seidel) – REALLY slow (Jacobi) • Relaxation methods (which are related) are usually better, but can be less reliable. M&NucE The Conjugate Gradient Method • CG is a much more powerful way to solve the problem. • Some advantages: – Easy to program (compared to other advanced methods) – Fast (theoretical convergence in N steps for an N by N system) • Some disadvantages: – Explicit representation of the matrix is probably necessary – Applies only to SPD matrices (not an issue here) • Note: In this particular case, preconditioning is not an issue. In more general problems, preconditioning must be addressed. M&NucE CG (cont’d) • As discussed in class, the CG algorithm requires two types of matrix-vector operations: – Vector dot product – Matrix-vector product • The matrix-vector product is more expensive, so we must be careful how we implement it. • Also keep in mind that both operations will require communication. M&NucE Matrix Storage • We could try and take advantage of the banded nature of the system, but a more general solution is the adoption of a sparse matrix storage strategy. • Complicated structures are possible, but we will use a very simple one: 1 0 0 0 1 1 1 2 3 4 0 2 3 4 1 2 3 3 As , j , n A 0 0 5 6 5 6 3 4 2 0 0 0 7 7 4 1 M&NucE Matrix Storage (Cont’d) • The previous example is pretty trivial, but consider the following: – An N by N grid leads to an N2 by N2 system – Each row in the system has no more than 5 nonzero entries • Example: N=64 – Full storage: 644=16777216 – Sparse storage: 642*5*3=61440 – “Sparseness” ratio (sparse / full): 0.00366 • For large N, storing the full matrix is incredibly wasteful. • This form is easy to use in a matrix-vector multiply routine. M&NucE Sparse Matrix Multiplication • The obligatory chunk of FORTRAN 77 code: c Compute Ax=y do i=1,M y(i)=0.0 do k=1,n(i) y(i)=y(i)+A(i,j(k))*x(j(k)) enddo enddo M&NucE Limitations of Finite Differences • Unfortunately, it is not easy to use finite differences in complex geometries. • While it is possible to formulate curvilinear finite difference methods, the resulting equations are usually pretty nasty. • The simplicity that makes finite differences attractive in simple geometries disappears in more complicated geometries. M&NucE Finite Element Methods • The finite element method, while more complicated than finite difference methods, easily extends to complex geometries. • A simple (and short) description of the finite element method is not easy to give. Here goes anyway: PDE Weak Matrix Form System M&NucE • The following steps usually take place 1. 2. 3. 4. 5. • Finite Element Methods (cont’d) Load mesh and boundary conditions Compute element stiffness matrices Assemble global stiffness matrix and right-hand side Solve system of equations Output results Where can we improve things? – Steps 1 and 5 are somewhat fixed – Step 4 – use parallel solution techniques – Avoid step 3 entirely – use an element by element solution technique M&NucE An Element by Element Algorithm • The assembly process can be considered as a summation: K Ke • Then: K x K e x K e1 x K e 2 x • We can compute each element stiffness matrix and store it. Computing the matrix-vector product is then done via the above element-by-element (EBE) formula. M&NucE Computation Time M&NucE Computation Time M&NucE Communication Ratio M&NucE Speedup M&NucE Parallel Efficiencies M&NucE MFLOPS M&NucE Solution M&NucE A Potential Flow Problem • A simple test problem for which there is an “exact” solution is potential flow over a wavy wall 0 n U n U 100 ft/s 0 0 n xx yy 0 y 0.05 sin( x) M&NucE A Potential Flow Problem (cont’d) • The exact solution is U[ x .05 sin( x) exp( y)] • This problem is easily solved with finite element methods • The computed solution at right is accurate to within 1% M&NucE Conclusions • CG is a “better” iterative method than Jacobi or Gauss-Seidel. • Finite element methods, while more complicated than finite difference methods, are more geometrically flexible. • Finite element methods are generally more difficult to implement and the setup process is particularly expensive. • The EBE technique reduced the communication ratio, but took longer to actually produce results. M&NucE Conclusions (cont’d) • Clearly, communication issues are important when using CGtype methods. • Implementing a fully parallel finite element scheme is quite difficult. Some issues that were not addressed here: – Distributed assembly – Better mesh distribution schemes (I.e. fewer broadcasts) • Good results were obtained using both FEM and finite difference methods.