* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Parallel Solution of the Poisson Problem Using
Genetic algorithm wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Mathematical optimization wikipedia , lookup
Inverse problem wikipedia , lookup
Perturbation theory wikipedia , lookup
Exact cover wikipedia , lookup
Simplex algorithm wikipedia , lookup
Reinforcement learning wikipedia , lookup
Multiple-criteria decision analysis wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
False position method wikipedia , lookup
Computational chemistry wikipedia , lookup
Department of Mechanical & Nuclear Engineering Parallel Solution of the Poisson Problem Using MPI Jason W. DeGraw Dept. of Mechanical and Nuclear Engineering The Pennsylvania State University M&NucE Introduction • The Poisson equation arises frequently in many areas of mechanical engineering: – Heat transfer (Heat conduction with internal heat generation) – Solid mechanics (Beam torsion) – Fluid mechanics (Potential flow) • There are several reasons to use the Poisson equation as a test problem: – It is (relatively) easy to solve – Many different exact solutions are a available for comparison M&NucE A Basic Poisson Problem Our first case is the simple two dimensional Poisson equation on a square: (-1,1) (1,1) y u xx u yy 1 in [1,1] [1,1] x u 0 (-1,-1) (1,-1) M&NucE Finite Difference Approach • The standard second order difference equation is: ui , j 1 ui , j 1 4ui , j ui 1, j ui 1, j h 2 • This formulation leads to a system of equations that is – symmetric – banded – sparse 1 M&NucE Solving the System • The system may be solved in many ways: – Direct methods (Gaussian elimination) – Simple iterative methods (Jacobi, Gauss-Seidel, etc.) – Advanced iterative methods (CG, GMRES) • We will only consider iterative methods here, as we would like to test ways to solve big problems. We will use – Jacobi – Gauss-Seidel – Conjugate gradient M&NucE Jacobi and Gauss-Seidel • The Jacobi and Gauss-Seidel iterative methods are easy to understand and easy to implement. • Some advantages: – No explicit storage of the matrix is required – The methods are fairly robust and reliable • Some disadvantages – Really slow (Gauss-Seidel) – REALLY slow (Jacobi) • Relaxation methods (which are related) are usually better, but can be less reliable. M&NucE The Conjugate Gradient Method • CG is a much more powerful way to solve the problem. • Some advantages: – Easy to program (compared to other advanced methods) – Fast (theoretical convergence in N steps for an N by N system) • Some disadvantages: – Explicit representation of the matrix is probably necessary – Applies only to SPD matrices (not an issue here) • Note: In this particular case, preconditioning is not an issue. In more general problems, preconditioning must be addressed. M&NucE CG (cont’d) • As discussed in class, the CG algorithm requires two types of matrix-vector operations: – Vector dot product – Matrix-vector product • The matrix-vector product is more expensive, so we must be careful how we implement it. • Also keep in mind that both operations will require communication. M&NucE Matrix Storage • We could try and take advantage of the banded nature of the system, but a more general solution is the adoption of a sparse matrix storage strategy. • Complicated structures are possible, but we will use a very simple one: 1 0 0 0 1 1 1 2 3 4 0 2 3 4 1 2 3 3 As , j , n A 0 0 5 6 5 6 3 4 2 0 0 0 7 7 4 1 M&NucE Matrix Storage (Cont’d) • The previous example is pretty trivial, but consider the following: – An N by N grid leads to an N2 by N2 system – Each row in the system has no more than 5 nonzero entries • Example: N=64 – Full storage: 644=16777216 – Sparse storage: 642*5*3=61440 – “Sparseness” ratio (sparse / full): 0.00366 • For large N, storing the full matrix is incredibly wasteful. • This form is easy to use in a matrix-vector multiply routine. M&NucE Sparse Matrix Multiplication • The obligatory chunk of FORTRAN 77 code: c Compute Ax=y do i=1,M y(i)=0.0 do k=1,n(i) y(i)=y(i)+A(i,j(k))*x(j(k)) enddo enddo M&NucE Limitations of Finite Differences • Unfortunately, it is not easy to use finite differences in complex geometries. • While it is possible to formulate curvilinear finite difference methods, the resulting equations are usually pretty nasty. • The simplicity that makes finite differences attractive in simple geometries disappears in more complicated geometries. M&NucE Finite Element Methods • The finite element method, while more complicated than finite difference methods, easily extends to complex geometries. • A simple (and short) description of the finite element method is not easy to give. Here goes anyway: PDE Weak Matrix Form System M&NucE • The following steps usually take place 1. 2. 3. 4. 5. • Finite Element Methods (cont’d) Load mesh and boundary conditions Compute element stiffness matrices Assemble global stiffness matrix and right-hand side Solve system of equations Output results Where can we improve things? – Steps 1 and 5 are somewhat fixed – Step 4 – use parallel solution techniques – Avoid step 3 entirely – use an element by element solution technique M&NucE An Element by Element Algorithm • The assembly process can be considered as a summation: K Ke • Then: K x K e x K e1 x K e 2 x • We can compute each element stiffness matrix and store it. Computing the matrix-vector product is then done via the above element-by-element (EBE) formula. M&NucE Computation Time M&NucE Computation Time M&NucE Communication Ratio M&NucE Speedup M&NucE Parallel Efficiencies M&NucE MFLOPS M&NucE Solution M&NucE A Potential Flow Problem • A simple test problem for which there is an “exact” solution is potential flow over a wavy wall 0 n U n U 100 ft/s 0 0 n xx yy 0 y 0.05 sin( x) M&NucE A Potential Flow Problem (cont’d) • The exact solution is U[ x .05 sin( x) exp( y)] • This problem is easily solved with finite element methods • The computed solution at right is accurate to within 1% M&NucE Conclusions • CG is a “better” iterative method than Jacobi or Gauss-Seidel. • Finite element methods, while more complicated than finite difference methods, are more geometrically flexible. • Finite element methods are generally more difficult to implement and the setup process is particularly expensive. • The EBE technique reduced the communication ratio, but took longer to actually produce results. M&NucE Conclusions (cont’d) • Clearly, communication issues are important when using CGtype methods. • Implementing a fully parallel finite element scheme is quite difficult. Some issues that were not addressed here: – Distributed assembly – Better mesh distribution schemes (I.e. fewer broadcasts) • Good results were obtained using both FEM and finite difference methods.