Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Linear algebra wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Matrix calculus wikipedia , lookup
Jordan normal form wikipedia , lookup
Horner's method wikipedia , lookup
System of polynomial equations wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
() SIAM J. NUMER. ANAL. Vol. 31, No. 2, pp. 552-571, April 1994 1994 Society for Industrial and Applied Mathematics 015 L-STABLE PARALLEL ONE-BLOCK METHODS FOR ORDINARY DIFFERENTIAL EQUATIONS* P. CHARTIERt Abstract. In this contribution, the author considers the one-block methods designed by Sommeijer, Couzy, and van der Houwen for the purpose of solving ordinary differential equations (ODEs) on a parallel computer. The author also derives a new set of order conditions, studies the stability, and exhibits a new class of parallel methods which contains L-stable schemes up to order eleven. Key words. ODEs, one-block methods, L-stability, high order, parallelism AMS subject classifications. 65M10, 65M20 1. Introduction. The purpose of this paper is to numerically solve the initial- value problem y’(x) (1) where f . f (x, y (x)), (x0) x e [x0, X], is assumed to be continuous and to satisfy a Lipschitz condition on region [x0, X] x R In the literature only a few parallel algorithms for solving (1) have been proposed. Generally speaking, speeding up the integration of (1) can be achieved by partitioning the tasks either "across the system of equations" or "across the method," as exemplified by Franklin in [10]. In addition to these two types of parallelism, Bellen and Zennaro have introduced a third one called "across the time" [1]. The equation segmentation method is straightforward and widely used by engineers in the field of dynamic system simulation. However, its scope is rather limited, since automated partitioning seems feasible only for application-oriented codes in which the structure of f is accessible. Parallelism "across the time" means that each processor evaluates f for different values of x. These values are combined into a recurrence that carries the information from the initial point. This kind of parallelism may yield large speedups as far as a large number of processors is available. However, Vermiglio reports numerical simulations that suggest severe limitations [19]. Actually, this approach seems feasible only for equations whose solution does not depend too strongly .on the initial condition [5]. In this paper, we focus on parallelism "across the method." More specifically, we consider parallel block methods. Block methods can be seen as a set of linear multistep methods simultaneously applied to (1) and then combined to yield a "better" approximation. Numerous block methods have been proposed: Shampine and Watts have constructed A-stable implicit one-block methods for very high orders [17], whereas Chu and Hamilton have studied the predictor-corrector formulation of multiblock schemes [7]. More recently, Sommeijer, Couzy, and van der Houwen have extended some of the strong stability properties of the backward differentiation formulae (BDF) (A-stability Received by the editors May 11, 1992; accepted for publication (in revised form) February 19, 1993. This work was supported by SIMULOG, rue James Joule, 78182 St. Quentin Yvelines Cedex, France. Institute de Recherche en Informatique et Systmes Aleatoires 35042 Rennes Cedex, France (chart +/-er0mat. aukun+/-, ac. nz). 552 (IRISA), Campus de Beaulieu, 553 L-STABLE PARALLEL ONE-BLOCK METHODS or A(c)-stability) to a new class of parallel one-block methods. Their procedure consists in segmenting the total work per step into a few independent tasks, so that each task actually requires the same amount of work as a sequential execution of a BDF. By numerically scanning the space of free coefficients, they have obtained very promising results with respect to A-stability. From a slightly different point of view, the search for higher order A-stable multistep methods has been carried out in two directions: the use of the second derivative of the solution [3], [4], [8], [9] and the study of new "general linear methods" [15]. These methods have a better numerical behavior (A-stability up to order six) but are fully implicit and consequently not appropriate for parallel computers. Our purpose is to carry on the work of [18] and more specifically to construct high-order A-stable schemes that are easy to implement on a parallel computer. In 2, we recall the definition of parallel implicit one-block methods introduced in [18] and we derive a new set of order conditions. Besides, we apply some fairly classic results on stability to get a practical criterion for A-stability. This criterion and order conditions are then built into a Maple program which enables us to solve the resulting system of algebraic equations. This leads to the explicit construction of a third-order L-stable block method. Section 3 discusses a rather different approach. The complexity of order and stability conditions have led us to restrict ourselves to a subset of the methods under investigation. Rather than trying to solve this large number of algebraic conditions, we designed a new class of methods by making strong simplifying assumptions. These methods exhibited surprisingly good stability properties for very high orders. An analytical proof of their L-stability up to order 11 has been given. Finally, 4 reports results of accuracy tests and stability tests. These two sets of tests are in perfect agreement with the theoretical results. 2. Parallel one-block methods. We present a class of methods, introduced by Sommeijer, Couzy, and van der Houwen in [18], which are a direct generalization of the implicit one-step method: Yn+l ayn + hbf(yn) + hdf(yn+l) as well as a restriction of the multiblock methods of Chu and Hamilton [7]. 2.1. Definition. Let us introduce the following vector: c-- (Cl,...,Ck) T with cl 1 (the cis are assumed to be pairwise distinct), and let yn,i denote the numerical approximation of the exact solution value y(Xn + (ci 1)h) at step n, and yTn,k) T The final approximation of the solution y at Yn denote the vector (ynT,1 the point Xn+l is given by Yn+I,1, i.e., the first component of the vector Yn+I. Finally, let I denote the rn rn identity matrix. DEFINITION 2.1. We call a k-dimensional parallel block method, a method defined by the recursion Yn+l (A (R) I)Yn + h(B (R) I)F(Yn) + h(D (R) I)F(Yn+I) where A, B, and D are k-by-k real matrices and D is assumed to be diagonal. According to the usual convention, for any given vector v (vlT,..., v kT)T F(v) denotes the vector (f(vl)T,..., f(vk)T) T. Let us emphasize that D is diagonal, so 554 P: CHARTIER that it is possible to decouple the various components of the solution (see Remark 2.2 below). We can also notice that c is allowed to have k- 1 noninteger components and that no restriction on their localization in R is assumed (provided they are distinct). Remark 2.2. Parallelism comes from the possibility to decouple the nonlinear system that has to be solved at each step (this represents the bulk of computations): Yn+l h(D (R) I)F(Yn+I) (A (R) I)Yn + h(B (R) I)F(Yn). Its Jacobian matrix is indeed of the form I hDl,1 Of oy (X n + Cl h, Yn+ 1,1) 0 0 0 i.e., block diagonal. Each block is of dimension rn and corresponds to one subvector of Y,+I. The corresponding subsystem can be solved independently. 2.2. Formulation as a "general linear method". In the sequel, we will frequently refer to results derived for "general linear methods." Hence, for ease of presentation, we briefly give the main characteristics of one-block methods considered as "general linear methods." Following the book of Hairer, Norsett, and Wanner (see [12, p. 386]), we may define a forward step procedure and a correct value function as follows: Forward step procedure: Vn Un+I=(.,4(R)I) Un+h (13(R)I) F(Vn), ( h.F(Yn) ) V =(A(R)I)Un+h (l(R)I) F(V), where 0 I 0 =D,R x A= A B ) R x2 Correct value function: z(x,h) ---(yT(x q-(Cl 1)h),...,yT(x + (Ck 1)h), h.y’T(x + (cl 1)h),... ,h.y’T(x + (ck 1)h)) T The correct value function be approximated by U,. Z(Xn, h) gives the interpretation of the method and is to 2.3. Reduction to an autonomous system. Let e denote the vector (1,..., Rk. If we assume that A, B, and D satisfy the relations (3) ne=e, A(c- e) + Be + De c, then it is equivalent to apply a block method to problem autonomous system (4) 1) T Y’= F(Y), Y0 (x0, y)T, (1) or to the reformulated 555 L-STABLE PARALLEL ONE-BLOCK METHODS (xiyT) T E Rm+l and F(Y) dimensional block method to (4), we obtain where Y (5) - Yn+l (A Xne hc-- When applying a k- I)Yn 2t- h(B (R) I)F(Yn) + h(D (R) I)F(Yn+I), xnAe + h[A(c- e)+ Be + De]. Owing to (3), the last line of Yn+l (1, f(x,y)T) T (5) is satisfied for any h, so that (5) reduces to (A (R) I)Yn + h(B (R) I)F(Yn) + h(D (R) I)F(Yn+l), which is exactly the recursion we get when we apply a block method to (1). It should be emphasized that conditions (3) are not too restrictive, since they are satisfied for any method of order p >_ 2 (see Theorem 2.6). In the sequel, we therefore restrict ourselves to the study of k-dimensional block methods on autonomous systems. In particular, this will allow us to apply Theorem 8.14 from [12], which is only valid for autonomous systems. 2.4. Pth-order consistency conditions. Further results will require the expansion of the components of the correct value function as Butcher series. This motivates the following computations. Readers unfamiliar with Butcher’s series can refer to [2], [12], and [13]. As a first observation, we can notice that all coefficients z(t) in the expansion of ith component of the correct value function depend only on the order of the tree t. In terms of the exact solution y(x) (which is assumed to be smooth enough), we have y’ i,. + (c- 1).h) yJ(x) + (c 1)(yJ(x)) (1) +... n + (Ci-n!1)nh (YJ(x))(n) +"" h.(yJ(x + (c 1).h)) (1) 0 + h.(yJ(x)) (1) +." n+l (yJ(x)) (n+l) + + (n + 1) (c-(n+1)nh 1)! .... denote the only tree of order one and T n denote any tree of order n. According to Theorem II.2.6. and Definition II.11.1 of [12] we consequently have Let zi((R)) 1, (6) (Ci- 1) n Zi(T n) for all from 1 to k and z,(e) =0, 1, (7) Zi(T n) n.(c- l) n-l for all from k + 1 to 2k. Expressing relations (8) 0 z(-) c-ee (6) and (7) in a vector form, we get and Vn >_ 2, z(Tn): n(c e) n-1 556 P. CHARTIER where powers of vector are meant to be componentwise powers. The so-called preconsistency conditions (see [12, p. 397]), which ensure that the method makes sense, have here a very simple form: z(e) and z(e) (,Az(e) e) Ae e. Now, in order to present the order conditions, we need two additional definitions. DEFINITION 2.3. A k-dimensional block method with matrix,4 is called zero-stable if A is uniformly bounded for all n. DEFINITION 2.4. Let us consider a zero-stable and preconsistent k-dimensional block method, and let T be such that the Jordan canonical form of A is 4=T.diag ".. ".. ,J .T-, ".. where i are the eigenvalues of Ji of modulus one and the Jordan block associated with eigenvalues of modulus strictly less than one. Then, E is defined as the spectral projector onto the invariant subspace associated with 1" E- T.diag(I, 0,..., 0).T -1. Remark 2.5. No eigenvalue of can lie outside the unit disc since the method is assumed to be zero-stable. Moreover, preconsistency conditions imply that 1 is an eigenvalue of .4. It is consequently relevant to decompose A as in Definition 2.4. The matrix E of Definition 2.4 plays an important role in the definition of the order of convergence according to Skeel (see Definition 8.10 of [12, p. 393]) that we have retained here. Although it seems more complex, this definition is less restrictive than the componentwise definition of the order given in [18] and leads to a global error of the same order. Finally, we define, as in [18], the following vectors: Co Ae e and Cj A(c e) j + j[B(c- e) j-1 + Dcj-l] cj, j 1, 2,... that will be used for the sake of clarity. We are now prepared to give the following. THEOREM 2.6. A zero-stable and preconsistent k-dimensional block method is of order p >_ 1 if and only if (9) (10) Vj, O<_j<_p-1, Cj=O, T (0, O) T. E(Cp, O) Proof. Suppose that the components of the local error (see [12, p. 391] for a definition of the local error of a general linear method) possess an expansion as a Butcher series, and let di(t) denote the coefficient of the ith component, and d(t) (dl(t),..., dk(t)) T. Similarly, let v(t) (vl (t),..., vk(t)) T denote the coefficients of the expansion of Vn. Theorem 2.6 is nothing else but Theorem 8.14 from [12, p. 398] where d(t) is computed as follows. As mentioned above, the coefficients z(t) of the correct value function depend only on the order #(t) of the tree t, so that relations (8.27) of [12, p. 398] can be written as j=,(t) (11) j=0 557 L-STABLE PARALLEL ONE-BLOCK METHODS and + (12) Inserting (8)into (11) (13) ( d(t)= (12), and we get [A(c- e)(t) + #(t)B(c- e)(t)-l + Dv’(t)] t(t)-I v’(t) c(t) and (14) A(c e) "(t) + #(t)B(c e) "(t)-I + Dv’(t). v(t) Now, let us consider a zero-stable pth-order k-dimensional block method. The following conditions Vt, #(t) <_ p-1, d(t) of Theorem 8.14 of [12, p. 398], together with conditions (13), imply on the one hand (15) Vje [0,p- i.e., conditions (9), 1], Cj -O, and on the other hand Vt, (16) O #(t)c"(t)-l. #(t) _< p- 1, v’(t) If t is a tree of order less than p- 1, (14) and (16) lead to v(t) C(t) + c"(t), so that from (15) we get v(t) c "(t). If t is now a tree of order p, according to Corollary 11.7 from [12, p. 246], we have v’(t) -pv(tl).. "v(tm), where t-It1,...,tm] denotes the tree which leaves over the trees ti when its root and adjacent branches are chopped off (see Definition 2.12 from [12, p. 152]). Since v(t) ct’(t) for any tree of order less than p- 1, we have v’(t) 0 of Theorem 8.14 is pcp-l, so that condition Ed(t) equivalent to condition (10). Conversely, let us consider a zero-stable method for which conditions (9) and (10) are satisfied for an integer p _> 1. Then an induction argument based on (13) and (14) implies that Vt, #(t) <_ p-1, v(t) v("(t)) c and also consequently that Vt, #(t) < p, d(t) d(Tt(t)) ( -Cry(t) 0 so that, according to Theorem 8.14, the method is of order p. COROLLARY 2.7. A sufficient condition for a k-dimensional block method to be order p >_ 1 is given by of Vj e [O,pl, (17) Cj O. Remark 2.8. The result stated in Corollary 2.7 has been obtained by Sommeijer, Couzy, and van der Houwen in [18] by using a componentwise definition of the order. However, conditions (9) and (10) of Theorem 2.6 are less restrictive than conditions (17). The number of scalar equations involved in (17) with j p is indeed k; this 558 P. CHARTIER number drops to rank(E) for (10) (which is less or equal to k and in most cases strictly less than k, since E is a projection matrix). The additional free parameters can be used to improve stability for a given k and a given order. The method of Example 2.17 benefits from this approach and compares favorably with method (3.10) of [18]. Incidentally, method (3.11) of [18] implicitly uses the conditions of Theorem 2.6. Another application of Theorem 2.6 is given in 3, where methods A//(k, r) reach order k by using condition (10). In order to completely define the order conditions, we need to compute matrix E. However, its decomposition as T.diag(I, 0,..., 0).T -1 is not of practical interest, since we aim at deriving purely formal algebraic conditions. The following theorem, together with the next proposition, gives us an easy way to compute E from the characteristic polynomial of A. THEOREM 2.9. Let us assume that the method is stable and preconsistent. Let be the multiplicity of the eigenvalue 1 of A, and let q be the polynomial defined by ( 1)tq() (18) det(A I), k identity matrix. We may then write E as where I is here the k 0 q- (q(A) q(A)B 1 E= (19) 0 I Proof. The spectral projector E onto the invariant subspace associated with the eigenvalue 1 of A is characterized by { (20) Vx e n E =E, and (Ex (Ax x) E(A-2-)=O, where 2" is the 2k 2k identity matrix. From plicity of 1 we have x) (18) and the assumption on the multi- (A-I)q(A)=O. Therefore, Aq(A) q(A) and q(A) 2 q(1)q(A). Using these properties, it is easy to check that the matrix given by (19) then satisfies (0). 2.5. Stability. PROPOSITION 2.10. A k-dimensional block method is zero-stable there exists a constant K such that Vn E N, Proof. An K. Since VnEN An+ ( An+ 0 An B 0 I if and only if 559 L-STABLE PARALLEL ONE-BLOCK METHODS [:] it is easily seen that this property is equivalent to Definition 2.3. The linear stability of block methods can be investigated by applying the method to the test equation yl Ay. This leads to a recursion of the form Yn+ M(Z)Yn, (21) where M(z) (I-zD)-(A+zB) is called the amplification matrix and where z Ah is defined as usual. Following the familiar notions of stability, we will successively define the following. DEFINITION 2.11.. The stability region of a block method is the region of the complex plane, where the sequence of vectors (Yn)neN remains bounded for any initial vector Yo. DEFINITION 2.12. A k-dimensional block method is called A-stable if the stability region contains the left half plane C- {z C; Re(z) <_ 0}. DEFINITION 2.13. A k-dimensional block method is called L-stable if it is A-stable and if it has an amplification matrix with vanishing eigenvalues at infinity. Remark 2.14. It is easily seen by considering the form of M(z) that - -D B; M(z) lim therefore, the amplification matrix of an L-stable matrix is such that p(D -B) O. Here we observe that a point z belongs to the stability region if and only if the eigenvalues of the amplification matrix M(z) have a modulus less or equal to one and the eigenvalues of modulus one are nondefective. However, it seems difficult to investigate the stability region from this single characterization. A more appropriate result establishes sufficient conditions for A-stability. This will enable us to "verify" that a method is A-stable. PROPOSITION 2.15. Let a k-dimensional zero-stable block method be defined by the matrices A, B and D R k, with D > O. Let P(, z) det(M(z)- I) denote the characteristic polynomial of M(z), and let ay(y),j 0,... ,2k be the coefficient j in ofT W(w’Y)=(w-1)2tcp(w+l w-i’iY ) P ( w+l iy w-l’ ) Finally, let the matrix Tl be O2k-1 O2k ?-/(Y) where o fo j < o, 0 0 O2k-3 OZ2k-2 O2ka2a 0 0 nd 02k-2 O-2k+l O-2k-t-2 O-2k+3 a-2k+4 0 ao O2k-5 O2k-4 (2k-3 () dt((())_<,_<_) t (2 )th pnpt of four conditions is sufficient for A-stability. minor. Then the following set () By*10,+cx[, VweC, W(w,y*)=O=Re(w)<O, e ]0, +[, o() # 0, e ]0, +[, () # 0, e ]0, +[, fl() # 0. v v v 560 P. CHARTIER Proof. Let us assume that conditions (22) are satisfied. Similarly to Corollary 1 of [11], we can prove that for any y E ]0, +cx[, the 2 k roots wj(y) of W(w, y) lie in the left half plane Re(w) < 0. Now, since the transformation (w + 1)/(w- 1) maps the unit disc (of the C-plane) onto the left half plane, the roots of P(, iy)P(,-iy) lie in the unit disk for any y E ]0, +c[. The same obviously holds for the roots of P(, iy). This implies that el0, p(M(iy)) < 1. In particular, the method is stable on the imaginary axis (by assumption, 0 belongs to the stability region). The conclusion now follows from the maximum principle applied E! to p(M(z)). COROLLARY 2.16. If p(D-1B) O, then a sufficient condition for L-stability is that all coefficients of co(y), c2k(y) and (y) are strictly positive. For small values of k (k _< 3), the criteria of Proposition 2.15 give the opportunity to derive formal stability conditions in terms of free parameters. The set of all conditions (stability and order conditions) lead to a large but solvable system, so that methods can be constructed explicitly (see Example 2.17). However, it does not seem possible to compute and solve the corresponding system for large values of k, owing to the expensive computations involved by conditions (22) when all parameters are kept free. In the next section, Proposition 2.15 and its corollary will be used to verify stability of a restricted class of methods (methods AJ(k, r)). Example 2.17. Solving the order and stability conditions (see [6]) for k 2 leads to the following third-order method: A B D 3 147-- 1880b2t 2s009/ 5 +2000b -2760b2 -49 120 2 (7+20b2) s00bb’l_27a0b2,-49 b21 120 b2 (-11+20b2) 2S00b-TaOb2-a9 0 --8 400b 1- 360b2 t-49 2800b 1-2760b2-49 -4/5 3 49+280b2 +400bl 2800b221-2760b21-49 -7/40- 1/2b2 0 ) ) ) where c 3/2 and b2 RootOf(4361 36480Z + 158800Z 2 + 48000Z3), i.e., b2 -3.530865570574838. The condition of Proposition 2.15 is fulfilled, so that A-stability is achieved. Since the eigenvalues of the amplification matrix vanish at infinity, the method is L-stable. 3. Generalized backward differentiation formulas. It soon becomes iinpossible to solve the system of order conditions and stability constraints. Even powerful symbolic manipulation software such as Maple cannot tackle such a huge task for the time of computation increases exponentially. A way to overcome these obstacles is to perform a numerical search in the space of free coefficients, once the conditions for zero-stability and order p have been derived. This technique has been investigated in [18]. The authors have reached the fifth order with almost A-stability. However, no formal proof of these results was given. In addition to this, it seems hard to go further, owing to the considerable amount of work needed to find optimal coefficients. This is why we turned to the more simple case where B is null, even if it dramatically reduces the number of free coefficients: for a given block-size k and a given vector c R k, 561 L-STABLE PARALLEL ONE’BLOCK METHODS there are (k + 1)k unknowns coefficients in A and D instead of (2k + 1)k. Since every order condition Cj 0 corresponds to k linear equations, we do not expect more than kth-order methods. However, by sacrificing many coefficients, we got the number of free parameters to a manageable level. 3.1. Construction. We now attack the explicit construction of a restricted class of one-block methods. An interesting question is how to choose the matrices B and D of Definition 2.1 and vector c so as to obtain good stability properties in addition to high order. Our search for a good choice of parameters is based onthe following ideas: 1. Set B 0 to ensure stability at infinity (see Remark 2.14); 2. Choose A such that A.U V- D.W, where U g W [e, (- e), (c- e)2,..., (c- )k-1], [,c,c2,...,ck-1], [0, e, 2c,..., (k 1)ck-2], to ensure a reasonably high order with respect to the "Daniel-Moore" conjecture (see Theorem 4.17 of [13]). The relation A.U V- D.W is nothing else but conditions j (17) of Corollary 2.7 expressed in a matrix form. The assumption ci cj for ensures that U is nonsingular so that A is determined once D and c are given. 3. Set D (1/r)(I+diag(c)) where I is the k k identity matrix and diag(c) the k k diagonal matrix with diagonal elements ci, to ensure zero-stability (see Remark 3.1 below). 4. Set c (1,..., k) T in order to make the implementation easier. The ith component is indeed a good initial guess for the determination of the (i- 1)th component of Yn+l by Newton’s method (the determination of the kth component of Yn+I will require special attention). 5. Determine the remaining coefficient r so as to maximize the order of accuracy. Within the class of values of r for which L-stability is achieved, choose the value that leads to the smallest error constant (the procedure will be described in details in 3.5). Remark 3.1. This particular and somewhat surprising choice of D as a linear combination of the two matrices I and diag(c) is motivated by the following observation. A close look at the order conditions of Corollary 2.7 for order (k- 1) suggests that A can be made upper-triangular in the basis {e, c, c2,..., C k-l} if Dcj-1 belongs to Span{e, c,..., cJ} for j 1,..., k. This goal is easily reached by taking D as a aI +/3diag(c) first degree polynomial of matrix diag(c). Generally speaking, D leads to a zero-stable method for 0 </3 <_ 2/(k- 1) (see Theorem 3.3 for the details). 0, we get Furthermore, setting = va > 0, Vz e c-/{0}, p(M(z)) p((I = czl)-lA) =11 -z[-lp(A) <- 0. Taking this so that stability is then ensured in the left half plane except at z of term the a as I, which brings perturbation result into account, /3diag(c) appears have simulations numerical domains of The determination by stability zero-stability. shown that among the best set of parameters was a =/3 1/r. Combining these ideas we finally give the definition of methods AJ(k,r). 562 P. CHARTIER DEFINITION 3.2. By 3d(k,r) we mean a special k-dimensional parallel block of (1), where method c-(1,2,...,k) TE B=0; D (1/r)(I + diag(c)), r e ]0, +(x[; A V.U -1 D.W.U -. 3.2. Zero-stability and stability at infinity. The following theorem sets minimal properties for the methods Jtd(k, r). TH OaEM 3.3. Let us assume that r >_ (k- 1)/2. Then the method M(k,r) satisfies M (k, r) is of order at least k 1; M(k,r) is zero-stable; Ad(k, r) is stable in the left half plane x <_ r/(k + 1)Proof. By definition of A, the method is of order p k- 1. Since (Ci)l<i<k are pairwise distinct e, c,..., c k-1 is a basis of R k where matrix A has a reduced form. According to the order conditions of Corollary 2.7 we indeed have Ae=e and Vje [1,...,p], Ac2- 1- j c r j-jc j-I-E(})(-1)j-Ac r l=0 . Hence, an easy induction argument shows that A is upper-triangular. From the expression of A we can see that its eigenvalues are 1 j/r for j 0,..., p. They are pairwise distinct and of modulus less or equal to one, so that Ad(k, r) is zero-stable. Finally, by considering the Laplace transform g of g(t) e -tD-1D-1A we get e-tD-ID-iAe’tdt .g(-z) fo e-t(D--ZI)D-1Adt (I- z.D)-IA M(z). Provided II.II is subordinated to a vector norm, it follows that IlM(z)ll < < where c max<i<k ci k and z IID-1All e-(X+-r)tdt IID-1All x+r/(c+l)’ -x + iy. 3.3. Error constant. In [18] the normalized error vectors have been introduced in order to compare their components with the error constants corresponding to conventional linear multistep methods: gJ j!De’ where .the division of vectors is meant componentwise..The motivation for such a choice was the similarity with the definition of error constant for multistep methods. Incidentally, BDF methods written in block format have an error vector of 563 L-STABLE PARALLEL ONE-BLOCK METHODS (0,.,., 0,-1/(k-- 1)) T, whose last component is nothing but the usual error constant of BDFs. However, it seems more relevant to define an error constant according to the asymptotic expansion of the global error. Since our block methods fits into the class of general linear methods, we may study the asymptotic expansion for a strictly zero-stable method, i.e., zero-stable and having 1 as the only eigenvalue of A with modulus one. This previous assumption greatly simplifies the discussion and is fulfilled for all methods A//(k, r), provided r > (k- 1)/2. Theorem 9.1 of [12, p. 406] now ensures the existence of an asymptotic expansion of the global error with principal term ep(X) (see [12, p. 409]) for a method of order p satisfying .ep(X)= EG(x)ep(X)- Edp+l(x)-(I- A + E)-Idp(X), ep(Xo)- E’),p- (I- d + E)-ldp(xo) where di (Xn) is the coefficient of h in the Taylor expansion of the local error at x Xn, p the coefficient of hp in the expansion of do z0- (I)(h) ((I) is the starting procedure) and G is a function depending on both the problem and the method. If we assume that j/l(k, r) is of order p k- 1, then dp(x) 0 and dp+(x) -(1/k!)y(k)(x)Ck. An appropriate definition of the error constant C is consequently given by 1 _ C.e -.EC. However, this definition cannot be used when p k, i.e., ECk O, since,dp(x) is then no longer null and dp+l(X) no longer depends on only derivatives of y. From a different point of view, we may define an error constant from the defect of the exponential when inserted into the numerical scheme j4(k, r). For the test equation y’ Ay and stepsize h, Yo (1,eZ,..., e(k-1)z) T (Z Ah), and we have 0 for all i, 0 Since Ci A Yo + z D Y Y1 _< k- 1, we have (A ezI + zeZD)Yo (23) Let R(, z) -.Ckzk + O(zk+ 1). P(, I + zCD) det(I zD)P(, z), where det(d teristic polynomial of M(z). Relation z) is the charac- (23) implies that z R, + Since 1 is a simple eigenvalue of A, we can consider the analytic continuation 1 in a neigborhood of the origin (called the principal root). This gives e with C (z) Cz 1 (Z) of + O(z (OR/O(1, 0))-(. Now, considering the Laurent series of (A- i)-1 neigborhood of in a 1, one can easily show that 0) so that the two definitions of the error constant are consistent when the order is k- 1. When we get an extra order with ECk O, the second definition carries on since 1 is still a simple eigenvalue of A, so that we will retain the definition of C given by the principal term of R(e z). It is clear thatthis definition is fully acceptable only when the order of .M(k, r) is k- 1. Like Runge-Kutta methods the principal error term is indeed composed of several "elementary.differentials" when the order is k. , 564 P. CHARTIER 3.4. Algebraic stability. One can ask whether jJ(k,r) can be made algebraically stable or not (see, for instance, Definition 9.3 of [13, p. 386] for a definition of algebraic stability). The following proposition shows that the free parameter r can not be used to achieve algebraic stability. PROPOSITION 3.4. For k _> 3, A//(k,r) is not algebraically stable. To prove Proposition 3.4 we need the following lemma. LEMMA 3.5. For k >_ 3 and for all x in the set {1,2,...,k}, the following inequality holds: x k-1 (4) (k- 1). x_ Proof. Let y ._ x k-2 (x- )-1 > + 1. x- 1. Straightforward calculations lead to l--k-2 ( + 1) (4) -1 l--k-1 (_) + (- 1) (_). >_ l=O l=O Now, for 0 _< _< k- 3, it is clear that (k- 1)(_2) >_ 2(_1). Consequently, (24) is ensured provided y >_ 1. As for y 0, we may see directly that (24) holds. Proof of Proposition 3.4. A necessary condition for a preconsistent general linear method to be algebraically stable is given by Lemma 9.5 of [13]. A diagonal positive matrix A and a symmetric positive definite matrix G should exist such that Ae-BTGo, (- ) _ (2) 0. Here we have ,4= A, B- D and @- e. Let us now denotev- Ge. Vector v is a nonzero vector whose components are all positive (due to (25)). Now, (25) implies that v ATv. Using the order conditions for j 1 and for j k- 1, we get { (26) - [(- ) + ;( + ) c] 0, vT[(c )k-1 k:r. _ .l (ck-1 t_ ck-2) ck-1] For (26) to hold, the components of (c- e)+ 7 (C e) same sign, so that r should belong to the interval ]2; k + (c- e) k-1 + ((k 1)/r)(ck-1 + ck-2) 5 k-1 implies that (k- 1) l<i<k min ck + ck-1 k-1 Ci (5 1) k-1 <r< (k- 1) l<i<k C O. should not have all the similar argument for 1[. A ck + ck-1 max c (ci 1) k-l" Now, due to Lemma 3.5, this would lead to a contradiction. 3.5. Survey of methods. The estimation of Theorem 3.3 shows that when Re(z) tends to infinity, [IM(z)ll tends to zero. In fact, this result is also valid when Iz tends to infinity, since p(D-1B) 0 (see Remark 2.14). It follows that L-stability is implied by A-stability for methods A/I(k,r). Nevertheless, A-stability cannot be easily obtained, and we actually did not find any direct proof. All we can show is that methods j4(k, r) satisfy the conditions of Proposition 2.15 for certain values of r with respect to k. This can be done by using an efficient symbolic calculus system. We present in [6] a program written in Maple whose purpose is to build methods AJ (k, r) for any value of k and r. This program then assembles matrix 7-/and computes the L-STABLE PARALLEL ONE-BLOCK METHODS 565 (2k- 1)th principal minor, which is known to be a polynomial of the real variable y. This program is used in the sequel to investigate the L-stability of methods M(k, r) for particular values of r. According to 3.1, it still remains to choose parameter r. One special choice seems interesting: since 1 is a simple eigenvalue of A, the projection matrix E is of rank one. As already mentioned in Remark 2.8, conditions (10) of Theorem 2.6 for order k consequently sum up to the following scalar equation: - 0, where E(r) and Ck(r) stand for E and Ck as functions of r and the index indicates that we consider the first component only. For r 0, the left part of (27) becomes a kth-degree polynomial whoose roots can be easily computed: for any value r of these roots, JM(k, r) is of order k. We now choose the solution of (27) according to point 5 of 3.1, and we denote it by rk" 2 < k < 6. The values r 4 and r 5 are the solution of (27), respectively, for k 2 and k 4, which lead to the smallest error constant. Since they are rational and since B 0, one can easily test the condition of Corollary 2.16. For other values of k, the smallest error constant is obtained for an irrational solution of (27), so that f, c0, and c2 have to be computed in terms of the formal parameter r. In that case, we have to test the positivity of the coefficients of gt, c0, and c2k by applying the following technique" for a given k, coefficients of gt, c0, and c2k are computed in terms of r. Then, using an idea of Roy and Szpirglas (see [16]) and the corresponding Maple program from F. Cucker 1, it is possible to determine the exact sign of each coefficient for every solution of (27). It turned out that the solutions of (27) minimizing the error constant lead to L-stable methods. 7 <_ k < 10. There is no rational solution of (27) and the size of the polynomials involved in conditions (22) prevent us from using the technique previously explained on the machines at our disposal. It is, however, possible to choose r among the solutions of (27) for which AJ(k, r) is stable in a neighborhood of the origin (see Lemma 3 of [6]) and then to pick up the value rk minimizing the error constant. For those methods, L-stability is consequently not proven but only conjectured (owing to numerical tests). Besides, we have approximated rk with rational values up to 17 digits and checked L-stability for the resulting method. With such an approximation, the method obtained is numerically equivalent to the method Ad(k, r), since its error constant (corresponding to order k- 1) is less than the machine precision (10-16). We now conclude with a survey of the results obtained up to k 10 (cf. Table 1) and we give the matrix A as well as (27) for the different methods in the appendix from A to G (from k 2 to 8). For k <_ 6, the methods presented have been completely constructed according to the strategy described in 3.1. As mentioned above, this strategy had to be partly modified for higher values of k. For k > 8 the coefficients of A are large and may cause rounding errors, so that methods for k > 8 have not been included. Remark 3.6. For higher values of k we have investigated the L-stability of methods Ad(k, r) for the particular set of values r k 2. It appeared that conditions (22) are satisfied for k 11 and k 12. In particular, we have obtained a method of order 11. This program is to be found in the "share" directory of Maple. 566 P. CHARTIER TABLE Error constants for methods JM(k, rk) and BDFs of the same order. Error Constants (Ad(k, rk), BDFk) k rk 2 3 4 5 6 7 4 3.18 5 4.37. 3.92 5.54 (-5/6,-1/3) (1.15,-1/4) (93/60,-1/5) (- 1.55, 1/6) (1.67, 1/7) (1.83, () 8 7.25 (2.53, Q) 9 6.72 (-2.06, (R)) 10 8.40 .. L-stability Proved. Proved. Proved. Proved. Proved. Conjectured (proved for r 5.5438664370799515). Conjectured (proved for r 7.2474044915485680). Conjectured (proved for r 6.7217218560398991). Conjectured (proved for r 8.4009592670313347). TABLE 2 (A) for Kaps’s problem -s. Number of correct Method h=1/4 h=1/8 h=1/16 h=1/32 h=1/64 h=1/128 h=1/256 [(.,r)l 0.57 0.73 1.06 l’3S 1.71 2.03 2.29 2.66 2.88 3.24 3.66 4.66 1.11 .72 1.97 [ I.2.aa 2.58 2.’75 3.64 4’20 5.04 5.80 6.52 7.37 7.97 a.94 9.45 10.S3 2.94 3.18 4.53 5"11 6.24 7.01 8.01 3.54 3.79 5.43 6.02 7.44 8.22 9.51 10.40 .56 12.59 3.65 15.63 4.15 4.39 6.33 6.92 8.65 9.43 11.02 11.91 13.36 4.39 15.75 18.04 BDF2 ra) BDF3 A/l(4, r4) BDF4 3/1(5, r5) BDF5 [3,t(a, 1(6,) BDF 1(7,)1 I(S,)J .I digits 1.35 1.88 2"35 3’28 3.85 4.57 2.70 3.32 3.59 4.28 4.47 5.3 5.3 6.26 5.04 5.84 6.20 7.10 7.33 .47 8,89 9.76 X0"76 X.5 3.3 with e 10 -- 3.6. Accuracy tests. In order to numerically verify the order of methods Ad (k, rk) we integrate the test problem proposed by Kaps [14], and used in [18]: y Yl (0) (0) --(2 -1)yl -1y, ( + ), , 1, OxX, with the following exact solution for any value of e: The methods Ad(k, rk) used in this test are those of Table 1 and are exactly defined at the end of the paper. In Table 2 we have listed the values of A, where A denotes the number of correct digits of the numerical solution at the end of the interval (i.e., A -logl0(llyi(Z -Yn,llloc/llYn,lllc)). All experiments have been done with 10 -s and X 4, in order to avoid the influence of the starting procedure (in this e 567 L-STABLE PARALLEL ONE-BLOCK METHODS case, the k first approximations are actually exact) 2. These values have been plotted on a graph (see Fig. 1), and a good appropriateness between theoretical and real order can be observed. 18 16 14 12 10 8 6 4 2 0 1.5 2 2.5 3.5 3 4 4.5 5 2 8 5,5 6 Stepsize in logarithmic scale FIG. 1. Error distributions for the methods A/(k, rk), k of Table 1. 3.7. Stability tests. The stability of the methods is tested by integrating a problem whose Jacobian matrix J has purely imaginary eigenvalues: y y yl(0) (2s) (0) -ay2 ayl- , + (1 + a)cos(x), (1 + a)sin(x), 0, O<_x<_X, with exact solution yl sin(x), Y2 cos (x) for any value of the parameter a. Therefore, J has the following expression: a 0 so that the eigenvalues of J are i.a and -i.a. In Table 3, we give the values of A, which has been defined above, as a function of the stepsize. We finally illustrate the better behavior of methods AJ(k, rk) over BDFs (Backward Differentiation Formulas) for orders 3, 4, 5, and 6. The second-order BDF is indeed A-stable, whereas for orders higher than 6, the BDF’s are no longer zero-stable. Overflow is indicated by c and values of A corresponding to stepsizes that are theoretically unstable are underlined. It should be noted that all methods behave correctly, even those for which stability is only conjectured. 2 Experiments were performed with a Fortran written code in quadruple precision (32 digits). 568 P. CHARTIER TABLE 3 Number of correct digits (A) for problem (28) Method (2,) BDF2 J(3, r3) BDF3 J(4, r4) BDF4 (5,) BDF5 (,) BDF6 /I(7, rT) (8, rs) with X 100 and c 10. h-4/5 h--2/5 h--1/5 h-1/10 h--1/20 h--1/40 1.49 2.16 2.82 3.47 4.09 4.72 1.91 2.83 3.58 oc a 5.3__3 -0.4___1 x) o 8.15 10.00 2.90 3.43 4.74 5.67 6.96 8.23 9.64 9.77 11.24 11.92 10.83 12.57 8.13 9.29 4. Concluding remarks. In this paper, we have tried to improve the results obtained by Sommeijer, Couzy, and van der Houwen in [18]. They introduced a class of seemingly very promising block methods suitable for integrating ordinary differential equations on parallel computers. On a first step, we analyzed these methods by considering them as "general linear methods." We applied the different results of Butcher and Skeel theory, and we obtained a refinement of the order conditions. On a second step, we got a practical condition for A-stability by applying well-known results of complex analysis. Based on this analysis, we have constructed a class of diagonally implicit block methods denoted by J4(k, rk). The structure of these schemes is such that k processors can be exploited to run them with the same effective computational cost as a BDF of order k. The. implicit system to be solved at each step can indeed be decoupled into k independent subsystems of dimension equal to the number of ODEs. With respect to a real implementation, it is of interest to consider the loadbalancing. In order to avoid any degradation of the parallelism, we need to compute an initial guess for the Newton process of the subsystem associated with the last component of our vector of approximations. This may be done by using an explicit scheme. The extra work required for this represents only a small part of the amount of computations needed for the Newton process of a large system of ODEs. Another possible improvement could consist in requesting different convergence tolerances for the Newton process of different components. All components of the vector of approximations but the first one are indeed to be recomputed at next step. The main advantage of methods A/I(k, r) is definitely their robustness. Using our formula manipulation approach, we have proved that they are L-stable at least up to order 6. Owing to their numerical behavior, L-stability has been conjectured for methods of order 7 to 10. In addition to this, for orders 7 and 8, we have proved the existence of L-stable methods which are practically equivalent to methods AA(7, rT) and j4(8, rs), respectively. With respect to accuracy, the comparison with BDFs of the same order shows that methods A/I(k, rk) loose about one digit. In a forthcoming paper, we intend to check L-stability for orders 7 and 8 and to confirm this comparable accuracy with extensive numerical testing. 569 L-STABLE PARALLEL ONE-BLOCK METHODS (a) .M(2, 4). Appendix. (b) A/I(3, r3). 5/4 63r32 + 185ra 3 r3 is the root of 6r value is 3.18. r -1 2 2r-[2 3 2r 144 --r -1 1 6_ 3 A= _ 1/2 ( -1/4 A= 2r--9 16 r2-r20 6 r 0 whose approximated ) (c) Ad(4, 5). 2/15 6/5 -2/5 1/15 -1/10 3/5 7/10 -1/5 4/15 -6/5 12/5-7/15 -a -1/a (d) A//(5, r5). r5 is the root of 120r55 3000r54 + 26650r53 86400 -0 whose approximated value is 4.37. --10-6r 6r 2r _2 4r - r -1 15 20--6 r 6r 20 1 r 3r 5 4r 24 r--300 24r 3 r 2 6 r r 20 3r 30 r--366 6r r 40 r--468 4r 60 r-642 6r 102850r52 + 166904r5 6r 1 4r --1 24 r--250 24r 120 r--924 24r (e) A//(6, r6). r6 is the root of 360r66-12420r +.161250r-992475r63 +2987647r4037154r6 + 1814400 0 whose approximated value is 3.92. 2 5- 52+24r 4 24 r 3 r 3 12--------- 5 3r 15 2r 144 r-2268 24 r 5 12r 10 r r 2 (f) J4(7, rT). 3 r 12r--16 2"---r --1 --2---0- 4r 6 5r 120r-1918 120 -12-12r 2 4 r 20 r 180 r-2772 12 r r7 is the root of 30 240 r-3556 12 r 2 10r 3 4r 2 10r 5r -1 r 130-24r 24r 30 r 360 r-4914 24 r 120 r- 1644 120r 720 r-7308 120 r -87442264r73-254393394r7 + 18305175r + 218041817r72-2044770r + 101606400 + l14660rT-2520r77 0 whose approximated value is 5.54. 3r 10r lr 12r 5r 7 6r 720r--14112 720 --308-120r 120r 6 5r 5 10 r 3r 84+48 r 4 48r r a _a 5r 2 3r 3 2r 42 5r r --16304840 r 120 r 5 2r 5 r 105 4r --19296+1008 r 48 r 1 20 3r 10 r 140 3r --23616+1260 r 36 r 570 P. CHARTIER 5 -57 2-7 3 2r 3 2 5r 3 5r r --14048r 20r 15r __2 48r 15 __1 6r r 924-- 120 r 120r 42 r 105 2r -30368- 1680 r 48 r A/l(8, r8). 15 r --r --1 12348720 r 720 r -42192T2520 r 120 r 3 r8 is the root of 2289061754rs -4907763545r82 + 83919360r85 -6850200r86 -64224-5040 r 720 r + 5066650482rs-580819883r 5040r88 1828915200 + 292320r 0 whose approximated value is 7.25. 2__ 7r 14 r 4 105 2---2 35r 2088-720r 2r 4 r 180--144r 144 r 10 r 35 2r 280 3r 5 r r --564--240 r 240 r 12 5r 3 2r 2 r 2 5r 3-7 2r 5 r 144+144r 144 r 5 5 r 7 6r 8 7r 5040r--117612 5040 r 10 3r _6 720 r 35 21 4r 168 5r 5r 28 3r 133488+5760 r 720 r 3r 70 r 154296-6720 r 182736-{-8064 r 240 r 3 2 5r 2-7 144 r "21r __I _r-i 4r 4 15r 6 5-7 35r 35: 3 r 1128+240r 240 21 r 84 2r 21 2_ __1 r 7r 7308-720 r 720 r 56 104544+5040 r r -223884+10080 r -288432+13440 r -402408+20160 r 144 r 240 r 720 r 5040 r -623376+40320 r 5040 Acknowledgments. I would like to thank Bernard Philippe of IRISA and Michel Crouzeix of Universit6 de Rennes I, who provided the author with numerous ideas and therefore brought an important contribution to this paper. REFERENCES AND VI. ZENNARO, Parallel algorithms for initial value problems, J. Comput. Appl. Math., 25 (1989), pp. 341-350. [1] A. BELLEN The numerical analysis of ordinary differential equations. Runge-Kutta and general linear methods, Wiley-Interscience, New York, 1987. J. CASH, On the integration of stiff systems of ODEs using extended backward differentiation formulae, Numer. Math., 34 (1980), pp. 235-246. [2] J. BUTCHER, [3] L-STABLE PARALLEL ONE-BLOCK METHODS 571 [4] J. CASH, Second derivative extended backward differentiation formulas for the numerical integration of stiff systems, SIAM J. Numer. Anal., 18 (1981), pp. 21-36. [5] P. CHARTIER, Application of Bellen’s method to ode’s with dissipative right-hand side, in Com[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] puting Methods in Applied Sciences and Engineering, R. Glowinski, ed., 1992, pp. 537-546. L-stable parallel one-block methods for ordinary differential equations, Tech. Rep. 1650, INRIA, March 1993. vI. CHU AND H. HAMILTON, Parallel solution of ODEs by multi-block methods, SIAM J. Sci. Statist. Comput., 8 (1987), pp. 342-353. W. ENRIGHT, Optimal second derivative methods for stiff systems, in Stiff Differential Systems, R. Willoughby, ed., Plenum Press, New York, 1974. Second derivative multistep methods for stiff ordinai,y differential equations, SIAM J. Numer. Anal., 11 (1974), pp. 321-331. /I. FRANKLIN, Parallel solution of ordinary differential equations, IEEE Trans. Comput., C-27 (1978), pp. 413-420. A. FaIEDL[ AND R. JELTSCH, An algebraic test for Ao-stability, BIT, 18 (1978), pp. 402-414. E. HAVe.Eel, S. NORSETT, AND G. WANNER, Solving Ordinary Differential Equations I. Nonsti Problems, Vol. 1, Springer-Verlag, New York, 1987. E. HARER AND (. WANNER, Solving Ordinary Differential Equations II. Stiff and DifferentialAlgebraic Problems, Vol. 2, Springer-Verlag, New York, 1991. P. KAPS, Rosenbrock-type methods, in Numerical Methods for Stiff Initial Value Problems., G. Dahlquist and R. Jeltsch, eds., Inst. fur Geometric und Praktische Mathematik der RWTH Aachen, .Germany, 1981. I. LIE AiD S. NORSETT, Superconvergence for multistep collocation, Math. Comput., 57 (1989), pp. 65-79. M. RoY AND A. SZPIRGLAS, Complexity of computation on real algebraic numbers, J. Symbolic Comput., 10 (1990), pp. 39-51. L. SHAMPINE AND H. WATTS, A-stable implicit one-step methods, BIT, 12 (1972), pp. 252-266. B. SOMMEIJER, W. CouzY, AND P. VAN DER HOUWEN, A-stable parallel block methods for ordinary and integro-differential equations, Ph.D. thesis, Universiteit van Amsterdam, CWI, Amsterdam, February 1992. R. VERMI(LIO, Parallel Step Methods for Difference and Differential Equations, Tech. Rep., C.N.R. Progetto Finaizzato "Sistemi Informatici e Calcolo Parallelo," University of Trieste, 1989. ,