* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bernard Hanzon and Ralf L.M. Peeters, “A Faddeev Sequence
Capelli's identity wikipedia , lookup
Rotation matrix wikipedia , lookup
Jordan normal form wikipedia , lookup
Determinant wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Four-vector wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Ordinary least squares wikipedia , lookup
Matrix calculus wikipedia , lookup
Gaussian elimination wikipedia , lookup
Matrix multiplication wikipedia , lookup
A Faddeev Sequence Method for Solving Lyapunov and Sylvester Equations Bernard Hanzon∗ and Ralf L.M. Peeters† Abstract Lyapunov equations and more generally Sylvester equations play an important role in linear systems theory. They attain the form AP + P B = K (continuous-time) and P − AP B = K (discrete-time) where matrix P (of size m × n) is the matrix to solve for. Here we present a method of solving such equations by exploiting the matrix-algebra structure of the problem. No use is made of Kronecker products and the largest matrices occurring in the algorithms are of sizes m × m, m × n and n × n. The Faddeev method for matrix inversion lies at the very heart of the algorithms presented. The idea is to associate sequences of linear operators with the sequences of matrices occurring in the Faddeev algorithms applied to matrices A and B, and then to use these operators to represent the Faddeev sequence for the Lyapunov and Sylvester operators. The resulting algorithms are capable of exactly solving the equations in a finite number of recursion steps. They turn out to be well-suited for symbolic calculation. The concept of a Faddeev reachability matrix introduced here turns out to be useful. It establishes a close connection between the controller canonical (companion) form of a reachable pair (A, b) and the Faddeev sequence of A. If A is already on controller form, then its Faddeev sequence takes on an especially simple form. Further simplifications arise in case of symmetry, i.e., if B = AT . Alternative algorithms can then be developed which require less iterations. Using Faddeev reachability matrices, it is shown how a solution can be quickly obtained for an equation with an arbitrary right-hand side K, provided a solution is known for a right-hand side xy T of rank-one where (A, x) and (B T , y) are reachable pairs. These results are of interest also outside the scope of Faddeev sequence methods. We conclude with some examples concerning the symbolic solution of the continuous-time Lyapunov equation AP + P AT = −bbT with (A, b) on controller form. 1 Introduction In the theory of linear dynamical systems, linear matrix equations like Lyapunov and more generally Sylvester equations play an important role. There are many numerically well-tested algorithms for such equations. However for several applications it is important to obtain symbolic solutions of such equations. For example in model reduction theory one encounters optimization problems in which the criterion function can be expressed in terms of the solutions of one or several Lyapunov equations. In order to apply gradient search algorithms, one needs the derivatives of the criterion function. Also if one wants to solve the first order conditions algebraically the derivatives of the criterion function are required. Therefore having an explicit symbolic expression of the criterion function is very useful. Also in the application of techniques from Riemannian geometry to problems in systems theory, like system identification, model reduction, parametrization etc., the calculation of Riemannian metric tensors plays an important role and often involves the solution of several Lyapunov and Sylvester equations. Again in that case it is very useful to have the answers in symbolic form, because ∗ Dept. Econometrics, Free University Amsterdam, De Boelelaan 1105, 1081 HV Amsterdam, Fax: +31-20-4446020, Email: [email protected] † Dept. Mathematics, University of Limburg, P.O. Box 616, 6200 MD Maastricht, The Netherlands, Fax: +31-43211889, Email: [email protected] 0 Submitted to Lin. Alg. Appl. 0 Discussions with J.M. Maciejowski and C.T. Chou are gratefully acknowledged. 1 further calculations to obtain curvature tensors etc., require taking derivatives. For stochastic linear dynamical models the Fisher information matrix is in fact a Riemannian metric tensor and it can also be obtained in symbolic form by solving a number of Lyapunov and Sylvester equations. For further information on these issues the reader is referred to [9, 4, 5]. One straightforward approach to solving such equations symbolically is to use the fact that the equations are linear and use Kronecker products to transform the problem into one in which an mn×mn matrix has to be inverted, if the matrix sought for is m×n. This usually works in practice, i.e. with a computer algebra package like Maple1 or Mathematica2 , only for problems in which the values of m and n are small. In this paper an alternative approach is presented, which exploits the matrix algebra structure of the problem. In existing algorithms of this kind (cf. [3, 7]) one needs to perform first a number of polynomial calculations starting with the characteristic polynomial. Here a different, but theoretically related approach is taken in which calculations with the characteristic polynomial are avoided. The idea behind the presented algorithm is to apply Faddeev’s algorithm for inversion of a linear finite dimensional operator to Lyapunov and Sylvester equations. This leads to a recursive procedure which ends after a finite number of steps, related to the size of the problem. In the algorithms the use of Kronecker products is avoided and the largest matrices occurring are of sizes n × n, m × m, m × n. Due to the recursive structure of the algorithm, it can be programmed in a concise way. One can calculate the L2 norm of a SISO stable linear dynamical system by solving a related Lyapunov equation. An important special case of our method occurs if the SISO stable linear system is in controller canonical form. This form is not to be confused with the controllability canonical form. It turns out that the controller canonical form can in fact be understood as the canonical form that is obtained by choosing the basis of the state space in a way that is directly related to the Faddeev sequence of the dynamical matrix of the system. Although the controller canonical form is one of the most well-known canonical forms, to the best of our knowledge this has not been noted before. In the controller canonical form the dynamical matrix is in companion form. It turns out that the Lyapunov equation can in most cases be reduced to the special case in which the dynamical matrix is in companion form. We give the symbolic solution to the Lyapunov equation for this case for a number of choices for n. Some experiences with the algorithm in computer algebra calculations are reported upon. We encountered cases which could not be handled by the Kronecker products method, because of memory problems, which could be handled by the method proposed here. At least for numerical calculations this method appears to require less operations, however experience shows that it is numerically unreliable. Of course this does not play a role in computer algebra applications where exact arithmetic is used. 2 The Faddeev sequence of a matrix and matrix inversion One of the basic problems in linear algebra is the calculation of the inverse of a square n × n nonsingular matrix A. An interesting matrix-algebra method to calculate the inverse can be obtained by exploiting the properties of the Faddeev sequence of A. The Faddeev sequence of the matrix A is recursively defined as follows: A(0) := In Ã(0) := A(0)A (2.1) In A(k) := Ã(k − 1) − tr(Ã(k−1)) k Ã(k) := A(k)A k = 1, 2, . . . (2.2) Let the characteristic polynomial of A be given by p(s) = det(sIn − A) = sn + p1 sn−1 + . . . + pn . Then the Faddeev sequence has the following nice properties, derived from the Newton identities (cf. [2, p.87]): tr(Ã(k − 1)) pk = − , k = 1, 2, . . . , n (2.3) k 1 Maple is a registered trademark of Waterloo Maple Software. is a registered trademark of Wolfram Research, Inc. 2 Mathematica 2 and A(k) = Ak + p1 Ak−1 + p2 Ak−2 + . . . + pk−1 A + pk In , k = 0, 1, 2, . . . , n (2.4) So due to the theorem of Cayley-Hamilton A(n) = p(A) = 0. (From this it follows directly that A(k) = 0 for all k ≥ n.) Therefore if A is nonsingular, − tr(Ã(n − 1)) = pn = det(−A) 6= 0 n and tr(Ã(n − 1)) In n from which the following formula for the inverse of A can be derived easily: 0 = A(n) = Ã(n − 1) − A−1 = n A(n − 1) tr(Ã(n − 1)) (2.5) Note that the inverse is obtained by a sequence of operations consisting of multiplication by A, taking trace, subtraction of a scalar multiple of the identity matrix and division by a scalar. Therefore the algorithm can in fact be applied to any finite dimensional linear endomorphism (i.e., linear operator of which the domain is equal to the codomain) and is independent of the choice of basis in the vector space in which the operator acts. Therefore one can speak of the Faddeev sequence of a linear endomorphism and the inverse of a linear endomorphism can be constructed from the Faddeev sequence in the way described. This will be important in the following sections. It may be good to note at this point that in the literature the Faddeev sequence is usually presented as a means to calculate the resolvent (sIn − A)−1 of a matrix A. The relevant formula for the resolvent is (sIn − A)−1 = A(0)sn−1 + A(1)sn−2 + . . . + A(n − 2)s + A(n − 1) p(s) where p(s) also follows from the Faddeev sequence calculations as noted before. (Compare e.g. [1, Sections 3.4, 3.5].) 3 3.1 Solving Lyapunov and Sylvester equations using Faddeev sequences The matrix-algebra approach to Lyapunov and Sylvester equations Consider Sylvester equations of the form AP + P B = K (3.1) P − AP B = K, (3.2) and where A is a given m × m matrix, B is a given n × n matrix, K is a given m × n matrix and P is an unknown m × n matrix for which we want to solve the equation. In order to do that consider the linear matrix operators L = LA , R = RB and Imn defined by LA : P 7→ AP, RB : P 7→ P B and Imn : P 7→ P This last one is clearly the identity on the vector space of m×n matrices. Define the linear operators C := L + R and D := Imn + LR. Then the Sylvester equations (3.1) and (3.2) can be written as C(P ) = K 3 (3.3) and D(P ) = K (3.4) respectively. Because C and D are linear endomorphisms (on the vector space of m × n matrices), the solution is given abstractly by P = C −1 (K) and P = D−1 (K) respectively, if C and D are invertible. This is known to be the case iff A and −B have no eigenvalues in common, and iff no eigenvalue of A is the reciprocal of an eigenvalue of B, respectively (cf., e.g., [2]). Therefore the question arises how C −1 and D−1 can be calculated. In principle the techniques of ˜ | k = 0, 1, 2, ...}, the previous section can be applied: one can define the Faddeev sequence {C(k), C(k) mn −1 ˜ for which C(k) = 0, C(k) = 0 for all k ≥ nm. Then C = tr C̃(mn−1) C(mn − 1) and similarly for D. In order to apply this idea one needs an explicit representation of C and its Faddeev sequence. Here we propose to represent C and the elements of its Faddeev sequence as a linear combination of operators Li Rj , where Li := LA(i) denotes left multiplication of an m × n matrix by the matrix A(i) from the Faddeev sequence of A, and where similarly Rj := RB(j) denotes right multiplication of an m × n matrix by the matrix B(j) from the Faddeev sequence of B. The reason why C and the elements of its Faddeev sequence can be represented in this way is twofold: (i) Left-multiplication of an m×n matrix by a matrix Ā and right-multiplication of an m×n matrix by an n × n matrix B̄ are commutative (due to the associativity of matrix multiplication): LĀ RB̄ = RB̄ LĀ for all Ā and B̄ of the correct sizes. (ii) An arbitrary polynomial in A can be written as a linear combination of elements A(i), i = 0, 1, . . . of the Faddeev sequence of A, because A(i) = Ai + lower degree terms for each i = 0, 1, 2, . . . . Because A(i) = 0 for all i ≥ m, this implies that it can actually be written as as linear combination of A(0), A(1), . . . , A(m − 1). It follows directly that an arbitrary polynomial in L can be written as a linear combination of L0 , L1 , . . . , Lm−1. Similarly an arbitrary polynomial in R can be written as a linear combination of R0 , R1 , . . . , Rn−1 . Combining (i) and (ii) one finds that each polynomial in C can be rewritten as a linear combination of {Li Rj |i = 0, 1, 2, . . . , m − 1; j = 1, 2, . . . , n − 1}. Note that the linear combination does not necessarily have to be unique. (It is not iff the degree of the minimal polynomial of C is less than the degree of its characteristic polynomial.) Even if there is linear dependence of this kind, the algorithm presented below still works without any changes, although a more careful analysis may then produce a quicker algorithm. The symmetric case B = AT is an important example of such a situation, to which we return in Section 5. 3.2 Derivation of the Faddeev sequence formulae An important ingredient in the calculation of the Faddeev sequences of C and D is the calculation of ˜ ˜ the traces of C(k) and D̃(k). Because C(k) is represented as a linear combination of endomorphisms of the form Li Rj the question arises how one can calculate the trace of Li Rj . The answer is given in the following lemma. Lemma 3.1 tr{Li Rj } = trA(i)trB(j) Proof. For each i ∈ {0, . . . , m − 1}, j ∈ {0, . . . , n − 1}, Li Rj is a linear endomorphism of the vector space of m × n matrices. Therefore its trace is well-defined, independent of the specific choice of a basis in the vector space. Consider the inner product h·, ·i on the vector space Rm×n of m × n matrices given by hP, Qi = tr{P T Q}; P, Q ∈ Rm×n 4 Then an orthogonal basis is given by {Ekl = ek flT |k = 1, . . . , m; l = 1, . . . , n} with ek the kth standard basis vector in Rm and fl the lth standard basis vector in Rn . The trace of an endomorphism Li Rj of Rm×n is equal to Pm Pn Pm Pn T T T l=1 hEkl , Li Rj (Ekl )i = k=1 k=1 l=1 tr (ek fl ) A(i)ek fl B(j) = Pn Pm T Pm Pn T = k=1 l=1 eTk A(i)ek flT B(j)fl = k=1 ek A(i)ek l=1 fl B(j)fl = trA(i)trB(j) 2 Theorem 3.2 Define the m × n coefficient matrices m−1,n−1 k = 0, 1, . . . , mn − 1 (3.5) m−1,n−1 (c̃ij (k))i=0,j=0 k = 0, 1, . . . , mn − 1 (3.6) C(k) = (cij (k))i=0,j=0 C̃(k) = by the following recursive formulas; C(0) = E11 and for each k = 0, 1, 2, . . . , mn − 1 c̃00 (k) = m−1 X i=0 c̃0j (k) = m−1 X i=0 c̃i0 (k) = n−1 X n−1 X trÃ(i) trB̃(j) ci0 (k) + c0j (k) i+1 j+1 j=0 (3.7) trÃ(i) cij (k) i+1 if j > 0 (3.8) trB̃(j) j+1 if i > 0 (3.9) cij (k) j=0 c̃ij (k) c00 (k + 1) cij (k + 1) = ci−1,j (k) + ci,j−1 (k) = c̃00 (k) − 1 k+1 if i > 0 and j > 0 m−1 X n−1 X c̃ij (k)trA(i)trB(j) (3.11) i=0 j=0 if (i, j) 6= (0, 0) = c̃ij (k) (3.10) (3.12) Then for each k = 0, 1, 2, . . . , mn − 1 C(k) = m−1 X X n−1 cij (k)Li Rj (3.13) c̃ij (k)Li Rj (3.14) i=0 j=0 ˜ C(k) = m−1 X n−1 X i=0 j=0 Proof. Induction is used for k = 0, 1, 2, . . . , mn − 1. Clearly C(0) = Imn = L0 R0 = m−1 X n−1 X cij (0)Li Rj i=0 j=0 with c00 = 1 and cij = 0 if (i, j) 6= (0, 0). Now suppose (this is the induction hypothesis) that C(k) = m−1 X n−1 X cij (k)Li Rj i=0 j=0 then ˜ C(k) = CC(k) = (L + R) X cij (k)Li Rj = ij X ij 5 cij (k)(LLi Rj + Li RRj ) Applying LLi to an m × n matrix means multiplication on the left by AA(i) = A(i)A = Ã(i) =A(i + Ã(i) Ã(i) 1) + tri+1 Im = A(i + 1) + tri+1 A(0) by definition of A(i + 1) and A(0). Therefore trÃ(i) L0 i+1 LLi = Li+1 + Similarly RRj = Rj+1 + trB̃(j) R0 j+1 It follows that ˜ C(k) trÃ(i) trB̃(j) = cij (k) Li+1 Rj + L0 Rj + Li Rj+1 + Li R0 i+1 j+1 = m−2 X n−1 X cij (k)Li+1 Rj + i=0 j=0 + m−1 X n−1 X i=0 j=0 m−1 X n−2 X cij (k)Li Rj+1 + i=0 j=0 = m−1 X n−1 X cij (k) m−1 X n−1 X (ci−1,j (k) + ci,j−1 (k)) Li Rj + i=1 j=1 + m−1 X n−1 X i=0 This is equal to i=0 cij (k) j=0 Pm−1 Pn−1 j=0 c̃00 (k) m−1 X = m−1 X = i=0 c̃i0 (k) m−1 X j=0 i=0 ! trÃ(i) cij (k) L0 Rj + i+1 trB̃(j) Li R0 j+1 n−1 X = n−1 X trB̃(j) trÃ(i) c0j (k) ci0 (k) + i+1 j+1 j=0 trÃ(i) cij (k) i+1 if j > 0 trB̃(j) j+1 if i > 0 cij (k) j=0 c̃ij (k) n−1 X c̃ij (k)Li Rj if the c̃ij (k) are assigned to be i=0 c̃0j (k) trB̃(j) Li R0 = j+1 = trÃ(i) L0 Rj + i+1 cij (k) i=0 j=0 ! = ci−1,j (k) + ci,j−1 (k) if i > 0 and j > 0 Next consider the equation ˜ − C(k + 1) = C(k) trC̃(k) Imn k+1 First note that application of Lemma 3.1 gives m−1 m−1 X n−1 X X n−1 X ˜ trC(k) 1 1 = c̃ij (k) tr (Li Rj ) = c̃ij (k) trA(i)trB(j) k+1 k + 1 k + 1 i=0 j=0 i=0 j=0 It follows that C(k + 1) = m−1 X n−1 X m−1 X n−1 X c̃ij (k)Li Rj − i=0 j=0 This is equal to C(k + 1) = i=0 j=0 Pm−1 Pn−1 i=0 j=0 c̃ij (k) trA(i)trB(j) L0 R0 k+1 cij (k + 1)Li Rj if the cij (k + 1) are assigned to be c00 (k + 1) = c̃00 (k) − cij (k + 1) = c̃ij (k) m−1 n−1 1 XX c̃ij (k)trA(i)trB(j) k + 1 i=0 j=0 if (i, j) 6= (0, 0) 6 2 The analogous result for the Faddeev sequence of D is as follows Theorem 3.3 Define the m × n coefficient matrices D(k) D̃(k) m−1,n−1 (dij (k))i=0,j=0 , for k = 0, 1, 2, . . . , mn − 1 m−1,n−1 = d˜ij (k) , for k = 0, 1, 2, . . . , mn − 1 = (3.15) (3.16) i=0,j=0 by the following recursive formulas: D(0) = E11 and for each k = 0, 1, 2, . . . , mn − 1 d˜00 (k) = d00 (k) − m−1 X n−1 X i=0 j=0 d˜0j (k) = d0j (k) − m−1 X i=0 d˜i0 (k) = di0 (k) − n−1 X j=0 d˜ij (k) d00 (k + 1) dij (k + 1) trÃ(i) trB̃(j) dij (k) i+1 j+1 trÃ(i) di,j−1 (k) i+1 if j > 0 (3.18) trB̃(j) di−1,j (k) j+1 if i > 0 (3.19) = dij (k) − di−1,j−1 (k) = d˜00 (k) − 1 k+1 = d˜ij (k) if i > 0, j > 0 m−1 X n−1 X d˜ij (k)trA(i)trB(j) (3.20) (3.21) i=0 j=0 if (i, j) 6= (0, 0) (3.22) Proof. Analogous to the proof of the previous theorem; this is left to the reader. 3.3 (3.17) 2 The Faddeev sequence formulae in matrix-vector notation The recursive equations for the Faddeev-sequence of the matrix operators C and D can also be cast in matrix-vector notation. Let p(s) = det(sIm − A) = sm + p1 sm−1 + . . . + pm be the characteristic polynomial of A and q(s) = det(sIn − B) = sn + q1 sn−1 + . . . + qn be the characteristic polynomial of B. Let Ac be the m × m matrix given by −p1 −p2 . . . −pm−1 pm 1 0 ... 0 0 . .. . . . . 1 . . Ac := 0 (3.23) . . . .. .. .. 0 0 0 ... 0 1 0 and let Bc be the n × n matrix given by −q1 1 Bc := 0 . .. 0 −q2 0 . . . −qn−1 ... 0 .. .. . . .. . 0 1 .. . ... 0 7 1 qn 0 .. . 0 0 (3.24) Furthermore let τA ∈ Rm denote the vector of traces of the elements A(0), A(1), . . . , A(m − 1) of the Faddeev sequence of A: trA(0) trA(1) (3.25) τA := .. . trA(m − 1) trIm = and let τB ∈ Rn be defined analogously. Note that for k > 0, trA(k) = trÃ(k − 1) − trÃ(k−1) k tr Ãk−1 (m−k)pk , where the equality pk = − k is used. Of course, defining p0 := 1, trA(0) = m = mp0 . So one can express the elements of the vector τA in terms of the pk , k = 0, 1, 2, . . . , m − 1, as follows: mp0 (m − 1)p1 .. (3.26) τA = . 2pm−2 pm−1 An analogous formula holds for τB . It is now straightforward to verify that the recursive formulae given in Theorems 3.2 and 3.3 for the coefficient matrices of Faddeev sequences of C and D can be rewritten as: C(0) = E11 C̃(0) = Ac C(0) + C(0)BcT (3.27) τ T C̃(k−1)τB C(k) = C̃(k − 1) − A k C̃(k) = Ac C(k) + C(k)BcT E11 k = 1, 2, . . . , mn − 1 (3.28) and D(0) = E11 D̃(0) = D(0) − Ac D(0)BcT (3.29) τ T D̃(k−1)τB D(k) = D̃(k − 1) − A k D̃(k) = D(k) − Ac D(k)BcT E11 k = 1, 2, . . . , mn − 1 (3.30) It follows that the inverse of C can be expressed as C −1 = m−1 X n−1 X γij Li Rj (3.31) i=0 j=0 m,n where the m × n matrix Γ = (γi−1,j−1 )i=1,j=1 is given by the formula Γ= mn C(mn − 1) − 1)τB τAT C̃(mn (3.32) It may be interesting to note that Γ has a structure that could perhaps be called alternating-Hankel: γij = −γi−1,j+1 , for i = 1, 2, 3, . . . , m − 1; j = 0, 1, 2, . . . , n − 2 This can easily be derived from the equality Ac Γ + ΓBcT = mn Ac C(mn − 1) + C(mn − 1)BcT = E11 − 1)τB τAT C̃(mn together with the special structure of Ac and Bc . The inverse of D can be expressed as D−1 = m−1 X n−1 X i=0 j=0 8 δij Li Rj (3.33) m,n where the m × n matrix ∆ = (δi−1,j−1 )i=1,j=1 is given by the formula ∆= mn D(mn − 1) − 1)τB (3.34) τAT D̃(mn It may be interesting to note that ∆ is a Toeplitz matrix. This can easily be derived from the equality mn ∆ − Ac ∆BcT = T D(mn − 1) − Ac D(mn − 1)BcT = E11 τA D̃(mn − 1)τB together with the special structure of Ac and Bc . 3.4 Solution of the Lyapunov and Sylvester equations The solution of the matrix equation AP + P B = K can now be given by the formula P =C −1 (K) = m−1 X n−1 X γij Li Rj (K) = i=0 j=0 m−1 X n−1 X γij A(i)KB(j) (3.35) δij A(i)KB(j) (3.36) i=0 j=0 Similarly the solution of the matrix equation P − AP B = K is given by P = D−1 (K) = m−1 X X n−1 δij Li Rj (K) = i=0 j=0 m−1 X n−1 X i=0 j=0 In order to write this in a concise way, the following definition will be helpful. Definition 3.4 Consider a pair (A, b) with A an m × m matrix and b an m × 1 column vector. The matrix FA (b) := [A(0)b, A(1)b, . . . , A(m − 1)b] will be called the Faddeev reachability matrix of the pair (A, b). The name of this matrix is clarified in the next section when relations with system theoretical concepts are treated. If K = xy T is a rank one matrix, where x and y are column vectors, then the solution of AP + P B = K can be rewritten as T P = FA (x)Γ (FB T (y)) (3.37) This follows directly from (3.35) using the basic rules of matrix multiplication. Similarly the solution of P − AP B = xy T is given by T P = FA (x)∆ (FB T (y)) (3.38) Pr T Any matrix K can of course be written as a linear combination k=1 xk yk of rank-one matrices, Pm where r ≤ min(m, n). (For instance, one can write K = i=1 ei kiT with kiT denoting the ith row of K.) The solution of AP + P B = K is then given by P = r X T (3.39) T (3.40) FA (xk )Γ (FB T (yk )) k=1 and similarly the solution of P − AP B = K is then given by P = r X FA (xk )∆ (FB T (yk )) k=1 This formula has some interesting consequences, which are of interest even outside the scope of Faddeev sequence methods and which apparently have not been noted before. 9 Corollary 3.5 If (A, x) and (B T , y) are pairs such that FA (x) and FB T (y) are nonsingular (i.e. (A, x) and (B T , y) are reachable pairs, cf. the next section) then: (i) If Σ is the solution of the equation AΣ + ΣB = xy T then Γ = FA (x)−1 Σ FB T (y)−1 T Similarly if Σ is the solution of Σ − AΣB = xy T then ∆ = FA (x)−1 Σ FB T (y)−1 T T (ii) If PrΣ is the Tsolution of the equation AΣ + ΣB = xy then the solution of AP + P B = K = k=1 x(k) y(k) is equal to P = r X FA (x(k) )FA (x)−1 Σ FB T (y)−1 T T FB T (y(k) ) k=1 and similarly if Σ is the solution of Σ − AΣB = xy T then the solution of P − AP B = K = P r T k=1 x(k) y(k) is equal to P = r X FA (x(k) )FA (x)−1 Σ FB T (y)−1 T T FB T (y(k) ) k=1 The second part of this corollary tells how to find a solution if the right-hand side is K in case a solution is known for a specific right-hand side xy T such that (A, x) and (B T , y) are reachable. This T takes on an especially simple form if K = x(1) y(1) , then the solution P is equal to P = FA (x(1) )FA (x)−1 Σ FB T (y)T −1 FB T (y(1) )T This can also be shown directly, bypassing the Faddeev sequences of C and D as follows. Note that A(i) commutes with A; therefore multiplication on the left of the equation AΣ + ΣB = xy T by A(i) gives A (A(i)Σ) + A(i)ΣB = (A(i)x)y T which shows that if x(1) = A(i)x then P = A(i)Σ Pm−1 is the solution. Taking a linear combination x1 = i=0 ξi+1 A(i)x one finds the solution P = Pm−1 ξ A(i)Σ of the corresponding matrix equation with right-hand side K = x(1) y T . The m × 1 i+1 i=0 m −1 column vector (1) and on the other Pmξ = (ξi )i=1 has the property that on the one hand ξ = FA (x) xP m hand P = i=1 ξi A(i − 1)Σ. Now considering the lth column of the matrix i=1 ξi A(i − 1) one −1 finds that it is equal to FA (el )ξ = FA (el )FA (x) x(1) , for l = 1, 2, . . . , m. In order to make the next step, the following remarkable lemma is required. Lemma 3.6 Let (A, x) ∈ Rm×m × Rm be such that FA (x) is a nonsingular m × m matrix (i.e., (A, x) is a reachable pair) then for all u, v ∈ Rm the equality FA (u)FA (x)−1 v = FA (v)FA (x)−1 u holds. Proof. Because FA (x) is nonsingular, the vectors A(0)x, A(1)x, . . . , A(m − 1)x form a basis of Rm . As FA (u)FA (x)−1 v is a bilinear form in u and v, it suffices to show the equality for the cases where u = A(k)x and v = A(l)x, with k = 0, 1, . . . , m − 1; l = 0, 1, . . . , m − 1. Because all the elements of the Faddeev sequence of A commute one has FA (A(k)x)FA (x)−1 A(l)x = A(k)FA (x)FA (x)−1 A(l)x = A(k)A(l)x = = A(l)A(k)x = FA (A(l)x)FA (x)−1 A(k)x 2 10 Pm Applying this lemma to the lth column of the matrix i=1 ξi A(i − 1) one finds that it is equal to FA (el )FA (x)−1 x(1) = FA (x(1) )FA (x)−1 el . Therefore the matrix is in fact equal to FA (x(1) )FA (x)−1 . T A similar reasoning can be applied if y T is replaced by y(1) in the right-hand side of the matrix T equation, using the reachability of the pair (B , y). In this way a direct proof of the second part of the corollary above has been obtained. Remark. In case m = n and B = AT , there are several parametrized families of matrices A for which an accompanying vector b is known such that (A, b) is reachable (i.e. FA (b) nonsingular) and such that the equation AP + P A = −bbT , respectively P − AP AT = bbT has a known (simple) solution. For example if A stems from a balanced parametrization then AP + P AT = −bbT has a known diagonal solution matrix P = Σ (cf., e.g., [8]). It follows that in that case the solution of AP + P AT = K can be found directly from the second part of the corollary, without going through the calculation of the Faddeev sequence of C. Another example is when A is in Schwarz form, i.e. A = (−b21 /2)E11 + Ask where Ask is an arbitrary tridiagonal skew-symmetric matrix. Then A + AT = −bbT if b = b1 e1 , implying that the identity matrix is the solution of the Lyapunov equation. The solution of AP + P AT = x(1) y(1) is then obtained from the corollary as: T T P = −FA (x(1) )FA (x)−1 FA (x)−1 FA (x(1) Also for the discrete-time Lyapunov equation such parametrized families are known (cf., e.g., [10]). 4 Relations with the controller canonical form of a pair (A, b) In linear systems theory (cf., e.g., [6]) a pair (A, b) ∈ Rm×m ×Rm is called reachable if the reachability matrix RA (b) := b, Ab, A2 b, . . . , Am−1 b is nonsingular. Because for the elements A(k), k = 0, 1, 2, . . . of the Faddeev sequence of A the equality A(k) = Ak + p1 Ak−1 + . . . + pk−1 A0 holds, the reachability matrix RA (b) is related to the matrix FA (b) of the previous section by the formula RA (b)U = FA (b) where U is the upper triangular m × m Toeplitz matrix with first row equal to (1, p1 , p2 , . . . , pm−1 ). It follows immediately that RA (b) and FA (b) have equal rank and that if one of these matrices is nonsingular then so is the other. Therefore the matrix FA (b) is called the Faddeev reachability matrix in this paper. In linear systems theory two reachable pairs (A(1) , b(1) ) and (A(2) , b(2) ) are considered to be equivalent if there exists a nonsingular transformation matrix T such that A(2) = T A(1) T −1 and b(2) = T b(1) . Such a transformation corresponds to a change of basis in the space Rm on which A operates as an endomorphism and in which b lies as a vector. This space is called the state space. A special choice of the state space basis can lead to a particularly simple form of the pair (A, b) which can simplify certain calculations and from which certain properties can be deduced more easily. One speaks of a canonical form for the pair (A, b). Choosing the columns of the reachability matrix as a basis for the state space leads to the well-known controllability canonical form (cf. [6], pp.335-6): 0 0 . . . 0 −pm .. .. 1 0 . . .. Ay = 0 1 . . . ... (4.1) . . . .. .. .. . 0 0 0 1 −p1 by = (1, 0, . . . , 0)T (4.2) It is easy to verify that RAy (by ) = Im . Choosing the columns of the Faddeev reachability matrix as a basis for the state space leads to the well-known controller form. The relation of this form with 11 the Faddeev sequence of the dynamical matrix seems not to have been noticed in the literature. The controller form will be denoted by (Ac , bc ): −p1 −p2 . . . −pm−1 −pm 1 0 ... 0 0 . .. . . . . 1 . . (4.3) Ac = 0 . .. . . . . . 0 . 0 0 1 0 bc = (1, 0, . . . , 0)T (4.4) (Note that the notation Ac was used already in the previous section for matrices with this structure). That this is indeed the form of the reachable pair if the columns of the Faddeev reachability matrix are chosen as basis for the state space follows simply from the fact that for k = 0, 1, 2, . . . , m − 1: A (A(k)b) = Ã(k)b = A(k + 1)b + trÃ(k) b k+1 Ã(k) and as was stated before, trk+1 = −pk+1 . So if the columns of the Faddeev reachability matrix are chosen as the basis of the state space, then of course b = e1 and A(k − 1)b = ek , k = 1, 2, . . . , m and A(m)b = 0 because A(m) = 0. So Aem = −pm e1 and Aek = ek+1 − pk e1 , k = 1, . . . , m − 1 and so (A, b) is indeed in controller form. Furthermore if (Ac , bc ) is a reachable pair in controller form, then the characteristic polynomial is det(sI − Ac ) = p(s) and therefore choosing the columns of the Faddeev reachability matrix as a basis of the state space keeps the reachable pair unchanged, which implies that the Faddeev reachability matrix is the identity matrix: FAc (bc ) = Im . This leads to an interpretation of the coefficient matrices Γ and ∆ of C −1 and D−1 respectively, as follows. Let A = Ac have characteristic polynomial p(s) of degree m and B = BcT have characteristic polynomial q(s) of degree n, then the equation Ac P + P BcT = E11 = e1 f1T has solution Γ as defined in (3.32). This follows simply from (3.37) because FAc (e1 ) = Im and FBc (f1 ) = In . Similarly the equation P − Ac P BcT = E11 = e1 f1T has solution ∆. Note that Γ and ∆ clearly depend only on the coefficients p1 , p2 , . . . , pm ; q1 , . . . , qn , which are system invariants and therefore Γ and ∆ are themselves system invariants. If Bc = Ac , and Ac has its spectrum in the open left half plane then −Γ is the positive definite continuous-time reachability Gramian of the pair (Ac , e1 ). And if Bc = Ac , and Ac has its spectrum in the open unit disk then ∆ is the positive definite discrete-time reachability Gramian of the same pair. If A = Ac and B = BcT in the Sylvester equations studied here, a remarkable simplification of the solution formulas can be obtained, by studying the Faddeev sequence of an endomorphism in controller form Ac more closely. Firstly as a corollary of Lemma 3.6 one can obtain the following equality Ac (i − 1)ej = Ac (j − 1)ei , i = 1, . . . , m; j = 1, . . . , m (4.5) In order to derive this from Lemma 3.6, use that FAc (e1 ) = Im and that FAc (ej )ei = Ac (i − 1)ej . Combining this one obtains Ac (i − 1)ej = FAc (ej )FAc (e1 )−1 ei = FAc (ei )FAc (e1 )−1 ej = Ac (j − 1)ei This implies that FAc (ei ) = Ac (i − 1), i = 1, . . . , m. Secondly, inspection of the Faddeev sequence of Ac (for small values of m — here computer algebra has turned out to be very helpful) learns that it has a remarkably simple structure! Even so much that one need not calculate it, one can just apply the following theorem. 12 Theorem 4.1 Let (Ac , e1 ) be in controller form and let Ac have characteristic polynomial p(s) of degree m. Then the matrix A(k), k = 0, 1, 2, . . . , m − 1 from the Faddeev sequence of A = Ac can be partitioned as A1 (k) A(k) = A2 (k) where A1 (k) is an k × m Toeplitz matrix (it is not there if k = 0) and A2 (k) is an (m − k) × m Toeplitz matrix. A1 (k) is given by its first row and column: the first row is (0, −pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0), so the number of zeroes at the end is k − 1; the first column is zero. A2 (k) is given by its last row and last column: its last row is (0, . . . , 0, p0 , p1 , . . . , pk ), with p0 := 1, the number of zeroes at the beginning is of course m − 1 − k; the last column is (0, . . . , 0, pk )T and the number of zeroes above pk is m − 1 − k. Proof. By induction. A(0) = Im is clearly of this form and 0 −p2 . . . −pm−1 1 p1 . . . 0 .. . .. 1 . A(1) = Ac + p1 Im = 0 . . .. .. p1 0 0 1 −pm 0 .. . .. . p1 is also clearly of this form. Suppose (induction hypothesis) that A(k) is of this form. Then the first row of Ã(k) = A(k)A is equal to eT1 A(k)A = (0, −pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0)A = (−pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0). The other rows of Ã(k) consist of the first m − 1 rows of A(k), shifted down by one row, since we can write 0 ... ... 0 .. 1 ... . (4.6) Ã(k) = AA(k) = e1 (−p1 , −p2 , . . . , −pm )A(k) + . .. .. . .. . 1 0 We find that Ã(k) can be partitioned into two Toeplitz matrices Ã1 (k) and Ã2 (k), where the first is of size (k + 1) × m and the second of size (m − k − 1) × m. Ã1 (k) is characterized by its first row and column: the first row is (−pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0), the first column is (−pk+1 , 0, . . . , 0)T . Ã2 (k) is characterized by its last row and last column: its last row is (0, . . . , 0, p0 , p1 , . . . , pk , 0), with p0 := 1, the number of zeroes at the beginning is m − 2 − k; the last column is zero. Because we have that trÃ(k) A(k + 1) = Ã(k) − Im = Ã(k) + pk+1 Im (4.7) k+1 it follows that A(k + 1) is obtained from Ã(k) by adding pk+1 Im , which clearly gives the partitioning in two Toeplitz matrices as described in the theorem. 2 Remark. From the proof of the theorem it also follows that the matrices Ã(k) allow for a similar partitioning in two Toeplitz blocks, with equally simple structure. These properties of the Faddeev sequence of Ac can be applied to simplify the solution Pm formulas for Ac P + P B = K and P − Ac P B = K. Let kiT denote the ith row of K, i.e. K = i=1 ei kiT . Using the fact that FAc (ei ) = Ac (i − 1), one obtains that the solution of Ac P + P B = K is given by P = m X T Ac (i − 1)Γ (FB T (ki )) i=1 where the matrices Ac (i − 1) require no calculations. 13 5 A special algorithm for the case of equal characteristic polynomials Consider again the equations AP + P B = K and P − AP B = K. If the characteristic polynomials p(s) and q(s) of A and B are the same, then it follows from the results in the previous sections that Γ and ∆ are the solutions of the equations Ac P + P ATc = E11 and P − Ac P ATc = E11 , where Ac is in controller form and has characteristic polynomial p(s). Let Cc denote here the linear mapping given by P 7→ Ac P + P ATc and similarly let Dc denote the mapping P 7→ P − Ac P ATc . Then Cc and Dc have the property that they map symmetric matrices to symmetric matrices and that Γ = Cc−1 (E11 ) and ∆ = Dc−1 (E11 ) where obviously E11 is symmetric. Therefore the Faddeev sequence method can be applied to invert Cc and Dc considered as endomorphisms of the 21 m(m + 1)- dimensional vector space of symmetric m × m matrices. Let Σm denote the set of m × m symmetric matrices and let us denote the Faddeev sequence of Cc |Σ by {CcΣ (k), C˜cΣ (k) | k = 0, 1, . . .} and similarly the Faddeev sequence of Dc |Σ by {DcΣ (k), D̃cΣ (k) | k = 0, 1, . . .}. The corresponding coefficient matrices are denoted by C Σ (k), C̃ Σ (k) (for k = 0, 1, . . .) and DΣ (k), D̃Σ (k) (for k = 0, 1, . . .), respectively. In order to calculate these, one needs to know the formula for the trace of a linear combination of the linear matrix operators 1 1 2 Lk Rl + 2 Ll Rk |Σm , where Lk stands for multiplication on the left by Ac (k) and Rl stands for multiplication on the right by ATc (l), where k, l ∈ {0, 1, 2, . . . , m − 1}. This is given in the following lemma. Lemma 5.1 The trace of the linear matrix operator 12 Lk Rl + 12 Ll Rk |Σm considered as an endomorphism on the set Σm of m × m matrices is equal to 12 trAc (k)trAc (l) + 12 tr {Ac (k)Ac (l)}. Proof. Choose the usual matrix inner product hP, Qi = tr P T Q as inner product on Σm . (Of course the choice of inner product does not affect the value of the trace.) An orthonormal basis Eij +Eji √ with respect to this inner product is given by {Eii }m |i > j}. It contains 12 m(m + 1) i=1 ∪ { 2 1 elements, all of which are of course symmetric. The trace of 2 Lk Rl + 12 Ll Rk |Σm is now given by: Pm P E +E E +E ( 12 Lk Rl + 12 Ll Rk )(Eii )i + i>j h ij√2 ji , ( 21 Lk Rl + 12 Ll Rk )( ij√2 ji )i = Pm Pm = 21 i=1 tr Eii Ac (k)Eii ATc (l) + 21 i=1 tr Eii Ac (l)Eii ATc (k) + i=1 hEii , + 14 tr Eij Ac (k)Eij ATc (l) + Eij Ac (k)Eji ATc (l) + Eji Ac (k)Eij ATc (l) + Eji Ac (k)Eji ATc (l) + P + 41 i>j tr Eij Ac (l)Eij ATc (k) + Eij Ac (l)Eji ATc (k) + Eji Ac (l)Eij ATc (k) + Eji Ac (l)Eji ATc (k) = Pm = i=1 eTI Ac (k)ei eTi ATc (l)ei + P + 12 i>j eTj Ac (k)ei eTj Ac (l)T ei + eTj Ac (k)ej eTi Ac (l)T ei + +eTi Ac (k)ei eTj Ac (l)T ej + eTi Ac (k)ej eTi Ac (l)T ej P i>j Working this out shows that this is equal to X 1 eTi Ac (k)ei eTj ATc (l)ej + 2 1 2 ij X eTi Ac (k)ej eTi Ac (l)T ej ij which is clearly equal to 1 2 trAc (k)trAc (l) + 12 tr {Ac (k)Ac (l)} 2 The sequence of coefficient matrices C Σ (k) is now given by C Σ (0) = E11 C̃ Σ (0) = Ac C Σ (0) + C Σ (0)ATc (5.1) τ T C̃ Σ (k−1)τA +hTA ,C̃ Σ (k−1)i E11 2k C Σ (k) = C̃ Σ (k − 1) − A C̃ Σ (k) = Ac C Σ (k) + C Σ (k)ATc 14 k = 1, 2, . . . , 12 m(m + 1) − 1 (5.2) and Γ is obtained by using the formula Γ= τAT C̃ Σ ( 12 m(m m(m + 1) C Σ ( 12 m(m + 1) − 1) + 1) − 1)τA + hTA , C̃ Σ ( 21 m(m + 1) − 1)i (5.3) where hX, Y i = Σij Xij Yij denotes the inner product of two matrices X and Y of the same size, and where TA denotes the m × m matrix which has tr {Ac (k − 1)Ac (l − 1)} as its (k, l)th element. Similarly the sequence of coefficient matrices DΣ (k) is given recursively by DΣ (0) = E11 D̃Σ (0) = DΣ (0) − Ac DΣ (0)ATc (5.4) τ T D̃ Σ (k−1)τA +hTA ,D̃ Σ (k−1)i E11 2k DΣ (k) = D̃Σ (k − 1) − A D̃Σ (k) = DΣ (k) − Ac DΣ (k)ATc k = 1, 2, . . . , 12 m(m + 1) − 1 (5.5) and ∆ is obtained from ∆= τAT D̃Σ ( 21 m(m m(m + 1) DΣ ( 12 m(m + 1) − 1) + 1) − 1)τA + hTA , D̃Σ ( 21 m(m + 1) − 1)i (5.6) Both algorithms require just 12 m(m + 1) iterations to obtain Γ and ∆ instead of the previous m2 iterations. The calculation of the traces of the operators is more involved though, but does not increase the overall complexity of the algorithms. 6 Examples In this section some outcomes of the algorithm are presented. In fact we will give the formula for Γ in the case p(s) = q(s) in terms of the coefficients of p(s) for several values of m. One reason for doing this, apart from showing what kind of results can be obtained with this algorithm, is one can obtain the matrix Γ for arbitrary parametrizations that by substitution of pk = − trÃ(k−1) k of A, and using our formulas, one can relatively easily obtain the solution of the Sylvester or Lyapunov equation involved using Γ. Because Γ is ‘alternating-Hankel’ it suffices to give a common denominator, together with the elements of the first row and last column. Furthermore, because it is as well symmetric, each element of the matrix for which i + j is odd is zero. The formulae presented below all apply to the continuous-time case, because experience shows that in the discrete-time case the outcomes are usually more involved. They have been calculated on a 486-based personal computer. Once the formulae are available, it is possible within the Maple and Mathematica software packages to substitute numerical values for the coefficients and then to obtain the outcomes with prespecified numerical accuracy. For m = 4 and A given by −p1 −p2 −p3 −p4 1 0 0 0 A= (6.1) 0 1 0 0 0 0 1 0 the coefficient matrix Γ satisfying AΓ + ΓAT = E11 r1 0 0 −r2 Γ= r2 0 0 −r3 is calculated as: r2 0 0 −r3 r3 0 0 −r4 (6.2) where r1 r2 r3 r4 = 12 (−p2 p3 + p1 p4 )/d = 12 p3 /d = − 12 p1 /d = 12 (p1 p2 − p3 )/(p4 d) 15 (6.3) (6.4) (6.5) (6.6) and where the polynomial d appearing in the denominators of r1 , . . . , r4 is given by d = p1 p2 p3 − p23 − p21 p4 (6.7) These formulae were obtained with Mathematica in about 15 seconds, using the general algorithm which does not exploit the available symmetry. The direct approach with Kronecker products using the Mathematica routines ‘LinearSolve’ and ‘Factor’ (to obtain simplified results) took about 15 seconds as well. (However, in the discrete-time case the direct approach needed 5 minutes to obtain the solution and many more minutes for simplification, while the algorithm of this paper was capable of yielding the simplified results in less than 2 minutes.) For m = 7 and A given by −p1 −p2 −p3 −p4 −p5 −p6 −p7 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 (6.8) A= 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 the coefficient matrix Γ satisfying AΓ + ΓAT r1 0 0 −r2 r2 0 0 −r Γ= 3 r3 0 0 −r4 r4 0 = E11 is calculated as: r2 0 r3 0 r4 0 r5 0 −r3 0 −r4 0 −r5 0 r3 0 r4 0 r5 0 r6 0 −r4 0 −r5 0 −r6 0 r4 0 r5 0 r6 0 r7 (6.9) where r1 = − 21 (−p1 p34 p7 − p22 p4 p5 p7 + p1 p24 p6 p5 + p24 p7 p5 + p6 p27 + p6 p22 p25 + p32 p27 −p4 p6 p5 p2 p3 − 2p6 p22 p7 p3 + p24 p7 p2 p3 − p1 p4 p26 p3 + p26 p5 p3 − p6 p4 p25 + p21 p36 −2p7 p1 p26 − 2p27 p4 p2 + 3p7 p1 p6 p4 p2 − 2p1 p5 p26 p2 + p5 p6 p7 p2 + p26 p2 p23 )/d r2 = 1 2 2 2 (p6 p3 (6.10) + p27 p22 + p5 p6 p7 − 2p3 p7 p2 p6 + p25 p2 p6 − p1 p5 p26 − p27 p4 − p7 p5 p2 p4 +p7 p1 p6 p4 − p5 p6 p3 p4 + p7 p24 p3 )/d (6.11) r3 = 2 1 2 (−p2 p7 − p1 p24 p7 + p4 p7 p5 + p1 p4 p6 p5 − p6 p25 + p6 p7 p3 − p1 p26 p3 + p6 p7 p1 p2 )/d (6.12) r4 = 1 2 2 (p7 + p6 p5 p3 − p7 p3 p4 − 2p6 p7 p1 − p6 p2 p5 p1 + p7 p2 p4 p1 + p26 p21 )/d (6.13) r5 = − 12 (p6 p23 − p2 p3 p7 + p5 p7 + p22 p7 p1 − p2 p3 p6 p1 − p5 p6 p1 − p7 p1 p4 + p6 p21 p4 )/d (6.14) r6 = − 12 (−p25 + 2p1 p5 p4 − p21 p24 − p1 p6 p3 + p7 p3 − p23 p4 + p21 p6 p2 − p1 p7 p2 + p5 p3 p2 +p1 p4 p3 p2 − p1 p5 p22 )/d r7 (6.15) = − 12 (−p25 p3 p2 + p7 p23 p2 + p31 p26 + p1 p25 p22 + p1 p5 p7 p2 + p21 p4 p7 p2 − 2p21 p5 p6 p2 −p1 p5 p4 p3 p2 + p1 p6 p23 p2 − p7 p1 p3 p22 + p4 p5 p23 + p35 − p6 p21 p4 p3 − 2p5 p7 p3 + p21 p24 p5 −2p1 p4 p25 + p1 p27 − 2p21 p7 p6 − p6 p33 + 3p5 p6 p1 p3 )/(dp7 ) (6.16) 16 and where the polynomial d appearing in the denominators of r1 , . . . , r7 is given by d = −p27 p3 p22 + 2p7 p1 p5 p24 + p7 p3 p1 p2 p24 − p23 p7 p24 + p3 p7 p5 p2 p4 − p7 p3 p1 p6 p4 +3p21 p7 p2 p6 p4 + 2p27 p3 p4 − p7 p25 p4 + p1 p2 p23 p26 − p21 p7 p34 − p2 p3 p25 p6 −2p21 p2 p5 p26 − p21 p3 p4 p26 + p21 p24 p5 p6 − p1 p2 p3 p4 p5 p6 + 3p1 p3 p5 p26 −3p21 p7 p26 + p35 p6 + p31 p36 − p33 p26 − p7 p1 p5 p22 p4 − 3p1 p27 p2 p4 −2p4 p1 p25 p6 + p7 p1 p5 p2 p6 − 3p3 p7 p5 p6 − 2p7 p3 p1 p22 p6 +2p23 p7 p2 p6 + p1 p25 p22 p6 + p1 p27 p32 − p37 + p5 p27 p2 + 3p1 p27 p6 + p23 p4 p5 p6 (6.17) These formulae were obtained using Maple in less than 1 hour, again using the general algorithm which does not exploit the symmetry. The direct approach using Kronecker products turned out to be infeasible for this particular case m = 7, because it amounts to solving a linear system of equations of size 49 × 49. As one may notice, the formulae for the case m = 4 can be reobtained by restricting to the 4 × 4 left-upper block of Γ, substituting p5 = p6 = p7 = x, cancelling common factors x and then setting x to zero. 7 Conclusions and further research The algorithms for solving Lyapunov and Sylvester equations developed in this paper have turned out to be well-suited for symbolic calculations. As the dimensions of the problem increase, the Faddeevbased recursive algorithms increasingly outperform a direct approach algorithm using Kronecker products. If the algorithms are to be used for numerical calculation only, care should be taken because numerical round-off errors tend to rapidly destroy the accuracy of final and intermediate results. (On a computer with 16-digit accuracy, the outcomes are generally unreliable for m or n larger than 4.) However, if A and B T are given in controller form, the analytic structure of the Faddeev sequences of A and B is known, which may help to improve on numerical performance of the algorithms. Symbolic formulae obtained with the algorithms (such as given in the previous section for m = 4 and m = 7) could also be taken as a startingpoint, thus bypassing the numerically most ill-conditioned computations. If m = n, the numerical complexity of the Faddeev-based algorithms is easily shown to be of order O(n4 ), both with respect to the Lyapunov or Sylvester equations and for the Faddeev sequences of the matrices A and B. Since, in general, inversion of an n × n matrix requires an algorithm of order O(n2.5 ) it is interesting to note that the Kronecker product approach requires inversion of an n2 ×n2 matrix, thus amounting to a complexity of order O(n5 ). About the complexity of the algorithm for symbolic computation little can be said. Here, complexity is mainly determined by the representation of all intermediate quantities in the algorithm and therefore depends on the parametrization of A, B and K. However, from the results of this paper we feel there is good reason to believe that the Faddeev-based algorithms are well-suited for symbolicly solving Sylvester equations if A and B T are on controller form. The matrix-algebra approach of this paper in conjunction with Faddeev’s algorithm has also turned out to be a powerful instrument to gain more theoretical insight in the structure and the solution of Lyapunov and Sylvester equations. In particular the role of the ‘Faddeev reachability matrix’ has been commented upon, as it provides the basis for the state-space of a linear SISO system that leads to its controller canonical form. It also becomes manifest in the formulae which specify how the solution to Lyapunov and Sylvester equations can be quickly obtained for an arbitrary righthand side K provided a solution is available for a right-hand side xy T of rank one with (A, x) and (B T , y) reachable pairs. Such situations are quite common when working with balanced realizations, both in the continuous-time and discrete-time case. Important simplifications have been shown to occur when A and B have the same characteristic polynomial p(s) = q(s), of which the symmetric case where B = AT is an example. An alternative algorithm has been developed for this case, which requires only 12 m(m + 1) iterations instead of the usual m2 . The situation where A and B T are in controller form and K = E11 has been recognized as the central issue to be further investigated. Several alternative approaches for obtaining Γ and ∆ 17 which exploit their alternating-Hankel and Toeplitz structure are currently being studied. Other topics under research include the symbolic calculation of various Riemannian metrics on spaces of linear systems (which are calculated by repeated solution of a number of Lyapunov equations, cf. [9, 4]), including the Fisher information metric for stable linear SISO systems driven by Gaussian white noise. Other subjects of interest are the generalization of Faddeev reachability matrices to the multivariate case and the extension of the methods in this paper to linear matrix equations of the more general form AP B + CP D = K. References [1] B. Friedland. Control System Design. McGraw-Hill, New York, 1986. [2] F.R. Gantmacher. The Theory of Matrices, volume I, II. Chelsea, 1959. [3] B. Hanzon. Some new results on and applications of an algorithm of Agashe. In R.F. Curtain, editor, Modelling,Robustness and Sensitivity Reduction in Control Systems, pages 285–303. Springer Verlag, Berlin, 1987. [4] B. Hanzon. Identifiability, recursive identification and spaces of linear dynamical systems. CWI Tracts 63 and 64. Centrum voor Wiskunde en Informatica (CWI), Amsterdam, 1989. [5] B. Hanzon and J.M. Maciejowski. Constructive algebra methods for the L2 problem for stable linear systems, with examples. Technical Report CUED/F-INFENG/TR 178, Department of Engineering, Cambridge University, June 1994. Submitted to Automatica. [6] T. Kailath. Linear Systems. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1980. [7] R.E. Kalman, J. Coffy, and P. Nicholson. Méthodes algebriques modernes appliquées à la theorie des systèmes linéaires. Technical report, Centre d’Automatique, Fontainebleau, 1969. [8] R.J. Ober. Balanced realizations: Canonical form, parametrization, model reduction. International Journal of Control, 46:643–670, 1987. [9] R.L.M. Peeters. System Identification Based on Riemannian Geometry: Theory and Algorithms, volume 64 of Tinbergen Institute Research Series. Thesis Publishers, Amsterdam, 1994. [10] R.L.M. Peeters and B. Hanzon. A balanced canonical form for discrete-time stable all-pass systems. In U. Helmke, R. Mennicken, and J. Saurer, editors, Systems and Networks: Mathematical Theory and Applications, volume II, pages 417–420. Akademie Verlag, 1994. 18 Errata To: Bernard Hanzon and Ralf L.M. Peeters, A Faddeev Sequence Method for Solving Lyapunov and Sylvester Equations, Report M 95-01, Dept. Mathematics, University of Limburg, Maastricht, 1995. • On p.4, one line has been misprinted. It should read: of {Li Rj |i = 0, 1, 2, . . . , m − 1; j = 0, 1, 2, . . . , n − 1}. Note that the linear combination does not • Formulae (3.8) and (3.9) in Theorem 3.2, p.5, should read: c̃0j (k) = m−1 X i=0 c̃i0 (k) = n−1 X trÃ(i) cij (k) + c0,j−1 (k) i+1 if j > 0 (3.8) trB̃(j) + ci−1,0 (k) j+1 if i > 0 (3.9) cij (k) j=0 ˜ • In the proof of Theorem 3.2, p.6, the calculations with respect to C(k) should read: ! m−1 X X n−1 tr Ã(i) tr B̃(j) ˜ cij (k) Li+1 Rj + L0 Rj + Li Rj+1 + Li R0 = C(k) = i+1 j+1 i=0 j=0 = m−2 X X n−1 cij (k)Li+1 Rj + i=0 j=0 cij (k) i=0 j=0 cij (k)Li Rj+1 + m−1 X n−1 X m−2 X trÃ(i) L0 Rj + ci,0 (k)Li+1 R0 + i+1 i=0 n−2 X trB̃(j) Li R0 + c0,j (k)L0 Rj+1 = j+1 i=0 j=0 i=0 j=0 j=0 ! m−1 n−1 X n−1 X X m−1 X trÃ(i) cij (k) L0 Rj + = (ci−1,j (k) + ci,j−1 (k)) Li Rj + i+1 i=1 j=1 j=0 i=0 m−1 m−1 n−1 X n−1 X X X tr B̃(j) Li R0 + + cij (k) ci−1,0 (k)Li R0 + c0,j−1 (k)L0 Rj j+1 i=0 j=0 i=1 j=1 + m−1 X n−2 X m−1 X n−1 X cij (k) • and in the subsequent calculations one should accordingly have the new expressions as in formulae (3.8) and (3.9) above: c̃0j (k) = m−1 X i=0 c̃i0 (k) = n−1 X j=0 trÃ(i) cij (k) + c0,j−1 (k) i+1 if j > 0 trB̃(j) + ci−1,0 (k) j+1 if i > 0 cij (k) • At the end of formula (4.6), p.13, a factor A(k) should be added. • Remark: on p.13, the matrices A(k) and Ã(k) may alternatively be called Sylvester matrices as can be seen from their block-Toeplitz structure.