Download Bernard Hanzon and Ralf L.M. Peeters, “A Faddeev Sequence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Capelli's identity wikipedia , lookup

Rotation matrix wikipedia , lookup

Jordan normal form wikipedia , lookup

Determinant wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Four-vector wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Ordinary least squares wikipedia , lookup

Matrix calculus wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix multiplication wikipedia , lookup

System of linear equations wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
A Faddeev Sequence Method for Solving Lyapunov and
Sylvester Equations
Bernard Hanzon∗ and Ralf L.M. Peeters†
Abstract
Lyapunov equations and more generally Sylvester equations play an important role in linear
systems theory. They attain the form AP + P B = K (continuous-time) and P − AP B = K
(discrete-time) where matrix P (of size m × n) is the matrix to solve for. Here we present a
method of solving such equations by exploiting the matrix-algebra structure of the problem.
No use is made of Kronecker products and the largest matrices occurring in the algorithms are
of sizes m × m, m × n and n × n.
The Faddeev method for matrix inversion lies at the very heart of the algorithms presented.
The idea is to associate sequences of linear operators with the sequences of matrices occurring
in the Faddeev algorithms applied to matrices A and B, and then to use these operators to
represent the Faddeev sequence for the Lyapunov and Sylvester operators. The resulting algorithms are capable of exactly solving the equations in a finite number of recursion steps. They
turn out to be well-suited for symbolic calculation.
The concept of a Faddeev reachability matrix introduced here turns out to be useful. It
establishes a close connection between the controller canonical (companion) form of a reachable
pair (A, b) and the Faddeev sequence of A. If A is already on controller form, then its Faddeev
sequence takes on an especially simple form. Further simplifications arise in case of symmetry,
i.e., if B = AT . Alternative algorithms can then be developed which require less iterations.
Using Faddeev reachability matrices, it is shown how a solution can be quickly obtained for
an equation with an arbitrary right-hand side K, provided a solution is known for a right-hand
side xy T of rank-one where (A, x) and (B T , y) are reachable pairs. These results are of interest
also outside the scope of Faddeev sequence methods.
We conclude with some examples concerning the symbolic solution of the continuous-time
Lyapunov equation AP + P AT = −bbT with (A, b) on controller form.
1
Introduction
In the theory of linear dynamical systems, linear matrix equations like Lyapunov and more generally
Sylvester equations play an important role. There are many numerically well-tested algorithms for
such equations. However for several applications it is important to obtain symbolic solutions of such
equations. For example in model reduction theory one encounters optimization problems in which
the criterion function can be expressed in terms of the solutions of one or several Lyapunov equations.
In order to apply gradient search algorithms, one needs the derivatives of the criterion function. Also
if one wants to solve the first order conditions algebraically the derivatives of the criterion function
are required. Therefore having an explicit symbolic expression of the criterion function is very useful.
Also in the application of techniques from Riemannian geometry to problems in systems theory, like
system identification, model reduction, parametrization etc., the calculation of Riemannian metric
tensors plays an important role and often involves the solution of several Lyapunov and Sylvester
equations. Again in that case it is very useful to have the answers in symbolic form, because
∗ Dept. Econometrics, Free University Amsterdam, De Boelelaan 1105, 1081 HV Amsterdam, Fax: +31-20-4446020,
Email: [email protected]
† Dept. Mathematics, University of Limburg, P.O. Box 616, 6200 MD Maastricht, The Netherlands, Fax: +31-43211889, Email: [email protected]
0 Submitted to Lin. Alg. Appl.
0 Discussions with J.M. Maciejowski and C.T. Chou are gratefully acknowledged.
1
further calculations to obtain curvature tensors etc., require taking derivatives. For stochastic
linear dynamical models the Fisher information matrix is in fact a Riemannian metric tensor and
it can also be obtained in symbolic form by solving a number of Lyapunov and Sylvester equations.
For further information on these issues the reader is referred to [9, 4, 5].
One straightforward approach to solving such equations symbolically is to use the fact that
the equations are linear and use Kronecker products to transform the problem into one in which an
mn×mn matrix has to be inverted, if the matrix sought for is m×n. This usually works in practice,
i.e. with a computer algebra package like Maple1 or Mathematica2 , only for problems in which the
values of m and n are small. In this paper an alternative approach is presented, which exploits the
matrix algebra structure of the problem. In existing algorithms of this kind (cf. [3, 7]) one needs to
perform first a number of polynomial calculations starting with the characteristic polynomial. Here
a different, but theoretically related approach is taken in which calculations with the characteristic
polynomial are avoided. The idea behind the presented algorithm is to apply Faddeev’s algorithm
for inversion of a linear finite dimensional operator to Lyapunov and Sylvester equations. This leads
to a recursive procedure which ends after a finite number of steps, related to the size of the problem.
In the algorithms the use of Kronecker products is avoided and the largest matrices occurring are
of sizes n × n, m × m, m × n. Due to the recursive structure of the algorithm, it can be programmed
in a concise way.
One can calculate the L2 norm of a SISO stable linear dynamical system by solving a related
Lyapunov equation. An important special case of our method occurs if the SISO stable linear system
is in controller canonical form. This form is not to be confused with the controllability canonical
form. It turns out that the controller canonical form can in fact be understood as the canonical
form that is obtained by choosing the basis of the state space in a way that is directly related to the
Faddeev sequence of the dynamical matrix of the system. Although the controller canonical form is
one of the most well-known canonical forms, to the best of our knowledge this has not been noted
before. In the controller canonical form the dynamical matrix is in companion form. It turns out
that the Lyapunov equation can in most cases be reduced to the special case in which the dynamical
matrix is in companion form. We give the symbolic solution to the Lyapunov equation for this case
for a number of choices for n.
Some experiences with the algorithm in computer algebra calculations are reported upon. We
encountered cases which could not be handled by the Kronecker products method, because of memory problems, which could be handled by the method proposed here. At least for numerical calculations this method appears to require less operations, however experience shows that it is numerically
unreliable. Of course this does not play a role in computer algebra applications where exact arithmetic is used.
2
The Faddeev sequence of a matrix and matrix inversion
One of the basic problems in linear algebra is the calculation of the inverse of a square n × n
nonsingular matrix A. An interesting matrix-algebra method to calculate the inverse can be obtained
by exploiting the properties of the Faddeev sequence of A. The Faddeev sequence of the matrix A
is recursively defined as follows:
A(0) := In
Ã(0) := A(0)A
(2.1)
In
A(k) := Ã(k − 1) − tr(Ã(k−1))
k
Ã(k) := A(k)A
k = 1, 2, . . .
(2.2)
Let the characteristic polynomial of A be given by p(s) = det(sIn − A) = sn + p1 sn−1 + . . . + pn .
Then the Faddeev sequence has the following nice properties, derived from the Newton identities
(cf. [2, p.87]):
tr(Ã(k − 1))
pk = −
,
k = 1, 2, . . . , n
(2.3)
k
1 Maple
is a registered trademark of Waterloo Maple Software.
is a registered trademark of Wolfram Research, Inc.
2 Mathematica
2
and
A(k) = Ak + p1 Ak−1 + p2 Ak−2 + . . . + pk−1 A + pk In ,
k = 0, 1, 2, . . . , n
(2.4)
So due to the theorem of Cayley-Hamilton A(n) = p(A) = 0. (From this it follows directly that
A(k) = 0 for all k ≥ n.) Therefore if A is nonsingular,
−
tr(Ã(n − 1))
= pn = det(−A) 6= 0
n
and
tr(Ã(n − 1))
In
n
from which the following formula for the inverse of A can be derived easily:
0 = A(n) = Ã(n − 1) −
A−1 =
n
A(n − 1)
tr(Ã(n − 1))
(2.5)
Note that the inverse is obtained by a sequence of operations consisting of multiplication by A, taking
trace, subtraction of a scalar multiple of the identity matrix and division by a scalar. Therefore
the algorithm can in fact be applied to any finite dimensional linear endomorphism (i.e., linear
operator of which the domain is equal to the codomain) and is independent of the choice of basis
in the vector space in which the operator acts. Therefore one can speak of the Faddeev sequence
of a linear endomorphism and the inverse of a linear endomorphism can be constructed from the
Faddeev sequence in the way described. This will be important in the following sections.
It may be good to note at this point that in the literature the Faddeev sequence is usually
presented as a means to calculate the resolvent (sIn − A)−1 of a matrix A. The relevant formula
for the resolvent is
(sIn − A)−1 =
A(0)sn−1 + A(1)sn−2 + . . . + A(n − 2)s + A(n − 1)
p(s)
where p(s) also follows from the Faddeev sequence calculations as noted before. (Compare e.g. [1,
Sections 3.4, 3.5].)
3
3.1
Solving Lyapunov and Sylvester equations using Faddeev
sequences
The matrix-algebra approach to Lyapunov and Sylvester equations
Consider Sylvester equations of the form
AP + P B = K
(3.1)
P − AP B = K,
(3.2)
and
where A is a given m × m matrix, B is a given n × n matrix, K is a given m × n matrix and P is an
unknown m × n matrix for which we want to solve the equation. In order to do that consider the
linear matrix operators L = LA , R = RB and Imn defined by
LA : P 7→ AP,
RB : P 7→ P B
and
Imn : P 7→ P
This last one is clearly the identity on the vector space of m×n matrices. Define the linear operators
C := L + R and D := Imn + LR. Then the Sylvester equations (3.1) and (3.2) can be written as
C(P ) = K
3
(3.3)
and
D(P ) = K
(3.4)
respectively. Because C and D are linear endomorphisms (on the vector space of m × n matrices),
the solution is given abstractly by
P = C −1 (K)
and
P = D−1 (K)
respectively, if C and D are invertible. This is known to be the case iff A and −B have no eigenvalues
in common, and iff no eigenvalue of A is the reciprocal of an eigenvalue of B, respectively (cf., e.g.,
[2]). Therefore the question arises how C −1 and D−1 can be calculated. In principle the techniques of
˜ | k = 0, 1, 2, ...},
the previous section can be applied: one can define the Faddeev sequence {C(k), C(k)
mn
−1
˜
for which C(k) = 0, C(k) = 0 for all k ≥ nm. Then C = tr C̃(mn−1) C(mn − 1) and similarly for
D. In order to apply this idea one needs an explicit representation of C and its Faddeev sequence.
Here we propose to represent C and the elements of its Faddeev sequence as a linear combination
of operators Li Rj , where Li := LA(i) denotes left multiplication of an m × n matrix by the matrix
A(i) from the Faddeev sequence of A, and where similarly Rj := RB(j) denotes right multiplication
of an m × n matrix by the matrix B(j) from the Faddeev sequence of B. The reason why C and the
elements of its Faddeev sequence can be represented in this way is twofold:
(i) Left-multiplication of an m×n matrix by a matrix Ā and right-multiplication of an m×n matrix
by an n × n matrix B̄ are commutative (due to the associativity of matrix multiplication):
LĀ RB̄ = RB̄ LĀ for all Ā and B̄ of the correct sizes.
(ii) An arbitrary polynomial in A can be written as a linear combination of elements A(i), i =
0, 1, . . . of the Faddeev sequence of A, because A(i) = Ai + lower degree terms for each
i = 0, 1, 2, . . . . Because A(i) = 0 for all i ≥ m, this implies that it can actually be written as as linear combination of A(0), A(1), . . . , A(m − 1). It follows directly that an arbitrary
polynomial in L can be written as a linear combination of L0 , L1 , . . . , Lm−1. Similarly an
arbitrary polynomial in R can be written as a linear combination of R0 , R1 , . . . , Rn−1 .
Combining (i) and (ii) one finds that each polynomial in C can be rewritten as a linear combination
of {Li Rj |i = 0, 1, 2, . . . , m − 1; j = 1, 2, . . . , n − 1}. Note that the linear combination does not
necessarily have to be unique. (It is not iff the degree of the minimal polynomial of C is less than
the degree of its characteristic polynomial.) Even if there is linear dependence of this kind, the
algorithm presented below still works without any changes, although a more careful analysis may
then produce a quicker algorithm. The symmetric case B = AT is an important example of such a
situation, to which we return in Section 5.
3.2
Derivation of the Faddeev sequence formulae
An important ingredient in the calculation of the Faddeev sequences of C and D is the calculation of
˜
˜
the traces of C(k)
and D̃(k). Because C(k)
is represented as a linear combination of endomorphisms
of the form Li Rj the question arises how one can calculate the trace of Li Rj . The answer is given
in the following lemma.
Lemma 3.1
tr{Li Rj } = trA(i)trB(j)
Proof. For each i ∈ {0, . . . , m − 1}, j ∈ {0, . . . , n − 1}, Li Rj is a linear endomorphism of the vector
space of m × n matrices. Therefore its trace is well-defined, independent of the specific choice of
a basis in the vector space. Consider the inner product h·, ·i on the vector space Rm×n of m × n
matrices given by
hP, Qi = tr{P T Q};
P, Q ∈ Rm×n
4
Then an orthogonal basis is given by {Ekl = ek flT |k = 1, . . . , m; l = 1, . . . , n} with ek the kth standard basis vector in Rm and fl the lth standard basis vector in Rn . The trace of an endomorphism
Li Rj of Rm×n is equal to
Pm Pn
Pm Pn
T T
T
l=1 hEkl , Li Rj (Ekl )i =
k=1
k=1
l=1 tr (ek fl ) A(i)ek fl B(j) =
Pn
Pm T
Pm Pn
T
= k=1 l=1 eTk A(i)ek flT B(j)fl =
k=1 ek A(i)ek
l=1 fl B(j)fl = trA(i)trB(j)
2
Theorem 3.2 Define the m × n coefficient matrices
m−1,n−1
k = 0, 1, . . . , mn − 1
(3.5)
m−1,n−1
(c̃ij (k))i=0,j=0
k = 0, 1, . . . , mn − 1
(3.6)
C(k) = (cij (k))i=0,j=0
C̃(k) =
by the following recursive formulas;
C(0) = E11
and for each k = 0, 1, 2, . . . , mn − 1
c̃00 (k)
=
m−1
X
i=0
c̃0j (k)
=
m−1
X
i=0
c̃i0 (k)
=
n−1
X
n−1
X
trÃ(i)
trB̃(j)
ci0 (k) +
c0j (k)
i+1
j+1
j=0
(3.7)
trÃ(i)
cij (k)
i+1
if j > 0
(3.8)
trB̃(j)
j+1
if i > 0
(3.9)
cij (k)
j=0
c̃ij (k)
c00 (k + 1)
cij (k + 1)
= ci−1,j (k) + ci,j−1 (k)
= c̃00 (k) −
1
k+1
if i > 0 and j > 0
m−1
X n−1
X
c̃ij (k)trA(i)trB(j)
(3.11)
i=0 j=0
if (i, j) 6= (0, 0)
= c̃ij (k)
(3.10)
(3.12)
Then for each k = 0, 1, 2, . . . , mn − 1
C(k)
=
m−1
X
X n−1
cij (k)Li Rj
(3.13)
c̃ij (k)Li Rj
(3.14)
i=0 j=0
˜
C(k)
=
m−1
X n−1
X
i=0 j=0
Proof. Induction is used for k = 0, 1, 2, . . . , mn − 1. Clearly
C(0) = Imn = L0 R0 =
m−1
X n−1
X
cij (0)Li Rj
i=0 j=0
with c00 = 1 and cij = 0 if (i, j) 6= (0, 0). Now suppose (this is the induction hypothesis) that
C(k) =
m−1
X n−1
X
cij (k)Li Rj
i=0 j=0
then
˜
C(k)
= CC(k) = (L + R)
X
cij (k)Li Rj =
ij
X
ij
5
cij (k)(LLi Rj + Li RRj )
Applying LLi to an m × n matrix means multiplication on the left by AA(i) = A(i)A = Ã(i) =A(i +
Ã(i)
Ã(i)
1) + tri+1
Im = A(i + 1) + tri+1
A(0) by definition of A(i + 1) and A(0). Therefore
trÃ(i)
L0
i+1
LLi = Li+1 +
Similarly
RRj = Rj+1 +
trB̃(j)
R0
j+1
It follows that
˜
C(k)
trÃ(i)
trB̃(j)
= cij (k) Li+1 Rj +
L0 Rj + Li Rj+1 +
Li R0
i+1
j+1
=
m−2
X n−1
X
cij (k)Li+1 Rj +
i=0 j=0
+
m−1
X n−1
X
i=0 j=0
m−1
X n−2
X
cij (k)Li Rj+1 +
i=0 j=0
=
m−1
X n−1
X
cij (k)
m−1
X n−1
X
(ci−1,j (k) + ci,j−1 (k)) Li Rj +
i=1 j=1
+
m−1
X
n−1
X
i=0
This is equal to
i=0
cij (k)
j=0
Pm−1 Pn−1
j=0
c̃00 (k)
m−1
X
=
m−1
X
=
i=0
c̃i0 (k)
m−1
X
j=0
i=0
!
trÃ(i)
cij (k) L0 Rj +
i+1
trB̃(j) 
Li R0
j+1
n−1
X
=
n−1
X
trB̃(j)
trÃ(i)
c0j (k)
ci0 (k) +
i+1
j+1
j=0
trÃ(i)
cij (k)
i+1
if j > 0
trB̃(j)
j+1
if i > 0
cij (k)
j=0
c̃ij (k)
n−1
X
c̃ij (k)Li Rj if the c̃ij (k) are assigned to be
i=0
c̃0j (k)
trB̃(j)
Li R0 =
j+1



=
trÃ(i)
L0 Rj +
i+1
cij (k)
i=0 j=0
!
= ci−1,j (k) + ci,j−1 (k)
if i > 0 and j > 0
Next consider the equation
˜ −
C(k + 1) = C(k)
trC̃(k)
Imn
k+1
First note that application of Lemma 3.1 gives
m−1
m−1
X n−1
X
X n−1
X
˜
trC(k)
1
1
=
c̃ij (k)
tr (Li Rj ) =
c̃ij (k)
trA(i)trB(j)
k+1
k
+
1
k
+
1
i=0 j=0
i=0 j=0
It follows that
C(k + 1) =
m−1
X n−1
X

m−1
X n−1
X
c̃ij (k)Li Rj − 
i=0 j=0
This is equal to C(k + 1) =
i=0 j=0
Pm−1 Pn−1
i=0
j=0

c̃ij (k)
trA(i)trB(j) L0 R0
k+1
cij (k + 1)Li Rj if the cij (k + 1) are assigned to be
c00 (k + 1)
= c̃00 (k) −
cij (k + 1)
= c̃ij (k)
m−1 n−1
1 XX
c̃ij (k)trA(i)trB(j)
k + 1 i=0 j=0
if (i, j) 6= (0, 0)
6
2
The analogous result for the Faddeev sequence of D is as follows
Theorem 3.3 Define the m × n coefficient matrices
D(k)
D̃(k)
m−1,n−1
(dij (k))i=0,j=0 , for k = 0, 1, 2, . . . , mn − 1
m−1,n−1
=
d˜ij (k)
, for k = 0, 1, 2, . . . , mn − 1
=
(3.15)
(3.16)
i=0,j=0
by the following recursive formulas:
D(0) = E11
and for each k = 0, 1, 2, . . . , mn − 1
d˜00 (k)
= d00 (k) −
m−1
X n−1
X
i=0 j=0
d˜0j (k)
= d0j (k) −
m−1
X
i=0
d˜i0 (k)
= di0 (k) −
n−1
X
j=0
d˜ij (k)
d00 (k + 1)
dij (k + 1)
trÃ(i) trB̃(j)
dij (k)
i+1 j+1
trÃ(i)
di,j−1 (k)
i+1
if j > 0
(3.18)
trB̃(j)
di−1,j (k)
j+1
if i > 0
(3.19)
= dij (k) − di−1,j−1 (k)
= d˜00 (k) −
1
k+1
= d˜ij (k)
if i > 0, j > 0
m−1
X n−1
X
d˜ij (k)trA(i)trB(j)
(3.20)
(3.21)
i=0 j=0
if (i, j) 6= (0, 0)
(3.22)
Proof. Analogous to the proof of the previous theorem; this is left to the reader.
3.3
(3.17)
2
The Faddeev sequence formulae in matrix-vector notation
The recursive equations for the Faddeev-sequence of the matrix operators C and D can also be cast
in matrix-vector notation. Let p(s) = det(sIm − A) = sm + p1 sm−1 + . . . + pm be the characteristic
polynomial of A and q(s) = det(sIn − B) = sn + q1 sn−1 + . . . + qn be the characteristic polynomial
of B. Let Ac be the m × m matrix given by


−p1 −p2 . . . −pm−1 pm
 1
0
...
0
0 



.
.. 
.
.
.

.
1
.
. 
Ac :=  0
(3.23)

 .

.
.
..
..
 ..
0
0 
0
...
0
1
0
and let Bc be the n × n matrix given by

−q1
 1


Bc := 
 0
 .
 ..
0
−q2
0
. . . −qn−1
...
0
..
..
.
.
..
.
0
1
..
.
...
0
7
1
qn
0
..
.







0 
0
(3.24)
Furthermore let τA ∈ Rm denote the vector of traces of the elements A(0), A(1), . . . , A(m − 1) of
the Faddeev sequence of A:


trA(0)


trA(1)


(3.25)
τA := 

..


.
trA(m − 1)
trIm =
and let τB ∈ Rn be defined analogously. Note that for k > 0, trA(k) = trÃ(k − 1) − trÃ(k−1)
k
tr
Ãk−1
(m−k)pk , where the equality pk = − k
is used. Of course, defining p0 := 1, trA(0) = m = mp0 .
So one can express the elements of the vector τA in terms of the pk , k = 0, 1, 2, . . . , m − 1, as follows:


mp0
 (m − 1)p1 




..
(3.26)
τA = 

.


 2pm−2 
pm−1
An analogous formula holds for τB . It is now straightforward to verify that the recursive formulae
given in Theorems 3.2 and 3.3 for the coefficient matrices of Faddeev sequences of C and D can be
rewritten as:
C(0) = E11
C̃(0) = Ac C(0) + C(0)BcT
(3.27)
τ T C̃(k−1)τB
C(k) = C̃(k − 1) − A k
C̃(k) = Ac C(k) + C(k)BcT
E11
k = 1, 2, . . . , mn − 1
(3.28)
and
D(0) = E11
D̃(0) = D(0) − Ac D(0)BcT
(3.29)
τ T D̃(k−1)τB
D(k) = D̃(k − 1) − A k
D̃(k) = D(k) − Ac D(k)BcT
E11
k = 1, 2, . . . , mn − 1
(3.30)
It follows that the inverse of C can be expressed as
C −1 =
m−1
X n−1
X
γij Li Rj
(3.31)
i=0 j=0
m,n
where the m × n matrix Γ = (γi−1,j−1 )i=1,j=1 is given by the formula
Γ=
mn
C(mn − 1)
− 1)τB
τAT C̃(mn
(3.32)
It may be interesting to note that Γ has a structure that could perhaps be called alternating-Hankel:
γij = −γi−1,j+1 ,
for i = 1, 2, 3, . . . , m − 1; j = 0, 1, 2, . . . , n − 2
This can easily be derived from the equality
Ac Γ + ΓBcT =
mn
Ac C(mn − 1) + C(mn − 1)BcT = E11
− 1)τB
τAT C̃(mn
together with the special structure of Ac and Bc .
The inverse of D can be expressed as
D−1 =
m−1
X n−1
X
i=0 j=0
8
δij Li Rj
(3.33)
m,n
where the m × n matrix ∆ = (δi−1,j−1 )i=1,j=1 is given by the formula
∆=
mn
D(mn − 1)
− 1)τB
(3.34)
τAT D̃(mn
It may be interesting to note that ∆ is a Toeplitz matrix. This can easily be derived from the
equality
mn
∆ − Ac ∆BcT = T
D(mn − 1) − Ac D(mn − 1)BcT = E11
τA D̃(mn − 1)τB
together with the special structure of Ac and Bc .
3.4
Solution of the Lyapunov and Sylvester equations
The solution of the matrix equation
AP + P B = K
can now be given by the formula
P =C
−1
(K) =
m−1
X n−1
X
γij Li Rj (K) =
i=0 j=0
m−1
X n−1
X
γij A(i)KB(j)
(3.35)
δij A(i)KB(j)
(3.36)
i=0 j=0
Similarly the solution of the matrix equation
P − AP B = K
is given by
P = D−1 (K) =
m−1
X
X n−1
δij Li Rj (K) =
i=0 j=0
m−1
X n−1
X
i=0 j=0
In order to write this in a concise way, the following definition will be helpful.
Definition 3.4 Consider a pair (A, b) with A an m × m matrix and b an m × 1 column vector. The
matrix
FA (b) := [A(0)b, A(1)b, . . . , A(m − 1)b]
will be called the Faddeev reachability matrix of the pair (A, b).
The name of this matrix is clarified in the next section when relations with system theoretical
concepts are treated. If K = xy T is a rank one matrix, where x and y are column vectors, then the
solution of AP + P B = K can be rewritten as
T
P = FA (x)Γ (FB T (y))
(3.37)
This follows directly from (3.35) using the basic rules of matrix multiplication. Similarly the solution
of P − AP B = xy T is given by
T
P = FA (x)∆ (FB T (y))
(3.38)
Pr
T
Any matrix K can of course be written as a linear combination
k=1 xk yk of rank-one matrices,
Pm
where r ≤ min(m, n). (For instance, one can write K = i=1 ei kiT with kiT denoting the ith row of
K.) The solution of AP + P B = K is then given by
P =
r
X
T
(3.39)
T
(3.40)
FA (xk )Γ (FB T (yk ))
k=1
and similarly the solution of P − AP B = K is then given by
P =
r
X
FA (xk )∆ (FB T (yk ))
k=1
This formula has some interesting consequences, which are of interest even outside the scope of
Faddeev sequence methods and which apparently have not been noted before.
9
Corollary 3.5 If (A, x) and (B T , y) are pairs such that FA (x) and FB T (y) are nonsingular (i.e.
(A, x) and (B T , y) are reachable pairs, cf. the next section) then:
(i) If Σ is the solution of the equation AΣ + ΣB = xy T then
Γ = FA (x)−1 Σ FB T (y)−1
T
Similarly if Σ is the solution of Σ − AΣB = xy T then
∆ = FA (x)−1 Σ FB T (y)−1
T
T
(ii) If
PrΣ is the Tsolution of the equation AΣ + ΣB = xy then the solution of AP + P B = K =
k=1 x(k) y(k) is equal to
P =
r
X
FA (x(k) )FA (x)−1 Σ FB T (y)−1
T
T
FB T (y(k) )
k=1
and similarly if Σ is the solution of Σ − AΣB = xy T then the solution of P − AP B = K =
P
r
T
k=1 x(k) y(k) is equal to
P =
r
X
FA (x(k) )FA (x)−1 Σ FB T (y)−1
T
T
FB T (y(k) )
k=1
The second part of this corollary tells how to find a solution if the right-hand side is K in case a
solution is known for a specific right-hand side xy T such that (A, x) and (B T , y) are reachable. This
T
takes on an especially simple form if K = x(1) y(1)
, then the solution P is equal to
P = FA (x(1) )FA (x)−1 Σ FB T (y)T
−1
FB T (y(1) )T
This can also be shown directly, bypassing the Faddeev sequences of C and D as follows. Note
that A(i) commutes with A; therefore multiplication on the left of the equation AΣ + ΣB = xy T
by A(i) gives A (A(i)Σ) + A(i)ΣB = (A(i)x)y T which shows that if x(1) = A(i)x then P = A(i)Σ
Pm−1
is the solution. Taking a linear combination x1 =
i=0 ξi+1 A(i)x one finds the solution P =
Pm−1
ξ
A(i)Σ
of
the
corresponding
matrix
equation
with
right-hand side K = x(1) y T . The m × 1
i+1
i=0
m
−1
column vector
(1) and on the other
Pmξ = (ξi )i=1 has the property that on the one hand ξ = FA (x) xP
m
hand P = i=1 ξi A(i − 1)Σ. Now considering the lth column of the matrix i=1 ξi A(i − 1) one
−1
finds that it is equal to FA (el )ξ = FA (el )FA (x) x(1) , for l = 1, 2, . . . , m. In order to make the next
step, the following remarkable lemma is required.
Lemma 3.6 Let (A, x) ∈ Rm×m × Rm be such that FA (x) is a nonsingular m × m matrix (i.e.,
(A, x) is a reachable pair) then for all u, v ∈ Rm the equality
FA (u)FA (x)−1 v = FA (v)FA (x)−1 u
holds.
Proof. Because FA (x) is nonsingular, the vectors A(0)x, A(1)x, . . . , A(m − 1)x form a basis of Rm .
As FA (u)FA (x)−1 v is a bilinear form in u and v, it suffices to show the equality for the cases where
u = A(k)x and v = A(l)x, with k = 0, 1, . . . , m − 1; l = 0, 1, . . . , m − 1. Because all the elements of
the Faddeev sequence of A commute one has
FA (A(k)x)FA (x)−1 A(l)x = A(k)FA (x)FA (x)−1 A(l)x = A(k)A(l)x =
= A(l)A(k)x = FA (A(l)x)FA (x)−1 A(k)x
2
10
Pm
Applying this lemma to the lth column of the matrix i=1 ξi A(i − 1) one finds that it is equal to
FA (el )FA (x)−1 x(1) = FA (x(1) )FA (x)−1 el . Therefore the matrix is in fact equal to FA (x(1) )FA (x)−1 .
T
A similar reasoning can be applied if y T is replaced by y(1)
in the right-hand side of the matrix
T
equation, using the reachability of the pair (B , y). In this way a direct proof of the second part of
the corollary above has been obtained.
Remark. In case m = n and B = AT , there are several parametrized families of matrices A for
which an accompanying vector b is known such that (A, b) is reachable (i.e. FA (b) nonsingular) and
such that the equation AP + P A = −bbT , respectively P − AP AT = bbT has a known (simple)
solution. For example if A stems from a balanced parametrization then AP + P AT = −bbT has
a known diagonal solution matrix P = Σ (cf., e.g., [8]). It follows that in that case the solution
of AP + P AT = K can be found directly from the second part of the corollary, without going
through the calculation of the Faddeev sequence of C. Another example is when A is in Schwarz
form, i.e. A = (−b21 /2)E11 + Ask where Ask is an arbitrary tridiagonal skew-symmetric matrix.
Then A + AT = −bbT if b = b1 e1 , implying that the identity matrix is the solution of the Lyapunov equation. The solution of AP + P AT = x(1) y(1) is then obtained from the corollary as:
T
T
P = −FA (x(1) )FA (x)−1 FA (x)−1
FA (x(1) Also for the discrete-time Lyapunov equation such
parametrized families are known (cf., e.g., [10]).
4
Relations with the controller canonical form of a pair (A, b)
In linear systems theory (cf., e.g., [6]) a pair (A, b) ∈ Rm×m ×Rm is called reachable if the reachability
matrix
RA (b) := b, Ab, A2 b, . . . , Am−1 b
is nonsingular. Because for the elements A(k), k = 0, 1, 2, . . . of the Faddeev sequence of A the
equality A(k) = Ak + p1 Ak−1 + . . . + pk−1 A0 holds, the reachability matrix RA (b) is related to the
matrix FA (b) of the previous section by the formula
RA (b)U = FA (b)
where U is the upper triangular m × m Toeplitz matrix with first row equal to (1, p1 , p2 , . . . , pm−1 ).
It follows immediately that RA (b) and FA (b) have equal rank and that if one of these matrices is
nonsingular then so is the other. Therefore the matrix FA (b) is called the Faddeev reachability matrix
in this paper. In linear systems theory two reachable pairs (A(1) , b(1) ) and (A(2) , b(2) ) are considered
to be equivalent if there exists a nonsingular transformation matrix T such that A(2) = T A(1) T −1
and b(2) = T b(1) . Such a transformation corresponds to a change of basis in the space Rm on which
A operates as an endomorphism and in which b lies as a vector. This space is called the state space.
A special choice of the state space basis can lead to a particularly simple form of the pair (A, b)
which can simplify certain calculations and from which certain properties can be deduced more
easily. One speaks of a canonical form for the pair (A, b). Choosing the columns of the reachability
matrix as a basis for the state space leads to the well-known controllability canonical form (cf. [6],
pp.335-6):


0 0 . . . 0 −pm

..
.. 
 1 0
.
. 



.. 
Ay =  0 1 . . . ...
(4.1)
. 


 .

.
..
.. 
 ..
. 0
0 0
1 −p1
by
=
(1, 0, . . . , 0)T
(4.2)
It is easy to verify that RAy (by ) = Im . Choosing the columns of the Faddeev reachability matrix as
a basis for the state space leads to the well-known controller form. The relation of this form with
11
the Faddeev sequence of the dynamical matrix seems not to have been noticed in the literature. The
controller form will be denoted by (Ac , bc ):


−p1 −p2 . . . −pm−1 −pm
 1
0
...
0
0 



.
.. 
.
.
.

.
1
.
. 
(4.3)
Ac =  0

 .
.. 
.
.
.
 .
.
0
. 
0
0
1
0
bc
=
(1, 0, . . . , 0)T
(4.4)
(Note that the notation Ac was used already in the previous section for matrices with this structure).
That this is indeed the form of the reachable pair if the columns of the Faddeev reachability matrix
are chosen as basis for the state space follows simply from the fact that for k = 0, 1, 2, . . . , m − 1:
A (A(k)b) = Ã(k)b = A(k + 1)b +
trÃ(k)
b
k+1
Ã(k)
and as was stated before, trk+1
= −pk+1 . So if the columns of the Faddeev reachability matrix are
chosen as the basis of the state space, then of course b = e1 and A(k − 1)b = ek , k = 1, 2, . . . , m and
A(m)b = 0 because A(m) = 0. So Aem = −pm e1 and Aek = ek+1 − pk e1 , k = 1, . . . , m − 1 and
so (A, b) is indeed in controller form. Furthermore if (Ac , bc ) is a reachable pair in controller form,
then the characteristic polynomial is det(sI − Ac ) = p(s) and therefore choosing the columns of the
Faddeev reachability matrix as a basis of the state space keeps the reachable pair unchanged, which
implies that the Faddeev reachability matrix is the identity matrix: FAc (bc ) = Im . This leads to
an interpretation of the coefficient matrices Γ and ∆ of C −1 and D−1 respectively, as follows. Let
A = Ac have characteristic polynomial p(s) of degree m and B = BcT have characteristic polynomial
q(s) of degree n, then the equation
Ac P + P BcT = E11 = e1 f1T
has solution Γ as defined in (3.32). This follows simply from (3.37) because FAc (e1 ) = Im and
FBc (f1 ) = In . Similarly the equation
P − Ac P BcT = E11 = e1 f1T
has solution ∆. Note that Γ and ∆ clearly depend only on the coefficients p1 , p2 , . . . , pm ; q1 , . . . , qn ,
which are system invariants and therefore Γ and ∆ are themselves system invariants. If Bc = Ac ,
and Ac has its spectrum in the open left half plane then −Γ is the positive definite continuous-time
reachability Gramian of the pair (Ac , e1 ). And if Bc = Ac , and Ac has its spectrum in the open unit
disk then ∆ is the positive definite discrete-time reachability Gramian of the same pair.
If A = Ac and B = BcT in the Sylvester equations studied here, a remarkable simplification of
the solution formulas can be obtained, by studying the Faddeev sequence of an endomorphism in
controller form Ac more closely.
Firstly as a corollary of Lemma 3.6 one can obtain the following equality
Ac (i − 1)ej = Ac (j − 1)ei ,
i = 1, . . . , m; j = 1, . . . , m
(4.5)
In order to derive this from Lemma 3.6, use that FAc (e1 ) = Im and that FAc (ej )ei = Ac (i − 1)ej .
Combining this one obtains
Ac (i − 1)ej = FAc (ej )FAc (e1 )−1 ei = FAc (ei )FAc (e1 )−1 ej = Ac (j − 1)ei
This implies that FAc (ei ) = Ac (i − 1), i = 1, . . . , m.
Secondly, inspection of the Faddeev sequence of Ac (for small values of m — here computer
algebra has turned out to be very helpful) learns that it has a remarkably simple structure! Even
so much that one need not calculate it, one can just apply the following theorem.
12
Theorem 4.1 Let (Ac , e1 ) be in controller form and let Ac have characteristic polynomial p(s) of
degree m. Then the matrix A(k), k = 0, 1, 2, . . . , m − 1 from the Faddeev sequence of A = Ac can
be partitioned as
A1 (k)
A(k) =
A2 (k)
where A1 (k) is an k × m Toeplitz matrix (it is not there if k = 0) and A2 (k) is an (m − k) × m
Toeplitz matrix.
A1 (k) is given by its first row and column: the first row is (0, −pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0), so
the number of zeroes at the end is k − 1; the first column is zero. A2 (k) is given by its last row
and last column: its last row is (0, . . . , 0, p0 , p1 , . . . , pk ), with p0 := 1, the number of zeroes at the
beginning is of course m − 1 − k; the last column is (0, . . . , 0, pk )T and the number of zeroes above
pk is m − 1 − k.
Proof. By induction. A(0) = Im is clearly of this form and

0 −p2 . . . −pm−1
 1 p1 . . .
0


..
.
..
1
.
A(1) = Ac + p1 Im = 
 0
 .
.
..
 ..
p1
0
0
1
−pm
0
..
.
..
.
p1








is also clearly of this form. Suppose (induction hypothesis) that A(k) is of this form. Then
the first row of Ã(k) = A(k)A is equal to eT1 A(k)A = (0, −pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0)A =
(−pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0).
The other rows of Ã(k) consist of the first m − 1 rows of A(k), shifted down by one row, since we
can write


0 ... ... 0

.. 
 1 ...
. 


(4.6)
Ã(k) = AA(k) = e1 (−p1 , −p2 , . . . , −pm )A(k) + 
. 
.. ..

. .. 
.
1 0
We find that Ã(k) can be partitioned into two Toeplitz matrices Ã1 (k) and Ã2 (k), where the first is
of size (k + 1) × m and the second of size (m − k − 1) × m. Ã1 (k) is characterized by its first row and
column: the first row is (−pk+1 , −pk+2 , . . . , −pm , 0, . . . , 0), the first column is (−pk+1 , 0, . . . , 0)T .
Ã2 (k) is characterized by its last row and last column: its last row is (0, . . . , 0, p0 , p1 , . . . , pk , 0), with
p0 := 1, the number of zeroes at the beginning is m − 2 − k; the last column is zero. Because we
have that
trÃ(k)
A(k + 1) = Ã(k) −
Im = Ã(k) + pk+1 Im
(4.7)
k+1
it follows that A(k + 1) is obtained from Ã(k) by adding pk+1 Im , which clearly gives the partitioning
in two Toeplitz matrices as described in the theorem.
2
Remark. From the proof of the theorem it also follows that the matrices Ã(k) allow for a similar
partitioning in two Toeplitz blocks, with equally simple structure.
These properties of the Faddeev sequence of Ac can be applied to simplify the solution
Pm formulas for
Ac P + P B = K and P − Ac P B = K. Let kiT denote the ith row of K, i.e. K = i=1 ei kiT . Using
the fact that FAc (ei ) = Ac (i − 1), one obtains that the solution of Ac P + P B = K is given by
P =
m
X
T
Ac (i − 1)Γ (FB T (ki ))
i=1
where the matrices Ac (i − 1) require no calculations.
13
5
A special algorithm for the case of equal characteristic
polynomials
Consider again the equations AP + P B = K and P − AP B = K. If the characteristic polynomials
p(s) and q(s) of A and B are the same, then it follows from the results in the previous sections that
Γ and ∆ are the solutions of the equations Ac P + P ATc = E11 and P − Ac P ATc = E11 , where Ac is in
controller form and has characteristic polynomial p(s). Let Cc denote here the linear mapping given
by P 7→ Ac P + P ATc and similarly let Dc denote the mapping P 7→ P − Ac P ATc . Then Cc and Dc
have the property that they map symmetric matrices to symmetric matrices and that Γ = Cc−1 (E11 )
and ∆ = Dc−1 (E11 ) where obviously E11 is symmetric. Therefore the Faddeev sequence method can
be applied to invert Cc and Dc considered as endomorphisms of the 21 m(m + 1)- dimensional vector
space of symmetric m × m matrices.
Let Σm denote the set of m × m symmetric matrices and let us denote the Faddeev sequence of Cc |Σ by {CcΣ (k), C˜cΣ (k) | k = 0, 1, . . .} and similarly the Faddeev sequence of Dc |Σ by
{DcΣ (k), D̃cΣ (k) | k = 0, 1, . . .}. The corresponding coefficient matrices are denoted by C Σ (k), C̃ Σ (k)
(for k = 0, 1, . . .) and DΣ (k), D̃Σ (k) (for k = 0, 1, . . .), respectively. In order to calculate these,
one needs to know the formula for the trace of a linear combination of the linear matrix operators
1
1
2 Lk Rl + 2 Ll Rk |Σm , where Lk stands for multiplication on the left by Ac (k) and Rl stands for
multiplication on the right by ATc (l), where k, l ∈ {0, 1, 2, . . . , m − 1}. This is given in the following
lemma.
Lemma 5.1 The trace of the linear matrix operator 12 Lk Rl + 12 Ll Rk |Σm considered as an endomorphism on the set Σm of m × m matrices is equal to 12 trAc (k)trAc (l) + 12 tr {Ac (k)Ac (l)}.
Proof. Choose the usual matrix inner product hP, Qi = tr P T Q as inner product on Σm . (Of
course the choice of inner product does not affect the value of the trace.) An orthonormal basis
Eij +Eji
√
with respect to this inner product is given by {Eii }m
|i > j}. It contains 12 m(m + 1)
i=1 ∪ {
2
1
elements, all of which are of course symmetric. The trace of 2 Lk Rl + 12 Ll Rk |Σm is now given by:
Pm
P
E +E
E +E
( 12 Lk Rl + 12 Ll Rk )(Eii )i + i>j h ij√2 ji , ( 21 Lk Rl + 12 Ll Rk )( ij√2 ji )i =
Pm
Pm
= 21 i=1 tr Eii Ac (k)Eii ATc (l) + 21 i=1 tr Eii Ac (l)Eii ATc (k) +
i=1 hEii ,
+ 14
tr Eij Ac (k)Eij ATc (l) + Eij Ac (k)Eji ATc (l) + Eji Ac (k)Eij ATc (l) + Eji Ac (k)Eji ATc (l) +
P
+ 41 i>j tr Eij Ac (l)Eij ATc (k) + Eij Ac (l)Eji ATc (k) + Eji Ac (l)Eij ATc (k) + Eji Ac (l)Eji ATc (k) =
Pm
= i=1 eTI Ac (k)ei eTi ATc (l)ei +
P + 12 i>j eTj Ac (k)ei eTj Ac (l)T ei + eTj Ac (k)ej eTi Ac (l)T ei +
+eTi Ac (k)ei eTj Ac (l)T ej + eTi Ac (k)ej eTi Ac (l)T ej
P
i>j
Working this out shows that this is equal to
X
1
eTi Ac (k)ei eTj ATc (l)ej +
2
1
2
ij
X
eTi Ac (k)ej eTi Ac (l)T ej
ij
which is clearly equal to
1
2 trAc (k)trAc (l)
+ 12 tr {Ac (k)Ac (l)}
2
The sequence of coefficient matrices C Σ (k) is now given by
C Σ (0) = E11
C̃ Σ (0) = Ac C Σ (0) + C Σ (0)ATc
(5.1)
τ T C̃ Σ (k−1)τA +hTA ,C̃ Σ (k−1)i
E11
2k
C Σ (k) = C̃ Σ (k − 1) − A
C̃ Σ (k) = Ac C Σ (k) + C Σ (k)ATc
14
k = 1, 2, . . . , 12 m(m + 1) − 1 (5.2)
and Γ is obtained by using the formula
Γ=
τAT C̃ Σ ( 12 m(m
m(m + 1)
C Σ ( 12 m(m + 1) − 1)
+ 1) − 1)τA + hTA , C̃ Σ ( 21 m(m + 1) − 1)i
(5.3)
where hX, Y i = Σij Xij Yij denotes the inner product of two matrices X and Y of the same size, and
where TA denotes the m × m matrix which has tr {Ac (k − 1)Ac (l − 1)} as its (k, l)th element.
Similarly the sequence of coefficient matrices DΣ (k) is given recursively by
DΣ (0) = E11
D̃Σ (0) = DΣ (0) − Ac DΣ (0)ATc
(5.4)
τ T D̃ Σ (k−1)τA +hTA ,D̃ Σ (k−1)i
E11
2k
DΣ (k) = D̃Σ (k − 1) − A
D̃Σ (k) = DΣ (k) − Ac DΣ (k)ATc
k = 1, 2, . . . , 12 m(m + 1) − 1 (5.5)
and ∆ is obtained from
∆=
τAT D̃Σ ( 21 m(m
m(m + 1)
DΣ ( 12 m(m + 1) − 1)
+ 1) − 1)τA + hTA , D̃Σ ( 21 m(m + 1) − 1)i
(5.6)
Both algorithms require just 12 m(m + 1) iterations to obtain Γ and ∆ instead of the previous m2
iterations. The calculation of the traces of the operators is more involved though, but does not
increase the overall complexity of the algorithms.
6
Examples
In this section some outcomes of the algorithm are presented. In fact we will give the formula for
Γ in the case p(s) = q(s) in terms of the coefficients of p(s) for several values of m. One reason
for doing this, apart from showing what kind of results can be obtained with this algorithm, is
one can obtain the matrix Γ for arbitrary parametrizations
that by substitution of pk = − trÃ(k−1)
k
of A, and using our formulas, one can relatively easily obtain the solution of the Sylvester or
Lyapunov equation involved using Γ. Because Γ is ‘alternating-Hankel’ it suffices to give a common
denominator, together with the elements of the first row and last column. Furthermore, because it
is as well symmetric, each element of the matrix for which i + j is odd is zero.
The formulae presented below all apply to the continuous-time case, because experience shows
that in the discrete-time case the outcomes are usually more involved. They have been calculated
on a 486-based personal computer. Once the formulae are available, it is possible within the Maple
and Mathematica software packages to substitute numerical values for the coefficients and then to
obtain the outcomes with prespecified numerical accuracy.
For m = 4 and A given by


−p1 −p2 −p3 −p4
 1
0
0
0 

A=
(6.1)
 0
1
0
0 
0
0
1
0
the coefficient matrix Γ satisfying AΓ + ΓAT = E11

r1
0
 0 −r2
Γ=
 r2
0
0 −r3
is calculated as:

r2
0
0 −r3 

r3
0 
0 −r4
(6.2)
where
r1
r2
r3
r4
= 12 (−p2 p3 + p1 p4 )/d
= 12 p3 /d
= − 12 p1 /d
= 12 (p1 p2 − p3 )/(p4 d)
15
(6.3)
(6.4)
(6.5)
(6.6)
and where the polynomial d appearing in the denominators of r1 , . . . , r4 is given by
d = p1 p2 p3 − p23 − p21 p4
(6.7)
These formulae were obtained with Mathematica in about 15 seconds, using the general algorithm
which does not exploit the available symmetry. The direct approach with Kronecker products using
the Mathematica routines ‘LinearSolve’ and ‘Factor’ (to obtain simplified results) took about 15
seconds as well. (However, in the discrete-time case the direct approach needed 5 minutes to obtain
the solution and many more minutes for simplification, while the algorithm of this paper was capable
of yielding the simplified results in less than 2 minutes.)
For m = 7 and A given by


−p1 −p2 −p3 −p4 −p5 −p6 −p7
 1
0
0
0
0
0
0 


 0
1
0
0
0
0
0 


0
1
0
0
0
0 
(6.8)
A=

 0

 0
0
0
1
0
0
0


 0
0
0
0
1
0
0 
0
0
0
0
0
1
0
the coefficient matrix Γ satisfying AΓ + ΓAT

r1
0
 0 −r2

 r2
0

0
−r
Γ=
3

 r3
0

 0 −r4
r4
0
= E11 is calculated as:
r2
0
r3
0
r4
0
r5
0
−r3
0
−r4
0
−r5
0
r3
0
r4
0
r5
0
r6
0
−r4
0
−r5
0
−r6
0
r4
0
r5
0
r6
0
r7










(6.9)
where
r1
= − 21 (−p1 p34 p7 − p22 p4 p5 p7 + p1 p24 p6 p5 + p24 p7 p5 + p6 p27 + p6 p22 p25 + p32 p27
−p4 p6 p5 p2 p3 − 2p6 p22 p7 p3 + p24 p7 p2 p3 − p1 p4 p26 p3 + p26 p5 p3 − p6 p4 p25 + p21 p36
−2p7 p1 p26 − 2p27 p4 p2 + 3p7 p1 p6 p4 p2 − 2p1 p5 p26 p2 + p5 p6 p7 p2 + p26 p2 p23 )/d
r2
=
1 2 2
2 (p6 p3
(6.10)
+ p27 p22 + p5 p6 p7 − 2p3 p7 p2 p6 + p25 p2 p6 − p1 p5 p26 − p27 p4 − p7 p5 p2 p4
+p7 p1 p6 p4 − p5 p6 p3 p4 + p7 p24 p3 )/d
(6.11)
r3
=
2
1
2 (−p2 p7
− p1 p24 p7 + p4 p7 p5 + p1 p4 p6 p5 − p6 p25 + p6 p7 p3 − p1 p26 p3 + p6 p7 p1 p2 )/d (6.12)
r4
=
1 2
2 (p7
+ p6 p5 p3 − p7 p3 p4 − 2p6 p7 p1 − p6 p2 p5 p1 + p7 p2 p4 p1 + p26 p21 )/d
(6.13)
r5
= − 12 (p6 p23 − p2 p3 p7 + p5 p7 + p22 p7 p1 − p2 p3 p6 p1 − p5 p6 p1 − p7 p1 p4 + p6 p21 p4 )/d
(6.14)
r6
= − 12 (−p25 + 2p1 p5 p4 − p21 p24 − p1 p6 p3 + p7 p3 − p23 p4 + p21 p6 p2 − p1 p7 p2 + p5 p3 p2
+p1 p4 p3 p2 − p1 p5 p22 )/d
r7
(6.15)
= − 12 (−p25 p3 p2 + p7 p23 p2 + p31 p26 + p1 p25 p22 + p1 p5 p7 p2 + p21 p4 p7 p2 − 2p21 p5 p6 p2
−p1 p5 p4 p3 p2 + p1 p6 p23 p2 − p7 p1 p3 p22 + p4 p5 p23 + p35 − p6 p21 p4 p3 − 2p5 p7 p3 + p21 p24 p5
−2p1 p4 p25 + p1 p27 − 2p21 p7 p6 − p6 p33 + 3p5 p6 p1 p3 )/(dp7 )
(6.16)
16
and where the polynomial d appearing in the denominators of r1 , . . . , r7 is given by
d
= −p27 p3 p22 + 2p7 p1 p5 p24 + p7 p3 p1 p2 p24 − p23 p7 p24 + p3 p7 p5 p2 p4 − p7 p3 p1 p6 p4
+3p21 p7 p2 p6 p4 + 2p27 p3 p4 − p7 p25 p4 + p1 p2 p23 p26 − p21 p7 p34 − p2 p3 p25 p6
−2p21 p2 p5 p26 − p21 p3 p4 p26 + p21 p24 p5 p6 − p1 p2 p3 p4 p5 p6 + 3p1 p3 p5 p26
−3p21 p7 p26 + p35 p6 + p31 p36 − p33 p26 − p7 p1 p5 p22 p4 − 3p1 p27 p2 p4
−2p4 p1 p25 p6 + p7 p1 p5 p2 p6 − 3p3 p7 p5 p6 − 2p7 p3 p1 p22 p6
+2p23 p7 p2 p6 + p1 p25 p22 p6 + p1 p27 p32 − p37 + p5 p27 p2 + 3p1 p27 p6 + p23 p4 p5 p6
(6.17)
These formulae were obtained using Maple in less than 1 hour, again using the general algorithm
which does not exploit the symmetry. The direct approach using Kronecker products turned out
to be infeasible for this particular case m = 7, because it amounts to solving a linear system of
equations of size 49 × 49. As one may notice, the formulae for the case m = 4 can be reobtained
by restricting to the 4 × 4 left-upper block of Γ, substituting p5 = p6 = p7 = x, cancelling common
factors x and then setting x to zero.
7
Conclusions and further research
The algorithms for solving Lyapunov and Sylvester equations developed in this paper have turned out
to be well-suited for symbolic calculations. As the dimensions of the problem increase, the Faddeevbased recursive algorithms increasingly outperform a direct approach algorithm using Kronecker
products. If the algorithms are to be used for numerical calculation only, care should be taken
because numerical round-off errors tend to rapidly destroy the accuracy of final and intermediate
results. (On a computer with 16-digit accuracy, the outcomes are generally unreliable for m or n
larger than 4.) However, if A and B T are given in controller form, the analytic structure of the
Faddeev sequences of A and B is known, which may help to improve on numerical performance
of the algorithms. Symbolic formulae obtained with the algorithms (such as given in the previous
section for m = 4 and m = 7) could also be taken as a startingpoint, thus bypassing the numerically
most ill-conditioned computations.
If m = n, the numerical complexity of the Faddeev-based algorithms is easily shown to be
of order O(n4 ), both with respect to the Lyapunov or Sylvester equations and for the Faddeev
sequences of the matrices A and B. Since, in general, inversion of an n × n matrix requires an
algorithm of order O(n2.5 ) it is interesting to note that the Kronecker product approach requires
inversion of an n2 ×n2 matrix, thus amounting to a complexity of order O(n5 ). About the complexity
of the algorithm for symbolic computation little can be said. Here, complexity is mainly determined
by the representation of all intermediate quantities in the algorithm and therefore depends on the
parametrization of A, B and K. However, from the results of this paper we feel there is good
reason to believe that the Faddeev-based algorithms are well-suited for symbolicly solving Sylvester
equations if A and B T are on controller form.
The matrix-algebra approach of this paper in conjunction with Faddeev’s algorithm has also
turned out to be a powerful instrument to gain more theoretical insight in the structure and the
solution of Lyapunov and Sylvester equations. In particular the role of the ‘Faddeev reachability
matrix’ has been commented upon, as it provides the basis for the state-space of a linear SISO system
that leads to its controller canonical form. It also becomes manifest in the formulae which specify
how the solution to Lyapunov and Sylvester equations can be quickly obtained for an arbitrary righthand side K provided a solution is available for a right-hand side xy T of rank one with (A, x) and
(B T , y) reachable pairs. Such situations are quite common when working with balanced realizations,
both in the continuous-time and discrete-time case.
Important simplifications have been shown to occur when A and B have the same characteristic
polynomial p(s) = q(s), of which the symmetric case where B = AT is an example. An alternative
algorithm has been developed for this case, which requires only 12 m(m + 1) iterations instead of the
usual m2 .
The situation where A and B T are in controller form and K = E11 has been recognized as
the central issue to be further investigated. Several alternative approaches for obtaining Γ and ∆
17
which exploit their alternating-Hankel and Toeplitz structure are currently being studied. Other
topics under research include the symbolic calculation of various Riemannian metrics on spaces of
linear systems (which are calculated by repeated solution of a number of Lyapunov equations, cf.
[9, 4]), including the Fisher information metric for stable linear SISO systems driven by Gaussian
white noise. Other subjects of interest are the generalization of Faddeev reachability matrices to
the multivariate case and the extension of the methods in this paper to linear matrix equations of
the more general form AP B + CP D = K.
References
[1] B. Friedland. Control System Design. McGraw-Hill, New York, 1986.
[2] F.R. Gantmacher. The Theory of Matrices, volume I, II. Chelsea, 1959.
[3] B. Hanzon. Some new results on and applications of an algorithm of Agashe. In R.F. Curtain,
editor, Modelling,Robustness and Sensitivity Reduction in Control Systems, pages 285–303.
Springer Verlag, Berlin, 1987.
[4] B. Hanzon. Identifiability, recursive identification and spaces of linear dynamical systems. CWI
Tracts 63 and 64. Centrum voor Wiskunde en Informatica (CWI), Amsterdam, 1989.
[5] B. Hanzon and J.M. Maciejowski. Constructive algebra methods for the L2 problem for stable
linear systems, with examples. Technical Report CUED/F-INFENG/TR 178, Department of
Engineering, Cambridge University, June 1994. Submitted to Automatica.
[6] T. Kailath. Linear Systems. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1980.
[7] R.E. Kalman, J. Coffy, and P. Nicholson. Méthodes algebriques modernes appliquées à la
theorie des systèmes linéaires. Technical report, Centre d’Automatique, Fontainebleau, 1969.
[8] R.J. Ober. Balanced realizations: Canonical form, parametrization, model reduction. International Journal of Control, 46:643–670, 1987.
[9] R.L.M. Peeters. System Identification Based on Riemannian Geometry: Theory and Algorithms,
volume 64 of Tinbergen Institute Research Series. Thesis Publishers, Amsterdam, 1994.
[10] R.L.M. Peeters and B. Hanzon. A balanced canonical form for discrete-time stable all-pass systems. In U. Helmke, R. Mennicken, and J. Saurer, editors, Systems and Networks: Mathematical
Theory and Applications, volume II, pages 417–420. Akademie Verlag, 1994.
18
Errata
To: Bernard Hanzon and Ralf L.M. Peeters, A Faddeev Sequence Method for Solving Lyapunov and
Sylvester Equations, Report M 95-01, Dept. Mathematics, University of Limburg, Maastricht, 1995.
• On p.4, one line has been misprinted. It should read:
of {Li Rj |i = 0, 1, 2, . . . , m − 1; j = 0, 1, 2, . . . , n − 1}. Note that the linear combination does not
• Formulae (3.8) and (3.9) in Theorem 3.2, p.5, should read:
c̃0j (k)
=
m−1
X
i=0
c̃i0 (k)
=
n−1
X
trÃ(i)
cij (k) + c0,j−1 (k)
i+1
if j > 0
(3.8)
trB̃(j)
+ ci−1,0 (k)
j+1
if i > 0
(3.9)
cij (k)
j=0
˜
• In the proof of Theorem 3.2, p.6, the calculations with respect to C(k)
should read:
!
m−1
X
X n−1
tr
Ã(i)
tr
B̃(j)
˜
cij (k) Li+1 Rj +
L0 Rj + Li Rj+1 +
Li R0 =
C(k)
=
i+1
j+1
i=0 j=0
=
m−2
X
X n−1
cij (k)Li+1 Rj +
i=0 j=0
cij (k)
i=0 j=0
cij (k)Li Rj+1 +
m−1
X n−1
X
m−2
X
trÃ(i)
L0 Rj +
ci,0 (k)Li+1 R0 +
i+1
i=0
n−2
X
trB̃(j)
Li R0 +
c0,j (k)L0 Rj+1 =
j+1
i=0 j=0
i=0 j=0
j=0
!
m−1
n−1
X n−1
X
X m−1
X trÃ(i)
cij (k) L0 Rj +
=
(ci−1,j (k) + ci,j−1 (k)) Li Rj +
i+1
i=1 j=1
j=0
i=0


m−1
m−1
n−1
X n−1
X
X
X
tr
B̃(j)

 Li R0 +
+
cij (k)
ci−1,0 (k)Li R0 +
c0,j−1 (k)L0 Rj
j+1
i=0
j=0
i=1
j=1
+
m−1
X n−2
X
m−1
X n−1
X
cij (k)
• and in the subsequent calculations one should accordingly have the new expressions as in formulae
(3.8) and (3.9) above:
c̃0j (k)
=
m−1
X
i=0
c̃i0 (k)
=
n−1
X
j=0
trÃ(i)
cij (k) + c0,j−1 (k)
i+1
if j > 0
trB̃(j)
+ ci−1,0 (k)
j+1
if i > 0
cij (k)
• At the end of formula (4.6), p.13, a factor A(k) should be added.
• Remark: on p.13, the matrices A(k) and Ã(k) may alternatively be called Sylvester matrices as
can be seen from their block-Toeplitz structure.