Download Numerical methods for Vandermonde systems with particular points

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polynomial greatest common divisor wikipedia , lookup

Factorization wikipedia , lookup

Bra–ket notation wikipedia , lookup

Quadratic form wikipedia , lookup

Linear algebra wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Cartesian tensor wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

System of linear equations wikipedia , lookup

Determinant wikipedia , lookup

Four-vector wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix calculus wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Matrix multiplication wikipedia , lookup

Transcript
Numerical methods for Vandermonde systems
with particular points distribution
T.Tommasini
Dipartimento di Matematica
Università di Bologna
A.A. 2002-2003
Revised February 2004
Numerical methods for Vandermonde systems with particular points
distribution
T.Tommasini
INDEX
1. Introduction
2. Inverse matrix, determinant and matrix-vector product
3. System solution
3.1. Primal system
3.2. Dual system
4. Parallel implementation and computational analysis
5. Errors and perturbation bounds
6. Rectangular Vandermonde Matrices
7. Least squares solution
8. Computational analysis
9. Perturbation bounds for the QRD factorization
Appendix
Conclusions
References
2
Numerical methods for Vandermonde systems with particular points distribution
T.Tommasini
1. Introduction
In this paper we will consider column-wise Vandermonde matrices of order n=kq of type
(1.1)
1
1 ………….1
1
2 ……… n
……………………
1n-1 2n-1 …… nn-1
V(1,2,…,n) =
where (j-1)k+s (s=1,k) are the kth roots of q distinct complex numbers a j0 (j=1,q) , that is
(j-1)k+tk =aj
(1.2)
,
t=1,k.
In other words they are equidistant points on concentric circles in the complex plane. The transpose
of V is called row-wise Vandermonde matrix and it arises in the polynomial interpolation problem
on particular points distribution.
We will also consider the more simple case of a Vandermonde matrix of order n=2q (k=2), where
j (j=1,q) are q distinct pairs of complex numbers different from zero, the two roots of the
complex numbers aj (j=1,q).
(1.3)
V(-1,1,…,-q,q) =
1
1 …… 1
1
-1
+1 …… + q
-q
….…………………………
(-1)n-1 +1n-1 .…. + qn-1 (-q)n-1
In order to simplify we have used the same index for the two roots opposite in sign of the numbers
aj. We will call it the symmetric case.
In the next pages we derive a compact expression of V by means of the Kronecker product and
then formulas and methods for the solution of the following problems:
a. Inverse matrix and determinant
b. Primal and dual system solution
c. Generalized inverse
d. Least squares problem
e. Errors and perturbation bounds.
In the general case (1.1) if we call Vj the matrices
(1.4)
Vj
=
1
1 …… 1
(j-1)k+1  (j-1)k+2 … jk
……………………..
(j-1)k+1k-1  (j-1)k+2 k-1…jkk-1
,
j=1,q,
and in the symmetric case ( k=2, n=2q) they will be simply
3
(1.5)
Vj =
1 1
-j j
j=1,q
,
then the matrix V can be written in the block form (see [10]) as
(1.6)
V1
V2 .….… Vq
V = a 1 V1
a2V2 ……. aqVq
a1q-1V1 a2q-1V2 …… aqq-1Vq
Let Aq = {aji-1}i=1,q;j=1,q denote the qxq column-wise Vandermonde matrix deriving from the points aj
(j = 1, q), by means of the Kronecker product it is possible to express V in this way
(1.7)
V=(AqIk)D,
where Ik is the k order identity matrix and D is the block diagonal matrix
(1.8)
D=diag(V1 ,V2,…,Vq)
From (1.7) it derives this formulation of the conjugate transpose of V
(1.9)
VH=DH(AqHIk).
In the symmetric case the previous expressions (1.7), (1.8) and (1.9) are true with k=2 and Vj in D
given by (1.5).
The extension to the rectangular case is discussed in Sect. 6.
2. - Inverse matrix, determinant and matrix-vector product
From the previous formulation and from known Kronecker product properties (see [4], [5]), the
inverse matrices of V and VH can be expressed by
(2.1)
V-1=D-1(Aq-1Ik) ,
(2.2)
V-H=(Aq-HIk)D-H ,
where
(2.3)
D-1=diag(V1-1 ,V2-1,…,Vq-1)
(2.4)
D-H=diag(V1-H ,V2-H,…,Vq-H)
involving the inversions of Vandermonde matrices Aq and AqH of order q=n/k (q=n/2 in the symmetric
case) and the well known k-order matrices Vj-1 and their conjugate transposes.
Moreover we can use the same formula (1.7) in order to evaluate the determinant of the Vandermonde
matrix V. By means of the Binet theorem and of a property of the Kronecker product it is possible to
write
(2.5)
det(V)=det[(AqIk)D]=det(Aq)kdet(D)
where
4
,
det(D)= j=1,q det(Vj)
(2.6)
In other words the determinant of V may be computed by the computation of the determinant of the
reduced dimension matrix Aq with a computational cost of order q2/2. The computation of det(D) is of
order qk2/2 and it is not relevant when k<<q.
We can use also the formula (1.7) in order to compute the matrix-vector product p=Vu, where u is an
arbitrary n-dimensional vector in this way
p = Vu = (AqIk)Du .
(2.7)
Let z=Du; we call Z the qk matrix derived from vector z row-wise, that is
(2.8)
z=vect(Z) ,
z1 ……… zk
Z = zk+1 ……. z2k
…………….
z(q-1)k+1 … zqk
and in the same way we call P the qk matrix derived from vector p row-wise, p=vect(P).
Then
(2.9)
p = vect(P) = (AqIk)z = (AqIk)vect(Z) .
If M,N and QT are matrices of dimensions such that it is possible to execute their product and
vect(matrix) is the vector derived from the matrix by rows, it is derived that (see [1])
(2.10)
vect(MNQT) = (MQ)vect(N) .
Then, applying (2.10) to (2.9), the matrix-vector multiplication p=Vu can be expressed as
(2.11)
p = vect(P)= vect(AqZ) .
Let us consider the following partition of the vectors u, z in subvectors of dimensions k.
(2.12)
u1
u = u2
…
uq
z1
, z = z2
…
zq
.
Then, in order to reduce the computational effort, the computation of the product p=Vu can be done in
the following way:
5
Algorithm 2.1
(j-1)k+tk =aj
,
t=1,k , j=1,q
Aq = {aji-1}i=1,q;j=1,q
Vj (j=1,q) as in (1.4)
z1
u1
z= ... , u= ...
zq
uq
Compute
zj=Vj uj , j=1,q
Compute
P = Aq Z
z= vect(Z) , p=vect(P) (by rows)
This means that the product Vu may be computed by executing the product AqZ, with a computational
cost cnew=2kq2+2k2q instead of c=2n2=2k2q2 . Without considering the lower cost 2k2q of the
computation of z=Du (generally k<<q), an approximated speed-up factor c/cnew =k is obtained.
3. System solution
3.1. Primal system
We now consider linear systems of equations of type
(3.1.1)
Vx=b
where V is matrix (1.1) and x, b n-dimensional vectors. Using the expression (2.1) of the inverse of
V, we can write the solution x of (3.1) in this way
x=D-1(Aq-1Ik)b
(3.1.2)
.
In order to reduce the solution computational effort, we call B the qk matrix derived from vector b
row-wise
(3.1.3)
b=vect(B)
b1 ……… bk
, B = bk+1 …….. b2k
…………….
b(q-1)k+1 … bqk
= b.1,b.2,...,b.k
and then , using some known formulas of the Kronecker product (see [1]) we obtain the following
expression of x
(3.1.4)
x = D-1(Aq-1Ik)vect(B) = D-1 vect(Aq-1B) .
6
In [10] , starting from this solution formulation, a less expensive algorithm is derived to compute x in
this way.
Now we let C=Aq-1B and call c. i and b.i the columns of C and B respectively; then we may compute C
solving the k Vandermonde systems
(3.1.5)
Aqc.i = b.i
i=1,k
with the Björck-Pereyra algorithm.
Let c=vect(C), we consider the following partition the vectors c and x in k-dimensional subvectors cj
(j=1,q) and xj (j=1,q) respectively; then we may solve the q systems of order k
(3.1.6)
Vj xj = c j
j=1,q .
Vj = Bj E
j=1,q
Moreover
(3.1.7)
where Bj = diag(j1,j,…,jk-1) is a diagonal matrix (j =  (j-1)k+1 is a fixed k-th root of aj) and
E=V(1,2,…,k) is a Vandermonde matrix on the kth roots i (i=1,k) of 1; then instead of (3.1.6) we
may compute the DFT of the vectors dj=Bj-1 cj (j=1,q)
(3.1.8)
xj= 1/k EH dj
j=1,q .
Given the Vandermonde matrix V (1.1) and the right hand side vector b, the algorithm for the solution
of the primal system on the special points distribution satisfying (1.2) can be summarized in this way:
Algorithm SV
b=vect(B)
C=Aq-1B , c=vect(C)
C=c.1, c.2,…c.k , B=b.1,b.2,...,b.k
c1
x1
c= … , x= …
cq
xq
Bj = diag(j1,j,…,jk-1) , j=1,q, (j , a fixed k-th root of aj)
E=V(1,2,…,k) ,
( i , i=1,k , k-roots of 1)
Solve with the Björck-Pereyra
algorithm
Aqc.i = b.i
i=1,k
Compute
dj=Bj-1 cj
j=1,q
Compute DFT
xj= 1/k EHdj
j=1,q
In the symmetric case, when k=2 and V is matrix (1.3), we have to solve only two Vandermonde
7
systems (3.1.5) of halved dimensions q=n/2. Then we may use the inverse of Vj in order to compute
directly the solution in this way
xj =Vj-1 cj = ,
(3.1.9)
j=1,q
Given the Vandermonde matrix V (1.3) and the right hand side vector b, the algorithm for the primal
system solution on a symmetric points distribution can be expressed as
Algorithm SSV
b=vect(B)
C=Aq-1B , c=vect(C)
C=c.1, c.2, B=b.1,b.2
c1
x1
c= … , x= …
cq
xq
1 1
Vj = -j j
, j=1,q
Solve with the Björck-Pereyra
algorithm
Aqc.i = b.i
i=1,2
Compute
xj=Vj-1 cj
j=1,q
In a different way it is possible to compute directly the inverse of Aq with, for example, the ParkerTroub algorithm and then the system solution will be
(3.1.10)
x = D-1(Aq-1Ik)vect(B) = D-1 vect(Aq-1B) .
Therefore a second algorithm for the primal system solution may be the following:
Algorithm SV1
1. compute the inverse of Aq with the Parker-Troub algorithm;
2. compute the matrix product Aq-1B;
3. compute x by DFT using (3.1.7) and (3.1.8).
8
3.2 Dual system
In a similar way if we consider the dual system
VHy =f
(3.2.1)
where V is the Vandermonde matrix of order n = kq given by (1.1) and f the right hand side vector, we
can write the vector solution y using the expression of the inverse of VH
(3.2.2)
y=(Aq-HIk)D-Hf
where D-H=diag(V1-H ,V2-H,…,Vq-H).
If we call
(3.2.3)
g= D-Hf
,
g=vect(G) ,
where G is the qk matrix deriving from vector g row-wise, (3.2.2) may be written as
(3.2.4)
y=(Aq-HIk)vect(G)=vect(Aq-HG)
In order to compute g, we consider the following partition of g and f in k dimensional subvectors
(3.2.5)
g1
g= …
gq
f1
f= …
fq
and the q Vandermonde systems of order k
(3.2.6)
VjHgj = f j
j=1,q
Now using (3.1.7) in (3.2.6) , we obtain
(3.2.7)
EHBjHgj = f j
j=1,q
Calling hj=BjHgj (j=1,q), we can compute gj = Bj-Hh j (j=1,q) and h j by the DFT algorithm
(3.2.8)
hj =1/k E f j
j=1,q .
Then if y=vect(Y), (3.2.4) is equivalent to
(3.2.9)
Y= Aq-HG
Now considering the columns of matrices Y and G
(3.2.10)
Y=y.1, y.2,…,y.k 
,
G=g.1,g.2,...,g.k
(3.2.9) may be written as the k dual Vandermonde systems of order q=n/k
(3.2.11)
Aq Hy.i= g.i
i=1,k
and then they can be solved by the Björck-Pereyra algorithm.
Given the Vandermonde matrix VH and the right hand side vector f , the algorithm for the solution of
9
the dual system (3.2.1) can be summarized in this way:
Algorithm SVH
y=vect(Y)
g=D-Hf
, g=vect(G)
1
g
f1
g= …
, f= …
gq
fq
Bj = diag(j1,j,…,jk-1) , j=1,q (j , a fixed k-th root of aj)
E=V(1,2,…,k) ,
( i , i=1,k , k-roots of 1),
j
H j
h =Bj g , j=1,q
Y=y.1, y.2,…,y.k , G=g.1,g.2,...,g.k
Compute DFT
hj = 1/k E fj
Compute
gj=Bj-H hj
Solve with the Björck-Pereyra
algorithm
AqH y.i= g.i
,
j=1,q
,
j=1,q
, i=1,k
In the symmetric case, the q systems (3.2.6) can be solved easily by means of the expression of the
inverse of VjH
(3.2.12)
gj=Vj-H fj
,
j=1,q
This method for the solution of the dual system on a symmetric points distribution can be summarized
in this way
Algorithm SSVH
y=vect(Y),
g=D-Hf
,
g=vect(G)
1
1
g
f
g= …
, f= …
gq
fq
Y=y.1, y.2, G=g.1,g.2
1 1
Vj = -j j
j=1,q
10
Compute
gj= Vj-H fj
,
j=1,q
Solve by Björck-Pereyra algorithm
AqH y.i= g.i
,
i=1,2
In the same manner as for the primal system a second algorithm for the system solution of (3.2.1) may
be the following:
Algorithm SVH1
1. compute the inverse of AqH with the Parker-Troub algorithm;
2. compute g by DFT using (3.2.7) and (3.2.8);
3. compute the matrix product Aq-HG as in (3.2.9).
4. Parallel implementation and computational analysis
The compact matrix formulation of the algorithms SV, SSV and SVH, SSVH respectively,
derived in the previous section, is particularly suitable for the parallel implementation. Therefore
we suppose that k parallel processors are available, with k << q and, to simplify, q = kl moreover,
every processor may be vectorial. In this hypothesis the parallel formulation of the algorithm SV
can be expressed in the following way:
Algorithm
PARSV
1. compute simultaneously on k processors the k columns c.i (i = 1,q) of C with
the Björck-Pereyra algorithm, applied to the systems (3.1.5);
2. compute simultaneously k subvectors dj=Bj-1 cj ( j=1,kl) by l major steps;
3. compute simultaneously the DFT of k vectors d j; after l major steps all the
q systems will be solved.
In the same manner we can do the parallel implementation of the dual system solution algorithm.
Both for the primal and the dual system, when k=2, also the parallel version is simplified.
It is well known that an nn Vandermonde system can be solved sequentially with O(n2)
operations. The sequential implementation of the SV algorithm requires the following number of
operations
(4.1)
C 1 = kq2+qk log k+qk
,
where  and  are constants. With k processors the parallel algorithm PARSV requires
(4.2)
C k = q2+ qlog k+q .
that is on k processors a theoretical speed-up
(4.3)
S k = C1/Ck = k
is obtained and, consequently, maximum efficiency.
11
Moreover it is important to underline that the previous algorithm SV is competitive also in
sequential computation. Indeed, if we consider the speed-up of its implementation with respect to
the Bjorck-Pereyra algorithm, the following gain is obtained
(4.4)
S =k2q2/(kq2+qk log k+qk)t=
=kq/(q+log k+1)=
=k(1+(logk+1)/q)-1 .
If we use, for example, the coefficients =5/2 and =5 of the known solution algoritms (see [2],
[4]), the following values for the operations number (,/,+,-) are obtained
(4.5)
C k = 5/2q2 +5q log k+q ,
(4.6)
C 1 =5/2kq2+5qk log k+qk ,
(4.7)
S = k(1+ 2(5log k+1)/5q) -1 .
This also means that the sequential implementation of the algorithm pre sented is about k times
faster than the Björck-Pereyra algorithm, for Vandermonde matrices of this type when kq . In
other words, the asymptotical speed-up is k.
Similar considerations may be done in the symmetric case when k=2. Practically we solve two
Vandermonde systems of halved dimensions q = n/2 instead of the n order system (3.1.1) and the cost
of the computation of xj is negligible. Therefore, whatever algorithm of n2-order we utilize for the
Vandermonde system solution, the computational cost is one half the original one (speed-up 2). If we
use the procedure given in [2] of complexity 5q2/2, the SSV algorithm requires 5q2 multiplications.
Indeed recently same new procedures of order n (log n)2 are devised for the Vandermonde systems
solution. By using them within the proposed method and omitting the computational cost of the small
systems solution(see (3.16) and (3.26)), we obtain a speed-up of
(4.8)
S'= [log n/(log (n/2)] 2 .
Moreover step 1 can be performed in parallel using 2 processors with a speed-up (k=2) Sk = 2 with
respect to the sequential one. We observe that also the less expensive step 2 may be execute in parallel
manner.
When j (j = 1, q) are real and bj (j = 1, n) real, all the steps may be performed using only real
arithmetic.
On the other hand the operations number (,/) of the algorithm SV1 will be
(4.8)
C1(1)= (k+6)q2+q k log k +qk ;
with k processors the computational cost is
(4.9)
Ck(1)= 7q2+q log k +q .
Therefore, when k is small with respect to q, this algorithm results more expensive.
Similar remarks, as for the algorithms SV, SSV, can be made on the dual system computational cost of
the algorithms SVD and SSVD.
12
5. – Errors and perturbation bounds
Now, starting from the compact expression of the Vandermonde matrix V given by (1.7) and using the
exact decomposition of Aq= UqLq, we can write
(5.1)
V=(UqIk)(LqIk)D.
In order to evaluate the perturbation errors involved by using this decomposition of Aq, we can
consider the perturbed matrix factorization
(5.2)
Aq + Aq = (Uq + Uq)(Lq + Lq)
and write
(5.3)
V+ V = (U + U)(L+L)D ,
where
(5.4)
L=LqIk
,
L=LqIk ,
(5.5)
U=UqIk
,
U=UqIk
.
The matrix D is considered not error affected.
Considering the norms 1,2, , it is easy to verify that
(5.6)
||L||t=|| Lq||t , ||U||t=||Uq||t , ||L||t = ||Lq||t , ||U||t = ||Uq||t , t=1,2, .
If we use the Frobenius norm, using a norm property of the Kronecker product, it follows
(5.7)
||L||F=|| Lq||F ||Ik ||F = k||Lq||F,
||U||F=||Uq||F ||Ik ||F = k||Uq||F .
Similar relations can be derived for ||L||F , ||U||F. Therefore we can write the equalities
(5.8)
||L||F ||Lq||F
 =  ,
||L||F
||Lq||F
||U||F ||Uq||F
 =  .
||U||F ||Uq||F
In other words, formulas (5.6) afferms that the error norm (norms t=1,2,) of the UL decomposition
is equal to the UL decomposition error norm of a Vandermonde matrix of order q=n/k. In Frobenius
norm, there is a factor k (see (5.7); we can say that the factorization relative errors of a Vandermonde
matrix of order n=kq are equal to ones of a q=n/k dimensional Vandermonde matrix decomposition
(see (5.8)).
This result is interesting because of the increasing of the ill-conditioning of the Vandermonde matrices
with their order. In the symmetric case (k=2, n= 2q), Aq is a matrix of order reduced by a factor 2.
The same considerations may be done for VH.
Moreover an upper bound for the condition number of Aq , using norm 1,2, may be computed.
From (1.7) and applying the previous norm properties, we can write
(5.9)
||AqIk||t=||Aq||t = ||VD-1||t  ||V||t ||D-1||t
In a similar way from (2.1) it follows
13
,
t=1,2, .
(5.10)
||Aq-1||t  ||V-1||t ||D||t
,
t=1,2, .
Hence it is obtained the following upper bound of the condition number of Aq
(5.11)
condt(Aq)  condt(D) condt(V) ,
t=1,2, . .
In a similar way an upper bound of the condition number of Aq may be computed using the Frobenius
norm.
From (1.7) and applying the properties of the Frobenius norm, we can write
(5.12)
||AqIk||F= || Aq||F||Ik ||F = ||VD-1||F  ||V||F||D-1||F
and hence
(5.13)
||Aq||F  (1/k) ||V||F||D-1||F .
In a similar way from (2.1) it follows
(5.14)
||Aq-1||F  (1/k) ||V-1||F ||D||F .
Hence it is obtained the following upper bound of the condition number of Aq.
(5.15)
condF(Aq)  (1/k)condF(D) condF(V) .
Interesting tests and results on the stability of the Björck-Pereyra algorithms are contained in [6] and
[7], when the points defining the matrix are not-negative and arranged in increasing order.
As shown in Section 4, the procedures SV and SVD have an approximated computational reduction
factor of k for any 0(n2) algorithm we use for the Vandermonde system solution. The speed-up factor
is 2 for SSV , SSVD. Moreover, in our experience, in addition to the advantage of lower computing
time, these methods give generally more accuracy, in agreement with the expectation derived from the
bounds (5.11) and (5.15). Some interesting numerical results are included in Appendix.
6. Rectangular Vandermonde Matrices
The expression of V by means of the Kronecker product given in (1.7) can be extended to a
rectangular Vandermonde matrix. Starting from this decomposition, we derive a formula for a
factorization of type QRD of a column-wise rectangular matrix given by a particular point distribution
in the complex plane and we obtain an algorithm for the least squares solution of an overdetermined
Vandermonde system of type
(6.1)
Vx = b,
where x and b are an n- and an m-vector respectively and V is the mxn complex Vandermonde matrix
(6.2)
V= {ji-1} i=1,m;j=1,n
and the n=kq points j (j = 1,...,q) defining the matrix are the kth roots of q distinct complex numbers
aj 0 (j=1,q) as in (1.2). We suppose the rows number m=kp and m>n . The advantage is a drastic
reduction in number of operations with respect to any known QR factorization of order O(mn) (see
[3],[4]).
14
Finally the error bounds of the new factorization are derived, that appear to be sharper than the classic
ones, depending on a reduced order matrix.
Extending a result obtained in [10] for the square matrices to the rectangular ones (see [11]), the
Vandermonde matrix (6.2) can be expressed in the simplified block form
(6.3)
V={aji-1Vj}i=1,p;j=1,q ,
where Vj are the k-order Vandermonde matrices defined in (1.4). We introduce the rectangular
Vandermonde matrix
(6.4)
A pq={aji-1}i=1,p;j=1,q
and use the Kronecker product in a similar way as in the square matrix case, in order to express the
matrix V by means of the compact expression
(6.5)
V=(ApqIk)D,
where D is the block diagonal matrix of order n=kq defined in ((1.8).
The row-wise rectangular Vandermonde matrix VH is the conjugate transpose of the matrix V and is
given by
(6.6)
VH = DH(ApqHIk)
In many applications the points defining the Vandermonde matrix are symmetric with respect to the
origin of the complex plane. This special case is obtained from the previous more general one when
k=2 (m=2p, n=2q): if we call ±j the two symmetric points, roots of the complex numbers aj (j=1,q),
the submatrices Vj have the simple form (1.5) and the matrix V is
(6.7)
V=(Am/2n/2I2)D,
where
(6.8)
D=diag(V1 ,V2,…,Vn/2) .
If the numbers aj are real and. positive, all the points are on the real axis and all the matrices involved
are real.
Moreover, using some Kronecker product properties, we can express the generalized inverse of V in
this way
(6.9)
V+ = [DH(ApqHIk)(ApqIk)D] -1 DH(ApqHIk) =
= D-1(Apq-1Ik)(Apq-HIk)D-HDH(ApqHIk) =
= D-1(Apq-1Ik)(Apq-HIk) (ApqHIk) =
= D-1(Apq-1Apq-HApqHIk) =
= D-1[(ApqHApq)-1ApqHIk] =
= D-1(Apq+Ik) .
15
In a different way, it is possible to use the singular value decomposition of the reduced order matrix
Apq
(6.10)
Apq =Up pqWqH
in order to write the generalized inverse of the matrix V by means of Apq+, in this way
(6.11)
V+=D-1(Wqpq+UpH Ik) .
7. Least squares solution
Now we can rewrite the overdetermined system (6.1) with m> n as
(7.1)
(ApqIk)Dx=b .
Using the standard properties of the Kronecker product, the exact solution may be obtained by means
of the generalized inverse of the matrix V given by (6.8), that is
(7.2)
x=D-1(Apq+Ik)b .
Moreover, if we call B the matrix of order pxk derived from the components of b row-wise, and
b=vect(B), the least squares solution x of (7.1) may be written as
(7.3)
x=D-1vect(Apq+B).
As the dimensions of matrices Apq+ and B are small compared to those of the original matrix V, this
formula implementation reduces the amount of work with respect to the explicit computation of V+.
Nevertheless the larger condition number of ApqHApq, suggests us the use of a different method.
It is better to compute a QR factorization of the reduced dimension matrix Apq, with any known
algorithm for Vandermonde matrices, and then, by some properties of the Kronecker product, we can
express the factorization of V and the least squares solution of the overdetermined system (6.1) by
these factors. We can write this factorization as
(7.4)
Apq =Qpq Rq
where the rectangular matrix Qpq and the square upper triangular matrix Rq are of dimensions p x q and
q x q respectively. Moreover the q columns of the matrix Qpq are orthonormal, that is QpqHQpq=Iq ,
where QpqH=QpqT. Factorizations of this type, together with the inverse of the factor Rq is computed
with a complexity of 5mn+7/2n for a m x n Vandermonde matrix. According to this, the matrix V of
(6.5), using also some properties of the Kronecker product, can be written as
(7.5)
V=(QpqIk)(RqIk)D .
So we have determined a factorization of V of type QRD, where the matrices Q and R,
(7.6)
Q=QpqIk
,
R=RqIk ,
easily obtained from the reduced dimension matrices Qpq and Rq, have the same properties of the
reduced dimension ones, that is QHQ= (QpqHIk)(QpqIk )=In and R is upper triangular ; D is the
known block diagonal matrix (1.8).
16
Moreover using (7.4), the previous system (7.1) becomes
(QpqRqIk)Dx = b.
(7.7 )
Multiplying both sides by ( Rq-1QpqHIk ) and using the properties of the Kronecker product, we
obtain
(7.8 )
( Rq-1QpqHIk ) (QpqRqIk)Dx = ( Rq-1QpqHIk ) b
( Rq-1QpqHQpqRqIk)Dx = ( Rq-1QpqHIk ) b
( Rq-1IqRqIk)Dx = ( Rq-1QpqHIk ) b
( IqIk)Dx = ( Rq-1QpqHIk ) b.
(7.9)
Dx = (Rq-1QpqH Ik) b.
Then, by the equality b = vect(B) we can write
(7.10)
Dx = vect(Rq-1QpqHB) .
Therefore if c =vect(C), where C is the matrix product
(7.11)
C=Rq-1Qpq HB
the block diagonal system (8.9) can be written as
(7.12)
Dx = c
where D=diag(V1 ,V2,…Vq) and Vj=BjE (j=1,q). Hence partitioning x and c in k-order subvectors and
(7.12) into q particular systems of order k, depending on the k roots of 1, they can be solved by the
DFT algorithm as in ( 3.1.8).
In such a way the least squares solution x of system (6.1) can be written in a simple compact matrix
formulation by means of (7.4), (7.11) and (7.12). The implementation of the previous formulas gives
rise to a method for the least squares solution of a Vandermonde system in these special cases of points
distribution, that may be summarized in three major steps:
Algorithm QRV
1. Compute the QR factorization of the Vandermonde matrix Apq;
2. Compute two matrix products, as in formula (7.11);
3. Compute the block diagonal system solution of (7.12), solving q systems of order k
with the DFT algorithm.
17
The resulting computational complexity is considerably lower with respect to the known algorithms, as
analysed in the next section, and the computer storage is obviously reduced.
An important application is the symmetric case described at the end of section 6, when k = 2 and
matrices V and D have the form (6.7) and (6.8) respectively. In such a situation of symmetric points
distribution, the formula (7.3) for the overdetermined system solution becomes
(7.13)
x=D-1vect(Am/2n/2+B),
where B is the (m/2) x 2 matrix derived from vector b and the inverse of D is given by the diagonal
block matrix
(7.14)
D-1=diag(V1-1,V2-1,…,Vn/2-1)
where the 2-order matrices Vj-1 (j=1,n/2) are known.
Moreover the compact matrix formulas (7.10) and (7.11) for the least squares solution become
(7.15)
x = D-1 vect(Rn/2-1Qm/2 n/2H B) .
If the numbers aj (j = 1, n/2) are real and positive, the whole algorithm can be performed using only
real arithmetic.
In a different way, it is possible to use the singular value decomposition of the reduced order matrix
Apq and the generalized inverse of the matrix V given by (6.10) to compute the least squares system
solution. In this case the solution vector x can be expressed as
(7.16)
x=D-1(Wqpq+UpH Ik) b
and considering also b=vect(B), it is possible to write
(7.17)
x=D-1vect(Wq pq+UpHB) .
Then, another algorithm may be formulated in this way:
Algorithm SVDV
1. Compute the SVD of
Apq =Up pqWqH
2. Compute the product
P =Wq pq+UpHB
3. Solve the diagonal block system Dx=vect(P)
8. Computational analysis
The implementation of the QRD factorization (7.5), involving the QR decomposition of a matrix of
order reduced by a factor k, gives rise to a speed-up factor of k2 with respect to the known algorithms.
If we use the algorithm described in [3] the operations numbers (,/) is 5pq+7/2q2. It is important to
observe that the computational reduction is general for any algorithm of this type of order we use for
the QR decomposition. In the symmetric case, when k = 2, the speed-up factor is obviously 4.
18
Here, too, for the overdetermined Vandermonde system solution we have to take into account the
matrix products of step 2 of the algorithm. The operations number required by step 3 may be
neglected. Hence, using the previously mentioned factorization method for step 1, the total cost of this
method for the least squares system solution is
Ck = (k + 5)pq+(k+7)/2 q2
(8.1)
in the general case, and
(8.2)
C2 = 7pq+9q2/2
when the points defining the matrix V are symmetric (k=2). Better results may be obtained if we use
the Strassen algorithm for the matrix products.
Anywhere it is interesting to observe that the speed-up factor is
(8.3)
12p+8q
Sk =  k2
(2p+ q)k+10p+7q
a function increasing very quickly with k, the dimension of the block decomposition submatrices Vj of
V. Function Sk has the asymptotic line slope ma = 2(12p+ 8q)/(2p+q), bounded below and above by 12
and 16 respectively. In other words the behaviour of function Sk is similar to a line that forms with the
x-axis an angle a limited by 85.230 < a < 86.420. Furthermore the speed-up Sk is almost independent
of the numbers p and q. On the symmetric point distribution (k = 2), the implementation of the
previous least squares algorithm produces a considerable complexity reduction and the speed-up is
(8.4)
12m+8n
S2= 4 
14m+9n
a number between 3 and 4. Moreover, the new algorithm for the least squares system solution derived
from the compact matrix formulation (7.4), (7.10) and (7.11) is particularly suitable for vectorial and
parallel implementation. The whole problem is reduced to use known and efficient parallel algorithms
for the QR factorization of Apq, and for the matrix products in (7.10). Also the solution of the m
systems of step 3 may be done simultaneously.
9. Perturbation bounds for the QRD factorization
Now our aim is to evaluate the perturbation matrices of the factors Q and R of the previous
factorization QRD of V. We suppose the factor D not error affected. The method proposed in (7.9)
utilizes a computed factorization of Apq, that is the exact decomposition of a perturbed matrix
(9.1)
Apq + Apq = (Qpq +Qpq )(Rpq +Rq) .
Then we can derive consequent perturbation errors of the factorization of V in this way
(9.2)
V + V = [(Qpq +Qpq )  Ik ] [(Rq +Rq )  Ik ] D .
If we call
19
Q = Qpq  Ik
(9.3)
,  R = Rq Ik ,
using (7.6) and (9.3), we can write
V + V =(Q + Q)( R + R) D .
(9.4)
Using the norms 1,2, we can write
(9.5)
||R||t =||Rq||t , ||R||t =||Rq||t , ||Q||t =||Qpq||t , ||Q||t =||Qpq||t
, t=1,2,  .
Otherwise with the Frobenius norm it is derived
(9.6)
||R||F = k||Rq||F , ||R||F = k||Rq||F , ||Q||F = k||Qpq|| F , ||Q||F = k||Qpq||F .
Bounds for the Frobenius norms of matrices Qpq and Rq depending on the condition number of Apq
are derived in [9]. Then we can derive consequent perturbation errors bounds of the factorization of V
as follows
(9.5)
||R||F k ||Rq||F
||Apq||F
 =   2 (k /t) cond2(Apq) 
||R||t
t||Rq||t
||Apq||t
(9.6)
,
t=2
2=1
t=F F=k
||Apq||F
||Q|| F = ||Qpq||F ||Ik||F  k (1+ 2) cond2(Apq) 
||Apq||2
where cond2(Apq)= ||Apq||2 ||Apq+ ||2.
Hence the proposed factorization reduces the errors, because they have the same norms (t=1,2,) of
the errors of a factorization of a lower order Vandermonde matrix and the error bounds (t=F) depends
on the condition number of Apq, a matrix of order reduced by a factor k. As a consequence, considering
also that Vandermonde matrices tend to be ill conditioned with increasing order, the previous method
is promising also from the point of view of its stability.
Appendix
Real case (k=2)
In the symmetric case (k=2), when j (j=q) are real, the determinant of D in (2.76) is given by
(a.1)
det(D)= j=1,q det(Vj) = 2q j=1,q j
and then
(a.2)
det(V)=2q j=1,q j det(Aq)2 .
In other words the determinant of V may be computed by the computation of the determinant of the
20
halved dimension matrix Aq with a computational cost of order q2/2 (we have not considered the lower
order operations).
Moreover cond (D) in (5.15) can be easily computed as
condF(D)= (q+j=1,q aj )1/2(q+j=1,q1/aj )1/2
(a.3)
and it may be used in the following bound of the condition number of Aq
condF(Aq)  (1/2) condF(D) condF(V) .
(a.4)
Some numerical experiments have been performed, comparing the results obtained by the procedures
SSV and the well known Björck-Pereyra algorithms. Obviously the results are depending on the
machine precision and the matrix dimensions. Even if they confirm the ill-conditioning of the
problem, in most cases better solution accuracy is obtained for the new algorithms with respect to the
classical ones. In table 6.1 the performance of the procedure SSV is compared with the Björck-Pereyra
algorithm for the primal system solution, when the points i =i (i=q,1) are the first natural numbers in
decreasing order, the right-hand side vector b is chosen as the sum of the matrix coefficients and the
exact solution is x = (1, 1, ..., 1)T . Considering the error vector norm || e|| and its ratio with the
machine precision u = 0.222E-15, the following improvement is observed.
TABLE
6.1.
Björck-Pereyra
n = 2q
||e||
20
22
24
26
28
30
1.37
1.51
4.54
1.47
4.53
1.63
SSV
|| e ||/u
e-09
e-08
e-08
e-06
e-06
e-05
6.19 e+ 06
6.81 e+ 07
2.04 e+ 08
6.65 e+ 09
2.04 e+ 10
7.37 e+ 10
|| e ||

1.27 e-09
1.96 e-09
1.03 e-08
1.31 e-06
2.84 e-06
4.54 e-07
|| e || /u
5.75 e+ 06
8.83 e+ 06
4.66 e+ 07
5.90 e+ 09
1.28 e+ 10
2.04 e+ 09
Conclusions
Vandermonde matrices mn (m=kp,n=kq) on a particular points distribution have been analysed
including the symmetric case (k=2) of pairs of symmetric points in the complex plane. Expressions of
the determinant, the inverse (m=n) and the generalized inverse (m>n) are derived, depending on
Vandermonde matrices of order pq. The matrix-vector product may be computed with a speed-up
factor k. Some numerical algorithms for the primal and dual system solution (m=n) are proposed and
analyzed depending on reduced order Vandermonde matrices. Both the sirial and parallel
implementation of the algorithms SV and SVH result to be less expensive on a computational point of
view. On the other hand the algorithm SV1 and SVH1 may be competitive on particular vectorial and
parallel architectures. In the same way the algorithm QRV for the Vandermonde least squares problem
(m>n) results to be faster than the traditional one. The algorithm SVDV might be preferred on
particular machines and it should produce more accurate results.
21
REFERENCES
[1] S. BARNETT, Matrix Differential Equations and Kronecker Products, SIAM J. Ap-pl. Math., 24
(1973), pp. 1-5.
[2] Ǻ. BJÖRCK - V. PEREYRA, Solution of Vandermonde Systems of Equations, Math.
Comp., 24 (1970), pp. 893-904.
[3] C. J. DEMEURE, Fast QR factorization of Vandermonde matrices, Linear Algebra Appl.
122/123/124 (1989) 165-194.
[4] G. H. GOLUB - C. F. VAN LOAN, Matrix Computation (third Edition), John’s Hopkins University
Press, 1989.
[5] A. GRAHAM, Kronecker Products and Matrix Calculus with Applications, Ellis Horwood Limited,
1981.
[6] N. J. HIGHAM, Error Analysis of Bjorck-Pereyra Algorithms for Solving Vandermonde Systems,
Numer. Math., 50 (1987), pp. 613-632.
[7] N. J. HIGHAM, Stability Analysis of Algorithms for Solving Confluent Vandermonde-like Systems,
SIAM J. Matrix Anal. Appl., 11, n. 1 (1990), pp. 23-41.
[8] R. A. HORN - C. R. JOHNSON, Topics in Matrix Analysis, Cambridge University Press.
[9] J.-G. SUN, Perturbation bounds for the Cholesky and QR factorizations, BIT 31 (1991) 341-352.
[10] T. TOMMASINI, A New Algorithm for Special Vandermonde Systems, Numerical Algorithms, 2
(1992), pp. 299-306.
[11] T. TOMMASINI, Complexity Reduction of Least Squares Problems Involving Special
Vandermonde Matrices, Advances in Comp. Math., 6 (1996), pp. 77-86.
T.Tommasini, Dipartimento di Matematica, [email protected]
24/03/03
revised february 2004
22