Download MODULE 11 Topics: Hermitian and symmetric matrices Setting: A is

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Determinant wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Rotation matrix wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Gaussian elimination wikipedia , lookup

Symmetric cone wikipedia , lookup

Four-vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

Principal component analysis wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Jordan normal form wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Transcript
MODULE 11
Topics: Hermitian and symmetric matrices
Setting: A is an n × n real or complex matrix defined on Cn with the
complex dot product hx, yi =
n
X
x j yj .
j=1
Notation: A∗ = AT , i.e., aij = aji .
We know from Module 4 that
hAx, yi = hx, A∗ yi
for all x, y ∈ Cn .
Definition: If A = AT then A is symmetric.
If A = ¶
A∗ then A is Hermitian.
µ
1 i
is symmetric but not Hermitian
Examples:
i
1
µ
¶
1 i
is Hermitian but not symmetric.
−i 1
Note that a real symmetric matrix is Hermitian.
Theorem: If A is Hermitian then
hAx, xi is real for all x ∈ Cn .
Proof: hAx, xi = hx, Axi = hAx, xi, so the number hAx, xi is real.
Theorem: If A is Hermitian then its eigenvalues are real.
Proof: Suppose that Au = λu, then
λhu, ui = hλu, ui = hAu, ui = hu, A∗ ui = hu, Aui = hu, λui = λ̄hu, ui.
It follows that λ = λ̄ so that λ is real.
Theorem: If A is Hermitian then eigenvectors corresponding to distinct eigenvalues are
orthogonal.
Proof: Suppose that Au = λu and Av = µv for λ 6= µ. Then
λhu, vi = hAu, vi = hu, Avi = hu, µvi = µhu, vi.
66
Since µ 6= λ it follows that hu, vi = 0.
Suppose that A has n distinct eigenvalues {λ1 , . . . , λn } with corresponding orthogonal
eigenvectors {u1 , . . . , un }. Let us also agree to scale the eigenvectors so that
hui , uj i = δij
where δij is the so-called Kronecker delta
δij =
½
if i 6= j
if i = j.
0
1
We recall that the eigenvalue equation can be written in matrix form
AU = U Λ
where U = (u1 u2 . . . un ) and Λ = diag{λ1 , . . . , λn }. We now observe from the orthonormality of the eigenvectors that
U ∗ U = I.
Hence U −1 = U ∗ and consequently
A = U ΛU ∗ .
In other words, A can be diagonalized in a particularly simple way.
Example: Suppose
A=
µ
1 i
−i 1
¶
We saw that A is Hermitian; its eigenvalues are the roots of (1 − λ)2 − 1 = 0 so that
√
u1 = (1, i)/ 2
√
u2 = (1, −i)/ 2
λ1 = 0,
λ2 = 2,
which shows that hu1 , u2 i = 0. Thus
1
U U=√
2
∗
µ
1
1
−i
i
¶
1
·√
2
µ
1 1
i −i
and
A=U
µ
0 0
0 2
67
¶
U ∗.
¶
=
µ
1 0
0 1
¶
We saw from the example
A=
µ
1
0
1
1
¶
that the eigenvector system for A can only be written in the form
A
µ
1 1
0 0
¶
=
µ
1
0
1
0
¶µ
1 0
0 1
¶
.
This matrix A cannot be diagonalized because we do not have n linearly independent eigenvectors. However, a Hermitian matrix can always be diagonalized because we can find an
orthonormal eigenvector basis of Cn regardless of whether the eigenvalues are distinct or
not.
Theorem: If A is Hermitian then A has n orthonormal eigenvectors {u 1 , . . . , un } and
A = U ΛU ∗
where
Λ = diag{λ1 , . . . , λn }.
Proof: If all the eigenvalues were distinct then the results follows from the forgoing discussion. Here we shall outline a proof this result for the case where A is real so that the setting
is Rn rather than Cn . Define the function
f (y) = hAy, yi
Analysis tells us that this function has a minimum on the set hy, yi = 1. The minimizer can
be found with Lagrange multipliers. We have the Lagrangian
L(y) = hAy, yi − λ(hy, yi − 1)
and the minimizer x1 is found from the necessary condition
∇L(y) = 0.
We compute
∂L
= hAk , yi + hAy, ek i − 2λyk = 2(hAk , yi − λyk )
∂yk
68
so that we have to solve
∇L ≡ Ay − λy = 0
hy, yi = 1.
The solution to this problem is the eigenvector x1 with eigenvalue λ1 .
If we now apply the Gram-Schmidt method to the set of n + 1 dependent vectors
{x1 , e1 , . . . , en } we end up with n orthogonal vectors {x1 , y2 , . . . , yn }. We now minimmize
f (y) = hAy, yi subject to kyk = 1
over all y ∈ span{y2 , . . . , y2 } and get the next eigenvalue λ2 and eigenvector x2 which
then necessarily satisfies hx1 , x2 i = 0. Next we find an orthogonal basis of Rn of the form
{x1 , x2 , y3 , . . . , yn } and minimize f (y) over span {y3 , . . . , yn } subject to the constraint that
kyk = 1. We keep going and find eventually n eigenvectors, all of which are mutually
orthogonal. Afterwards all eigenvectors can be normalized so that we have an orthonormal
eigenvector basis of Rn which we denote by {u1 , . . . , un }.
If A is real and symmetric then this choice of eigenvectors satisfies
U T U = I.
If A is complex then the orthonormal eigenvectors satisfy
U ∗ U = I.
In either case the eigenvalue equation
AU = U Λ
can be rewritten as
A = U ΛU ∗ or equivalently, U ∗ AU = Λ.
We say that A can be diagonalized.
We shall have occasion to use this result when we talk about solving first order systems
of linear ordinary differential equations.
69
The diagonalization theorem guarantees that for each eigenvalue one can find an eigenvector which is orthogonal to all the other eigenvectors regardless of whether the eigenvalue
is distinct or not. This means that if an eigenvalue µ is a repeated root of multiplicity k
(meaning that det(A − λI) = (λ − µ)k g(λ), where g(µ) 6= 0), then dim N (A − µI) = k. We
simply find an orthogonal basis of this null space. The process is, as usual, the application
of Gaussian elimination to
(A − µI)x = 0
and finding k linearly independent vectors in the null space which can then be made orthogonal with the Gram-Schmidt process.
Some consequences of this theorem: Throughout we assume that A is Hermitian and
that x is not the zero vector.
Theorem: hAx, xi > 0 if and only if all eigenvalues of A are strictly positive.
Proof: Suppose that hAx, xi > 0. If λ is a non-positive eigenvalue with eigenvector u then
hAu, ui = λhu, ui ≤ 0 contrary to assumption. Hence the eigenvalues must be positive.
Conversely, assume that all eigenvalues are positive. Let x be arbitrary, then the existence
of an orthonormal eigenvector basis {uj } assures that
x = α 1 u1 + · · · + α n un .
If we substitute this expansion into hAx, xi we obtain
hAx, xi
n
X
λj α j α j > 0
j=1
for any non-zero x.
Definition: A is positive definite if hAx, xi > 0 for x 6= 0.
A is positive semi-definite if hAx, xi ≥ 0 for all x.
A is negative definite if hAx, xi < 0 for x 6= 0.
A is negative semi-definite if hAx, xi ≤ 0 for all x.
Note that for a positive or negative definite matrix Ax = 0 if and only if x = 0 while a
semi-definite matrix may have a zero eigenvalue.
70
An application: Consider the real quadratic polynomial
P (x, y, z) = a11 x2 + a12 xy + a13 xz + a22 y 2 + a23 yz + a33 z 2 + b1 x + b2 y + b3 z + c
We “know” from analytic geometry that P (x, y, z) = 0 describes an ellipsoid, paraboloid
or hyperboloid in R3 . But how do we find out? We observe that P (x, y, z) = 0 can be
rewritten as
where
*    + *  +
x
x
x
P (x, y, z) = A  y  ,  y  + b,  y  c = 0
z
z
z

a11

A = a12 /2
a13 /2
Hence A can be diagonalized

a12 /2 a13 /2
a22
a23 /2 
a23 /2
a33
is a symmetric real matrix.
A = U ΛU T
Define the new vector (new coordinates)


 
X
x
 Y  = UT  y 
Z
z
Then
P (X, Y, X) =
 +
  + *
X
X
X





+ b, U Y  + c
Λ Y , Y
Z
Z
Z
* 
= λ1 X 2 + λ2 Y 2 + λ3 Z 2 + lower order terms
where {λj } are the eigenvalues of A. If all are positive or negative we have an ellipsoid, if all
are non-zero but not of the same sign we have a hyperboloid, and if at least one eigenvalue
is zero we have a paraboloid. The lower order terms determine where these shapes are
centered but not their type.
71
Module 11 Homework
1) Suppose that A has the property A = −A∗ . In this case A is said to be skew-Hermitian.
i) Show that all eigenvalues of A have to be purely imaginary.
ii) Prove or disprove: The eigenvectors corresponding to distinct eigenvalues of a
skew-Hermitian matrix are orthogonal with respect to the complex dot product.
iii) If A is real and skew-Hermitian what does this imply about A T ?
2) Let A be a skew-Hermitian matrix.
i) Show that for each x ∈ Cn we have RehAx, xi = 0
ii) Show that for any matrix A we have
hAx, xi = hBx, xi + hCx, xi
where B is Hermitian and C is skew-Hermitian.
3) Suppose that U ∗ U = I. What can you say about the eigenvalues of U ?
4) Find a new coordinate system such that the conic section
x2 − 2xy + 4y 2 = 6
is in standard form so that you can read off whether it is an ellipse, parabola or hyperbola.
72