Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
L. Vandenberghe
EE133A (Spring 2017)
5. Orthogonal matrices
• matrices with orthonormal columns
• orthogonal matrices
• tall matrices with orthonormal columns
• complex matrices with orthonormal columns
5-1
Orthonormal vectors
a collection of real m-vectors a1, a2, . . . , an is orthonormal if
• the vectors have unit norm: kaik = 1
• they are mutually orthogonal: aTi aj = 0 if i 6= j
Example
0
0 ,
−1
Orthogonal matrices
1
1
√
1 ,
2 0
1
1
√
−1
2
0
5-2
Matrix with orthonormal columns
A ∈ Rm×n has orthonormal columns if its Gram matrix is the identity matrix:
T
A A =
a1 a2 · · ·
aT1 a1 aT1 a2 · · ·
aT2 a1 aT2 a2 · · ·
..
..
..
.
.
.
aTn a1 aTn a2 · · ·
1 0 ··· 0
0 1 ··· 0
.. .. . .
..
. .
. .
=
=
0 0 ···
an
T a1 a2 · · ·
T
a1 an
aT2 an
..
.
an
aTn an
1
there is no standard short name for ‘matrix with orthonormal columns’
Orthogonal matrices
5-3
Matrix-vector product
if A ∈ Rm×n has orthonormal columns, then the linear function f (x) = Ax
• preserves inner products:
(Ax)T (Ay) = xT AT Ay = xT y
• preserves norms:
1/2
= (xT x)1/2 = kxk
kAxk = (Ax) (Ax)
T
• preserves distances: kAx − Ayk = kx − yk
• preserves angles:
∠(Ax, Ay) = arccos
Orthogonal matrices
T
(Ax) (Ay)
kAxkkAyk
= arccos
T
x y
kxkkyk
= ∠(x, y)
5-4
Left invertibility
if A ∈ Rm×n has orthonormal columns, then
• A is left invertible with left inverse AT : by definition
AT A = I
• A has linearly independent columns (from p. 4-24 or p. 5-2):
Ax = 0
=⇒
AT Ax = x = 0
• A is tall or square: m ≥ n (see page 4-13)
Orthogonal matrices
5-5
Outline
• matrices with orthonormal columns
• orthogonal matrices
• tall matrices with orthonormal columns
• complex matrices with orthonormal columns
Orthogonal matrix
Orthogonal matrix
a square real matrix with orthonormal columns is called orthogonal
Nonsingularity (from equivalences on page 4-14): if A is orthogonal, then
• A is invertible, with inverse AT :
T
A A=I
A is square
=⇒
AAT = I
• AT is also an orthogonal matrix
• rows of A are orthonormal (have norm one and are mutually orthogonal)
Note: if A ∈ Rm×n has orthonormal columns and m > n, then AAT 6= I
Orthogonal matrices
5-6
Permutation matrix
• let π = (π1, π2, . . . , πn) be a permutation (reordering) of (1, 2, . . . , n)
• we associate with π the n × n permutation matrix A
Aiπi = 1,
Aij = 0 if j 6= πi
• Ax is a permutation of the elements of x: Ax = (xπ1 , xπ2 , . . . , xπn )
• A has exactly one element equal to 1 in each row and each column
Orthogonality: permutation matrices are orthogonal
• AT A = I because A has exactly one element equal to one in each row
(AT A)ij =
n
X
AkiAkj =
k=1
1 i=j
0 otherwise
• AT = A−1 is the inverse permutation matrix
Orthogonal matrices
5-7
Example
• permutation on {1, 2, 3, 4}
(π1, π2, π3, π4) = (2, 4, 1, 3)
• corresponding permutation matrix and its inverse
0
0
A=
1
0
1
0
0
0
0
0
0
1
0
1
,
0
0
A−1
0
1
T
=A =
0
0
0
0
0
1
1
0
0
0
0
0
1
0
• AT is permutation matrix associated with the permutation
(π̃1, π̃2, π̃3, π̃4) = (3, 1, 4, 2)
Orthogonal matrices
5-8
Plane rotation
Rotation in a plane
Ax
A=
cos θ − sin θ
sin θ
cos θ
θ
x
Rotation in a coordinate plane in Rn: for example,
cos θ 0 − sin θ
1
0
A= 0
sin θ 0 cos θ
describes a rotation in the (x1, x3) plane in R3
Orthogonal matrices
5-9
Reflector
A = I − 2aaT
with a a unit-norm vector (kak = 1)
• a reflector matrix is symmetric and orthogonal
AT A = (I − 2aaT )(I − 2aaT ) = I − 4aaT + 4aaT aaT = I
• Ax is reflection of x through the hyperplane H = {u | aT u = 0}
x
H
line through a and origin
Orthogonal matrices
y
y
= x − (aT x)a
= (I − aaT )x
z
= y + (y − x)
= (I − 2aaT )x
z = Ax
(see page 2-44)
5-10
Product of orthogonal matrices
if A1, . . . , Ak are orthogonal matrices and of equal size, then the product
A = A1 A2 · · · Ak
is orthogonal:
AT A = (A1A2 · · · Ak )T (A1A2 · · · Ak )
= ATk · · · AT2 AT1 A1A2 · · · Ak
= I
Orthogonal matrices
5-11
Linear equation with orthogonal matrix
linear equation with orthogonal coefficient matrix A of size n × n
Ax = b
solution is
x = A−1b = AT b
• can be computed in 2n2 flops by matrix-vector multiplication
• cost is less than order n2 if A has special properties; for example,
permutation matrix:
reflector (given a):
plane rotation:
Orthogonal matrices
0 flops
order n flops
order 1 flops
5-12
Outline
• matrices with orthonormal columns
• orthogonal matrices
• tall matrices with orthonormal columns
• complex matrices with orthonormal columns
Tall matrix with orthonormal columns
suppose A ∈ Rm×n is tall (m > n) and has orthonormal columns
• AT is a left inverse of A:
AT A = I
• A has no right inverse; in particular
AAT 6= I
on the next pages, we give a geometric interpretation to the matrix AAT
Orthogonal matrices
5-13
Range
• the span of a collection of vectors is the set of all their linear combinations:
span(a1, a2, . . . , an) = {x1a1 + x2a2 + · · · + xnan | x ∈ Rn}
• the range of a matrix A ∈ Rm×n is the span of its column vectors:
range(A) = {Ax | x ∈ Rn}
Example
1
0 1
x 1 + x3
1 2 ) = x1 + x2 + 2x3 | x1, x2, x3 ∈ R
range( 1
0 −1 1
−x2 + x3
Orthogonal matrices
5-14
Projection on range of matrix with orthonormal columns
suppose A ∈ Rm×n has orthonormal columns; we show that the vector
AAT b
is the orthogonal projection of an m-vector b on range(A)
b
n
range(A) = {Ax | x ∈ R }
Ax̂ = AAT b
• x̂ = AT b satisfies kAx̂ − bk < kAx − bk for all x 6= x̂
• this extends the result on page 2-12 (where A = (1/kak)a)
Orthogonal matrices
5-15
Proof
the squared distance of b to an arbitrary point Ax in range(A) is
kAx − bk2 = kA(x − x̂) + Ax̂ − bk2
(where x̂ = AT b)
= kA(x − x̂)k2 + kAx̂ − bk2 + 2(x − x̂)T AT (Ax̂ − b)
= kA(x − x̂)k2 + kAx̂ − bk2
= kx − x̂k2 + kAx̂ − bk2
≥ kAx̂ − bk2
with equality only if x = x̂
• line 3 follows because AT (Ax̂ − b) = x̂ − AT b = 0
• line 4 follows from AT A = I
Orthogonal matrices
5-16
Orthogonal decomposition
the vector b is decomposed as a sum b = z + y with
z ∈ range(A),
y ⊥ range(A)
y = b − AAT b
b
z = AAT b
range(A)
such a decomposition exists and is unique for every b
b = Ax + y,
Orthogonal matrices
AT y = 0
⇐⇒
AT b = x,
y = b − AAT b
5-17
Outline
• matrices with orthonormal columns
• orthogonal matrices
• tall matrices with orthonormal columns
• complex matrices with orthonormal columns
Gram matrix
A ∈ Cm×n has orthonormal columns if its Gram matrix is the identity matrix:
H
A A =
=
=
a1 a2 · · ·
an
H H
aH
1 a1 a1 a2 · · ·
H
aH
2 a1 a2 a2 · · ·
..
..
.
.
H
aH
n a1 an a2 · · ·
1 0 ··· 0
0 1 ··· 0
.. .. . .
..
. .
. .
0 0 ···
a1 a2 · · ·
H
a1 an
aH
2 an
..
.
an
aH
n an
1
• columns have unit norm: kaik2 = aH
i ai = 1
• columns are mutually orthogonal: aH
i aj = 0 for i 6= j
Orthogonal matrices
5-18
Unitary matrix
Unitary matrix
a square complex matrix with orthonormal columns is called unitary
Inverse
H
A A=I
A is square
=⇒
AAH = I
• a unitary matrix is nonsingular with inverse AH
• if A is unitary, then AH is unitary
Orthogonal matrices
5-19
Discrete Fourier transform matrix
recall definition from page 3-37 (with ω = e
W =
1
1
1
..
.
1
ω −1
ω −2
..
.
1
ω −2
ω −4
..
.
2πj/n
√
and j = −1)
···
···
···
1 ω −(n−1) ω −2(n−1) · · ·
1
ω −(n−1)
ω −2(n−1)
..
.
ω −(n−1)(n−1)
√
the matrix (1/ n)W is unitary (proof on next page):
1 H
1
W W = WWH = I
n
n
• inverse of W is W −1 = (1/n)W H
• inverse discrete Fourier transform of n-vector x is W −1x = (1/n)W H x
Orthogonal matrices
5-20
Gram matrix of DFT matrix
we show that W H W = nI
• conjugate transpose of W is
WH
=
1
1
1
..
.
1
ω
ω2
..
.
1
ω2
ω4
..
.
···
···
···
1 ω n−1 ω 2(n−1) · · ·
1
ω n−1
ω 2(n−1)
..
.
ω (n−1)(n−1)
• i, j element of Gram matrix is
(W H W )ij = 1 + ω i−j + ω 2(i−j) + · · · + ω (n−1)(i−j)
H
(W W )ii = n,
ω n(i−j) − 1
= 0 if i 6= j
(W W )ij =
ω i−j − 1
H
(last step follows from ω n = 1)
Orthogonal matrices
5-21