Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
L. Vandenberghe EE133A (Spring 2017) 5. Orthogonal matrices • matrices with orthonormal columns • orthogonal matrices • tall matrices with orthonormal columns • complex matrices with orthonormal columns 5-1 Orthonormal vectors a collection of real m-vectors a1, a2, . . . , an is orthonormal if • the vectors have unit norm: kaik = 1 • they are mutually orthogonal: aTi aj = 0 if i 6= j Example 0 0 , −1 Orthogonal matrices 1 1 √ 1 , 2 0 1 1 √ −1 2 0 5-2 Matrix with orthonormal columns A ∈ Rm×n has orthonormal columns if its Gram matrix is the identity matrix: T A A = a1 a2 · · · aT1 a1 aT1 a2 · · · aT2 a1 aT2 a2 · · · .. .. .. . . . aTn a1 aTn a2 · · · 1 0 ··· 0 0 1 ··· 0 .. .. . . .. . . . . = = 0 0 ··· an T a1 a2 · · · T a1 an aT2 an .. . an aTn an 1 there is no standard short name for ‘matrix with orthonormal columns’ Orthogonal matrices 5-3 Matrix-vector product if A ∈ Rm×n has orthonormal columns, then the linear function f (x) = Ax • preserves inner products: (Ax)T (Ay) = xT AT Ay = xT y • preserves norms: 1/2 = (xT x)1/2 = kxk kAxk = (Ax) (Ax) T • preserves distances: kAx − Ayk = kx − yk • preserves angles: ∠(Ax, Ay) = arccos Orthogonal matrices T (Ax) (Ay) kAxkkAyk = arccos T x y kxkkyk = ∠(x, y) 5-4 Left invertibility if A ∈ Rm×n has orthonormal columns, then • A is left invertible with left inverse AT : by definition AT A = I • A has linearly independent columns (from p. 4-24 or p. 5-2): Ax = 0 =⇒ AT Ax = x = 0 • A is tall or square: m ≥ n (see page 4-13) Orthogonal matrices 5-5 Outline • matrices with orthonormal columns • orthogonal matrices • tall matrices with orthonormal columns • complex matrices with orthonormal columns Orthogonal matrix Orthogonal matrix a square real matrix with orthonormal columns is called orthogonal Nonsingularity (from equivalences on page 4-14): if A is orthogonal, then • A is invertible, with inverse AT : T A A=I A is square =⇒ AAT = I • AT is also an orthogonal matrix • rows of A are orthonormal (have norm one and are mutually orthogonal) Note: if A ∈ Rm×n has orthonormal columns and m > n, then AAT 6= I Orthogonal matrices 5-6 Permutation matrix • let π = (π1, π2, . . . , πn) be a permutation (reordering) of (1, 2, . . . , n) • we associate with π the n × n permutation matrix A Aiπi = 1, Aij = 0 if j 6= πi • Ax is a permutation of the elements of x: Ax = (xπ1 , xπ2 , . . . , xπn ) • A has exactly one element equal to 1 in each row and each column Orthogonality: permutation matrices are orthogonal • AT A = I because A has exactly one element equal to one in each row (AT A)ij = n X AkiAkj = k=1 1 i=j 0 otherwise • AT = A−1 is the inverse permutation matrix Orthogonal matrices 5-7 Example • permutation on {1, 2, 3, 4} (π1, π2, π3, π4) = (2, 4, 1, 3) • corresponding permutation matrix and its inverse 0 0 A= 1 0 1 0 0 0 0 0 0 1 0 1 , 0 0 A−1 0 1 T =A = 0 0 0 0 0 1 1 0 0 0 0 0 1 0 • AT is permutation matrix associated with the permutation (π̃1, π̃2, π̃3, π̃4) = (3, 1, 4, 2) Orthogonal matrices 5-8 Plane rotation Rotation in a plane Ax A= cos θ − sin θ sin θ cos θ θ x Rotation in a coordinate plane in Rn: for example, cos θ 0 − sin θ 1 0 A= 0 sin θ 0 cos θ describes a rotation in the (x1, x3) plane in R3 Orthogonal matrices 5-9 Reflector A = I − 2aaT with a a unit-norm vector (kak = 1) • a reflector matrix is symmetric and orthogonal AT A = (I − 2aaT )(I − 2aaT ) = I − 4aaT + 4aaT aaT = I • Ax is reflection of x through the hyperplane H = {u | aT u = 0} x H line through a and origin Orthogonal matrices y y = x − (aT x)a = (I − aaT )x z = y + (y − x) = (I − 2aaT )x z = Ax (see page 2-44) 5-10 Product of orthogonal matrices if A1, . . . , Ak are orthogonal matrices and of equal size, then the product A = A1 A2 · · · Ak is orthogonal: AT A = (A1A2 · · · Ak )T (A1A2 · · · Ak ) = ATk · · · AT2 AT1 A1A2 · · · Ak = I Orthogonal matrices 5-11 Linear equation with orthogonal matrix linear equation with orthogonal coefficient matrix A of size n × n Ax = b solution is x = A−1b = AT b • can be computed in 2n2 flops by matrix-vector multiplication • cost is less than order n2 if A has special properties; for example, permutation matrix: reflector (given a): plane rotation: Orthogonal matrices 0 flops order n flops order 1 flops 5-12 Outline • matrices with orthonormal columns • orthogonal matrices • tall matrices with orthonormal columns • complex matrices with orthonormal columns Tall matrix with orthonormal columns suppose A ∈ Rm×n is tall (m > n) and has orthonormal columns • AT is a left inverse of A: AT A = I • A has no right inverse; in particular AAT 6= I on the next pages, we give a geometric interpretation to the matrix AAT Orthogonal matrices 5-13 Range • the span of a collection of vectors is the set of all their linear combinations: span(a1, a2, . . . , an) = {x1a1 + x2a2 + · · · + xnan | x ∈ Rn} • the range of a matrix A ∈ Rm×n is the span of its column vectors: range(A) = {Ax | x ∈ Rn} Example 1 0 1 x 1 + x3 1 2 ) = x1 + x2 + 2x3 | x1, x2, x3 ∈ R range( 1 0 −1 1 −x2 + x3 Orthogonal matrices 5-14 Projection on range of matrix with orthonormal columns suppose A ∈ Rm×n has orthonormal columns; we show that the vector AAT b is the orthogonal projection of an m-vector b on range(A) b n range(A) = {Ax | x ∈ R } Ax̂ = AAT b • x̂ = AT b satisfies kAx̂ − bk < kAx − bk for all x 6= x̂ • this extends the result on page 2-12 (where A = (1/kak)a) Orthogonal matrices 5-15 Proof the squared distance of b to an arbitrary point Ax in range(A) is kAx − bk2 = kA(x − x̂) + Ax̂ − bk2 (where x̂ = AT b) = kA(x − x̂)k2 + kAx̂ − bk2 + 2(x − x̂)T AT (Ax̂ − b) = kA(x − x̂)k2 + kAx̂ − bk2 = kx − x̂k2 + kAx̂ − bk2 ≥ kAx̂ − bk2 with equality only if x = x̂ • line 3 follows because AT (Ax̂ − b) = x̂ − AT b = 0 • line 4 follows from AT A = I Orthogonal matrices 5-16 Orthogonal decomposition the vector b is decomposed as a sum b = z + y with z ∈ range(A), y ⊥ range(A) y = b − AAT b b z = AAT b range(A) such a decomposition exists and is unique for every b b = Ax + y, Orthogonal matrices AT y = 0 ⇐⇒ AT b = x, y = b − AAT b 5-17 Outline • matrices with orthonormal columns • orthogonal matrices • tall matrices with orthonormal columns • complex matrices with orthonormal columns Gram matrix A ∈ Cm×n has orthonormal columns if its Gram matrix is the identity matrix: H A A = = = a1 a2 · · · an H H aH 1 a1 a1 a2 · · · H aH 2 a1 a2 a2 · · · .. .. . . H aH n a1 an a2 · · · 1 0 ··· 0 0 1 ··· 0 .. .. . . .. . . . . 0 0 ··· a1 a2 · · · H a1 an aH 2 an .. . an aH n an 1 • columns have unit norm: kaik2 = aH i ai = 1 • columns are mutually orthogonal: aH i aj = 0 for i 6= j Orthogonal matrices 5-18 Unitary matrix Unitary matrix a square complex matrix with orthonormal columns is called unitary Inverse H A A=I A is square =⇒ AAH = I • a unitary matrix is nonsingular with inverse AH • if A is unitary, then AH is unitary Orthogonal matrices 5-19 Discrete Fourier transform matrix recall definition from page 3-37 (with ω = e W = 1 1 1 .. . 1 ω −1 ω −2 .. . 1 ω −2 ω −4 .. . 2πj/n √ and j = −1) ··· ··· ··· 1 ω −(n−1) ω −2(n−1) · · · 1 ω −(n−1) ω −2(n−1) .. . ω −(n−1)(n−1) √ the matrix (1/ n)W is unitary (proof on next page): 1 H 1 W W = WWH = I n n • inverse of W is W −1 = (1/n)W H • inverse discrete Fourier transform of n-vector x is W −1x = (1/n)W H x Orthogonal matrices 5-20 Gram matrix of DFT matrix we show that W H W = nI • conjugate transpose of W is WH = 1 1 1 .. . 1 ω ω2 .. . 1 ω2 ω4 .. . ··· ··· ··· 1 ω n−1 ω 2(n−1) · · · 1 ω n−1 ω 2(n−1) .. . ω (n−1)(n−1) • i, j element of Gram matrix is (W H W )ij = 1 + ω i−j + ω 2(i−j) + · · · + ω (n−1)(i−j) H (W W )ii = n, ω n(i−j) − 1 = 0 if i 6= j (W W )ij = ω i−j − 1 H (last step follows from ω n = 1) Orthogonal matrices 5-21