Download Document

Document related concepts

Quadratic form wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Tensor operator wikipedia , lookup

Jordan normal form wikipedia , lookup

Cross product wikipedia , lookup

Euclidean vector wikipedia , lookup

Hilbert space wikipedia , lookup

Dual space wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Exterior algebra wikipedia , lookup

Vector space wikipedia , lookup

Oscillator representation wikipedia , lookup

System of linear equations wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Matrix multiplication wikipedia , lookup

Geometric algebra wikipedia , lookup

Matrix calculus wikipedia , lookup

Linear algebra wikipedia , lookup

Cartesian tensor wikipedia , lookup

Four-vector wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Bra–ket notation wikipedia , lookup

Transcript
Chapter 5
Orthogonality
Outline






Scalar Product in Rn
Orthogonal Subspaces
Least Square Problems
Inner Product Spaces
Orthogonal Sets
The Gram-Schmidt Orthogonalization Process
Scalar product in
n
R
T 
x y  x1 y1  x2 y2  ...  xn yn
Example :
 3
  
x    2
 1 
T 
x y 8
 4
  
y  3 
2
Def: Let x and y be vectors in either R2 or R3.
The distance between x and y is defined to
be the number x  y
Theorem 5.1.1
If x and y are two nonzero vectors in either R2 or
R3 and  is the angle between them , then
xT y  x y cos 
Proof: By the law of cosines,
 2 2 2
 
y  x  x  y  2 x y cos
1
2
2
2
 x y cos  
x  y  yx
2
3
3
1 3 2
2
2
   xi   yi    yi  xi  
2  i 1
i 1
i 1


3
  xi yi  x  y
i 1

Corollary 5.1.2(Cauchy-Schwarz Inequality)
If x and y are vectors in either R2 or R3, then
x y x y
T
With equality holding if and only if one of the
vectors is 0 or one vector is a multiple of the
other.
Note: If  is the angle between x & y , then

x y
x
y
cos  
Thus 1 
1
x y
x y
Def: The vectors x and y in R2(or R3)are said to
be orthogonal if x T y  0 .
Examples
 

2
A :0  x ,  x  R
3  - 4
B:    
 2  6
Scalar and Vector Projections
Scalar projection of x onto y :
T
x y

y
Vector projection of x onto y :
T
1
x y
p  u 
y T y
y
y y
z  x p
x

u
p  u
  x cos
y
1.4
v
Example: Find the point
1
1
y x
on the line y  x that
3
3
Q
is closest to the point
(1,4)
1
 3
Sol: Note that the vector w    is on the line y  x
3
1
Thus the desired point is
 v w 
 2.1 
  w  

 0.7 
 w w
Example: Find the equation of the plane

passing through 2,1,3 and
normal to 2,3,4
Sol:
 2   x   2 
     
 3    y     1  0
 4   z   3 
     
 2x  2  3 y  1  4z  3  0
Example: Find the distance form P  2,0,0
to the plane x  2 y  2 z  0
Sol: a normal vector to
1
the plane is  

 2
 2
 
 The distance

P n

n
2

3
P
n
Application 1: Information Retrieval Revisited
Table 1
Frequency of Key words
Modules
Key Words
M1
M2
M3
M4
M5
M6
M7
M8
determines
0
6
3
0
1
0
1
1
eignvalues
0
0
0
0
0
5
3
2
linear
5
4
4
5
4
0
3
3
matrices
6
5
3
3
4
4
3
2
numerical
0
0
0
0
3
0
4
3
orthogonality
0
0
0
0
4
6
0
2
spaces
0
0
5
2
3
3
0
1
systems
5
3
3
2
2
2
1
1
transformations
0
0
0
5
3
3
1
0
vector
0
4
4
3
2
2
0
3
Application I: Information Retrieval Revisited
A is the matrix corresponding to Table I, then the columns of
the database matrix Q are determined by setting
1
qj 
aj
aj
j  1, ... ,8
 To do a search for the key words orthogonality, spaces,
vector, we form a unit search vector x whose entries are
1
all zero except for the three rows(be put 3 in each of
the rows) corresponding to the search rows.
 0.000

 0.000
 0.539

 0.647
 0.000
Q
 0.000
 0.000

 0.539
 0.000

 0.000

0.594 0.327 0.000 0.100 0.000 0.147 0.154 

0.000 0.000 0.000 0.000 0.500 0.442 0.309 
0.396 0.436 0.574 0.400 0.000 0.442 0.463 

0.495 0.327 0.344 0.400 0.400 0.442 0.309 
0.000 0.000 0.000 0.300 0.000 0.590 0.463 

0.000 0.000 0.000 0.400 0.600 0.000 0.309 
0.000 0.546 0.229 0.300 0.300 0.000 0.154 

0.297 0.327 0.229 0.400 0.200 0.147 0.154 
0.000 0.000 0.574 0.100 0.300 0.147 0.000 
0.396 0.436 0.344 0.400 0.200 0.000 0.463 
 0.000 


0.000


 0.000 


0.000


 0.000 
x 

 0.577 
 0.577 


 0.000 
 0.000 


 0.577 


y  QT x  yi  qiT x  cos i
where i is the angle between the unit vectors x and qi .
For our example,
y   0.000, 0.229, 0.567, 0.310, 0.635, 0.577, 0.000, 0.535 
T
Application I: Information Retrieval Revisited
Since y5  0.635 is the entry of y that is closest to 1,this
indicates that the direction of the search vector x is closest
to the direction of q5 and hence that Module 5 is the one
that best matches our search criteria.
Application 2: Correlation And Covariance Matrices
Table 2
Math Scores Fall 1996
Scores
Student
Assignment Exams Final
S1
198
200
196
S2
160
165
165
S3
158
158
133
S4
150
165
91
S5
175
182
151
S6
134
135
101
S7
152
136
80
Average
161
163
131
65 
 37 37



1
2
34


 3 5
2 


X   11 2 40 
 14 19
20 


 27 28 30 
 9 27 51 


Application 2: Correlation And Covariance Matrices
The column vectors of X represent the deviations from the
mean for each of the three sets of scores.
The three sets of translated data specified by the column
vectors of X all have mean 0 and all sum to 0.
A cosine value near 1 indicates that the two sets of scores
are highly correlated.
 Scale
x1 and x2 to make them unit vectors
1
u1 
x1
x1
1
and u2 
x2
x2
 0.74 0.65 0.62 


 0.02 0.03 0.33 
 0.06 0.09 0.02 


U   0.22 0.03 0.38 
 0.28 0.33 0.19 


 0.54 0.49 0.29 
 0.18 0.47 0.49 


If we set C  U TU , then
0.92 0.83 
 1


C   0.92
1
0.83 
 0.83 0.83 1 


Application 2: Correlation And Covariance Matrices
The matrix C is referred to as a correlation matrix.
The three sets of scores in our example are all positively
correlated since the correlation coefficients are all positive.
A negative coefficient would indicate that two data sets were
negatively correlated.
A coefficient of 0 would indicate that they were uncorrelated.
5-2 Orthogonal Subspaces
Def: Two subspaces X and Y of  n are said to be
orthogonal if x T y = 0 for every x  X and y  Y
If X and Y are orthogonal, we write X  Y
Example:
Let X  span{e1 , e2 }  3 , Y  span{e3}
then X  Y
Def: Let Y be a subspace of  .n The set of all vectors in
n
 that are orthogonal to every vector in Y will be
denoted Y . Thus
Y = { x  Rn | xT y  0 for every y  Y }
The set Y  is called the orthogonal complement
of Y
Example:
Let X  span{e1}  3

then X  span{e2 , e3}
Remarks:
1. If X and Y are orthogonal subspaces of  n, then X  Y  {0} .
2. If Y is a subspace of  n, then Y  is also a subspace of  n.
Proof(1). If x  X  Y and X  Y , then x  x T x  0
2
and hence x  0.
Proof(2). If x  Y  and  is a scalar, then for any y  Y ,
( x )T y   ( x T y )    0  0   x  Y 
If x1 and x2  Y  , then
(x1  x2 ) y  x y  x2 y  0  0  0 for each y  Y ,
T
T
1
T
 x1  x2  Y  .
Therefore, Y  is a subspace of n .
Four Fundamental Subspaces
Let
A
 or



mn
A : n  m 

x  Ax
 A  nm
N  A x n Ax  0  n


N  A   x  A x  0  
R  A b  b  Ax for some x    
R  A  b  b  A x for some x    
T
m
T
m
m
T
n
n
T
m
m
n
It will be shown later that N  A  R  A  , N  A   R  A
and n  N  A  R A

 

 
m  N A  R A


Theorem 5.2.1(Fundamental Subspace Theorem)
Let A mn , then N ( A)  R( AT ) and N ( AT )  R( A) .

y

R
A
pf: Let x  N  A and
 
 A  i,: x  0 i  1, m ---------(1)
m
and y    i A :, i  for some  i 's --------(2)
i 1
m
(1)
 
 x y    i x A :, i   0  N  A  R A


i 1

 N  A  R  A

 



Also, if z  R A
 z  A :, i   0, i  1, m
 A(i,:) z  0, i  1, m  Az  0  z  N ( A)

 

R A

 N  A
 
hence, N  A  R  A
Similarly, N A   R A   R A




 1 0
1 2

Example: Let A  
  A  

 2 0
0 0
 0 
 N  A  span  
 1 
  2 

N A   span 
 1 
 1 
R A  span  
 2 
 1 
R A  span  
 0 
 
Clearly,
 
N  A  R A
 
 
N A  R  A 

Theorem 5.2.2

If S is a subspace of  , then dim S  dim S  n
Furthermore, if { x1 ,..., xr } is a basis for S and
{xr 1 ,..., xn}is a basis for S  , then { x1 ,..., xr , xr 1 ,..., xn}
is a basis for  n.
n
Proof: If S  {0}  S   n  The result follows
Suppose S  {0} . Let X  ( x1 xr ) nr
 R( X )  S and rank ( X )  r  rank ( X  )
Theorem 5.2.1




S  R( X )  N ( X )

 dim( S  )  dim N ( X  )

Thm 3.6.4
 nr
To show that {x1 xn } is a basis for  n,
It remains to show their independency.
n
Let  ci  xi  0 . Then x  S
i 1
n
r






x   ci xi   x   ci xi   0
r i 1

 i 1

  ci  xi  0  ci  0, i  1
i 1
Similarly, y  S
r

n
 n


 
y   ci xi   y   ci xi   0
 i 1

 i  r 1



n
cx
i  r 1
i
i
 0  ci  0, i  r  1,
n
Def: If U and V are subspaces of a vector space W
and each w W can be written uniquely as a
sum u  v , where u U and v V ,then we
say that W is a direct sum of U and V, and we
write W  U  V
n
is
a
subspace
of
 ,
S
n

then   S  S
pf: By Theorem5.2.2, n  S  S 
Theorem5.2.3: If
 x n , x  u  v , u  S & v  S  .
To show uniqueness,
Suppose x  u1  v1  u2  v2
where u1 , u2  S & v1 , v2  S 

 u1  u2  v2  v1  S S
S S   {0}  u1  u2 & v2  v1
Theorem5.2.4: If S is a subspace of  n ,
then ( S  )   S
pf: Let dim( S )  r
Theorem5.2.2
 
 dim( S )  r
If x  S , then x  y  0 y  S 
 x S

 S  S 
 S  S 


 
 
Remark: Let A  
. i.e. , A : 
n

 Since
  N  A  R A
and rank ( A)  rank A
mn
 
 
 n  nullity  A  rank  A

 nullity  A  rank A 

 
A : R A  R A
 
A : R A  R A


are bijections .
n

m
Let A  mn
A:

n
 
 A
m
 A
bijection
 
N  A
N A
0

A :

m
 A
n
 
 A
bijection
 
N  A
N A
0
Cor5.2.5:
Let A  mn and b m. Then
either
(i) x n  Ax  b
or (ii) y m

N ( A )
b

 A y  0 and y b  0
for m  3
R(A)
  R( A)  N ( A )
m
pf:
(i) b  R( A)  x n  Ax  b
(ii) b  R( A)  N ( A )  y  N ( A )  y b  0
 y m  A y  0 & y  b  0
T
1 1 2


Example: Let A   0 1 1  . Find N ( A), R( A), N ( A ), R( A )
1 3 4


The basic idea is that the row space and the sol. of
Ax  b are invariant under row operations.
 1  0 
1 0 1
Sol: (i) row


  

A ~ Ar   0 1 1   R( A )  span  0  1 
(Why?)
 1  1 
0 0 0
  


 1 
(ii)
 

N
(
A
)

span
 1 
Ar x  0  x1  x3  0 & x2  x3  0
  1
 
 1  0 
1 0 1
  
row



(iii) Similarly, A ~  0 1 2   R( A)  span  0  1 
 1  2 
0 0 0
  


 1 
 

and N ( A )  span  2 
  1
 
 
(iv) Clearly, N  A  R A & N ( A )  R( A)
(Why?)
 2 0 0 3
Example: Let A  
 :   2
 0 3 0
 0 
 1  0 
(i) 3  N  A  R A  span  0   span  0  1 
 
  
 1 
 0  0 
 
  
 
and R( A)  2
(ii) The mapping A

R( A
and A R ( A

1

)
 

:
R
A
 R( A) is a bijection
)
 x1   2 x1 
  

 x2    3 x2 
0  0 
  

: R A  R( A )
1 
 y1 
2 
 y1   1 
  
y2


y
3
 2
 0 




(iv) What is the matrix representation for A R ( A ) ?
5-4 Inner Product Spaces

A tool to measure the
orthogonality of two vectors in
general vector space
Def: An inner product on a
vector spaceV is a function
,  : V  V  F (orC )
Satisfying the following conditions:
(i) x , x  0 with equality iff x  0
(ii) x, y  y, x
(iii)  x   y, z   x, z   y, z
x , y  & wi  0i  1,
n
Example: (i) Let

Then x , y 
n.
n
is an inner product of  n
w
x
y
 iii
i 1
 m
n
(ii) Let A, B  mn , Then A, B    aijbij is an
m n
i 1
j 1
inner product of 
(iii) Let f , g , w( x)  C 0 [a, b]. and w( x)  0 then
 b
f , g   w( x) f ( x) g ( x)dx
is an inner product of
C [a, b].
(iv) Let p, g  Pn , w(x ) is a positive function and
x1  xn are distinct real numbers. Then
0
a

n
p, g   w( xi ) P(xi ) g ( xi )
i 1
an inner product of
Pn
is
Def: Let , be an inner product of a
vector space V and u , v V .
we say u  v  u , v  0
The length or norm of v is given
by
v 
v, v
Theorem5.4.1: (The Pythagorean Law)
2
2
u v  u v  u  v
pf: u  v
2
2
 u  v,u  v
 u, u  u, v  v, u  v, v
2
 u  v
2
v
u v
u
Example 1: Consider C 0 [1,1] with inner product
1
f , g   f ( x) g ( x)dx
1
1
(i)
1, x   1 xdx  0  1  x
(ii)
1,1   11dx  2  1  2
1
1
1
2
x, x   x  xdx   x  2
3
1
3
2
2
2
2 8
(iv) 1  x  1  x  2  
(Pythagorean Law)
3 3
or
2
1
2
8
1  x  1  x,1  x   (1  x) dx 
1
3
(iii)
1
0
C
Example 2: Consider [ ,  ] with inner product
1

f , g   f ( x) g ( x)dx
 
It can be shown that
(i)
(ii)
1 1
,
1
2 2
cos nx, sin mx  0
 cos mx, cos nx   mn
(iii) 
 sin mx,sin nx   mn
 1

Thus 
, cos nx,sin nx n  N  is an orthonormal
2

set. 
Remark

cos x  sin x 
cos x  sin x  2
2
2
Remark: The inner product in example 2 plays a key
role in Fourier analysis application involving trigonometric approximation of functions.
Example 3: Let A, B 
mn
 m
,
n
m
A, B   aij bij  trace( ABT )   ( ABT )ii
i 1 j 1
i 1

AF
and let
A, A
1 1 
 1 1 




A   1 2 , B   3 0 
3 3
  3 4




Then A, B  6  A is not orthogonal to B
A F  5, B
F
6
Def: Let u & v  0 be two vectors in an
inner product space V . Then
the scalar projection of u onto v is
u, v
defined as
1
  u,
v
v 
v
The vector projection of u onto v is
u, v
1
p 
v
v
v
v, v
Lemma: Let v  0 & p be the vector projection
of u onto v . Then
u
(i)  u  p   p
up
(i)u  p  u  kv for some k
pf:
v
(i ) p, u  p  p, u  p, p

u, v
v, v
2

u, v
v, v
 p  u  p 
(ii ) trivial.
2
0
p
Theorem5.4.2: (Cauchy-Schwarz Inequality)
u
Let u & v be two vectors in an
inner product space V . Then
u, v  u v
v
Moreover, equality holds  u & v are linear
dependent.
pf: If v  0, then u , v  0  u v
2
If v  0, then 

Pythagorean Theorem
u, v

  p
 v, v 

2
 u, v
 u
2
2
v  v
Equality holds  v  0, or u  p 
p
2

2
up
u  up
2
up  u
u, v
v, v
2
v
i.e., equality holds iff u & v are linear dependent.
2
2
v
2
Note:
From Cauchy-Schwarz Inequality for F   .
1 
u, v
u v
1
if u  0 and v  0
 !   0,    cos  
u, v
u v
.
This, we can define  as the angle between
the two nonzero vectors u & v.
Def: Let V be a vector space a function

 : V    {0}
v v
is said to be a norm if it satisfies
(i )
v  0 with equality  v  0
(ii )  v    v ,  scalar  .
(iii ) v  w  v  w
(triangle inequality)
Remark: Such a vector space is called a normed linear space.
Theorem5.4.3: If V is an inner product
v , v  v V
space, then v
defines a norm on V .
pf: trivial
Def: The distance between u & v is defined
as u  v .
Example: Let x n , then
n
(i )
x1
(ii ) x

x
is a norm.
i
i 1
max x
is a norm.
i
1i  n
1
P

P
(iii ) x P   xi  is a norm for any p  1.
 i 1

in particular p =2,
n
x2
n
x
i 1
i
2

x , x is the euclidean norm.
Remark: In the case of a norm that is not derived from an
inner product, the Pythagorean Law will not hold.
1
 4 
Example: Let x1    & x2   
 2
2
 x1 , x2  0
Thus, x 2  x2 2  5  20  25  x1  x2
2
2
However, x1   x2   4  16
2
2
 20  x1  x2
2

2
2
 16
(Why?)
 4
 
x


5
Example: Let
 
 3
 
then
,
x 1  12
x 2 5 2
x

5


Example: Let B  x 2 x


1
Then
B
B2
1
1
B1
1
5-3 Least Squares Problems
Least squares problems

A typical example:
Given  xi  , i  1, n
 
 yi 
Find the best line
y  c0  c1 x
1 x1 
to fit the data .
or find c0 , c1
 y1 


 
x
y2 
c
1


2
0


 solve




  c1   


 
1 xn 
 yn 
or Ac  y
such that
Ac  y
is minimum
 Geometrical meaning :
y  c0  c1 x
( xn , yn )
( x1 , y1 )
Least squares problems:
Given A mn & b m ,
then the equation Ax  b
may not have solutions
i.e., b  Col ( A)  R( A)


The objective of least square problem is
trying to find x such that
b  Ax is minimum value
b
i.e., find x satisfying

b  A x  minn b  Ax
x
Ax
R( A)
Preview of the results:
It will be shown that
! p  R( A),  b  p  min y  b
yR ( A)
b
 b  p  R ( A)   N ( A )


 A b  p  0


 A b  Ax  0
p
 A Ax  Ab
If columns of
A
 x   A A Ab

1
are linear independent .
R( A)
Theorem5.3.1: Let S be a subspace of m , then
(i) b m , ! p  S ,  b  y  b  p for all y  S \{ p}
y b  b  pS
(ii) p  b  min
yS
pf:
b
m

(i)   S  S
S

 b  p  z where p  S & z  S
If y  S \{ p}
p
unique expression
2


 b  y  b  p   p  y
2
z S 
Pythogorean Theorem

b  p  p y
S
(ii) follows directly from (i) by noting that
b  p  z S
2
0
2

Question: How to find x which solves

A x  b  minn b  Ax ?
b
x
Ans.:
R( A)
p  Ax
From previous Theorem , we know that
b  p  R( A)   N ( A )
 A (b  p)  0
 Ab  A Ax  0
Definition: A Ax  Ab is called the normal equation.
Remark:
In general, it is possible to have more than
one solution to the normal equation.
If x̂ is a solution, then the general solution
is of the form
xˆ  h where h  N ( A)
Theorem5.3.2: Let A  mn and rank ( A)  n.


Then the normal equation A Ax  A b



has an unique solution x  A A.
1
Ab

and x is the unique least squares solution to
Ax  b .
pf: To show that A A is nonsingular
Let AT Ax  0  Ax  N ( AT )  R( A)  {0}
 Ax  0  x  0 (
rank ( A)  n)
 x   A A A b is the unique solution.

1

Note: The projection vector
1 

p  Ax  A  A A A b
is the element of R(A) that
is closet to b in the least squares
sense .

b
R(A)
p
Thus, The matrix P  A  A A A is called the
projection matrix (that project any vector of
m
 to R(A) )

1
Application 2: Spring Constants
Suppose a spring obeys the Hook’s law F  Kx
and a series of data are taken (with measurement
error) as F 3 5 8
x 4 7 11
How to determine K ?
4K  3
4
 3
 
 
sol: Note that 7 K  5 or  7  K   5  is inconsistent
11
8
11K  8
 
 
The normal equation is
4
 3
 
 
4 7 11 7  K  4 7 11 5 
11
8
 
 
so, 186K  135  K  0.726
Example 2: Given the data x 0 3 6
y 1 4 5
Find the best least squares fit by a linear function.
sol:
Let the desired linear function be y  c0  c1 x
The problem becomes to find the least squares solution
of
1 0 
1

  c0   
1
3

4



 c
 
1 6   1   5 

 c
 
A
∵ rank(A)=2

y
4
3
1
 c0 


    A A  A y   is the unique solution.
2
 c1 
 
4
3
2
Thus, the best linear least square fit is y   x
3 3
Example3: Find the best quadratic least squares fit to the data
x 0 1 2 3
y 3 2 4 4
sol:
Let the desired quadratic function be y  c0  c1 x  c2 x
The problem becomes to find the least square
solution of  3  1 0 0 
  
 c0 
 2  1 1 1  
 4   1 2 4  c1 
  
 c2 
 4  1 3 9 
  

 c0 
 2.75 

1
  c1   A A A y   0.25  is the unique solution.
c 
 0.25 
 2


Thus, the best quadratic least square fit is
∵ rank(A)=3 


y  2.75  0.25x  0.25x 2
2
5-5 Orthonormal Sets
Orthonormal Set


Simplify the least squares solution
(avoid computing inverse)
Numerical computational stability
Def: v1 vn  is said to be an orthogonal set in
an inner product space V if
vi , v j  0 for all i  j
Moreover, if vi , v j   ij , then v1
to be orthonormal.
vn  is said
Example 2:

 1
2
 4 

 
 
 
v

1
,
v

1
,
v

 1   2   3  5  

 1
 3 
 1 
 
 
 

is an
orthogonal set but not orthonormal.
1
1

u

v
,
u

v2 , u3 
 1
1
2
3
14

However ,
is orthonormal.
1

v3 
42 
Theorem5.5.1: Let v1 vn  be an orthogonal
set of nonzero vectors in an inner product
space V . Then they are linear independent.
n
pf: Suppose that
c v
i 1
i i
0
n
n
i 1
i 1
 0  v j ,  ci vi   ci v j , vi  c j v j , v j
 c j  0,  j  1,
 v1
n
vn  is linearly independent.
Example:
 1

, cos nx, sin nx n  N  is an

 2

orthonormal set of C 0   ,   with inner
product f , g  1  f ( x) g ( x)dx .
 
Note: Now you know the meaning what one
says that cos x  sin x .
Theorem5.5.2: Let u1 un  be an orthonormal
basis for an inner product space V .
n
If v   ci ui , then ci  ui , v .
i 1
pf:
n
n
j 1
j 1
ui , v  ui ,  c j u j   c j ui , u j
n
  c j ij  ci
j 1
Cor: Let u1 un  be an orthonormal basis for
an inner product space V .
n
If u   ai ui and v 
i 1
n
bu ,
i 1
n
then u , v   ai bi .
i 1
pf:
u, v 
n
a u ,v
i 1
i i
n
  ai ui , v
i 1
Theorem 5.5.2 n

a b
i 1
i i
i i
Cor: (Parseval’s Formula)
If u1 un  is an orthonormal basis
for an
n
inner product space V and v 
ci ui , then

v
2
n
  ci
i 1
2
i 1
pf: By Corollary 5.5.3,
v
2
n
 v , v   ci
i 1
2


u1  



Example 4:
1 
2 
1 

2
and
 1 
 2 

u2  
 1 


2


form
2

an orthonormal basis for
.
If x   x1  2 , then
 
 x2 
and
x1  x2
x1  x2
x , u1 
, x , u2 
2
2
Theorem 5.5.2
x1  x2
x1  x2
 x
u1 
u2
2
2
2
2
 x1  x2   x1  x2 
2
2
x 


x

x
1
2
 

2  
2 

2

4
sin
xdx without computing
Example 5: Determine 

antiderivatives .
sol:

  sin

4
xdx   sin x,sin x   sin x
2
2
2
2
1  cos 2 x
1 1  1
sin x 

    cos 2 x
2
2 2  2
 1

and 
, cos 2 x  is an orthonormal set of C 0   ,  
 2

2

  sin xdx   sin x
4
2
2

 1  2  1 2  3
  
  2    4
 
 2  
Def: Q nn is said to be an orthogonal
matrix if the column vectors of Q form an
n
orthonormal set in  .
Example 6:
 cos   sin  
The rotational matrix 

 sin  cos  
and the elementary reflection matrix
 cos 

  sin 
sin  
 are orthogonal matrix .
 cos  
Properties of orthogonal matrices:
If Q nn is orthogonal, then
(i ) The column vectors of Q form an orthonormal
basis for n .
(ii ) Q Q  I  QQ 
(iii )Q   Q 1
(iv) Qx , Qy  x , y  preserve inner product
(v) Qx
2
 x
2
(vi ) preserve angle
 preserve norm
Theorem 5.5.6:
If the columns of A mn form an
orthonormal set in m, then A A  I
and the least squares solution to Ax  b
is
x   A A A b  A b

1


This avoid computing matrix inverse .
Theorem 5.5.7 & 5.5.8:
Let S be a subspace of an inner product
space V and let x V . Let  x1 , x2 , , xn  be
an orthonormal basis for S .
n
If p   ci xi , where ci  x, xi
then i 1
(i)
(ii)
p  xS
for each i ,

y-x  p-x
 y  p in S.
The vector p is said to be the projection of x onto S.
Cor5.5.9:
Let S be a subspace of  m and b m .
If u1 uk  be an orthonormal basis for S
and U  u1 uk  , then
the projection p of b onto S is p  UU b .
pf: From Thm.5.5.8, p  c u  c u  ...  c u  Uc ,
1 1
2 2
where
 c1   u1T b 
   T 
c2   u2 b 

c

 U Tb

  
   T 
 ck   uk b 
Therefore, p  UU b .
k
k
Note: Let columns of U   u1 uk  be an
orthonormal set
 u1 


uk  
b
u  
 k 
 p  UU b   u1

k


  ui b ui
i 1

 (i) The projection of b onto R(U ) is the
sum of the projection of b onto each ui .
T
(ii) The matrix UU is called the projection
matrix onto S .
Example 7: Let S 
 x, y,0p x, y .

Find the vector
in S that is closet to
w   5,3, 4  .

Sol:
1 0


Clearly, e1 , e2  is a basis for S . Let U   0 1  ,
0 0


Thus, p  UU  w
 1 0 0  5   5 

   
  0 1 0  3    3 
 0 0 0  4   0 
   
1  
 1
 2

2


1 
 1
T
HW : U   

,
what
is
UU
?

2
2

0 
 0




Approximation of functions
Example 8: Find the best least squares approximation to
x
e on 0,1 by a linear function .
(i.e., Find g( x)  P2 0,1  e x  g ( x)  min e x  p( x) ,
p ( x )P2
where f
2


1
 f , f   f 2 dx.)
0
Sol:
(i) Clearly, span 1, x  P2 0,1 , but 1, x is not orthonormal.
(ii) seek a function of the form x  a,  ( x  a)  1
1
1
1
 1, x  a   ( x  a)dx   a  0  a 
0
2
2
1
1
x 
2
12
1
 u1  1, u2  12( x  ) form an orthonormal set of P2  0,1.
2
Sol:
(iii) c1  u1 , e
x
c 2  u2 , e
1
  ex  e 1
0
x
  u2 e x dx  3  3  e 
1
0
Thus, the projection
p( x)  c1u1  c2u2
1
 (e  1) 1  3(3  e)( 12( x  ))
2
 (4e  10)  6(3  e) x
is the best linear least square approximation to e x on  0,1 .
Approximation of trigonometric polynomials
FACT:  1 , cos nx, sin nx n  N forms an orthonormal set
 2 0

in C   ,   with respect to the inner product
f,g 
1

f ( x) g ( x)dx

 

Problem: Given a continuous 2π-periodic function f (x),
find a trigonometric polynomial of degree n
n
a0
tn ( x ) 
   ak cos kx  bk sin kx 
2 k 1
which is a best least squares approximation
to f (x) .
Sol: It suffices to find the projection of f (x) onto
the subspace
 1
span 
, cos kx, sin kx k  1,
 2

, n

 The best approximation of tn ( x)
has coefficients
1
1 
a0  f ,

f ( x)dx


2
2
1 
ak  f , cos kx   f ( x) cos kxdx

bk  f ,sin kx 
1


f ( x) sin kxdx




0
Example: Consider C   ,   with inner product of

1
f,g 
2

(i) Check that eikx k  0, 1,
(ii) Let tn 

  f ( x) g ( x)dx

n
ikx
c
e
 k
k  n
1 
 ikx
 ck 
f
(
x
)
e
dx



2
1
 (ak  ibk )
2
Similarly, c k  ck

,  n is orthonormal
(iii)
 ck e  c k e
ikx
 ikx
 ak cos kx  bk sin kx
(iv)
 tn 
n
ikx
c
e
 k
k  n
n
a0

   ak cos kx  bk sin kx 
2 k 1
5-6
Gram-Schmidt Orthogonalization Process
Cram-Schmidt Orthogonalization Process
Question: Given an ordinary basis x1 , x2 ,..., xn  ,
how to transform them into an orthonormal
basis u1 , u2 ,..., un  ?
Given x1
u1
xn
1
x1
x1
,Clearly span{u1}  span{x1}
x2
1
x2 , u1 u1 , u2 
( x2  p1 )
x2  p1
u1
p1
p1
Clearly, u1  u2 & span{x1, x2}  span{u1, u2}
Similarly, p2  x3 , u1 u1  x3 , u2 u2
and
u3
1
( x3  p2 )
x3  p2
Clearly, u3  u1 , u3  u2 & span{x1 , x2 , x3}  span{u1 , u2 , u3}
We have the next result
Theorem5.6.1: (The Gram-Schmidt process)
H. (i) Let  x1 xn  be a basis for an inner
product space V .
(ii) u1  1 x1 ,
x1
1
uK 1 
xK 1  pK
 xK 1  pK  , K  1,
K
where
C.
u1
pK   xK 1 , u j u j
j 1
un  is an orthonormal basis.
, n 1
Example: Find an orthonormal basis for P3 with
inner product given
by
3
P, g   P( xi )g ( xi ),
i 1
, where x1  1, x2  0 & x3  1.
Sol: Starting with a basis 1, x, x 2 
Let p1 , p2 ,..., pn 1 be the projection vectors defines in Thm. 5.6.1, and
let q1 , q2 ,..., qn  be the orthonormal basis of R( A) derived from the
Gram-Schmidt process.
r11  a1  a1  r11  q1
Define 
for k  2,..., n
rkk  ak  pk 1
and rik  qiT ak for i  1,..., k  1 by the Gram-Schmidt process.
Theorem5.6.2: (QR Factorization)
If A is an m×n matrix of rank n, then A
can be factored into a product QR, where Q
is an m×n matrix with orthonormal columns
and R is an n×n matrix that is upper triangular
and invertible.
Proof. of QR-Factorization
Let p1 , p2 ,..., pn 1 be the projection vectors defined in Thm.5.6.1,
and let q1 , q2 ,..., qn 1 be the orthonormal basis of R ( A) derived from
the Gram-Schmidt process.
r11  a1
Define 
rkk ak  pk 1
for k  2,..., n
By the Gram-Schmidt process,
a1  r11q1
a2  r12 q1  r22 q2
an  r1n q1  ...  rnn qn
and rik  qiT ak for i  1,...k -1
Proof. of QR-Factorization (cont.)
If we set Q  (q1 , q2 ,..., qn ) and define R to be the upper triangular matrix
 r11
0
R


 0
r12
r22
0
r1n 
r2 n 
,


rnn 
then the jth column of the product QR will be
Qrj  r1 j q1  r2 j q2  ...  rjj q j  a j
Therefore,
QR  ( a1 , a2 ,..., an )  A
for j  1,...n.
Theorem5.6.3:
If A is an m×n matrix of rank n, then the
solution to the least squares problem Ax  b
is given by x̂  R 1Qb , where Q and R are the
matrices obtained from Thm.5.6.2. The solution
x̂ may be obtained by using back substitution to
solve Rxˆ  Q b .
Proof. of Thm.5.6.3
Let xˆ be the solution to the leaset squares problem
Axˆ  b
 AT
T
ˆ
 A Ax  A b
 (QR)T QRxˆ  (QR)T b
 RT (QT Q) Rxˆ  RT QT b
T
(QR  Factorization)
I
T T
ˆ
 R Rx  R Q b
( R is invertible)
 Rxˆ  QT b or xˆ  R 1Q b
T
1
Example 3: Solve 2
2

4
2 1
 1 
  x1   
0 1    1 
x2  

4 2     1 
  x3   
0 0
 2 
b
A
By direct calculation,
 1  2  4

 5  2 1 

1  2 1 2 
A  QR  
 0 4  1

5 2 4 2 


 0 0
2

 4 2  1  




R
 1
 
Q b   1
2
 
 The solution can be obtained from
5

0
0

2
4
1
1
0
2
1

 1
2 
Q