Download Review of Linear Algebra

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Symmetric cone wikipedia , lookup

Cross product wikipedia , lookup

Euclidean vector wikipedia , lookup

Vector space wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Exterior algebra wikipedia , lookup

Rotation matrix wikipedia , lookup

System of linear equations wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Principal component analysis wikipedia , lookup

Jordan normal form wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Determinant wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Gaussian elimination wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Four-vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

Transcript
Review of linear algebra
1
Vectors and matrices
We will just touch very briefly on certain aspects of linear algebra, most of
which should be familiar. Recall that we deal with vectors, i.e. elements of
Rn , which here we will denote with bold face letters such as v, and scalars,
in other words elements of R. We could also use Cn , or Qn as needed,
with scalars respectively C or Q. The main point is that, for the scalars,
we need to be able to add, subtract, multiply and divide (except by 0).
We can then add two vectors: if v = (v1 , . . . , vn ) and w = (w1 , . . . , wn ),
then v + w = (v1 + w1 , . . . , vn + wn ). Scalar multiplication is similarly
defined: given t ∈ R and v = (v1 , . . . , vn ) ∈ Rn , tv = (tv1 , . . . , tvn ). Vector
addition is commutative and associative, there is a zero vector 0 = (0, . . . , 0),
and every vector v has an additive inverse −v = (−1)v = (−v1 , . . . , −vn ).
Scalar multiplication satisfies: for all s, t ∈ R and v ∈ Rn , s(tv) = (st)v and
1v = v. Finally there are two analogues of the distributive law: s, t ∈ R
and v ∈ Rn , (s + t)v = sv + tv, and for all t ∈ R and v, w ∈ Rn , t(v + w) =
tv + tw. These are easily check by using the usual properties of addition
and multiplication for real numbers.
We will not discuss the standard definition of linear independence, span,
dimension or basis here. However, we will frequently use the standard basis
e1 , . . . , en , where the components of ei are 0 except for the ith component,
which is 1. Thus, any vector v =
1 , . . . , vn ) can be uniquely written in
P(v
n
terms of the standard basis: v = i=1 vi ei .
There is also the dot product or scalar product or inner product of two
vectors v, w ∈ Rn , which we shall call the inner product and write as hv, wi,
although if called the dot product it is usually written as v • w. Note that
the product of two vectors is a scalar, whence the name scalar product. It
1
is bilinear and symmetric: for all v, w, u ∈ Rn and t ∈ R,
hv + w, ui = hv, ui + hw, ui;
htv, wi = thv, wi;
hu, v + wi = hu, vi + hu, wi;
hv, twi = thv, wi;
hv, wi = hw, vi
Of course, the third and fourth identities are a consequence of the first two
and the symmetry condition. We can also define the length or norm of v:
kvk = (hv, vi)1/2 .
The standard basis e1 , . . . , en is orthonormal: for all i, j,
(
0, if i 6= j;
hei , ej i =
1, if i = j.
Any basis u1 , . . . , un with this property, that hui , uj i = 1 if i 6= j and
hui , ui i = kui k2 = 1, will be called an orthonormal basis.
Our main interest will be interesting sets of matrices. Recall that an
m × n matrix is a rectangular array


a11 a12 . . . a1n

..
..  .
..
A =  ...
.
.
. 
am1 am2 . . . amn
We will often abbreviate this as A = (aij ). The above matrix consists of
m rows and n columns. We refer to the number aij as the ij th entry. This
means that aij is the number in the ith row and j th column. In particular
a vector (x1 , . . . , xn ) is also a matrix, in this case a 1 × n matrix. We will
call such a matrix a row vector. We can also think of a vector as an n × 1
matrix, which we shall refer to as a column vector. (We will often have to
think of vectors as column vectors because of our conventions on the way
we write functions.) The set of all m × n matrices is written Mm,n (R).
Mm,n (C), Mm,n (Q), and even Mm,n (Z) are defined similarly. We can add
two matrices in Mm,n (R) and multiply a matrix by a scalar. The zero matrix
O = Om,n ∈ Mm,n (R) is the matrix all of whose entries are 0. In case m = n,
we abbreviate Mn,n (R) by Mn (R) and call such a matrix a square (n × n)
matrix. An important element of Mn (R) is the identity matrix In = I,
2
whose diagonal entries aii are equal to 1 and whose other entries aij , i 6= j,
are equal to 0. It is easy to see that, for all A ∈ Mm,n (R), Im A = AIn = A.
Given an m×n matrix A and an n×k matrix B, we can
the matrix
Pform
n
th
product AB, an m × k matrix whose ij entry is given by t=1 ait btj . Thus
the ij th entry is the inner product of the ith row of A with the j th column of
B. Matrix multiplication is associative and distributes over matrix addition,
where defined, but for A, B ∈ Mn (R) (the only case where AB and BA are
both defined and of the same shape), it is rarely the case that AB = BA:
matrix multiplication is not commutative.
Recall that a linear function F : Rn → Rm is a function F such that, for
all v, w ∈ Rn and t ∈ R, F (v + w) = F (v) + F (w) and F (tv) = tF (v).
A linear function is completely specified by its values on the standard basis
vectors e1 , . . . , en . Conversely, given any set of vectors v1 , . . . , vn ∈ Rm ,
there is a unique linear function
F : Rn → Rm such that F (ei ) = vi for all i,
P
namely F (x1 , . . . , xn ) = i xi vi . In this case, recall that we can associate
an m × n matrix to F as follows: write the vectors vi = (a1i , . . . , ami ). Then
to F we associate the matrix


a11 a12 . . . a1n

..
..  .
..
A =  ...
.
.
. 
am1 am2 . . . amn
Here the columns of A are the vectors vi , written vertically, and the
linear map F (x1 , . . . , xn ) corresponds to the matrix productPA · x, where
A · x is the n × 1 matrix (column vector) whose j th entry is ni=1 aji xi . In
particular A · ei = vi , written as a column vector; its j th entry is aji and it
P
is equal to kj=1 aji ej , where in the equality
A · ei =
k
X
aji ej
j=1
the ei on the left is a basis vector in Rn and the ej on the right is a basis
vector in Rk . Note the reversal of the indices! The case F : Rn → Rn
corresponds to square (n × n) matrices. For example the linear function
IdRn corresponds to the identity matrix In . Then we have:
Proposition 1.1. If G : Rk → Rn and F : Rn → Rm are linear maps, and
A and B are the matrices corresponding to F and G respectively, then F ◦ G
is again linear and the matrix corresponding to F ◦ G is the matrix product
A · B.
3
This gives another, more conceptual proof of the associativity of matrix
multiplication.
Let A be an m×n matrix A = (aij ). Recall that the transpose matrix t A
is the n × m matrix whose (i, j)th entry is aji . For example, if A is a square
(n × n) matrix, then t A is the reflection of A along the diagonal running
from upper left to lower right. Clearly, t (t A) = A. A calculation shows that,
for all standard basis vectors ei ∈ Rm and ej ∈ Rn , hei , Aej i = ht Aei , ej i.
(Here of course the first inner product is of vectors in Rm and the second
is of vectors in Rn .) Using bilinearity, it follows that, for all v ∈ Rm and
w ∈ Rn ,
hv, Awi = ht Av, wi.
From this (or directly from the definitions) one can prove: If A is an m × n
matrix and B is an n × k matrix, then
t
2
(AB) = t B t A.
Invertible matrices
We will write linear maps F : Rn → Rm as matrices A: F (v) = Av, with the
understanding that, for the right hand side, v must be viewed as a column
vector. Define the nullspace or kernel of A to be the set {v ∈ Rn : Av = 0}.
We then have the basic result:
Proposition 2.1. The linear function A : Rn → Rm is injective ⇐⇒ the
nullspace of A is {0} ⇐⇒ the columns of A are linearly independent. The
linear function A : Rn → Rm is surjective ⇐⇒ the columns of A span Rm .
Corollary 2.2. Let F : Rn → Rm be a linear function, corresponding to the
matrix A.
(i) If F is injective, then n ≤ m.
(ii) If F is surjective, then n ≥ m.
(iii) If n = m, then F is injective ⇐⇒ F is surjective ⇐⇒ F is a
bijection, and in this case the inverse function F −1 corresponds to a
matrix, denoted A−1 , with the property that
AA−1 = A−1 A = In .
4
We call a matrix A ∈ Mn (R) invertible if an inverse A−1 exists. The
problem of deciding when a given n×n matrix A is invertible can be answered
by determinants. Recall that, for every n, we have a function det : Mn (R) →
R with the following properties:
1. For all A, B ∈ Mn (R), det(AB) = (det A)(det B).
2. det In = 1.
3. A is invertible ⇐⇒ det A 6= 0. In this case,
det(A−1 ) = (det A)−1 .
4. det t A = det A.
Define the general linear group GLn (R) to be the subset of Mn (R) consisting of invertible matrices. Equivalently, by (3) above,
GLn (R) = {A ∈ Mn (R) : det A 6= 0}.
The subset GLn (R) of Mn (R) is closed under products, In ∈ GLn (R), and
if A ∈ GLn (R), then by definition A−1 exists and A−1 ∈ GLn (R); note that
A−1 is invertible and that (A−1 )−1 = A.
Define the special linear group SLn (R) via:
SLn (R) = {A ∈ Mn (R) : det A = 1}.
Clearly SLn (R) ⊆ GLn (R). By (1) above, SLn (R) is closed under multiplication and by (2) above, In ∈ SLn (R). Finally, if A ∈ SLn (R), then A is
invertible and A−1 ∈ SLn (R) by (3).
3
Orthogonal matrices
Orthogonal matrices are invertible matrices with very special geometric
properties.
Definition 3.1. A linear function A : Rn → Rn is an isometry if, for all
v ∈ Rn , kAvk = kvk. In other words, A preserves length.
Proposition 3.2. Given A ∈ Mn (R), the following conditions on A are
equivalent.
(i) A is an isometry, i.e. for all v ∈ Rn , kAvk = kvk.
5
(ii) For all v, w ∈ Rn , hAv, Awi = hv, wi. In other words, A preserves
inner product.
(iii) The columns of A are an orthonormal basis of Rn .
(iv) A is invertible and t A = A−1 .
(v) The rows of A are an orthonormal basis of Rn .
Proof. (i) =⇒ (ii): This follows from the polarization identity: For all
v, w ∈ Rn ,
kv + wk2 − kv − wk2 = 4hv, wi.
This in turn follows from the bilinearity and symmetry of inner product and
expansion: For example,
kv + wk2 = hv + w, v + wi = hv, vi + 2hv, wi + hw, wi,
and similarly for kv − wk2 . Then, if A is an isometry,
4hAv, Awi = kAv + Awk2 − kAv − Awk2
= kA(v + w)k2 − kA(v − w)k2
= kv + wk2 − kv − wk2 = 4hv, wi.
Hence hAv, Awi = hv, wi.
(ii) =⇒ (i): If hAv, Awi = hv, wi for all v, w ∈ Rn , then take v = w, so
that kAvk2 = hAv, Avi = hv, vi = kvk2 .
(ii) =⇒ (iii): The columns of A are equal to ui = Aei . By (ii), hui , uj i =
hAei , Aej i = hei , ej i. Thus u1 , . . . , un is an orthonormal basis of Rn .
(iii) ⇐⇒ (iv): The ij th entry of t AA is the inner product hui , uj i. Hence
t AA = I
n ⇐⇒ hui , uj i is 0 if i 6= j and 1 if i = j ⇐⇒ the columns of A
are an orthonormal basis of Rn .
(iv) ⇐⇒ (v): Similar to the above, using At A instead of t AA.
(iv) =⇒ (ii): If t A = A−1 , then for all v, w ∈ Rn ,
hAv, Awi = hv, t AAwi = hv, A−1 Awi = hv, wi.
We see that any of the five statements in the proposition implies any other,
so they are all equivalent.
6
Definition 3.3. A matrix A ∈ Mn (R) satisfying any (and hence all) of the
equivalent properties above is called an orthogonal matrix. The set of all
orthogonal n × n matrices is denoted On , the orthogonal group. The set
of all orthogonal matrices with determinant 1 is denoted SOn , the special
orthogonal group.
Proposition 3.4. If A, B ∈ On , then AB ∈ On . Moreover In ∈ SOn and
hence In ∈ On . Finally, if A ∈ On , then A−1 ∈ On . Similar statements hold
with On replaced by SOn .
Proof. If A, B ∈ On , then
t
(AB) = t B t A = B −1 A−1 = (AB)−1 .
Thus AB ∈ On . Clearly In ∈ SOn . Finally, note that, in general, if A is an
n × n matrix with an inverse A−1 , then t (A−1 ) = (t A)−1 , by applying the
identity t (AB) = t B t A to the product AA−1 = I. Thus, if A is orthogonal,
t
(A−1 ) = (t A)−1 = (A−1 )−1 .
It follows that A−1 ∈ On .
The following says that there is not a big difference between On and
SOn :
Proposition 3.5. If A ∈ On , then det A = ±1.
Proof. Using t A = A−1 , we see that
det A = det t A = det A−1 = (det A)−1 .
Thus (det A)2 = 1, so that det A = ±1.
We sometimes think of SOn as the set of rigid motions of Rn (fixing the
origin). More details about SO2 and O2 are in the homework.
7