Download Matrix Operations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Matrix completion wikipedia , lookup

Symmetric cone wikipedia , lookup

Exterior algebra wikipedia , lookup

Capelli's identity wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Rotation matrix wikipedia , lookup

System of linear equations wikipedia , lookup

Principal component analysis wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Jordan normal form wikipedia , lookup

Determinant wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Four-vector wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix calculus wikipedia , lookup

Gaussian elimination wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Transcript
14
MATH10212 • Linear Algebra • Brief lecture notes
Matrix Operations
Definition A matrix is a rectangular array of numbers called the entries,
or elements, of the matrix.
The following are all examples of matrices:


·
¸
· √
¸
2
1 2
5 −1 0
 4 ,
,
,
0 3
2
π 21
17

5.1
 6.9
−7.3

1.2 −1
0 4.4  ,
9 8.5
£
1 1
1
1
¤
[7]
The size of a matrix is a description of the numbers of rows and columns
it has. A matrix is called m × n (pronounced “m by n”) if it has m rows and
n columns. Thus, the examples above are matrices of sizes 2 × 2, 2 × 3, 3 × 1,
1 × 4, 3 × 3 and 1 × 1, respectively. A 1 × m matrix is called a row matrix,
and n × 1 matrix is called a column matrix (or column vector).
We use double-subscript notation to refer to the entries of a matrix A.
The entry of A in row i and column j is denoted by aij . Thus, if
·
¸
3 9 −1
A=
0 5
4
then a13 = −1 and a22 = 5. We can therefore compactly denote a matrix A
by [aij ] (or [aij ]m×n if it is important to specify the size of A, although the
size will usually be clear from the context).
With this notation, a general m × n matrix A has the form


a11 a12 · · · a1n
 a21 a22 · · · a2n 


A=
..
.. . .
.. 

.
.
.
. 
am1 am2 · · · amn
If the columns of A are the vectors ~a1 , ~a2 , . . . , ~an , then we may represent A
as
A = [~a1~a2 . . . ~an ]
~1, A
~2, . . . , A
~ n , then we may represent A as
If the rows of A are A
 ~ 
A1
 A
~ 
 2 
A= . 
 .. 
~m
A
The diagonal entries of A are a11 , a22 , a33 , . . . , and if m = n (that is, if A
has the same number of rows as columns), then A is called a square matrix.
15
MATH10212 • Linear Algebra • Brief lecture notes
A square matrix whose nondiagonal entries are all zero is called a diagonal
matrix. A diagonal matrix all of whose diagonal entries are the same is
called a scalar matrix. If the scalar on the diagonal is 1, the scalar matrix
is called an identity matrix.
For example, let
·
A=

¸
5 0
,
4 1

0 0
6 0 ,
0 2
2
−1
3
C= 0
0
·
3
4
B=

1
D= 0
0
1
5
¸
,

0
0 
1
0
1
0
The diagonal entries of A are 2 and 4, but A is not square; B is a square
matrix of size 2 × 2 with diagonal entries 3 and 5; C is a diagonal matrix;
D is 3 × 3 identity matrix. The n × n identity matrix is denoted by In (or
simply I if its size is understood).
We can view matrices as generalizations of vectors. Indeed, matrices can
and should be thought of as being made up of both row and column vectors.
(Moreover, an m × n matrix can also be viewed as a single “wrapped vector”
of length mn.) Many of the conventions and operations for vectors carry
through (in an obvious way) to matrices.
Two matrices are equal if they have the same size and if their corresponding entries are equal. Thus, if A = [aij ]m×n and B = [bij ]r×s , then
A = B if and only if m = r and n = s and aij = bij for all i and j.
Example 3.1. Consider the matrices
¸
·
¸
·
a b
2 0
,
B=
,
A=
c d
5 3
·
C=
2 0
5 3
x
y
¸
Neither A nor B can be equal to C (no matter what the values of x and y),
since A and B are 2 × 2 matrices and C is 2 × 3. However, A = B if and only
if a = 2, b = 0,c = 5 and d = 3.
Matrix Addition and Scalar Multiplication
We define matrix addition componentwise. If A = [aij ] and B = [bij ] are
m × n matrices, their sum A + B is the m × n matrix obtained by adding the
corresponding entries:
A + B = [aij + bij ]
If A and B are not the same size, then A + B is not defined.
Example 3.3. Let
·
A=
1
−2
4
6
0
5
¸
·
,
B=
−3 1 −1
3 0
2
¸
,
16
MATH10212 • Linear Algebra • Brief lecture notes
·
C=
Then
·
A+B =
4 3
2 1
−2
1
5
6
¸
−1
7
¸
but neither A + C nor B + C is defined.
If A is m × n matrix and c is a scalar, then the scalar multiple cA is the
m × n matrix obtained by multiplying each entry of A by c. More formally,
we have
cA = c[aij ] = [caij ]
[In terms of vectors, we could have equivalently stipulate that each column
(or row) of cA is c times the corresponding column (or row) of A.]
Example 3.4. For matrix A in Example 3.3,
·
¸
· 1
1
2
2 8
0
2
2A =
,
A=
−4 12 10
−1
3
2
·
¸
−1 −4
0
(−1)A =
2 −6 −5
0
5
2
¸
,
The matrix (−1)A is written as −A and called the negative of A. As with
vectors, we can use this fact to define the difference of two matrices: if A
and B are the same size, then
A − B = A + (−B)
Example 3.5. For matrices A and B
·
1 4 0
A−B =
−2 6 5
·
4
−5
in Example 3.3,
¸ ·
¸
−3 1 −1
−
=
3 0
2
¸
3 1
6 3
A matrix all of whose entries are zero is called a zero matrix and denoted
by O (or Om×n if it is important to specify its size). It should be clear that if
A is any matrix and O is the zero matrix of the same size, then
A+O =A=O+A
and
A − A = O = −A + A
17
MATH10212 • Linear Algebra • Brief lecture notes
Matrix Multiplication
Definition If A is m × n matrix and B is n × r matrix, then the product
C = AB is an is m × r matrix. The (i, j) entry of the product is computed as
follows:
cij = ai1 b1j + ai2 b2j + · · · + ain bnj
Remarks
• Notice that A and B need not be the same size. However, the number
of columns of A must be the same as the number of rows of B. If we
write the sizes of A, B and AB in order, we can see at a glance whether
this requirement is satisfied. Moreover, we can predict the size of the
product before doing any calculations, since the number of rows of AB
is the same as the number of rows of A, while the number of columns
of AB is the same as the number of columns of B
A
B
m×n
=
n×r
AB
m×r
• The formula for the entries of the product looks like a dot product,
and indeed it is. It says that (i, j) entry of the matrix AB is the dot
product of the ith row of A and the jth column of B:


a11 a12 · · · a1n 


..
..
..  b11 · · · b1j · · · b1r


.
.
.   b21 · · · b2j · · · b2r 


 ai1 ai2 · · · ain  
..
..
.. 




.
.
.


..
..
..

.
.
.  bn1 · · · bnj · · · bnr
am1 am2 · · · amn
Notice, that in the expression
cij = ai1 b1j + ai2 b2j + · · · + ain bnj
the “outer subscripts" on each ab term in the sum are always i and j
whereas the “inner subscripts" always agree and increase from 1 to n.
We see this pattern clearly if we write cij using summation notation:
cij =
n
X
aik bkj
k=1
• Matrix multiplication is not commutative: AB 6= BA in general, even
if both AB and BA are defined.
·
¸ ·
¸
·
¸
·
¸
¸ ·
¸ ·
2 0
1 1
2 1
2 0
2 2
1 1
E.g.,
·
=
=
.
6=
·
0 1
0 1
0 1
0 1
0 1
0 1
18
MATH10212 • Linear Algebra • Brief lecture notes
• There is no cancellation rule in general:
·
¸ ·
¸
·
¸
·
¸ ·
1 1
−1 1
0 0
1 1
2
·
=
=
·
1 1
1 −1
0 0
1 1
−2
·
¸
·
¸
−1 1
2 −2
but
6=
.
1 −1
−2 2
¸
−2
,
2
• Helpful hint on multiplication of matrices: If the left factor A is
“sparse”, or/and contains “simple” elements like many 0s or 1s, then
it may be advantageous to compute the product AB as follows: the
ith row of AB is a linear combination of the rows of B taken with
coefficients given by the ith row of A.


1
2
7
·
¸
·
¸
−3
2
5
1 0 1
0
3 5 9


E.g.,
·
=
; the first row of
2
3
2
1 0 0 −1
3 3 5
−2
−1
2
AB is the sum of the 1st and 3rd rows of B, because we have 1, 0, 1, 0
in the first row in A; the second row of AB is 1st row of B minus 4th
row of B because we have 1, 0, 0, −1 in the second row of A. Actually,
this rule can always be applied, but is particularly effective when A is
“easier” than B. (Note: “rows ↔ left” works here again.)
Similarly, if the right matrix B is better, then the kth column of AB
is obtained as a linear combination of columns of A taken with coefficients given in the kth column of B.
• The real justification of the definition of the product of matrices (which
may seem at first glance rather arbitrary, “too complicated”) is connected with so-called linear transformations, which we shall study
later. These transformations are mappings (functions) from Rn to Rm ;
ϕ
ψ
in coordinates they are given by matrices. If Rm −→ Rn −→ Rs are
two such transformations whose matrices are A, B, then the resulting, composite, transformation ~a → ψ(ϕ(~a)) is given precisely by the
product AB.
• Matrix multiplication is associative: A(BC) = (AB)C (of course,
when these products are defined, i.e. with correct sizes Am×n , Bn×s , Cs×t ).
This is a very important property, which is very far from being obvious (as the definition of products is so complicated...). The proof of
this property is deferred until we study linear transformations later.
In fact, then it will become quite easy to prove: if A, B, C are the maϕ
ψ
χ
trices for Rm −→ Rn −→ Rs −→ Rt , then the matrix A(BC) = (AB)C
is simply the matrix of the composite transformation ~a → χ(ψ(ϕ(~a)))...
Theorem 3.1. Let A be an m × n matrix, e~i an 1 × m standard unit vector,
and e~j an n × 1 standard unit vector. Then
(a) e~i A is the ith row of A and
(b) Ae~j is the jth column of A.
19
MATH10212 • Linear Algebra • Brief lecture notes
Proof We prove (b) and leave proving (a) as an exercise. If ~a1 , . . . , ~an are
the columns of A, then the product Ae~j can be written
Ae~j = 0~a1 + 0~a2 + · · · + 1~aj + · · · + 0~an = ~aj
We could also prove (b) by direct calculation:

Ae~j
=
a11
 a21


..

.
am1

=




a1j
a2j
..
.
···
···
a1j
a2j
..
.
···
···
···
amj
···


0
a1n  . 
 .. 
a2n 


1 
..  


.  . 
 .. 
amn
0






amj
since the 1 in e~j is the jth entry.
Partitioned Matrices
It will often be convenient to regard a matrix as being composed of a number of smaller submatrices. By introducing vertical and horizontal lines
into a matrix, we can partition it into blocks. There is a natural way to
partition many matrices, particularly those arising in certain applications.
For example, consider the matrix


1 0 0 2 −1
 0 1 0 1
3 



0 
A= 0 0 1 4

 0 0 0 1
7 
0 0 0 7
2
It seems natural to partition A as

1 0 0
 0 1 0

A=
 0 0 1
 0 0 0
0 0 0
2
1
4
1
7
−1
3
0
7
2

 ·

I
=

O

B
C
¸
where I is the 3 × 3 identity matrix, B is 3 × 2, O is 2 × 3 zero matrix, and
C is 2 × 2. In this way, we can view A as a 2 × 2 matrix whose entries are
themselves matrices.
When matrices are being multiplied, there is often an advantage to be
gained by viewing them as partitioned matrices. Not only does this frequently reveal underlying structures, but it often speeds up computation,
20
MATH10212 • Linear Algebra • Brief lecture notes
especially when the matrices are large and have many blocks of zeros. It
turns out that the multiplication of partitioned matrices is just like ordinary matrix multiplication.
We begin by considering some special cases of partitioned matrices. Each
gives rise to a different way of viewing the product of two matrices.
Suppose A is m × n matrix and B is n × r, so the product AB exists. If
we partition B in terms of its column vectors, as
. . .
B = [~b1 .. ~b2 .. · · · .. ~br ]
(here vertical dots are used simply as dividers, as in the textbook), then
. .
.
.
.
.
AB = A[~b1 .. ~b2 .. · · · .. ~br ] = [A~b1 .. A~b2 .. · · · .. A~br ]
This result is an immediate consequence of the definition of matrix multiplication. The form on the right is called the matrix-column representation
of the product.
In fact, we saw this already in Helpful Hint: the jth column of the product is A~bj and clearly it is the linear combination of the
 columns
 of A with

 −1


1
1 2 3 4 
2 10

0
1 
coefficients given by ~bj . For example, 2 3 4 5 · 
= 2 14
 1
1
6 7 8 9
2 30
0
1
(The first column of this product C = AB is the 3rd column of A minus 1st
column of A: that is, ~c1 = −1~a1 + 1~a3 , with the coefficients given by ~b1 , the
second column of AB is ~c2 = 1~a1 + 1~a2 + 1~a3 + 1~a4 , i.e. the sum of all columns
of A.)
Suppose A is m × n matrix and B is n × r, so the product AB exists. If
we partition A in terms of its row vectors, as
 ~ 
A1
 −− 


 A
~ 
 2 
 −− 


 . 
 .. 


 −− 
~m
A
then
 ~
A1
 −−

 A
~
 2
 −−
AB = 
 .
 ..

 −−
~m
A

 ~
A1 B

 −−



 A
~

 2B

 −−
B = 


..


.



 −−
~mB
A











21
MATH10212 • Linear Algebra • Brief lecture notes
Once again, this result is a direct consequence of the definition of matrix
multiplication. The form on the right is called the row-matrix representation of the product.
~iB
Again, we saw this already in Helpful Hint: the ith row of AB is A
and clearly it is the linear combination of the rows of B with coefficients
~i.
given by A
Matrix Powers. When A and B are two n × n matrices, their product AB
will also be an n × n matrix. A special case occurs when A = B. It makes
sense to define A2 = AA and, in general, to define Ak as
Ak = AA · · · A
1
if k is a positive integer. Thus, A
A0 = In .
(k times)
= A, and it is convenient to define
If A is a square matrix and r and s are nonnegative integers, then
1. Ar As = Ar+s
2. (Ar )s = Ars
Definition The transpose of an m × n matrixA is the n × m matrix AT
obtained by interchanging the rows and columns of A. That is, the ith
column of AT is the ith row of A for all i.
The transpose is sometimes used to give an alternative definition of the
dot product of two vectors in terms of matrix multiplication. If




u1
v1
 u2 
 v2 




~u =  .  and ~v =  . 
 .. 
 .. 
un
vn
then
~u · ~v (dot product) = u1 v1 + u2 v2 + · · · + un vn


v1

£
¤
 v2 
= u1 u2 . . . un  .  = ~uT ~v (matrix product)
 .. 
vn
A useful alternative definition of the transpose is given componentwise:
(AT )ij = Aji for all i and j
Definition A square matrix A is symmetric if
AT = A,
that is, if A is equal to its own transpose.
A square matrix A is symmetric if and only if Aij = Aji for all i and j.
MATH10212 • Linear Algebra • Brief lecture notes
22
Matrix Algebra
Theorem 3.2. Algebraic Properties of Matrix Addition and Scalar
Multiplication
Let A, B and C be matrices of the same size and let c and d be scalars. Then
a . A + B = B + A Commutativity
b. (A + B) + C = A + (B + C) Associativity
c. A + O = A
d. A + (−A) = O
e. c(A + B) = cA + cB Distributivity
f. (c + d)A = cA + dA Distributivity
g. c(dA) = (cd)A
h. 1A = A
We say that matrices A1 , A2 , . . . , An of the same size are linearly independent if the only solution of the equation
c1 A1 + c2 A2 + · · · + ck Ak = O
is the trivial one: c1 = c2 = · · · = ck = 0, otherwise they are linearly
dependent. Questions like “Is matrix B equal to a linear combination of
given matrices A1 , . . . , Ak ?” or “Are given matrices A1 , . . . , Ak linearly dependent?” can be answered by using linear systems just like for vectors
(here regarding matrices as “wrapped” vectors).
Theorem 3.3.
Properties of Matrix Multiplication
Let A, B and C be matrices (whose sizes are such that the indicated operations can be performed) and let k be a scalar. Then
a. (AB)C = A(BC) (Associativity)
b. A(B + C) = AB + AC (Left distributivity)
c. (A + B)C = AC + BC (Right distributivity)
d. k(AB) = (kA)B = A(kB)
e. Im A = A = AIn if A is m × n (Multiplicative identity)
Proof (a) Let A be m × n matrix, B n × s matrix, C s × t matrix (for the
products to be defined). First, check the sizes are the same on the left and
on the right. Left: AB is m × s matrix; (AB)C is m × t matrix. Right: BC is
n × t matrix; A(BC) is m × t matrix.
23
MATH10212 • Linear Algebra • Brief lecture notes
We now check that every ij-th entry is the same. Starting from the left:
((AB)C)ij = by def. of matrix product
s
s ³X
n
´
X
X
=
(AB)ik Ckj =
Ail Blk Ckj =
k=1
k=1
l=1
using distributivity for numbers expand brackets
=
s X
n
X
Ail Blk Ckj =
k=1 l=1
change order of summation, which is OK since it is simply the sum over all
pairs k, l either way
n X
s
X
=
Ail Blk Ckj =
l=1 k=1
collecting terms
=
n
X
l=1
Ail
s
³X
´
Blk Ckj =
k=1
now the inner sum is (BC)lj by def. of product
=
n
X
Ail (BC)lj = (A(BC))ij ,
l=1
the ij-th entry on the right.
There is also another way to prove the associativity (later, in connection
with linear transformations).
We also prove (b) and half of (e). The remaining properties are considered in the exercises in the textbook.
~i
(b) To prove A(B + C) = AB + AC, we let the rows of A be denoted by A
~
and the columns of B and C by bj and c~j . Then the jth column of B + C is
b~j + c~j (since addition is defined componentwise), and thus
~ i · (b~j + c~j )
[A(B + C)]ij = A
~ i · b~j + A
~ i · c~j
=A
= (AB)ij + (AC)ij
= (AB + AC)ij
Since this is true for all i and j, we must have A(B + C) = AB + AC.
(e) To prove AIn = A we note that the identity matrix In can be columnpartitioned as
. .
.
I = [~e ..~e .. . . . ..~e ]
n
1
2
n
where ~ei is a standard unit vector. Therefore,
.
.
.
AIn = [A~e1 ..A~e2 .. . . . ..A~en ]
. .
.
= [~a1 ..~a2 .. . . . ..~an ] = A
by Theorem 3.1(b).
MATH10212 • Linear Algebra • Brief lecture notes
Theorem 3.4.
24
Properties of the Transpose
Let A and B be matrices (whose sizes are such that the indicated operations
can be performed) and let k be a scalar. Then
a. (AT )T = A
b. (A + B)T = AT + B T
c. (kA)T = k(A)T
d. (AB)T = B T AT
e. (Ar )T = (AT )r for all nonnegative integers r
Theorem 3.5.
a. If A is a square matrix, then A + AT is a symmetric matrix.
b. For any matrix A, AAT and AT A are symmetric matrices.
25
MATH10212 • Linear Algebra • Brief lecture notes
The Inverse of a Matrix
Definition If A is an n × n matrix, an inverse of A is an n × n matrix A0
with the property that
AA0 = I and A0 A = I
where I = In is the n × n identity matrix. If such an A0 exists, then A is
called invertible.
·
Examples.
1)
¸ ·
1 2
1
·
0 1
0
¸ ·
−2
1
=
1
0
¸ ·
¸ ·
¸
−2
1 2
1 0
·
=
.
1
0 1
0 1
2) Having an inverse helps: e.g. to solve the equation AX = B, where
A, V, B are matrices, if there is A−1 , then simply X = A−1 B, after we multiply both parts by A−1 on the left and use associativity:
A−1 (AX) = A−1 B; (A−1 A)X = A−1 B; X = A−1 B.
·
¸
·
¸
·
¸·
¸
·
¸
1 1
3 4
1 −1 3 4
−2 −2
E.g.,
· X2×2 =
⇒X =
=
is the
0 1
5 6
0 1
5 6
5
6
unique solution.
·
¸
1 1
3) The matrix A =
has no inverse: for any matrix B2×2 we have
1 1
equal rows in AB (by Helpful
· Hint
¸ both are simply the sum of the rows
1 0
of B), so it cannot be = I =
whose rows are not equal.
0 1
Theorem 3.6.
If A is an invertible matrix, then its inverse is unique.
From now on, when A is invertible, we will denote its (unique) inverse
by A−1 .
Matrix form of a linear system. An arbitrary linear system

a11 x1 + a12 x2 + . . . + a1n xn = b1



a21 x1 + a22 x2 + . . . + a2n xn = b2
...
...
...
...
...



am1 x1 + am2 x2 + . . . + amn xn = bm
can be viewed as the equality of two big column-vectors:




b1
a11 x1 + a12 x2 + . . . + a1n xn
 b2 
 a21 x1 + a22 x2 + . . . + a2n xn 


 = 
 .. .
 ...

...
...
...
 . 
am1 x1 + am2 x2 + . . . + amn xn
bm
The left column in turn is obviously the sum
26
MATH10212 • Linear Algebra • Brief lecture notes








= 

a11 x1
a21 x1
..
.






 + 


a12 x2
a22 x2
..
.






 + ··· + 


a1n xn
a2n xn
..
.
am1 x1
a x
amn xn

 m2 2 

a11
a12
a1n
 a22 
 a2n
a21 




..  · x1 +  ..  · x2 + · · · +  ..
 . 
 .
. 
am1
am2
amn



 =




 · xn .

By our “Helpful hint” on multiplication, the left-hand column is therefore the product of two matrices:

a11
 a21

 ...
am1
a12
a22
...
am2
...
...
...
...
 x 

1
a1n
a11 x1
 x2 
 a21 x1
a2n 


·  = 
 ...
. . .   ... 
amn
am1 x1
xm
+ a12 x2
+ a22 x2
...
+ am2 x2

+ . . . + a1n xn
+ . . . + a2n xn 
.

...
...
+ . . . + amn xn
We now can write the linear system in the matrix form: A · ~x = ~b, where

a11 a12
 a21 a22
A=
 ...
...
am1 am2
 
x1
 x2 
 
~x =  .  is the
 .. 

. . . a1n
. . . a2n 
 is the coefficient matrix of the system,
... ... 
. . . amn
 
b1
 b2 
 
column matrix (vector) of variables, and ~b =  .  is the
 .. 
xn
column matrix of constant terms (right-hand sides).
bm
Theorem 3.7. If A is an invertible n × n matrix, then the system of linear
equations given by A~x = ~b has the unique solution ~x = A−1~b for any ~b in Rn .
The following theorem will be superseded later by more general methods, but it is useful in small examples.
·
Theorem 3.8.
If A =
a b
c d
¸
, then A is invertible if and only if
ad − bc 6= 0, in which case
A−1 =
1
ad − bc
·
d −b
−c
a
¸
(So, if ad − bc = 0, then A is not invertible.)
The expression ad − bc is called the determinant of A (in this special 2 × 2
case), denoted det A.