Download 2 Rank and Matrix Algebra - UCLA Department of Mathematics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bra–ket notation wikipedia , lookup

Determinant wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

System of linear equations wikipedia , lookup

Matrix calculus wikipedia , lookup

Transcript
2
Rank and Matrix Algebra
2.1
Rank
In our introduction to systems of linear equations we mentioned that a system can have
no solutions, a unique solution, or infinitely many solutions. In this section we’re going to
introduce an invariant1 of matrices, and when this invariant is computed for the matrix of
coefficients of a system of linear equations, it can give us information about the number of
solutions to this system.
Rank.2 The rank of a matrix A is the number of pivots in rref(A), the reduced rowechelon form of A. We denote the rank of A by rank(A).
Example 8. Consider the matrix


4
3
2 −1
 5
4
3 −1 

A=
 −2 −2 −1 2  .
11 6
4
1
This matrix has reduced row-echelon form

1
 0
rref(A) = 
 0
0
0
1
0
0

0 1
0 −3 
,
1 2 
0 0
so rank(A) = 3, since rref(A) has 3 pivots.
Note. Here are some of the connections between the rank of a matrix and the number of
solutions to a system of linear equations. Suppose we have a system of n linear equations in
m variables, and that the n × m matrix A is the coefficient matrix of this system. Then
1. We have rank(A) ≤ n and rank(A) ≤ m, because there cannot be more pivots than
there are rows, nor than there are columns.
2. If the system of equations is inconsistent, then rank(A) < n. This is because in rowreducing an inconsistent system we eventually have a row of zeros, augmented by a
nonzero solution. This row of zeros can’t have a pivot, so the number of pivots is at
most n − 1.
1
An invariant of a mathematical object is a property that doesn’t change when we apply certain operations to the object. In this case, the rank of a matrix is invariant under elementary row operations.
2
We’ll soon give a different, much better definition of the rank of a matrix. This is a probationary
definition from which we’ll soon move on.
10
3. If the system has exactly one solution, then rank(A) = m. If rank(A) < m, then the
system would have a free variable, meaning that if there is a solution, then there are
infinitely many solutions.
4. If the system has infinitely many solutions, then rank(A) < m, because a system with
infinitely many solutions must have a free variable.
We can combine these observations with some remarks made in the previous section to
conclude that if m = n (i.e., we have the same number of equations as variables), then the
system has a unique solution if and only if rref(A) = In , the n × n identity matrix.
2.2
Matrix Algebra
So far we’ve only seen matrices used to represent systems of linear equations. They have
another (perhaps more important) use as representatives of linear transformations. For
example, matrices can be used to model the evolution of the electorate between election
cycles. Suppose that each election cycle, 90% of voters who were members of the Republican
party remain in the Republican party, while the other 10% join the Democratic party. At
the same time, suppose 5% of Democrats become Republicans, while the other 95% remain
in the Democratic party. If each party has 500,000 voters in one election cycle, the following
matrix equation computes the number of voters each party will have in the next cycle:
A
0.95 0.10 500, 000
525, 000
=
.
0.05 0.90 500, 000
475, 000
(6)
Once we learn how to interpret (6) we’ll see that in the next election there should be 525,000
Democratic voters and 475,000 Republican voters. In this equation the matrix A represents
the transformation from one election cycle to the next, and this is the idea we’d like to focus
on soon: matrices as transformations.
Before we can really consider matrices as representing transformations, we’ll need to first
make sense of expressions such as (6). To this end, we’ll make clear some vocabulary surrounding matrices and then discuss addition of matrices.
Size of a matrix. The size of a matrix is given by the number of rows and columns it
has. A matrix with m rows and n columns is said to be “m-by-n”, written m × n. We
occasionally call a matrix with only one row a row matrix and call a matrix with just
one column a column matrix; we will call either of these types of matrices vectors.
Matrices which have the same number of rows and columns are called square matrices.
Example 9. Below we have a 2 × 3 matrix, a row matrix, a column matrix, and a square
matrix, respectively:
 


6
0 −1 1
2 7 −3
0 9 1 , 2 , −2 2 −4 .
,
1 0 9
3
4 −8 8
11
We use lowercase letters with double subscripts to identify the entries of a matrix. For
example, if the square matrix above is A, then a23 = −4.
Now we define addition of matrices. This operation is easy, but it is important to
remember that addition of matrices is only defined when the matrices involved have the
same size. That is, we can’t add a 2 × 3 matrix to a 5 × 4 matrix, but we can add two 2 × 3
matrices together, and we do so entry-wise. For instance,
2 7 −3
1 2 3
3 9 0
+
=
.
1 0 9
4 5 6
5 5 15
We can similarly define subtraction entry-wise, subject to the same restriction on size:
2 7 −3
1 2 3
1
5 −6
−
=
.
1 0 9
4 5 6
−3 −5 3
Because scalar addition is commutative and we defined matrix addition entry-wise, matrix addition is commutative. That is, if A and B are matrices of the same size, then
A + B = B + A. We point this out because it will not be true for multiplication.
After addition and subtraction, it makes sense to define matrix multiplication. We’re
going to defer most of this discussion a little longer, but we will at least define the product
Ax, where A is a matrix and x is a vector. As with addition and subtraction, multiplication
has a size condition:
Size condition for the product Ax. If A is an n × m matrix and x is a vector with k
component, then the product Ax only exists if m = k. That is, the number of columns
of A must equal the number of components of x. If this condition is met, then Ax will
be a vector with n components.
To prepare ourselves for the definition of Ax, we first define a product between vectors
with the same number of components, called the dot product.
Dot product. Suppose we have vectors
v = hv1 , v2 , . . . , vn i
and
w = hw1 , w2 , . . . , wn i.
Then the dot product of v and w is a scalar given by
v · w = v1 w1 + v2 w2 + · · · + vn wn .
Notice that w · v = v · w.
Example 10. We have

2
 3 

1 1 −1 1 · 
 6  = 1 · 2 + 1 · 3 + (−1) · 6 + 1 · 1 = 0.
1

12
Though it’s not necessary, we often write the first vector in a dot product as a row vector
and the second as a column vector.
We now define the product Ax in terms of dot products.
The product Ax. Let A be an n×m matrix and let x be a vector with m components.
If the rows of A are the vectors w1 , . . . , wn , each with m components, then


w1 · x


..
Ax = 
.
.
wn · x
Example 11. Compute each of the following products, or write that the product does not
exist:
 
0
1 0 8  
1 0 8
24
0.95 0.10
525, 000
6
(a)
(b)
(c)
.
4 −3 2
4 −3 2
−12
0.05 0.90
475, 000
3
(Solution)
(a)
1 0 8
4 −3 2


0
24
1
·
0
+
0
·
6
+
8
·
3
 6 =
=
.
4 · 0 + (−3) · 6 + 2 · 3
−12
3
(b) Since the matrix in the product is 2 × 3 and the vector has two components (rather
than three), this product is not defined.
(c) This represents another election cycle in the two-party example at the beginning of
this section:
0.95 0.10
525, 000
0.95 · 525, 000 + 0.10 · 475, 000
546, 250
=
=
.
0.05 0.90
475, 000
0.05 · 525, 000 + 0.90 · 475, 000
453, 750
So after another cycle, there will be 546,250 Democratic voters and 453,750 Republican
voters.
♦
One helpful property of matrix multiplication is that it plays nicely with vector and
matrix addition, as well as scalar multiplication:
A(x + y) = Ax + Ay,
(A + B)x = Ax + Bx,
and A(kx) = k(Ax),
for all n × m matrices A and B, vectors x and y with m components, and scalars k.
13
We’ll conclude with an example of arguably the most important matrix product in this
course. Consider the product

  

1 2
3
x
x + 2y + 3z
 2 −1 6   y  =  2x − y + 6z  .
3 0 −4
z
3x − 4z
The entries in the resulting vector look at lot like the left side of a system of linear equations.
Indeed, requiring, say,

  

1 2
3
x
6
 2 −1 6   y  =  7  ,
3 0 −4
z
−2
is the same as insisting that (x, y, z) be a solution to the system of linear equations given by
x + 2y + 3z = 6
2x − y + 6z = 7 .
3x+0y − 4z = −2
This is an important observation. It means that we may take an arbitrary system
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. ..
.=.
am1 xn + am2 x2 + · · · + amn xn = bm
and write it succinctly as a matrix product

a11 a12 · · · a1n
 a21 a22 · · · a2n

 ..
..
..
..
 .
.
.
.
am1 am2 · · · amn





x1
x2
..
.
xn


 
 
=
 
b1
b2
..
.



.

bn
That is, we want to solve the equation Ax = b, where A, x and b are what you’d expect
them to be. If this were an equation of numbers and A were nonzero, we’d know how to
solve for x: divide by A. We’ll apply similar ideas to this equation, though division rules for
matrices are slightly less straightforward than they are for numbers.
14
References
15