Download A set of equations of the form (1) a11x1 + a12x2 + ··· + a 1nxn = c1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Determinant wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Four-vector wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Gaussian elimination wikipedia , lookup

System of linear equations wikipedia , lookup

Transcript
18.024 SPRING OF 2008
SYS. SYSTEMS OF LINEAR EQUATIONS
A set of equations of the form


a11 x1 + a12 x2 + · · · + a1n xn = c1



a21 x1 + a22 x2 + · · · + a2n xn = c2
(1)
..
..

.
.




am1 x1 + am2 x2 + · · · + amn xn = cm
is called a system of m linear equations in n unknowns. Here aij ’s, i = 1, . . . , m and j = 1, . . . , n, and
c1 , . . . , cm are numbers, and x1 , . . . , xn are regarded as unknown. The system (1) is conveniently
written in the matrix notation as
(2)
AX = C,
where A = (aij ) is an m by n matrix, called the coefficient matrix, and
 
 
x1
c1
 .. 
 .. 
X =  .  and C =  . 
xn
cm
are, respectively, 1 by n column vector and 1 by m column vector. By a solution of (1) we mean an
n-vector (x1 , . . . , xn ) for which all the equations in (1) are satisfied simultaneously. The solution set
of (1) consists of all such n-vectors; it is a subset of Vn , naturally.
Theory of systems of linear equations forms a major branch of linear algebra. Computational
algorithms of finding solutions of linear systems have important applications in real-world problems in engineering, business, and science, especially the social sciences. A system of nonlinear
equations can often be approximated by a linear system (called linearization), which is a useful
technique in designing a mathematical model of a relatively complex system.
The present chapter provides with a crash course on solving systems of linear equations (without
proofs); An extensive theory may take many chapters to completely develop. To those who are
interested in studying proofs (and already familiar with linear algebra) I refer to [Mun] or [Str].
Solution set. The solution set of (1) satisfies one of the followings:
S1 The solution set is empty. In this case we say the system is inconsistent.
S2 The solution set consists of a single point. Then, we say the solution is unique.
S3 The solution set is a k-dimensional plane (called a hyperplane) of Vn for some k > 0, that
is, a k-dimensional subspace of Vn translated by an n-vector. We say in this case that the
system has infinitely many solutions.
Example 1. (1) The system
(
x+y =1
x+y =2
has no solution since the sum of two numbers cannot be both 1 and 2.
1
(2) The system
(
x+y =1
x−y =0
has a unique solution (x, y) = (1/2, 1/2).
(3) The system
(
x+y =1
2x + 2y = 2
has infinitely many solutions. Indeed, any two numbers whose sum is 1 give a solution. We may
express the general solution of this system as
(x, y) = (0, 1) + t(1, −1),
where t is an arbitrary scalar. Thus, the solution set is a 1-plane in V2 . More precisely, it forms a
line in V2 through (0, 1) and determined by (1, −1).
Geometrically interpreted, each linear equation of two unknowns x and y determines a straight
line on the (x, y)-plane. Since a solution to a linear system must satisfy all equations of the system,
it must lie on the intersection of these lines, and therefore, the solution set of the linear system of
two equations in two unknowns is either (1) the empty set, or (2) a single point, or (3) a line.
The proof of the trichotomy result uses Gauss-Jordan elimination. The following theorem states
the crucial result which we shall use later to actually solve systems.
Theorem 2. Consider the system of linear equations AX = C, where A is an m by n matrix and C is a
m by 1 matrix. Let B be the matrix obtained by applying an elementary row operations to A and C 0 be the
matrix obtained by the same elementary row operations to C. Then the solution set of the system BX = C 0
is the same as the solution set of AX = C.
Homogeneous systems. To (1) we can associate another system
AX = O
obtained by replacing each ci in (1) by 0. This is called the homogeneous system corresponding to
(1). If C 6= O then (1) is called inhomogeneous.
The homogeneous system always has one solution, namely X = O, which is called the trivial
solution. It may have others. Furthermore, the solution set of AX = O is a linear subspace of
Vn , which is called the null space of A. Indeed, if X1 and X2 are solutions of AX = 0 then so are
X1 + X2 and cX1 for any scalar c. We wish to determine the dimension of this solution space and
to find a basis for it.
Definition 3. The column[row] rank of a matrix A is the maximal number of linearly independent
columns[rows] of A. The column rank and the row rank are always equal, and thus they are
simply called the rank∗ of A.
Theorem 4. Let A be an m by n matrix and let r be the rank of A. The solution space of the system of
linear equations AX = O is a subspace of Vn of dimension n − r.
In particular, if the rows of A are independent then the solution space of the system AX = O
has dimension n − m.
A proof is in [Mun, B, Theorem 3].
∗
An alternative definition viewing A as a linear transformation is given in [Apo, 16.3].
2
Example 5. Let

(3)

0
1 4
1
2
−1 −2 0
9
−1
,
A=
1
2 0 −6
1
2
5 4 −10 4
and as such AX = O is a system of 4 equations in 5 unknowns. Now, A is reduced via Gauss-Jordan
elimination to


1 0 −8 0 −3
0 1 4 0 2 

(4)
D=
0 0 0 1 0  .
0 0 0 0 0
We apply Theorem 2 in our situation to assert that the solution set of AX = O agrees with
that of DX = O. In view of Theorem 4 the solution set of DX = O (and also the solution set of
AX = O) has dimension 5 − 3 = 2.
We now find the general solution of AX = O (or equivalently, DX = O). Observe that in
the matrix DX, the unknowns x1 , x2 and x4 each appear in one equation. We solve for these
unknowns in terms of the others:
x1 = 8x3 + 3x5 ,
x2 = −4x3 − 2x5 ,
x4 = 0.
The general solution thus can be written as
X = (8x3 + 3x5 , −4x3 − 2x5 , x3 , 0, x5 )
= (8x3 , −4x3 , x3 , 0, 0) + (3x5 , −2x5 , 0, 0, x5 )
= x3 (8, −4, 1, 0, 0) + x5 (3, −2, 0, 0, 1).
The solution space is therefore spanned by two vectors (8, −4, 1, 0, 0) and (3, −2, 0, 0, 1).
The procedure we followed in the above example can be followed in general. Once we write
X as a vector of which each component is a linear combination of xi ’s, ad then finally as a linear
combination, with coefficients xi , of vectors in Vn . There are of course n − r of the unknowns xi ,
and hence n − r of these vectors. These vectors are linearly independent. (why?)
Solving inhomogeneous systems. We now turn to solving (1) with allowance for an inhomogeneous term C 6= O.
Theorem 6. (a) The solution set of (2) is given by
{P + Xh : AXh = 0},
where P is a solution to the inhomogeneous system AX = C.
(b) Let r be the rank of A. If r < m then for some C ∈ Vm there is no solution of (2). If r = m
then (2) always has a solution.
The proof of (a) is in [Apo, Theorem 16.18] and the proof of (b) is in [Mun, B, Theorem 6].
Example 7 (continued). The system
 
0
0

DX = 
0 ,
1
3
where D is given in (4), has no solution since the last equation of the system is
0x1 + 0x2 + 0x3 + 0x4 + 0x5 = 1.
On the other hand, the system
 
−1
3

DX = 
7
0
does have a solution. Indeed, a particular solution satisfies
x1 = −1 + 8x3 + 3x5 ,
x2 = 3 − 4x3 − 2x5 ,
x4 = 7.
One such a solution is X = (−1, 3, 0, 7, 0). The general solution is thus the 2-plane in V5 , represented by the parametric equation
X = (−1, 3, 0, 7, 0) + x3 (8, −4, 1, 0, 0) + x5 (3, −2, 0, 0, 1).
Solving the system AX = C in practice involves applying elementary row operations to A and
applying the same operations to C. A convenient way to perform these is to form a new matrix
from A by adjoining C as a additional column. The matrix obtained so is often called the augmented
matrix of the system. Then, one applies the elementary row operations to this matrix and deals
with A and C simultaneously. This procedure is explained in [Apo, 16.18].
Remarks on inverses of matrices. We must take into account that matrix multiplication is not
commutative. For example, for


0
0
1 1 2
A=
,
B =  3 −2 ,
0 1 3
−1 1
it is straightforward to verify that AB = I2 but BA 6= I3 .
Definition 8. Given A ∈ Mm×n , a matrix B ∈ Mn×m is called a left[right] inverse of A if BA =
In [AB = Im ], respectively, where In is the identity matrix of dimension n. A matrix B ∈ Mn×m is
called an inverse of A if BA = In and AB = Im .
Theorem 9. A matrix A ∈ Mm×n has an inverse if and only if m = n = rankofA. Moreover, the inverse
is unique.
The proof of the theorem uses the following lemma.
Lemma 10. If A ∈ Mm×n has a right inverse then m = rankofA. If A ∈ Mm×n has a left inverse then
n = rankofA.
Proof. If AB = Im , then AX = C has a solution for all C. Indeed, A(BC) = (AB)C = Im C = C.
This forces the number of rows of the matrix to be equal to its rank.
If BA = In , thenAX = 0 has only a trivial solution X = 0. Indeed, B(AX) = (BA)X = In X =
X. This forces the number of columns of the matrix to be equal to its rank.
The inverse of a matrix A is unique, if exists, and is denoted by A−1 .
4
To actually determine the entries of the inverse of an invertible n × n matrix is equivalent to
solving n separate inhomogeneous linear systems. Let A = (aij ) has the inverse A−1 = (bij ). The
entries of A and A−1 are related by the n2 equations
n
X
aik bkj = δij ,
k=1
where δij is the Kronecker’s delta. For each fixed choice of j, we can regard this as an inhomogeneous system of n linear equations in n unknowns b1j , b2j , . . . , bnj . The detail of the method is
explained in [Apo, pp. 612].
Computing A−1 is practically important. If A is invertible then the system AX = C has a unique
solution X = A−1 C.
R EFERENCES
[Apo] T. Apostol, Calculus, vol. I, Second edition, Wiley, 1967.
[Mun] J. Munkres, 18.024 Lecture Notes, available at the OCW.
Standard textbook on elementary linear algebra includes
[Str] G. Strang, Introduction to Linear Algebra, Third edition, Wellesley, MA, 2003.
c
2008
BY V ERA M IKYOUNG H UR
E-mail address: [email protected]
5