Download Systems of Linear Equations in Fields

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

System of polynomial equations wikipedia , lookup

Field (mathematics) wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Jordan normal form wikipedia , lookup

Linear algebra wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Determinant wikipedia , lookup

Four-vector wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix calculus wikipedia , lookup

System of linear equations wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Transcript
Systems of Linear Equations in Fields
1. Fields
A field is a structure F = (F ; +, ·; −, ι; 0, 1) such that
(1) F is a set with at least two members
(2) +, ·, −, ι, 0, 1 are operations on F
(a) + (addition) and · (multiplication) are binary operations
(b) − (additive inversion) and ι (multiplicative inversion) are unary operations
(c) 0 (zero) and 1 (one) are nullary operations, sometimes called constants. By a harmless abuse
of notation, they are two of the elements of F .
The operations satisfy the following properties:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Addition is commutative: x + y = y + x
Addition is associative: (x + y) + z = x + (y + z)
Zero is an additive left identity: 0 + x = x
Additive left inverses are selected via the additive inversion operation: −x + x = 0
Multiplication is commutative: x · y = y · x
Multiplication is associative: (x · y) · z = x · (y · z)
One is a left multiplicative identity: 1 · x = x
Nonzero members of F have multiplicative left inverses that are selected via the multiplicative
inversion operation: ι(x) · x = 1, for x 6= 0
(9) Multiplication is left distributive over addition: x · (y + z) = x · y + x · z
Examples 1.1. Several familiar examples are immediately available:
(1) The integers, with the usual integer arithmetic, do not form a field.
(2) Arithmetic modulo p, where p is a prime number, makes the set {0, 1, ..., p − 1} into a field. An
example is any two-element field, whose only elements are zero and one.
(3) Boolean arithmetic on the two-element set {0, 1} is a field arithmetic.
(4) The usual arithmetic of the rational numbers makes the set of rationals into a field.
(5) The real numbers form a field.
(6) The complex numbers form a field.
(7) Even though arithmetic modulo four does not make the set {0, 1, 2, 3} into a field, there is a field
with exactly four elements.
(8) The set of algebraic numbers forms a field.
(9) The set of constructible numbers forms a field.
1
2
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
To each field F is associated a nonnegative integer, called its characteristic. Specifically, if there is a
positive integer n such that nx = 0 for each element of the field F, then F is said to have finite
characteristic, and the characteristic of F is the least such positive integer. If there is no such positive
integer, then F is said to have characteristic zero. The rational, real and complex fields are characteristic
zero fields. For any prime p, the fields with p elements are characteristic p fields. For example, boolean
arithmetic is a characteristic 2 arithmetic, because boolean arithmetic satisfies the law x + x = 0. An
important fact about characteristic of fields is the following:
Theorem 1.1. Every field either has characteristic zero or has characteristic p, where p is a prime.
Fields of prime characteristic play a significant role in cryptography.
Another important result along these lines is the following:
Theorem 1.2. Let F be a field. If F has characteristic zero, then F has a subfield that is isomorphic to the
rational number field. If F has characteristic p, where p is a prime integer, then F has a subfield with p
elements.
2. Systems of Equations
Example 2.1. Solve the following system of modulo five equations:
x1 + 3x2 = 0
x1 + x2 = 0
Solution:
We row-reduce the corresponding augmented matrix, using arithmetic modulo five:
1 3 0
3 0 0
R1 +2R2
=====⇒
1 1 0
1 1 0
R2
1 0 0
2R
====1=⇒
.
0 2 0
2R2 +R1
It follows that the only solution of the system is (0, 0).
3. Elementary and Admissible Row Operations
Some row operations are clearly useful in solving systems of equations, and some are clearly useless (such
as multiplying a row by zero). But in general, it is not always easy to distinguish useful row operations
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
3
from useless ones. The useful ones we shall call “admissible”, and we have a formal definition to help us
determine which are which. But those that are “obviously useful” we call “elementary”.
3.1. Elementary Row Operations.
(1) A row replacement operation is a row operation that replaces a given row by the sum of itself and a
multiple of another row.
(2) A row interchange operation is a row operation that interchanges two rows.
(3) A scaling operation replaces a row by a nonzero multiple of itself.
An elementary row operation is a row operation that is either a row replacement, a row interchange, or a
scaling.
These are the basic row operations that can be used to solve systems of linear equations, but there are
other row operations that will suffice. We call a row operation admissible if it is representable as a sequence
of finitely many elementary row operations. Two matrices [A|~b] and [B|~c] are row equivalent if there is an
admissible row operation that can be used to transform the matrix [A|~b] into the matrix [B|~c].
Examples 3.1.
(1) The following is an elementary row operation:
R1
2 1 3
2 1 3
====⇒
.
3 −1 4
6 −2 8
2R2
Another elementary row operation is
R1
2 1 3
2 1
3
=========⇒
.
6 −2 8
0 −5 −1
R − 3R
2
1
(2) The following is an admissible row operation, but is not an elementary row operation:
R1
2 1 3
2 1
3
==========⇒
.
3 −1 4
0 −5 −1
2R2 − 3R1
(3) The following matrices are row-equivalent matrices:
2 1 3
2 1 3
,
3 −1 4
6 −2 8
and
2 1
3
0 −5 −1
Theorem 3.1. If two linear systems have row-equivalent augmented matrices, then the two systems are
equivalent systems. Conversely, if two (m × n) systems of linear equations are equivalent systems, then
their augmented matrices are row-equivalent matrices.
This theorem expresses the fact that if we begin with an augmented matrix for a given system of linear
equations, and perform elementary or other admissible row operations to this matrix, and if we can then
easily tell what are the solutions of the system whose augmented matrix results, then we may solve the
4
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
original system. This is in fact what we do to solve systems of linear equations. A special type of matrix is
studied next, because such matrices represent easily solvable systems.
3.2. Echelon Forms. A matrix is in (row) echelon form if:
(1) Any nonzero row is above any zero row.
(2) Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it.
(3) All entries in a column below a leading nonzero entry are zeros.
A matrix is in near-reduced (row) echelon form if:
(1)
(2)
(3)
(4)
Any nonzero row is above any zero row.
Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it.
All entries in a column below a leading nonzero entry are zeros.
Each leading nonzero entry is the only nonzero entry in its column.
A matrix is in reduced (row) echelon form if:
(1)
(2)
(3)
(4)
(5)
Any nonzero row is above any zero row.
Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it.
All entries in a column below a leading nonzero entry are zeros.
Each leading nonzero entry is the only nonzero entry in its column.
Each leading nonzero entry is 1.
To solve a system, we reduce its augmented matrix either to near-reduced echelon form or to reduced
echelon form, using admissible row operations. Some authors prefer reduced echelon form, and they have
good reasons. However, we will prefer near-reduced echelon form, as it is slightly easier to obtain, and is
just as informative as reduced echelon form.
Example 3.1. Solve the system
2x1 + x2 = 3
3x1 − x2 = 4.
Solution:
We use admissible row operations to reduce the matrix
2 1 3
[A|~b] =
3 −1 4
to near-reduced echelon form, as follows:
R1
2 1
3
2 1 3
~
==========⇒
[A|b] =
3 −1 4
0 −5 −1
2R2 − 3R1
5R1 + R2
=========⇒
−R2
10
0
0 14
5 1
.
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
5
Thus our original system is equivalent to the system
10x1 = 14
5x2 = 1,
so the only solution of the system is
(x1 , x2 ) =
14 1
,
10 5
=
7 1
,
5 5
.
In performing row operations to solve a system of linear equations, we choose a nonzero row to use as a
pivot row, and in that row, the leading nonzero entry is called a pivot entry. The column containing this
pivot entry is called a pivot column. When we have computed a near-reduced (or reduced) echelon form of
an augmented matrix, the leading nonzero entries of the nonzero rows of the near-reduced (or reduced)
echelon form are the pivot entries of the matrix. We use the pivot entries to convert other entries in the
pivot columns into zeros, in our algorithm to find a near-reduced (or reduced) echelon form.
Example 3.2. In the matrices


1 1 2 3
[A|~b] =  2 1 2 2 
5 0 1 1
and


1 1
2
3
[B|~c] =  0 −1 −2 −4  ,
0 −5 −9 −14
the (1,1) entry (a 1) is a pivot entry, row 1 is a pivot row and column 1 is a pivot column.
We obtain [B|~c] from [A|~b] by performing one admissible row operation, using the pivot entry in the (1,1)
position:
R1
R2 − 2R1
[A|~b] =========⇒ [B|~c]
R3 − 5R1
4. Matrix Multiplication
4.1. Definition of matrix products. We define matrix products as follows:
 
y1
 y2 
 
(1) If x = x1 x2 ... xn and y =  . , then
 .. 
yn
xy =
Pn
j=1
xj yj .
6
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
(2) If A is an (m × n) matrix and B is an (n × p) matrix, then (AB)ij = A(i) B (j) . Thus the (i, j)-entry
of AB is the product of row i of A with column j of B.
(3) If the number of columns of A is not the same as the number of rows of B, then A and B cannot
be multiplied.
4.2. Some Facts about Matrix Products.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
Elementary matrices represent elementary row operations.
Matrix multiplication is associative.
Matrix multiplication is not, in general, commutative.
A matrix A is invertible if there is a matrix B such that AB = BA = I.
Any elementary matrix is invertible.
Any admissible row operation is represented by some invertible matrix. Conversely, any invertible
matrix represents some admissible row operation.
To perform an elementary row operation on a matrix, one need only multiply on the left by the
elementary matrix that represents the given row operation.
To perform an admissible row operation on a matrix, one need only multiply on the left by the
invertible matrix that represents the given row operation.
Even row operations that are not admissible are represented by matrices: To perform a row
operation on a matrix, one may find first the matrix that represents the given row operation, then
multiply on the left by that matrix.
The process of LU -factorization provides a way to solve systems of equations quickly when one can
find a lower triangular matrix L and an upper triangular matrix U such that LU is the coefficient
matrix of the given systems.
The inverse of the inverse is the original matrix.
A product of invertible matrices is invertible, and its inverse is the product of the inverses of the
factors, in the reverse order.
Several nice properties are equivalent to invertibility.
To compute the inverse of an invertible matrix, one need only augment the given matrix with the
corresponding identity matrix and compute the reduced row-echelon form of the result. The inverse
is then in the augmented portion of the result.
4.3. A Useful Family of Admissible Row Operations. Assume that j ∈ {1, ..., m}, and that scalars
a1 , ..., am and b1 , ..., bm are such that aj + bj 6= 0, and for k 6= j, ak 6= 0. Then the following describes an
admissible row operation:
a1 R1 + b1 Rj
..
.
aj Rj + bj Rj
=============⇒
..
.
am Rm + bm Rj
SYSTEMS OF LINEAR EQUATIONS IN FIELDS
7
The above family of admissible row operations is extremely useful, but note that many of them are not
elementary. Now we will see an example of an admissible row operation that is not in the family described
above, and we shall contrast it with an inadmissible row operation.
Example 4.1. On (3 × 3) matrices, the following row operation is admissible:
R1 + R 2
R2 + R 3
========⇒
R3 + R 1
Example 4.2. On (4 × 4) matrices, the following row operation is not admissible:
R1 + R 2
R2 + R 3
========⇒
R3 + R 4
R4 + R 1