Download Chapter 4: Polynomials A polynomial is an expression of the form p

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Capelli's identity wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

System of linear equations wikipedia , lookup

Automatic differentiation wikipedia , lookup

History of algebra wikipedia , lookup

Elementary algebra wikipedia , lookup

Chinese remainder theorem wikipedia , lookup

Quadratic equation wikipedia , lookup

Gröbner basis wikipedia , lookup

Equation wikipedia , lookup

Horner's method wikipedia , lookup

Resultant wikipedia , lookup

Root of unity wikipedia , lookup

Cubic function wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Polynomial greatest common divisor wikipedia , lookup

Quartic function wikipedia , lookup

Polynomial wikipedia , lookup

Polynomial ring wikipedia , lookup

System of polynomial equations wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

Factorization wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
Chapter 4: Polynomials
A polynomial is an expression of the form
p(X) = an X n + an−1 X n−1 + · · · + a1 X + a0 .
(4.1)
where X is a “variable symbol” and a0 , a1 , a2 , . . . , an are some constants, called the
coefficients of p. Using the summation notation, we may write
p(X) =
n
ak X k .
(4.2)
k= 0
In (1), when an = 0, we say that the degree of p(X) is n and we call an the leading
coefficient. (When p is a nonzero constant, we say that the degree of p(X) is zero.
When p(X) is equal to the constant zero, it is customary assign ∞ to be the degree
of p(X). We are mainly interested in nonconstant polynomials and there is no need to
worry about this convention.) By an algebraic equation we mean an equation of the form
p(X) = 0. By a root of the polynomial p(X), or a solution to the algebraic equation
p(X) = 0, we mean a (complex) number a such that p(a) = 0. The fundamental
theorem of algebra says that every nonconstant (complex) polynomial has a (complex)
root. We will give a proof of this basic fact in Chapter 13.
Given a polynomial p(X) and a number a, we divide p(X) by X − a to get
p(X) = (X − a)q(X) + r, where q is a polynomial of degree less than that of p and r is
a number called the remainder. Now suppose that a is a root of p(X): p(a) = 0. Letting
X = a in p(X) = (X − a)q(X) + r, we obtain 0 = r. Hence p(X) = (X − a)q(x). We
have shown that if a is a root of p(X), then X − a is a factor of p(X). The converse
of this statement, namely “if x − a is a factor of p(x), then a is a root of p(x)”, is
clearly true.
Suppose that b is another root: p(b) = 0 and b = a. Putting X = b in
p(X) = (X − a)q(X), we get 0 = (b − a)q(b). As b − a = 0, we have q(b) = 0. Thus b
is also a root of q(X) and hence X − b is a factor of q(X), say q(X) = (X − b)Q(X).
Thus p(X) = (X − a)(X − b)Q(X), telling us that (X − a)(X − b) is a factor of
p(X). More generally, if a1 , a2 , . . . , am are distinct roots of a polynomial p(X), then
(X − a1 )(X − a2 ) · · · (X − am ) is a factor of p(X). One consequence of this fact is: If
1
p(X) is a polynomial of degree n, then p(X) cannot have more than n roots. To see
this, suppose that p(X) has more than n roots, say a1 , a2 , . . . , am with m > n. Then,
according to what we have just learned, f (X) ≡ (X − a1 )(X − a2 ) · · · (X − am ) is a factor
of p(X). This cannot happen because the degree of f (X) is m, which is greater than
the degree of p(X), which is n. Indeed, the degree of a factor of p cannot be greater
than the degree of p.
Next we study the multiplicity of a root. Suppose that a is a root of a polynomial
p(X), that is, p(a) = 0. The above discussion tells that p(X) = (X − a)q(X) for some
polynomial q(X). Now we ask the question: is a also a root of q(X)? The answer can
be found by computing q(a), the value of q(X) at a, to see if it is zero. If q(a) = 0,
the answer is No and in that case we say that a is a simple root of p(X). If q(a) = 0,
the answer is Yes and in that case we say that a is a multiple root of p(X). Extract
all factors of X − a from p, say, p(X) = (X − a)m g(X) for some polynomial g(X)
which has no factor of X − a, or equivalently g(a) = 0. The positive integer m is called
the multiplicity of the root a. When m = 1, a is a simple root. When m > 1, a is a
multiple root. When we count the number of roots of a polynomial, we should count their
multiplicities as well in order to get correct answers to many questions.
As we know, a real polynomial (that is, a polynomial with real coefficients) does not
necessarily have a real root. For a quadratic polynomial p(X) = aX 2 + bX + c with real
coefficients a(= 0), b, c, the usual recipe
x± =
−b ±
√
b2 − 4ac
2a
for its roots tells us that p(X) does not have real roots if its discriminant b2 − 4ac is
negative and in that case its roots are two complex numbers conjugate to each other. In
general, we have the following facts concerning the roots of a real polynomial:
Facts. Suppose that p(X) is a real polynomial of degree n ≥ 1. Then: (1) if the
degree n is an odd number, then p has at least one real root; (2) if ω is a nonreal root
of p, then its complex conjugate ω is also a root of p; (3) p can be written as a product
of irreducible factors, either of the form rx + s or ax2 + bx + c, where r, s, a, b, c are
real numbers with r = 0, a = 0 and b2 − 4ac < 0.
To show these facts, let p(X) = an X n + an−1 X n−1 + · · · + a1 X + a0 where a0 , a1 , . . . , an
2
are real numbers with an = 0. If ω is a nonreal root of p(X), then p(ω) = 0 gives
p(ω) = a0 ω n + a1 ωn−1 + · · · + an−1 ω + an
= a0 ω n + a1 ωn−1 + · · · + an−1 ω + an
(because the coefficients are real)
= p(ω) = 0 = 0
and hence ω is also a root of p(X). This proves (2). Now both X − ω and X − ω
are factors of p(X) and they are different; (otherwise ω = ω, contradicting the fact
that ω is nonreal). Thus (X − ω)(X − ω) is a factor of p(X). So we have p(X) =
(X − ω)(X − ω)q(X) for some polynomial q(X) of degree n − 2. Now
(X − ω)(X − ω) = X 2 + (ω + ω)X + ωω ≡ X 2 + bX + c
with b = ω + ω, c = ωω real and
b2 − 4ac = (ω + ω)2 − 4ωω = ω 2 + 2ωω + ω 2 − 4ωω = (ω − ω)2 < 0.
Since p and (X − ω)(X − ω) ≡ X 2 + bX + c are real polynomials and p(X) =
(X 2 + bX + c)q(X), the factor q(X) is also a real polynomial, because q(X) can be
obtained by the long division of p(X) by X 2 + bX + c. We can ask if q(X) has a
nonreal root. If the answer is yes, we can get another conjugate pair of nonreal roots and
hence another factor of the form ax2 + bx + c with real a, b, c such that b2 − 4ac < 0.
Pulling out this factor from q(X) (and hence from p(X)), the degree of p(X) drops
again by 2. Continue in this manner, until all nonreal roots are exhausted so that only
real roots of p(X) remains. Now part (3) of the above theorem is clear. Part (1) is a
direct consequence of part (3). The proof is complete.
Exercise 4.1. Factorize each of the following polynomials into a product of real polynomials: (a) X 4 − 1, (b) X 6 − 1, (c) X 8 − 1, (d) X 5 − 1, (e) X 12 − 1.
The derivative of p(X) given in (4,2) is defined to be
p′ (X) ≡
n
n
d p(X)
d ≡
ak X k =
kak X k−1
dX
dX
k= 0
(4.3)
k= 1
The usual rules about differentiation, such as the product rule and the chain rule, are still
applicable here:
Product Rule.
d
p(X)q(X) = p′ (X)q(X) + p(X)q ′ (X).
dX
3
Chain Rule.
d
p(q(X)) = p′ (q(X)) q ′ (X).
dX
The second derivative of p, denoted by p′′ or p(2) , is defined to be the derivative of
p′ . In general, we can define the k-th derivative p(k) (X) of p(X) of a polynomial p(X)
inductively by p(0) (X) = p(X) and p(k) (X) = (p(k−1) )′ (X) for k ≥ 1.
What is the role played by higher derivatives? One answer is, we can use higher derivatives to compute the Taylor series of a function at a given point. Consider a monomial
(X −a)n , where n is a positive integer. Its first derivative is d (X −a)n /dX = n(X −a)n−1 .
If n ≥ 2, its second derivative is
d2
(X − a)n = n(n − 2)(X − a)n−2
2
dX
Continuing in this faction, we see that when n ≥ k,
dk
(X − a)n = n(n − 1) · · · (n − k + 1)(X − a)n−k .
dX k
Note that, for k = n,
dn
(X − a)n = n(n − 1) · · · 1 = n!
dX n
which is a constant, and
dk
(X − a)n = 0 for k > n.
dX k
Let p(X) of degree n. Take any constant a and consider q(X) = p(X + a). Clearly
q(X) is still a polynomial of degree n, and hence we can write q(X) = nj= 0 cj X j .
Replacing X by X −a in the last identity, we have q(X −a) = nj= 0 cj (X −a)j . Replacing
X by X − a in q(X) = p(X + a), we obtain q(X − a) = p(X). Thus
n
p(X) =
cj (X − a)j .
j= 0
(4.4)
Now we compute p(k) (a), the kth derivative of p evaluated at a. From the previous computation we see that kth derivatives of the term (x − a)j in the right hand side of (4.4),
evaluated at x = a, all vanish except when k = j, and in that exceptional case we get k!.
Therefore we have p(k) (a) = k!ck . Thus ck = p(k) (a)/k!. Now we can rewrite (4.4) as
p(X) =
n
k=
p(k) (a)
(X − a)k ,
0
k!
(4.5)
which is the Taylor expansion for a polynomial p of degree n. When the degree of p(X)
is not specified, we may write (4.5) as
p(X) =
k≥0
p(k) (a)
(X − a)k .
k!
4
The Taylor coefficients p(k) (a)/k! here vanish if k is greater than the degree of p(X).
Example 4.1. We apply (4.5) to p(X) = X n . We have
n!
p(k) (X) = (dk /dxk )X n = n(n − 1) · · · (n − k + 1)X n−k =
X n−k
(n − k)!
n n−k
(k)
k
for k ≤ n. So p (a)/k! = (n!/(n − k)!k!)a ≡ k a
. Thus the Taylor expansion of X n
is X n = nk= 0 nk an−k (X − a)k . Putting X = α + β and a = α, we obtain
n n
n
(α + β) =
αn−k β k
j= 0 k
which is the well-known binomial expansion.
Besides the binomial formula, the Taylor formula (4.5) for polynomials. gives us the
partial fraction expansion for rational functions of the form p(X)/(X − a)N , where the
degree n is less than N . Indeed, dividing both sides of (4.5), we get
n
p(X)
ck
=
,
N
k= 1 (X − a)N −k
(X − a)
where ck = p(k) (a)/k!.
Exercise 4.2. Find the Taylor expansion of p(X) = X 4 + 1 at X = 1. Use your answer
X4+ 1
to find the partial fraction expansion of (X−1)
4.
The material of the rest of this chapter is optional
The notion of derivatives for polynomials is suggested by analysis, even though here we
give a purely algebraic treatment. In the rest of the present chapter, we study the notion
of finite differences, which is the counterpart in finite mathematics of derivatives. The
finite difference method used to be and still is an important tool in applied mathematics.
Given a polynomial p(X), the difference ∆p(X) is defined via
∆p(X) = p(X + 1) − p(X).
For example, ∆X 2 = (X + 1)2 − X 2 = 2X + 1. Like higher derivatives, we have higher
differences ∆2 p(X), ∆3 p(X), etc. with
∆2 p(X) = ∆∆p(X) = ∆(p(X + 1) − p(X))
= (p(X + 2) − p(X + 1)) − (p(X + 1) − p(X))
= p(X + 2) − 2p(X + 1) + p(X)
∆3 p(X) = ∆(p(X + 2) − 2p(X + 1) + p(X))
= (p(X + 3) − 2p(X + 2) + p(X + 1)) − (p(X + 2) − 2p(X + 1) + p(X))
= p(X + 3) − 3p(X + 2) + 3(X + 1) − p(X).
5
Now you can guess that in general
n
∆ p(X) =
n
k=
n
(−1)
p(X + n − k).
0
k
k
To prove this, we use the “operator method”. We introduce operators I and S, which
operate on polynomials p(X) in the following way: Ip(X) = p(X) and Sp(X) = p(X +1).
We call I the identity operator and S the shift operator. Clearly, for all integers
k, I k p(X) = p(X) and S k p(X) = p(X + k). Now
∆p(X) = p(X + 1) − p(X) = Sp(X) − Ip(X) = (S − I)p(X).
By applying the binomial theorem to (S − I)n , we have
n n
n n−k
n
n
k
k n
∆ p(X) = (S − I) p(X) =
(−I) p(X) =
(−1)
S
p(X + n − k).
k
k
k= 0
k= 0
Instead of the monomials X n , for finite differences it is more convenient to work with the
“deformed monomials” X [1] = X, X [2] = X(X − 1), X [3] = X(X − 1)(X − 2), etc. In
general,
X [n] = X(X − 1) · · · (X − n + 1).
Now let us compute ∆X [n+
∆X [n+
1]
1]
:
= ∆X(X − 1) · · · (X − n)
= (X + 1)X(X − 1) · · · (X − n)(X − n + 1) − X(X − 1)(X − 2) · · · (X − n)
= X(X − 1) · · · (X − n + 1){(X + 1) − (X − n)}
= (n + 1)X(X − 1) · · · (X − n + 1) = (n + 1)X [n] .
We conclude:
∆X [n+
1]
= (n + 1)X [n] .
Do you see the resemblance of this to the fact that the derivative of X n+
1
is (n + 1)X n ?
Consider the following problem: given a polynomial f (X), find a closed expression of
the sum
Sn = f (1) + f (2) + f (3) + · · · + f (n)
in terms of n. We reduce this problem to he one of solving the difference equation
∆p(X) = f (X) for the unknown polynomial p(X). Once the solution p(X) is found,
we have f (1) = p(2) − p(1), f (2) = p(3) − p(2), f (3) = p(4) − p(3), etc., and hence
Sn = p(2) − p(1) + p(3) − p(2) + p(4) − p(3) + · · · + p(n + 1) − p(n) = p(n + 1) − p(1).
6
For example, we know that p(X) = X [m+
∆ p(X) = X [m] So we have
n
k= 1
1]
/(m+1) is a solution to the difference equation
k(k − 1) · · · (k − m + 1) = p(n + 1) − p(1) =
(n + 1)n · · · (n − m + 1)
.
m+1
The special cases m = 1 and m = 2 gives
n
k= 1
k=
(n + 1)n n
,
k=
2
1
k(k − 1) =
(n + 1)n(n − 1)
.
3
These two identities are enough for us to find the sum 12 + 22 + 32 + · · · + n2 :
n
k= 1
k2 =
n
k= 1
k(k − 1) + k =
(n + 1)n(n − 1) (n + 1)n
(n + 1)n(2n + 1)
+
=
.
3
2
6
As we have remarked above, the problem of finding the sum Sn is equivalent to the
problem of solving the difference equation ∆p(X) = f (X). When f (X) = X [k] , we know
a solution, namely p(X) = X [k+ 1] /(k + 1). Thus, if we can express f (X) as a linear
combination of “deformed monomials” X [k] , then it is easy to write down a solution to the
difference equation ∆p(X) = f (X). There is a neat formula telling us how to do this:
f (X) =
k≥0
∆k f (0) [k]
X .
k!
(4.6)
As you can tell, this is the finite difference version of Taylor’s formula. The proof of it is
left to you as an exercise.
Let us go back to the difference equation
∆P (X) = f (X).
(4.7)
Using the “Taylor’s formula” (4.6) above and the identity ∆ X [n] = nX [n−1] , we see that
a particular solution is given by
P0 (X) =
∆k f (0) [k+
X
k≥0 (k + 1)!
1]
.
(4.8)
Now if P (X) is a solution to (4.7), then Q(X) = P (X) − P0 (X) satisfies
Q(X + 1) − Q(X) = ∆ Q(X) = ∆ P (X) − ∆ P0 (X) = f (X) − f (X) = 0.
Thus Q(X + 1) = Q(X); in other words, Q(X) is a periodic of period 1. But among
polynomials only constants can be periodic (the reader is asked to prove this statement).
7
So we have P (X) − P0 (X) = Q(X) = C for some constant C, or P (X) = P0 (X) + C. We
have proved that the general solution to (4.6) is of the form P (X) = P0 (X) + C, where
P0 (X) is given by (4.8). Notice that, if f (X) is a polynomial of degree n, then the solution
P (X) is of degree n + 1.
Now we consider the equation ∆P (X) = nX n−1 . As we know from the above
discussion, the solution exists and is unique up to the addition of a constant. The question
is, what is the “right choice” of this constant. To answer this, let us consider a similar
equation, with the right hand side switching n to n + 1: ∆Q(X) = (n + 1)X n , or
Q(X + 1) − Q(X) = (n + 1)X n .
Again, Q(X) is uniquely determined up to the addition of a constant. By taking derivatives
on both sides in the last identity, we have Q′ (X + 1) − Q′ (X) = (n + 1)nX n−1 , or
∆ (n + 1)−1 Q′ (X) = nX n−1 . Thus (n + 1)−1 Q′ (X) is a particular solution to ∆P (X) =
nX n−1 ; (notice that the constant term in Q(X) is “killed” by differentiation and hence
(n + 1)−1 Q′ (X) is completely determined). We write Bn (X) = (n + 1)−1 Q′ (X), called
the Bernoulli polynomial of degree n. Notice that, from the above discussion, we have
the following two important properties of Bernoulli’s polynomials:
∆ Bn (X) = nX n−1 ,
Bn+′
1 (X)
= (n + 1)Bn (X)
Again, notice the resemblance of the last identity to d X n+
1
/dX = (n + 1)X n . The
constant term of Bn (X), namely bn := Bn (0), is called the nth Bernoulli number. It
turns out that Bernoulli polynomials can be written in terms of Bernoulli numbers:
n n
Bn (X) =
bk X n−k .
k= 0 k
The first few Bernoulli numbers are given as follows:
b0 = 1, b1 = −1/2, b2 = 1/6, b3 = 0, b4 = −1/30, b5 = 0, b5 = 1/42.
Using Bernoulli polynomials, we can write down a closed form for sums of powers:
1k + 2k + · · · + nk =
Bk+
1 (n
+ 1) − Bk+
k+1
1 (1)
.
Bernoulli numbers are crucial for analytic number theory, in which many problems involve
estimation of sums by integrals using a classical recipe called Euler–Maclaurin formula,
which contains Bernoulli numbers and Bernoulli polynomials as ingredients.
8