* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 4: Polynomials A polynomial is an expression of the form p
Capelli's identity wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
System of linear equations wikipedia , lookup
Automatic differentiation wikipedia , lookup
History of algebra wikipedia , lookup
Elementary algebra wikipedia , lookup
Chinese remainder theorem wikipedia , lookup
Quadratic equation wikipedia , lookup
Gröbner basis wikipedia , lookup
Horner's method wikipedia , lookup
Root of unity wikipedia , lookup
Cubic function wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Polynomial greatest common divisor wikipedia , lookup
Quartic function wikipedia , lookup
Polynomial ring wikipedia , lookup
System of polynomial equations wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Eisenstein's criterion wikipedia , lookup
Chapter 4: Polynomials A polynomial is an expression of the form p(X) = an X n + an−1 X n−1 + · · · + a1 X + a0 . (4.1) where X is a “variable symbol” and a0 , a1 , a2 , . . . , an are some constants, called the coefficients of p. Using the summation notation, we may write p(X) = n ak X k . (4.2) k= 0 In (1), when an = 0, we say that the degree of p(X) is n and we call an the leading coefficient. (When p is a nonzero constant, we say that the degree of p(X) is zero. When p(X) is equal to the constant zero, it is customary assign ∞ to be the degree of p(X). We are mainly interested in nonconstant polynomials and there is no need to worry about this convention.) By an algebraic equation we mean an equation of the form p(X) = 0. By a root of the polynomial p(X), or a solution to the algebraic equation p(X) = 0, we mean a (complex) number a such that p(a) = 0. The fundamental theorem of algebra says that every nonconstant (complex) polynomial has a (complex) root. We will give a proof of this basic fact in Chapter 13. Given a polynomial p(X) and a number a, we divide p(X) by X − a to get p(X) = (X − a)q(X) + r, where q is a polynomial of degree less than that of p and r is a number called the remainder. Now suppose that a is a root of p(X): p(a) = 0. Letting X = a in p(X) = (X − a)q(X) + r, we obtain 0 = r. Hence p(X) = (X − a)q(x). We have shown that if a is a root of p(X), then X − a is a factor of p(X). The converse of this statement, namely “if x − a is a factor of p(x), then a is a root of p(x)”, is clearly true. Suppose that b is another root: p(b) = 0 and b = a. Putting X = b in p(X) = (X − a)q(X), we get 0 = (b − a)q(b). As b − a = 0, we have q(b) = 0. Thus b is also a root of q(X) and hence X − b is a factor of q(X), say q(X) = (X − b)Q(X). Thus p(X) = (X − a)(X − b)Q(X), telling us that (X − a)(X − b) is a factor of p(X). More generally, if a1 , a2 , . . . , am are distinct roots of a polynomial p(X), then (X − a1 )(X − a2 ) · · · (X − am ) is a factor of p(X). One consequence of this fact is: If 1 p(X) is a polynomial of degree n, then p(X) cannot have more than n roots. To see this, suppose that p(X) has more than n roots, say a1 , a2 , . . . , am with m > n. Then, according to what we have just learned, f (X) ≡ (X − a1 )(X − a2 ) · · · (X − am ) is a factor of p(X). This cannot happen because the degree of f (X) is m, which is greater than the degree of p(X), which is n. Indeed, the degree of a factor of p cannot be greater than the degree of p. Next we study the multiplicity of a root. Suppose that a is a root of a polynomial p(X), that is, p(a) = 0. The above discussion tells that p(X) = (X − a)q(X) for some polynomial q(X). Now we ask the question: is a also a root of q(X)? The answer can be found by computing q(a), the value of q(X) at a, to see if it is zero. If q(a) = 0, the answer is No and in that case we say that a is a simple root of p(X). If q(a) = 0, the answer is Yes and in that case we say that a is a multiple root of p(X). Extract all factors of X − a from p, say, p(X) = (X − a)m g(X) for some polynomial g(X) which has no factor of X − a, or equivalently g(a) = 0. The positive integer m is called the multiplicity of the root a. When m = 1, a is a simple root. When m > 1, a is a multiple root. When we count the number of roots of a polynomial, we should count their multiplicities as well in order to get correct answers to many questions. As we know, a real polynomial (that is, a polynomial with real coefficients) does not necessarily have a real root. For a quadratic polynomial p(X) = aX 2 + bX + c with real coefficients a(= 0), b, c, the usual recipe x± = −b ± √ b2 − 4ac 2a for its roots tells us that p(X) does not have real roots if its discriminant b2 − 4ac is negative and in that case its roots are two complex numbers conjugate to each other. In general, we have the following facts concerning the roots of a real polynomial: Facts. Suppose that p(X) is a real polynomial of degree n ≥ 1. Then: (1) if the degree n is an odd number, then p has at least one real root; (2) if ω is a nonreal root of p, then its complex conjugate ω is also a root of p; (3) p can be written as a product of irreducible factors, either of the form rx + s or ax2 + bx + c, where r, s, a, b, c are real numbers with r = 0, a = 0 and b2 − 4ac < 0. To show these facts, let p(X) = an X n + an−1 X n−1 + · · · + a1 X + a0 where a0 , a1 , . . . , an 2 are real numbers with an = 0. If ω is a nonreal root of p(X), then p(ω) = 0 gives p(ω) = a0 ω n + a1 ωn−1 + · · · + an−1 ω + an = a0 ω n + a1 ωn−1 + · · · + an−1 ω + an (because the coefficients are real) = p(ω) = 0 = 0 and hence ω is also a root of p(X). This proves (2). Now both X − ω and X − ω are factors of p(X) and they are different; (otherwise ω = ω, contradicting the fact that ω is nonreal). Thus (X − ω)(X − ω) is a factor of p(X). So we have p(X) = (X − ω)(X − ω)q(X) for some polynomial q(X) of degree n − 2. Now (X − ω)(X − ω) = X 2 + (ω + ω)X + ωω ≡ X 2 + bX + c with b = ω + ω, c = ωω real and b2 − 4ac = (ω + ω)2 − 4ωω = ω 2 + 2ωω + ω 2 − 4ωω = (ω − ω)2 < 0. Since p and (X − ω)(X − ω) ≡ X 2 + bX + c are real polynomials and p(X) = (X 2 + bX + c)q(X), the factor q(X) is also a real polynomial, because q(X) can be obtained by the long division of p(X) by X 2 + bX + c. We can ask if q(X) has a nonreal root. If the answer is yes, we can get another conjugate pair of nonreal roots and hence another factor of the form ax2 + bx + c with real a, b, c such that b2 − 4ac < 0. Pulling out this factor from q(X) (and hence from p(X)), the degree of p(X) drops again by 2. Continue in this manner, until all nonreal roots are exhausted so that only real roots of p(X) remains. Now part (3) of the above theorem is clear. Part (1) is a direct consequence of part (3). The proof is complete. Exercise 4.1. Factorize each of the following polynomials into a product of real polynomials: (a) X 4 − 1, (b) X 6 − 1, (c) X 8 − 1, (d) X 5 − 1, (e) X 12 − 1. The derivative of p(X) given in (4,2) is defined to be p′ (X) ≡ n n d p(X) d ≡ ak X k = kak X k−1 dX dX k= 0 (4.3) k= 1 The usual rules about differentiation, such as the product rule and the chain rule, are still applicable here: Product Rule. d p(X)q(X) = p′ (X)q(X) + p(X)q ′ (X). dX 3 Chain Rule. d p(q(X)) = p′ (q(X)) q ′ (X). dX The second derivative of p, denoted by p′′ or p(2) , is defined to be the derivative of p′ . In general, we can define the k-th derivative p(k) (X) of p(X) of a polynomial p(X) inductively by p(0) (X) = p(X) and p(k) (X) = (p(k−1) )′ (X) for k ≥ 1. What is the role played by higher derivatives? One answer is, we can use higher derivatives to compute the Taylor series of a function at a given point. Consider a monomial (X −a)n , where n is a positive integer. Its first derivative is d (X −a)n /dX = n(X −a)n−1 . If n ≥ 2, its second derivative is d2 (X − a)n = n(n − 2)(X − a)n−2 2 dX Continuing in this faction, we see that when n ≥ k, dk (X − a)n = n(n − 1) · · · (n − k + 1)(X − a)n−k . dX k Note that, for k = n, dn (X − a)n = n(n − 1) · · · 1 = n! dX n which is a constant, and dk (X − a)n = 0 for k > n. dX k Let p(X) of degree n. Take any constant a and consider q(X) = p(X + a). Clearly q(X) is still a polynomial of degree n, and hence we can write q(X) = nj= 0 cj X j . Replacing X by X −a in the last identity, we have q(X −a) = nj= 0 cj (X −a)j . Replacing X by X − a in q(X) = p(X + a), we obtain q(X − a) = p(X). Thus n p(X) = cj (X − a)j . j= 0 (4.4) Now we compute p(k) (a), the kth derivative of p evaluated at a. From the previous computation we see that kth derivatives of the term (x − a)j in the right hand side of (4.4), evaluated at x = a, all vanish except when k = j, and in that exceptional case we get k!. Therefore we have p(k) (a) = k!ck . Thus ck = p(k) (a)/k!. Now we can rewrite (4.4) as p(X) = n k= p(k) (a) (X − a)k , 0 k! (4.5) which is the Taylor expansion for a polynomial p of degree n. When the degree of p(X) is not specified, we may write (4.5) as p(X) = k≥0 p(k) (a) (X − a)k . k! 4 The Taylor coefficients p(k) (a)/k! here vanish if k is greater than the degree of p(X). Example 4.1. We apply (4.5) to p(X) = X n . We have n! p(k) (X) = (dk /dxk )X n = n(n − 1) · · · (n − k + 1)X n−k = X n−k (n − k)! n n−k (k) k for k ≤ n. So p (a)/k! = (n!/(n − k)!k!)a ≡ k a . Thus the Taylor expansion of X n is X n = nk= 0 nk an−k (X − a)k . Putting X = α + β and a = α, we obtain n n n (α + β) = αn−k β k j= 0 k which is the well-known binomial expansion. Besides the binomial formula, the Taylor formula (4.5) for polynomials. gives us the partial fraction expansion for rational functions of the form p(X)/(X − a)N , where the degree n is less than N . Indeed, dividing both sides of (4.5), we get n p(X) ck = , N k= 1 (X − a)N −k (X − a) where ck = p(k) (a)/k!. Exercise 4.2. Find the Taylor expansion of p(X) = X 4 + 1 at X = 1. Use your answer X4+ 1 to find the partial fraction expansion of (X−1) 4. The material of the rest of this chapter is optional The notion of derivatives for polynomials is suggested by analysis, even though here we give a purely algebraic treatment. In the rest of the present chapter, we study the notion of finite differences, which is the counterpart in finite mathematics of derivatives. The finite difference method used to be and still is an important tool in applied mathematics. Given a polynomial p(X), the difference ∆p(X) is defined via ∆p(X) = p(X + 1) − p(X). For example, ∆X 2 = (X + 1)2 − X 2 = 2X + 1. Like higher derivatives, we have higher differences ∆2 p(X), ∆3 p(X), etc. with ∆2 p(X) = ∆∆p(X) = ∆(p(X + 1) − p(X)) = (p(X + 2) − p(X + 1)) − (p(X + 1) − p(X)) = p(X + 2) − 2p(X + 1) + p(X) ∆3 p(X) = ∆(p(X + 2) − 2p(X + 1) + p(X)) = (p(X + 3) − 2p(X + 2) + p(X + 1)) − (p(X + 2) − 2p(X + 1) + p(X)) = p(X + 3) − 3p(X + 2) + 3(X + 1) − p(X). 5 Now you can guess that in general n ∆ p(X) = n k= n (−1) p(X + n − k). 0 k k To prove this, we use the “operator method”. We introduce operators I and S, which operate on polynomials p(X) in the following way: Ip(X) = p(X) and Sp(X) = p(X +1). We call I the identity operator and S the shift operator. Clearly, for all integers k, I k p(X) = p(X) and S k p(X) = p(X + k). Now ∆p(X) = p(X + 1) − p(X) = Sp(X) − Ip(X) = (S − I)p(X). By applying the binomial theorem to (S − I)n , we have n n n n−k n n k k n ∆ p(X) = (S − I) p(X) = (−I) p(X) = (−1) S p(X + n − k). k k k= 0 k= 0 Instead of the monomials X n , for finite differences it is more convenient to work with the “deformed monomials” X [1] = X, X [2] = X(X − 1), X [3] = X(X − 1)(X − 2), etc. In general, X [n] = X(X − 1) · · · (X − n + 1). Now let us compute ∆X [n+ ∆X [n+ 1] 1] : = ∆X(X − 1) · · · (X − n) = (X + 1)X(X − 1) · · · (X − n)(X − n + 1) − X(X − 1)(X − 2) · · · (X − n) = X(X − 1) · · · (X − n + 1){(X + 1) − (X − n)} = (n + 1)X(X − 1) · · · (X − n + 1) = (n + 1)X [n] . We conclude: ∆X [n+ 1] = (n + 1)X [n] . Do you see the resemblance of this to the fact that the derivative of X n+ 1 is (n + 1)X n ? Consider the following problem: given a polynomial f (X), find a closed expression of the sum Sn = f (1) + f (2) + f (3) + · · · + f (n) in terms of n. We reduce this problem to he one of solving the difference equation ∆p(X) = f (X) for the unknown polynomial p(X). Once the solution p(X) is found, we have f (1) = p(2) − p(1), f (2) = p(3) − p(2), f (3) = p(4) − p(3), etc., and hence Sn = p(2) − p(1) + p(3) − p(2) + p(4) − p(3) + · · · + p(n + 1) − p(n) = p(n + 1) − p(1). 6 For example, we know that p(X) = X [m+ ∆ p(X) = X [m] So we have n k= 1 1] /(m+1) is a solution to the difference equation k(k − 1) · · · (k − m + 1) = p(n + 1) − p(1) = (n + 1)n · · · (n − m + 1) . m+1 The special cases m = 1 and m = 2 gives n k= 1 k= (n + 1)n n , k= 2 1 k(k − 1) = (n + 1)n(n − 1) . 3 These two identities are enough for us to find the sum 12 + 22 + 32 + · · · + n2 : n k= 1 k2 = n k= 1 k(k − 1) + k = (n + 1)n(n − 1) (n + 1)n (n + 1)n(2n + 1) + = . 3 2 6 As we have remarked above, the problem of finding the sum Sn is equivalent to the problem of solving the difference equation ∆p(X) = f (X). When f (X) = X [k] , we know a solution, namely p(X) = X [k+ 1] /(k + 1). Thus, if we can express f (X) as a linear combination of “deformed monomials” X [k] , then it is easy to write down a solution to the difference equation ∆p(X) = f (X). There is a neat formula telling us how to do this: f (X) = k≥0 ∆k f (0) [k] X . k! (4.6) As you can tell, this is the finite difference version of Taylor’s formula. The proof of it is left to you as an exercise. Let us go back to the difference equation ∆P (X) = f (X). (4.7) Using the “Taylor’s formula” (4.6) above and the identity ∆ X [n] = nX [n−1] , we see that a particular solution is given by P0 (X) = ∆k f (0) [k+ X k≥0 (k + 1)! 1] . (4.8) Now if P (X) is a solution to (4.7), then Q(X) = P (X) − P0 (X) satisfies Q(X + 1) − Q(X) = ∆ Q(X) = ∆ P (X) − ∆ P0 (X) = f (X) − f (X) = 0. Thus Q(X + 1) = Q(X); in other words, Q(X) is a periodic of period 1. But among polynomials only constants can be periodic (the reader is asked to prove this statement). 7 So we have P (X) − P0 (X) = Q(X) = C for some constant C, or P (X) = P0 (X) + C. We have proved that the general solution to (4.6) is of the form P (X) = P0 (X) + C, where P0 (X) is given by (4.8). Notice that, if f (X) is a polynomial of degree n, then the solution P (X) is of degree n + 1. Now we consider the equation ∆P (X) = nX n−1 . As we know from the above discussion, the solution exists and is unique up to the addition of a constant. The question is, what is the “right choice” of this constant. To answer this, let us consider a similar equation, with the right hand side switching n to n + 1: ∆Q(X) = (n + 1)X n , or Q(X + 1) − Q(X) = (n + 1)X n . Again, Q(X) is uniquely determined up to the addition of a constant. By taking derivatives on both sides in the last identity, we have Q′ (X + 1) − Q′ (X) = (n + 1)nX n−1 , or ∆ (n + 1)−1 Q′ (X) = nX n−1 . Thus (n + 1)−1 Q′ (X) is a particular solution to ∆P (X) = nX n−1 ; (notice that the constant term in Q(X) is “killed” by differentiation and hence (n + 1)−1 Q′ (X) is completely determined). We write Bn (X) = (n + 1)−1 Q′ (X), called the Bernoulli polynomial of degree n. Notice that, from the above discussion, we have the following two important properties of Bernoulli’s polynomials: ∆ Bn (X) = nX n−1 , Bn+′ 1 (X) = (n + 1)Bn (X) Again, notice the resemblance of the last identity to d X n+ 1 /dX = (n + 1)X n . The constant term of Bn (X), namely bn := Bn (0), is called the nth Bernoulli number. It turns out that Bernoulli polynomials can be written in terms of Bernoulli numbers: n n Bn (X) = bk X n−k . k= 0 k The first few Bernoulli numbers are given as follows: b0 = 1, b1 = −1/2, b2 = 1/6, b3 = 0, b4 = −1/30, b5 = 0, b5 = 1/42. Using Bernoulli polynomials, we can write down a closed form for sums of powers: 1k + 2k + · · · + nk = Bk+ 1 (n + 1) − Bk+ k+1 1 (1) . Bernoulli numbers are crucial for analytic number theory, in which many problems involve estimation of sums by integrals using a classical recipe called Euler–Maclaurin formula, which contains Bernoulli numbers and Bernoulli polynomials as ingredients. 8