Download Chapter 8 Complex Numbers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Infinitesimal wikipedia , lookup

Infinity wikipedia , lookup

Georg Cantor's first set theory article wikipedia , lookup

Vincent's theorem wikipedia , lookup

Large numbers wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Elementary algebra wikipedia , lookup

Real number wikipedia , lookup

Factorization wikipedia , lookup

History of algebra wikipedia , lookup

System of polynomial equations wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Addition wikipedia , lookup

Elementary mathematics wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
Chapter 8
Complex Numbers
In this chapter, we reach the last of our important number systems, the complex
numbers. You will learn
(a) the construction of the complex numbers;
(b) their geometric representation in the Argand diagram;
(c) the Fundamental Theorem of Algebra.
The Study skill points out that examples do not make a proof. In the supplementary material we show that every quadratic equation over the complex numbers
can be solved, and discuss Euler’s famous formula eiπ = −1.
8.1
Complex numbers
The final extension arises because there are still equations we can’t solve, such as
x2 = −1 (which has no real solution) or x3 = 2 (which has only one, though for
various reasons we would like it to have three). It turns out that the first equation
is the crucial one.
8.1.1
The square root of minus one
123
124
CHAPTER 8. COMPLEX NUMBERS
(Thanks to Emma Kelly for this picture)
Negative numbers don’t have square roots, right? So mathematicians thought
for millennia. Then they found that they needed to use square roots of negative
numbers to solve cubic equations, even if the solutions to these equations were
ordinary real numbers. Finally they became brave enough to accept these “imaginary numbers” into the body of mathematics. With hindsight, it is very simple.
Definition 8.1.1 A complex number is a number of the form a + bi, where a and
b are real numbers, and i is a mysterious symbol which will have the property that
i2 = −1. The rules for addition and multiplication are
(a + bi) + (c + di) = (a + c) + (b + d)i,
(a + bi)(c + di) = (ac − bd) + (ad + bc)i.
The rule for multiplication comes out by just expanding in the usual way and
using i2 = −1:
(a + bi)(c + di) = ac + adi + bci + bdi2 = (ac − bd) + (ad + bc)i.
Subtraction and division (except by zero) also work for complex numbers.
You can work out the rule for subtraction. How do we divide? You can check that
the rule above gives
(a + bi)(a − bi) = a2 + b2 ,
which is a positive number unless a = b = 0. So, to divide by a + bi, we multiply
by
b
a
− 2
i.
a2 + b2
a + b2
Thus, in the complex numbers, we can add, subtract, multiply, and divide
(except by zero), and the laws we met earlier all apply here too.
Complex numbers are not called complex because they are complicated: a
modern advertising executive would certainly have come up with a different name!
They are called “complex” because each complex number is built of two parts,
each of which is simpler (being a real number).
Here, for example, is a proof of the distributive law. Let z1 = a1 + b1 i, z2 =
a2 + b2 i, and z3 = a3 + b3 i. Now
z1 (z2 + z3 ) = (a1 + b1 i)((a2 + a3 ) + (b2 + b3 )i)
= (a1 (a2 + a3 ) − b1 (b2 + b3 )) + a1 (b2 + b3 ) + b1 (a2 + a3 ))i,
8.1. COMPLEX NUMBERS
125
and
z1 z2 + z1 z3 = ((a1 a2 − b1 b2 ) + (a1 b2 + a2 b1 )i) + ((a1 a3 − b1 b3 ) + (a1 b3 + a3 b1 )i)
= (a1 a2 − b1 b2 + a1 a3 − b1 b3 ) + (a1 b2 + a2 b1 + a1 b3 + a3 b1 )i,
and a little bit of rearranging shows that the two expressions are the same.
Definition 8.1.2 If z = a+bi is a complex number (where a and b are real), we say
that a and b are the real part and imaginary part of z respectively. The complex
number a − bi is called the complex conjugate of z, and is written as z.
So the rules for addition and subtraction can be put like this:
To add or subtract complex numbers, we add or subtract their real
parts and their imaginary parts.
The rule for multiplication looks more complicated as we have written it out.
There is another representation of complex numbers which makes it look simpler.
Definition 8.1.3 Let z = a + bi be a non-zero complex number. The modulus and
argument of z are defined by
p
a2 + b2 ,
|z|
=
arg(z) = θ where cos θ = a/|z| and sin θ = b/|z|.
In other words, if |z| = r and arg(z) = θ , then
z = r(cos θ + i sin θ ).
Often, people write arg(z) = tan−1 (b/a), or arctan(b/a)
this√is
√ if you prefer; but
−1
not quite correct. Consider the complex number z = 1− 3i. We have tan ((− 3)/1) =
2π/3, but the argument of z is 5π/3, since its cosine is positive and its sine negative.
Now the rules for multiplication and division are:
To multiply two complex numbers, multiply their moduli and add
their arguments. To divide two complex numbers, divide their moduli
and subtract their arguments.
Remark The definition of modulus of a complex number agrees with the definition in the last chapter of the modulus of a real number. The argument of a
positive real number is 0; that of a negative real number is π (or 180◦ ).
126
8.1.2
CHAPTER 8. COMPLEX NUMBERS
The complex plane, or Argand diagram
The complex numbers can be represented geometrically, by points in the Euclidean plane (which is usually referred to as the Argand diagram or the complex
plane for this purpose). The complex number z = a + bi is represented as the
point with coordinates (a, b). Then |z| is the length of the line from the origin to
the point z, and arg(z) is the angle between this line and the positive x-axis. The
next diagram shows this.
r z = a + bi
|z| = r
b = r sin θ
r
0
θ
a = r cos θ
In terms of the complex plane, we can give a geometric description of addition
and multiplication of complex numbers. The addition rule is the parallelogram
rule which you will meet again for vectors in Geometry I, next semester. It states:
Draw lines from the origin to the points representing the two complex
numbers z1 , z2 to be added. Construct the parallelogram having these
lines as two of its sides. Then the point opposite the origin represents
z1 + z2 .
This is illustrated in the next diagram.
r
z1 + z2
*
r
z1
r
z2
*
r
0
8.1. COMPLEX NUMBERS
127
Multiplication is a little bit more complicated. Let z be a complex number
with modulus r and argument θ , so that z = r(cos θ + i sin θ ). Then the way to
multiply an arbitrary complex number by z is a combination of a stretch and a
rotation: first we expand the plane so that the distance of each point from the
origin is multiplied by r; then we rotate the√plane through an angle θ . In the
next diagram, we are multiplying by 1 + i =
√ 2(cos(π/4) + i sin(π/4)); the dots
represent the stretching out by a factor of 2, and the circular arc represents the
rotation by π/4.
(3 + 2i)(1 + i)
= 1 +.r 5i
.................
.............
............
...........
.........
.........
........
........
.......
.......
.......
......
......
.....
....
....
....
....
....
....
....
..
.
...
.
.
.
r. .
3 + 2i
r
0
Now let’s check the correctness of our rule for multiplying complex numbers.
Remember that the rule is: to multiply two complex numbers, we multiply the
moduli and add the arguments. To see that this is correct, suppose that z1 and z2
are two complex numbers; let their moduli be r1 and r2 , and their arguments θ1
and θ2 , respectively. Then
z1 = r1 (cos θ1 + i sin θ1 ),
z2 = r2 (cos θ2 + i sin θ2 ).
Then
z1 z2 = r1 r2 (cos θ1 + i sin θ1 )(cos θ2 + i sin θ2 )
= r1 r2 ((cos θ1 cos θ2 − sin θ1 sin θ2 ) + (cos θ1 sin θ2 + sin θ1 cos θ2 )i)
= r1 r2 (cos(θ1 + θ2 ) + i sin(θ1 + θ2 )),
which is what we wanted to show.
From this we can prove De Moivre’s Theorem:
Theorem 8.1.4 (De Moivre’s Theorem) For any natural number n, we have
(cos θ + i sin θ )n = cos nθ + i sin nθ .
128
CHAPTER 8. COMPLEX NUMBERS
Proof The proof is by induction. Starting the induction is easy since (cos θ +
i sin θ )0 = 1 and cos 0 + i sin 0 = 1.
For the inductive step, suppose that the result is true for n, that is,
(cos θ + i sin θ )n = cos nθ + i sin nθ .
Then
(cos θ + i sin θ )n+1 = (cos θ + i sin θ )n · (cos θ + i sin θ )
= (cos nθ + i sin nθ )(cos θ + i sin θ )
= cos(n + 1)θ + i sin(n + 1)θ ,
which is the result for n + 1. So the proof by induction is complete.
Note that, in the second line of the chain of equations, we have used the inductive hypothesis, and in the third line, we have used the rule for multiplying
complex numbers.
The argument is clear if we express it geometrically. To multiply by the complex number (cos θ + i sin θ )n , we rotate n times through an angle θ , which is the
same as rotating through an angle nθ .
Example Find all complex numbers z satisfying z3 = −8.
If |z| = r and arg(z) = θ , we have (r(cos θ + i sin θ ))3 = −8 = 8(cos π +
i sin π). So r3 = 8, giving r = 2, and 3θ has the same sine and cosine as π. This
means that 3θ = π, 3π, or 5π, so that θ = π/3, π, or 5π/3. (There is no need to
go further, since 3θ = 7π would give θ = 7π/3 = 2π + π/3, which has the same
sine and cosine as π/3.) Thus
√
√
z = 2(cos θ + i sin θ ) = 1 + i 3, −2 or 1 − i 3.
Check for yourself that all these numbers do satisfy z3 = −8.
De Moivre’s Theorem is useful in deriving trigonometrical formulae. For example,
cos 3θ + i sin 3θ = (cos θ + i sin θ )3
= (cos3 θ − 3 cos θ sin2 θ ) + (3 cos2 θ sin θ − sin3 θ )i,
so
cos 3θ = cos3 θ − 3 cos θ sin2 θ ,
sin 3θ = 3 cos2 θ sin θ − sin3 θ .
These can be converted into the more familiar forms cos 3θ = 4 cos3 θ − 3 cos θ
and sin 3θ = 3 sin θ − 4 sin3 θ by using the equation cos2 θ + sin2 θ = 1.
8.1. COMPLEX NUMBERS
8.1.3
129
The Fundamental Theorem of Algebra
We enlarged the real numbers to the complex numbers by adding a square root
of minus one, a solution of the equation x2 + 1 = 0. It turns out that we have,
incidentally, provided solutions for a huge class of equations.
Definition 8.1.5 (a) A polynomial (over some number system) is an expression
of the form
an xn + an−1 xn−1 + · · · + a1 x + a0 ,
where an , an−1 , . . . , a1 , a0 belong to the number system under consideration,
and the leading coefficient an is not equal to zero. The number n is the
degree of the polynomial.
(b) A root of the above polynomial is a number r (in the appropriate number
system) such that
an rn + an−1 rn−1 + · · · + a1 r + a0 = 0.
For example, r = 1 and r = 2 are roots of the polynomial x3 − 7x + 6.
Note: Polynomials have a kind of double existence. On the one hand, they are
just algebraic expressions; the x in the polynomial doesn’t have any real existence,
it is just a placeholder. On the other hand, a polynomial defines a function on the
number system in question: putting a value into the black box labelled by the
polynomial simply evaluates the polynomial with x put equal to this value, as we
saw in Chapter 4.
We enlarged our number systems successively to find roots of various polynomials: bx − a (to build Q), x2 − 2 (to build R), x2 + 1 (to build C). Maybe we have
to do more complicated constructions to find roots of other polynomials? No, we
have reached the end:
Theorem 8.1.6 (Fundamental Theorem of Algebra) Any polynomial over C of
degree at least 1 has a root in C.
The first rigorous proof of this theorem was given by Gauss. Indeed, he gave
many different proofs of the theorem: ten, according to Michael Atiyah. He
clearly felt very proud of this!
There is another account of it here on Theorem of the Day.
In the supplementary material, we will see something much weaker: at least
we can solve all quadratic equations. Here, we will see another small part of the
theorem:
130
CHAPTER 8. COMPLEX NUMBERS
Proposition 8.1.7 Every non-zero complex number has n distinct nth roots: that
is, if a 6= 0, the equation zn = a has n distinct solutions in C.
Proof We write a in modulus-argument form, a = r(cos θ + i sin θ ).
Now there is a unique positive real number s which satisfies sn = r, that is, s
is the nth root of a. (This follows from the Principle of the Supremum: s is the
supremum of the set {x ∈ R : xn ≤ r}.) Since the function f (x) = xn is strictly
increasing for positive real numbers x, there cannot be more than one solution.
Now put z = s(cos φ + i sin φ ); then zn = sn (cos nφ + i sin nφ ), by De Moivre’s
Theorem. So z will be a solution if
cos nφ + i sin nφ = cos θ + i sin θ .
Now the value φ = θ /n clearly satisfies this equation. Moreover, since both the
cosine and sine functions are periodic with period 2π, we see that adding any multiple of 2π to nφ (that is, adding any multiple of 2π/n to φ ) will give a solution.
So we have solutions given by
φ = θ /n, (θ + 2π)/n, (θ + 4π/n), . . . , (θ + 2(n − 1)π)/n.
Why did we stop here? The next term in this series would be (θ + 2nπ)/n =
(θ /n) + 2π, and the complex number z = s(cos(φ + 2π) + i sin(φ + 2π)) is identical to z = s(cos φ + i sin φ ). After that the n solutions simply repeat.
So we have produced n distinct nth roots of the complex number a.
Exercise Convince yourself that the n roots that we found are the vertices of
a regular n-gon with centre at the origin. (Hint: multiplication by cos(2π/n) +
i sin(2π/n) is a rotation about the origin through an angle 2π/n.)
For example, the three cube roots of 1 (the three numbers z satisfying z3 = 1)
are
• z = 1,
• z = cos(2π/3) + i sin(2π/3) =
• z = cos(4π/3) + i sin(4π/3) =
√
−1+ −3
,
2
√
−1− −3
.
2
Check this directly as follows. We are trying to solve the equation z3 = 1, that is,
z3 − 1 = 0. We can write this as (z − 1)(z2 + z + 1) = 0, and so the solutions are 1
and the two solutions of the quadratic equation z2 + z + 1 = 0. The usual formula
for the solution of the quadratic gives us the other two displayed numbers.
Now plot these three points in the Argand diagram and show that they are the
vertices of an equilateral triangle.
8.1. COMPLEX NUMBERS
8.1.4
131
Summary of number systems
We have talked about the number systems N, Z, Q, R, C. Each is constructed from
the one before, and each includes all the numbers in the one before and some new
numbers, added to enable us to solve certain kinds of equations. The picture looks
something like this:
N
Z
Q
1
2
√
2
22
7
1, 2, 3, 4, 5, . . .
0, −1,
3
−2, −3, . . . − 125
R
C
1 + 2i
i
e
π
log10 2
− 12 +
√
i 3
2
−i
So for example, the natural number 2, the integer +2, the rational number 21 ,
the real number 2.000 . . ., and the complex number 2 + 0i should all be treated as
the same thing, even though they are actually constructed differently.
Consider, for example, the function f : R → R given as follows:
n
x
if x is rational,
f (x) =
2x if x is irrational.
This function is represented by a black box in which the input can be any real
number x. The black box inspects the number x which is input; if x is rational
(that is, if x ∈ Q), then it passes x straight to the output, whereas if x is irrational
(that is, if x ∈ R \ Q), then x is multiplied by 2 before being passed to the output.
Exercise Is the function f injective? Is it surjective?
8.1.5
Study skills 8: Examples don’t make a proof
How would you prove the following theorem?
Theorem All odd numbers are prime.
You should know by now that the argument
• 3 is prime,
• 5 is prime,
• 7 is prime,
132
CHAPTER 8. COMPLEX NUMBERS
• and so on . . .
is not a proof. If an argument contains the words “and so on” or a row of dots,
you should immediately suspect that there is a proof by induction going on somewhere. But there is no way to prove this theorem by induction: the fact that 7 is
prime tells you nothing about whether 9 is prime.
For more examples of how to prove that all odd numbers are prime, look at
this web page.
In particular, note the confused undergraduate’s proof:
Let p be any prime number larger than 2. Then p is not divisible by
2, so p is odd.
What is the mistake here? We discussed this in an earlier study guide.
It is a very common mistake to think that a few examples make a proof (as
above). Take care!
We saw earlier the definition of Mersenne numbers: these are numbers of the
form 2 p − 1, where p is prime. Now
• 22 − 1 = 3 is prime,
• 23 − 1 = 7 is prime,
• 25 − 1 = 31 is prime,
• 27 − 1 = 127 is prime,
• but 211 − 1 = 2047 = 23 × 89.
If you have a good, general example of the thing that has to be proved, then
with a bit of luck you can turn it into a general proof. Indeed, this is what mathematicians did before they came up with the idea of proof in the modern sense.
Here, for example, is how they would have solved a quadratic equation in ancient
Egypt.
Example Solve the quadratic x2 + 16 = 10x.
(a) Calculate the square of half the coefficient of x: 52 = 25.
(b) Subtract the constant term from this: 25 − 16 = 9.
√
(c) Take the square root of this: 9 = 3.
(d) Add and subtract this from half the coefficient of x: 5 + 3 = 8, 5 − 3 = 2.
(e) The solutions are x = 8 and x = 2. Substitute them in and you will find that
this works.
I hope you can write down a general proof that the method works.
8.2. SUPPLEMENTARY MATERIAL
8.2
Supplementary material
8.2.1
Solving quadratic equations
133
Proposition 8.2.1 Every quadratic equation over the complex numbers has a solution. In other words, every polynomial of degree 2 has a root.
Everyone knows the formula for the solution of the quadratic equation ax2 +
bx + c = 0:
√
−b ± b2 − 4ac
.
x=
2a
You probably learned this formula in the case where a, b, c are real numbers. Does
it work over C?
The easiest way to show that it does is simply to substitute the two values of x
into the quadratic equation and simplify. I will just take√the positive sign here; the
argument for the negative sign is similar. If x = (−b + b2 − 4ac)/2a, then
√
√
2 − 2b b2 − 4ac + (b2 − 4ac)
b(−b
+
b2 − 4ac)
b
+
+c
ax2 + bx + c =
4a
2a
√
b2 + b2 − 2b2 − 4ac + 4ac + (−2b + 2b) b2 − 4ac
=
4a
= 0.
In the second line, we have simply put everything over the common denominator
4a. The manipulations that we do in simplifying the expression use the various
laws (commutative, associative, distributive) that hold in the complex numbers
just as they do in the real numbers. So the answer
√ is, yes, the formula is valid.
The only problem is: does the square root b2 − 4ac exist?
Yes it does:
Lemma 8.2.2 Every complex number z has a square root.
There are two ways to see this. The first is to use the argument that we already used for Proposition 8.1.7, involving the modulus-argument form. We can
suppose that z 6= 0 (since 0 certainly has a square root), so we can write
z = r(cos θ + i sin θ ).
√
Then, if we put w = r(cos θ /2 + i sin θ /2), we see that w2 = z, and we have
found our square root.
√
But how do we know that r exists? Well, r is a positive real number, and we
showed in the supplementary material for the last chapter of the notes that every
positive real number has a square root.
134
CHAPTER 8. COMPLEX NUMBERS
The second method works just with the real and imaginary parts. We are given
a complex number z = a + bi, and we want to find w = x + yi such that
(x + yi)2 = a + bi,
in other words,
x2 − y2 = a,
2xy = b.
Now a, b, x, y are real numbers, so this is just a question about real numbers.
We can assume that b 6= 0. For if b = 0, we are trying to find the square root
of a real number a. If a ≥ 0, we know
√ that this exists; and if a is negative, say
a = −c, then c > 0 and we can take ( c)i to be the square root. Now, if b 6= 0,
the second equation shows that x 6= 0, so y = b/2x.
Substituting in the first equation, we get
b
x −
2x
2
2
= a,
so (clearing the denominators),
4x4 − 4ax2 − b2 = 0.
Putting u = x2 , we have a quadratic equation for u, namely 4u2 − 4au − b2 = 0,
which has the solution
√
a ± a2 + b2
.
u=
2
√
2 + b2 > 0, so a2 + b2 exists as a real number. Moreover, a2 + b2 > a2 , so
Now
a
√
a2 + b2 > |a|; so, if we choose the positive sign to describe u, that is,
√
a + a2 + b2
u=
,
2
√
then u > 0, and so we can find x = u. Knowing x, we can find y = b/2x, which
is also OK since x 6= 0.
This is a special property of the complex numbers. Look at this account of the
quadratic formula on Theorem of the Day. The problem is to make a frame for
Euler’s formula, which we will discuss next, to display in a gallery of beautiful
mathematics. You will see that quadratic equations over the real numbers (even
those arising in practice) don’t always have solutions.
8.2. SUPPLEMENTARY MATERIAL
135
8.2.2 eiπ = −1
The formula eiπ = −1, due to Euler, is one of the most famous in all of mathematics. It connects the negative unit −1, the imaginary unit i, and the two most
famous mathematical constants π and e.
So of course you would like to see a proof.
Well, I can’t prove it. After all, you have never had a definition of raising a
real number to an imaginary power; and without that, we can’t even start!
In fact the formula follows from a more general formula,
eix = cos x + i sin x.
Substitute x = π, and observe that cos π = −1 and sin π = 0.
This formula can be justified on the basis of consistency. But this argument
can be very powerful.
First, it has sensible conclusions. Just before De Moivre’s Theorem in the
notes, we did a calculation to show that
(cos x1 + i sin x1 )(cos x2 + i sin x2 ) = cos(x1 + x2 ) + i sin(x1 + x2 ).
On the other hand, the laws of exponentiation would say that
eix1 · eix2 = ei(x1 +x2 ) .
(Of course, we don’t know that the laws of exponentiation hold when the exponent
is a complex number; but we would like this to happen.) So at least the proposed
formula doesn’t disagree with what we know.
The second argument is stronger, in my view. One of the most important
things about complex numbers (which we won’t discuss in this module, but you
will meet if you study Complex Variables) is that we can extend ordinary calculus
to them. The rewards are very rich. Quantum mechanics (the theory which underlies all our consumer electronics, among many other things) rests on the fact
that we can do calculus over the complex numbers, and the usual rules apply. The
most famous equation in quantum mechanics is Schrödinger’s equation, which
mixes derivatives with complex numbers. Among many other applications is fluid
mechanics, where the calculus of complex numbers helps us study the flow of air
over aeroplane wings.
Enough of the commercial: how will calculus over the complex numbers help
us?
It turns out that, to make calculus work properly, we have to assume that the
power series we use to express functions over the real numbers continue to apply
over the complex numbers.
136
CHAPTER 8. COMPLEX NUMBERS
Three very famous power series, which you definitely need to know, are
ex =
∑
xn
x2 x3 x4
= 1+x+ + + +···,
n!
2! 3! 4!
∑
(−1)m
x2m
x2 x4
= 1− + −···,
(2m)!
2! x!
∑
(−1)m
x2m+1
x3
= x− +···.
(2m + 1)!
3!
∞
n=0
∞
cos x =
m=0
∞
sin x =
m=0
So our assumption will be that these series continue to hold true when we let
x take a complex value. In fact, we can regard this assumption as a definition of
what ex , cos x and sin x mean when x is a complex number.
Now we are going to calculate eix , by substituting ix for x in the series. The
powers in cycle through the values 1, i, −1, −i as n runs through the indices
0, 1, 2, 3, . . .. So the even-numbered terms will be real and the odd-numbered terms
imaginary, and we can separate the real and imaginary parts:
x3
x2 x4
ix
e =
1− + −··· + x− +··· i
2! 4!
3!
= cos x + i sin x,
as required.
8.2.3
Solving the cubic
Until the nineteenth century, the job of algebra was largely to solve equations.
The story contains some of the most exciting and romantic episodes in all of
mathematics. The method for solving quadratics has been known since ancient
times. The next advance didn’t come until the sixteenth century, when Niccolò
Fontana, known as Tartaglia (“the stammerer”), discovered a method for solving
cubic equations (equations of the third degree, of the form ax3 + bx2 + cx + d = 0).
Gerolamo Cardano persuaded Tartaglia to show him the method. According
to Tartaglia, Cardano promised not to publish it. (It was a valuable secret, since in
those days mathematicians had competitions, with large bets on the side, to solve
various equations.) When Cardano published the solution in his book Ars Magna
(“The Great Art”), a bitter dispute arose. Cardano claimed that he had seen the
result in an earlier manuscript by Scipione del Ferro, which absolved him of his
promise.
Soon afterwards, yet another Italian mathematician, Lodovico Ferrari, found
a way to solve a quartic (fourth-degree) equation. There matters stopped for quite
some time.
8.2. SUPPLEMENTARY MATERIAL
137
Before discussing what happened next, you are probably thinking “But now
we know the Fundamental Theorem of Algebra; we can solve an equation of any
degree.” Well, not quite; the theorem tells us that there is a solution (a complex
number which satisfies the equation), but doesn’t tell us how to find it.
If you look at the famous formula for solving the quadratic equation ax2 +
bx + c = 0, namely,
√
−b ± b2 − 4ac
,
x=
2a
you will see that it involves the arithmetic operations of addition, subtraction,
multiplication, and division, as well as taking a square root. The question arises:
for an arbitrary equation, can we find a formula for the solution which only involves the arithmetic operation and taking nth roots for (maybe) various values of
n? (The formulae of Tartaglia and Ferrari are of this form.)
It turns out that the answer is “no”, even for equations of degree 5; so the
method had been pushed as far as it could go. This was proved by the Norwegian
mathematician Niels Henrik Abel. Shortly afterwards, the French mathematician
Évariste Galois (who was killed in a duel at the age of just 20) developed a general theory which included a test for which equations could be solved by the old
method, and is regarded by many people as the foundation of modern algebra.
My goal here is much more modest than explaining all of this; I simply want
to show you how to solve a cubic equation.
Theorem 8.2.3 Any cubic equation ax3 + bx2 + cx + d = 0 (where a, b, c, d ∈ C
and a 6= 0) has a solution in C which can be found using arithmetic operations
and extracting square roots and cube roots.
Remark We assume that a 6= 0 since, if a = 0, the equation would not be a
cubic, but quadratic (or linear), and we know how to deal with these.
Proof We have to solve the equation ax3 + bx2 + cx + d = 0. We proceed in
several steps.
Step 1: We can assume that a = 1. For we can divide through by a without
changing anything, to reach the situation where the coefficient of x3 is equal to 1.
Step 2: We can assume that b = 0. This is by a process called “completing
the cube”, which is very similar to completing the square for a quadratic. We have
1 3
b ,
(x + 31 b)3 = x3 + bx2 + 31 b2 x + 27
138
CHAPTER 8. COMPLEX NUMBERS
so that
2 3
x3 + bx2 + cx + d = (x + 13 b)3 + (c − 13 b2 )(x + 13 b) + (d − 13 bc + 27
b ).
If we put y = x + 13 b, the equation becomes
y3 + c0 y + d 0 = 0,
2 3
where c0 = c − 31 b2 and d 0 = d − 31 bc + 27
b . If we can solve this equation for y,
1
subtracting 3 b gives the solution of the original equation in x.
Step 3: How to solve x3 + cx + d = 0. This requires
a trick.
√
We let ω√= cos(2π/3) + i sin(2π/3) = (−1 + −3)/2. Then we see that
ω 2 = (−1 − −3)/2, and ω 3 = 1. Note that ω and ω 2 can be expressed in terms
of arithmetic operations and square roots.
Now you can check by multiplying it out that
(x + y + z)(x + ωy + ω 2 z)(x + ω 2 y + ωz) = x3 + y3 + z3 − 3xyz.
This remarkable equation says that, if we could find numbers y and z for which
d = y3 + z3 ,
c = −3yz,
then our equation x3 + cx + d = 0 would become x3 − (3yz)x + (y3 + z3 ) = 0, and
the three solutions would be given by
x = −y − z,
x = −ωy − ω 2 z,
x = −ω 2 y − ωz.
(8.1)
Now let u = y3 and v = z3 . We have
u + v = d,
1 3
c .
uv = (yz)3 = − 27
If I have two unknown numbers u and v whose sum is s and whose product is
p, then I can write down a quadratic equation which has u and v as its solution,
namely x2 − sx + p = 0. (For expanding brackets shows that
(x − u)(x − v) = x2 − (u + v)x + uv = x2 − sx + p,
and so the solutions really are u and v.) So in our case we can find u = y3 and
v = z3 by solving a quadratic equation. Take the cube root of u to find y, then put
z = −c/3y; and then (8.1) gives us the solutions to our original equation.