Download 1 First Theme: Sums of Squares

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

System of polynomial equations wikipedia , lookup

Homological algebra wikipedia , lookup

Deligne–Lusztig theory wikipedia , lookup

Modular representation theory wikipedia , lookup

Birkhoff's representation theorem wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Field (mathematics) wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Factorization wikipedia , lookup

Homomorphism wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

Addition wikipedia , lookup

Polynomial ring wikipedia , lookup

Commutative ring wikipedia , lookup

Algebraic number field wikipedia , lookup

Transcript
I will try to organize the work of this semester around several classical questions.
The first is, When is a prime p the sum of two squares? The question was raised
by Fermat who gave the correct answer in 1640 and claimed he had a proof but
never wrote it down; the first published proof, in 1747, is due to Euler. There is
a beautiful article in Wikipedia which you can reach by Googling “sum of two
squares”; it is well worth reading. Today there is a very short exceedingly clever
proof which requires practically no preparation but which also gives you no idea
of why the theorem should be true. Our approach here, however, will not be
to get to the result as speedily as possible but to bring in the many modern
algebraic ideas that lead to a conceptual understanding of why Fermat’s original
assertion is correct.
1
First Theme: Sums of Squares
When is an integer a sum of two squares, m = a2 + b2 ? First observation: if
m, n are both sums of two squares then so is mn. For going over to complex
numbers, m = a2 + b2 can be written as m = (a + bi)(a − bi). Similarly, if
n = c2 + d2 ) then we can write n = (c + di)(c − di). It follows that mn =
(a + bi)(c + di)(a − bi)(c − di) = [(ac − bd) + (ad + bc)i][(ad − bc) − (ac − bd)i], so
we must have mn = (ac − bd)2 + (ad + bc)2 . An easy calculation shows that this
is correct. This suggests that we look first at the question of when a prime p
is a sum of two squares. Knowing which primes are sums of two squares won’t
completely answer the question, however. Certainly those numbers which can
be written as products of primes each of which is a sum of two squares will,
from what we have just seen, indeed be sums of two squares. It is still possible
that some other numbers are, too. In fact, any number which is itself a square
is trivially the sum of two squares, itself and zero. We will see that this really
exhausts the possibilities: to be a sum of two squares a number must be a
product of squares and of primes which are themselves a sum of two squares.
So which primes qualify?
Let’s try out some small primes: 2 = 12 + 12 , 3 fails, 5 = 12 + 22 , 7 fails,
11 fails, 13 = 22 + 32 , 17 = 12 + 42 , 19 fails. So far, except for 2, those primes p
which are congruent 1 modulo 4, i.e., which leave a remainder of 1 when divided
by 4 are sums of two squares. This is written p ≡ 1 mod 4 or simply p ≡ 1(4).
Those p with p ≡ 3(4) fail. And a number like 3 × 5 = 15 which has as a factor
a prime that fails, namely 3, (but not 32 ), is not a sum of two squares. In fact
we have the following
Theorem 1 1. A prime p is a sum of two squares if and only if either p = 2
or p ≡ 1(4). 2. An integer m is a sum of two squares if and only if in the prime
factorization of m those primes p ≡ 3(4) appear to an even power. 2
There are classical computational proofs of this which do not involve any
higher algebra (cf. the Wikipedia article). The first few weeks of our course will
be devoted to developing enough modern algebra so that we can understand a
conceptual proof. Part of this will be understanding when prime factorizations
1
exist and are unique; it is implicit in the statement of the theorem that this is
the case for the ordinary integers.
Passing to the complex numbers has been useful but before going further,
let’s introduce some terminology and some notation. You have probably already
been introduced to the concept of a group, G but let me review it very briefly:
G consists of a set of elements together with a multiplication map G × G → G
which is associative; moreover (i) there is a unit element, denoted 1 or e (or 1G
if we want to emphasize that this is the unit element of the group G) with the
property that 1 · x = x · 1 = x for all x ∈ G, and (ii) for every x ∈ G there is an
inverse element x−1 such that x · x−1 = 1 = x−1 · x. This inverse is necessarily
unique (as is the unit element). We say that x and y commute if xy = yx. A
group in which all pairs of elements commute is called commutative or Abelian
(in honor of N. H. Abel). In that case, multiplication is frequently written as
addition and the ‘unit element’ is denoted by 0 and called the zero element; the
group is then usually called an additive group. The set of all permutations of an
arbitrary set S forms a group; it is non-Abelian whenever S has at least three
elements. The most important Abelian group is the additive group of integers,
here always denoted by Z. The integers, however, have more structure, there is
an associative multiplication. This leads to the following definition: a ring R is
an additive group with an associative multiplication that satisfies the distributive
laws, i.e., where x(y + z) = xy + xz and (y + z)x = yx + zx. We need both
of these statements since the multiplication need not be commutative. A good
example of such a ring is set of all n × n matrices with real coefficients. We will
always denote the real numbers by R and this ring by Mn (R). (This notation
differs from that in the text but is more common.) Notice that Mn (R) has a
unit element for multiplication. We will generally assume without mentioning
it that the rings we deal with have a unit element but (unlike the text) not that
they are commutative.
A field is a commutative ring in which every non-zero element has a multiplicative inverse. The most important examples are the rational numbers
Q, the real numbers, R, and the complex numbers C. There do exist noncommutative rings in which every non-zero element has a multiplicative inverse.
These are called division rings, or skew fields, a term which the elder Artin
has contracted to sfield. The most important example is the Hamiltonians,
generally denoted by H after its discoverer. This algebra is a four-dimensional
vector space over R with basis elements 1, i, j, k and multiplication defined by
i2 = j 2 = k 2 = −1, ij = k = −ji. It follows that jk = i = −kj, ki = j = −ik.
The general quaternion thus has the form q = a+bi+cj +dk, where a, b, c, d ∈ R.
To see that this is a division ring, define the conjugate q̄ = a − bi − cj − dk.
Then q · q̄ = q̄ · q = a2 + b2 + c2 + d2 . Since a, b, c, d are real, this can’t vanish
unless all are zero, so we have q −1 = q/q q̄. A ring which is a vector space over
a field is frequently called an algebra.
This brings us to the concept of a morphism (older name homomorphism).
Recall that a morphism between groups f : G → H is a mapping which preserves
the group multiplication, i.e., such that f (xy) = (f x)(f y). This implies that
f (1g ) = 1H and that f (x−1 ) = (f x)−1 .
2
Exercise 1 Prove the preceding assertions.
In the newer terminology, a morphism which is one-to-one, i.e., where x 6= y
implies f x 6= f y, is called a monomorphism although I will frequently use
the older word as well. A morphism which is onto is now usually called an
epimorphism, but here, too, I will frequently use the older term. One which
is both is frequently called a bijection. It is easy to check that if we have
f
g
g◦f
morphisms G −−−
−→ H −−−−→ J then the composite G −−−−→ J is again
a morphism. In elementary texts an isomorphism f : G → H is usually
defined to be a morphism which is one-to-one and onto, a bijection. This will
do for groups, rings, and all the structures you will encounter in this course, but
the ‘categorical’ definition is this: f is an isomorphism if there is a morphism
g : H → G such g ◦ f is the identity map of g, denoted idG , and such that
f ◦ g = idH . The reason that the elementary definition works for groups and for
rings is that if f : G → H is a bijection then there is a set mapping g : H → G
sending every z ∈ H back to the unique x ∈ G with f x = z and this g is again
a morphism.
Exercise 2 Prove the preceding assertion.
The concept of a morphism extends to rings: a morphism f : R → S is a
mapping which preserves both the addition and the multiplication, i.e., such
that f (x + y) = f x + f y and f (xy) = (f x)(f y). While we certainly have
f (0) = 0 it need no longer be the case that f (1R ) = 1S . For example, let R
just be R and S = M2 (R). We can µ
then define
a morphism R → S by sending
¶
x 0
. This is certainly a morphism but
every real number x to the matrix
0 0
the image of the unit element 1 ∈ R is not the unit element of S. The image
of 1 is, however, an element whose square is itself; such an element is called
an idempotent. The general concept of a morphism is that it preserves all the
structure there is. Having a unit element is not part of the definition of a ring,
even though the rings we deal with here generally do have units. To be more
precise, we define a unital ring to be one with a unit element and a unital
morphism to be one preserving the unit element.
If you want a simple example where a morphism which is a bijection is not
necessarily an isomorphism, considered partially ordered sets or posets, that is,
sets in which for some but not necessarily all pairs of elements x, y there is a
relation x ≺ y. This has to satisfy the axioms that x ≺ y and y ≺ z imply x ≺ z,
x ≺ x, and x ≺ y together with y ≺ x imply x = y. A morphism f : S → T
of partially ordered sets is a set map such that x ≺ y implies f x ≺ f y. Now
suppose that S carries a non-trivial partial order and that |S| is the same set
with the partial order wiped out, i.e., in which there is no pair x, y with x ≺ y
unless x = y. Now consider the identity map |S| → S carrying every element to
itself. This is trivially a morphism of partially ordered sets (since there are no
conditions to satisfy) and is trivially a bijection, but the inverse map S → |S|
is not a morphism of partially ordered sets.
3
The kernel K of a group morphism f : G → K is the set of all x ∈ G
such that f x = 1. It is a normal subgroup of G, i.e., a subgroup such that
xKx−1 = K, where xKx−1 is the set of all elements xkx−1, k ∈ K. The kernel
of a ring morphism f : R → S is similarly the set I of all x ∈ R such that
f x = 0. This subset is an ideal of R, that is, (i) I is an additive subgroup
of R (ii) with the additional property that if x ∈ R and z ∈ I then xz and
zx are both in I. When we consider only the additive group structure of R
(forget the multiplication for the moment) we generally write R+ . Here is a
simple but fundamental example. Suppose that R is commutative and pick an
element a ∈ R. Then the set of all multiples of a, i.e. elements of the form ax
with x ∈ R, forms an ideal. Such an ideal is called principal ; it is the principal
ideal generated by a, written aR or sometimes simply as (a) when the ring is
understood. In particular, when R = Z if we pick an integer say 6, then the set
of all multiples of 6 forms an ideal. As we will see, in Z every ideal is principal.
Rings with this special property are generally called principal ideal rings. In
any ring, the entire ring and the subring reduced to the zero element alone are
ideals but we generally don’t count these; all other ideals are called proper. It
can happen that a ring has no proper ideals, in which case it is called simple.
The rings Mn (R) are simple.
Exercise 3 (i) Prove this. (ii) Prove that if I is an ideal of a ring R then
Mn (I) is an ideal of Mn (R) and (iii) that every ideal of Mn (R) has this form.
Recall that if we have a normal subgroup K of a group G then we can form
the quotient group G/K (the set of cosets xK). Every normal subgroup K
is actually the kernel of a group morphism, namely of the canonical morphism
G → G/K. Moreover, if we have a group morphism f : G → H and if the kernel
of f contains K then we can define a new morphism f¯ : G/K → H as follows:
An element of G/K is a coset xK; set f¯(xK) = f (x). The “representative” x
of the coset xK is not unique; it is just one of the elements of xK. However,
if we have one representative x (that is, if we have in hand one element x of
xK) then any other representative y must be a multiple xk of x. It follows that
f (y) = f (x)f (k) = f (x), so the map f¯ is well defined, and it is easy to see that
it is a group morphism. With this, f can be factored: it can be written as the
f¯
composite morphism G −−can
−−→ G/K −−−−→ H . The important thing now is
that we can do the same for rings. Suppose that I is an ideal of R. Since I
is an additive subgroup of R+ its cosets are written in the form x + I and we
define R/I to be the set of these cosets. This is again a ring with addition and
multiplication defined by (x+I)+(y +I) = (x+y)+I; (x+I)(y +I) = xy +I. In
our simple example of the multiples of 6 in Z, the ring Z/(6) (sometimes written
even more simply as Z/6 has exactly six elements, 0̄, 1̄, 2̄, . . . , 5̄, the cosets of 0
through 5, respectively. Addition and multiplication in Z/6 are easy but there
are some peculiarities: 2̄ · 3̄ = 0̄ so here we have a pair of elements, neither of
which is the zero element of the ring, but their product is zero; such elements
are called zero divisors. Also, (3̄)2 = 3̄ · 3̄ = 3̄. This is another example of an
idempotent.
4
The reason that Z/6 has zero divisors is that 6 is composite; 6 = 2 · 3. It
is a basic theorem (a special case of a deeper one) that if p is a prime number
then Z/p is a field. It is easy to check, for example, that Z/7 is a field: 2̄ · 4̄ =
3̄ · 5̄ = 6̄ · 6̄ = 1̄ (and, of course 1̄ is its own inverse), showing that every non-zero
element of Z/7 has a multiplicative inverse. It follows that the non-zero elements
form a group under multiplication. This group is cyclic; you can check that the
powers of 3̄ give all the non-zero elements of Z/6. This is not an accident. We
will prove the following
Theorem 2 Any finite multiplicative subgroup of the multiplicative group of a
field is cyclic. 2
Exercise 4 Give examples to show that this need not hold for (i) a skew field
or (ii) an infinite group.
To prove this we shall need some information about the structure of Abelian
groups. A group G (Abelian or not) is finitely generated if there is some finite
subset of elements g1 , . . . , gn such that every element of G can be written as
a product of these elements and their reciprocals. A group G which can be
generated by a single element g is cyclic. If G is finite, say #G = n then
G = {1, g, g 2 , . . . , n − 1}, where g n = 1. In this case G is isomorphic (as an a
group) to Z/n, the isomorphism being given by g m → m̄. If G is infinite then
G = {. . . , g −2 , g −1 , 1, g, g 2 , . . . }, where no power of G other then the zeroth
is equal to the unit element. In this case, G ∼
= Z, the isomorphism being
given by g m → m. Recall that the direct product or simply the product of
two groups, G × H consists of all ordered pairs (g, h), g ∈ G, h ∈ H with group
operation (g, h)(g 0 , h0 ) = (gg 0 , hh0 ). This construction extends in an obvious way
to a product of any finite number of groups. With this we have the following
fundamental theorem on finitely generated Abelian groups (a special case of a
slightly more general theorem which we will prove)
Theorem 3 Every finitely generated Abelian group is isomorphic to a unique
group of the form Z/d1 × Z/d2 × · · · × Z/dr × Z × · · · × Z where d1 |d2 | · · · |dr . 2
The last condition means that d1 divides d2 , d2 divides d3 , etc.. If the group
is finite, then the Z factors don’t appear. There are generally other ways to
decompose an Abelian group, but this is the shortest (least number of factors).
The di are called the principal divisors of the group. Note that if the group is
finite then the largest one, here denoted dr is the exponent of the group, i.e.,
the smallest integer e such that g e = 1 for every g ∈ G. Notice that if a finite
group is not cyclic, i.e., if r ≥ 2, then writing dr = e, the number of elements
of the group satisfying the equation xe = 1 is greater than e. In a field the
number of solutions to a polynomial equation can not exceed the degree of the
equation. This is why a finite subgroup of the multiplicative group of a field
must be cyclic. Sometimes, however, it is quite difficult to find a generator. For
example, we will see that if p is a prime then Z/p is a field, usually denoted
Fp . Since this field is finite and has exactly p elements, its multiplicative group
5
is of order p − 1 and cyclic, but the difficulty of finding a generator when p is
a large prime has sometimes been used in coding schemes. (The existence of
finite fields was discovered by E. Galois.)
The concept of direct product can also be extended to rings R and S except
that there it is frequently called the direct sum and denoted R ⊕ S. It is the
set of ordered pairs (r, s) with r ∈ R, s ∈ S with addition and multiplication
defined by (r, s) + (r0 , s0 ) = (r + r0 , s + s0 ), (r, s)(r0 , s0 ) = (rr0 , ss0 ).
Let’s get back (in a sophisticated way) to our problem of determining which
primes p (and more generally, which integers) are sums of two squares, but
first, one elementary observation: Any prime p ≡ 3(4) can not be a sum of two
squares. For observe that if n is an even integer then n2 ≡ 0(4), while if n is
odd, say n = 2m + 1 then n2 = 4m2 + 4m + 1 ≡ 1(4). Therefore, a sum of two
squares can only be congruent to 0, 1, or 2 mod 4, hence never to 3.
For deeper results we will look at a generalization of the concept of integer.
A special case is the set of all complex numbers of the form a + bi where a
and b are integers. These form a ring, the ring of Gaussian integers, denoted
Z[i]. Ordinary primes (henceforth called rational primes because they are the
primes in the field Q of rational numbers) may factor in Z[i]. For example,
2 = i(1 + i)2 , 5 = (1 + 2i)(1 − 2i). If p is a sum of two squares, p = a2 + b2 , a, b ∈
Z, then p = (a + bi)(a − bi) in Z[i], so being a sum of two squares implies
factorization in Z[i]. The converse is also true.
Lemma 1 A rational prime p factors in Z[i] if and only if it is a sum of two
squares, in which case it has exactly two non-trivial factors, i.e., factors other
than ±1, ±i.
Proof. Suppose that p is a prime which factors in a non-trivial way in Z[i]
with p = αβ. Taking conjugates we also have p = ᾱβ̄. Multiplying gives
p2 = |α|2 |β|2 , a factorization of p2 in Z, but the only possibility for this is
that |α|2 = |β|2 = p. It follows that neither α nor β can factor further since
this would give a factorization of p in Z. Since both have the same absolute
value and their product is real, one must be the conjugate of the other. So if
α = a + bi, a, b ∈ Z then β = a − bi and p = a2 + b2 . ¥
A complex number α which is a root of some polynomial f (x) = cn xn +
cn−1 xn−1 + · · · + c1 x + c0 with integer coefficients is called an algebraic number.
We could make this equation monic, i.e., have leading coefficient equal to 1
by dividing by cn , but then the other coefficients would generally be rational
numbers and not integers. If α is a root of a monic polynomial with integer
coefficients then it is called an algebraic integer. If α is an algebraic number
then there is a unique monic polynomial f (x) of minimum degree which it
satisfies; that polynomial is called the minimum polynomial for α and its degree
is called the degree
√ of α. Before considering the general case, consider the special
case where α = d, where d is some square-free integer (i.e., not divisible by
the square of any integer other than 1) but which may be negative, a most
important case being d = −1. This is obviously algebraic and even an algebraic
2
integer since it satisfies
√ the equation x − d = 0. In fact, all complex numbers of
the form α = a + b d with a, b ∈ Q are algebraic since α satisfies the equation
6
√
x2 − 2ax + (a2 − b2 d) = 0. The set of all these numbers is denoted Q( d)
and we claim that they form a field: it is clear
√ that the sums, differences, and
products of two numbers of the form a + b d is again a number of the same
form. To see that the inverse is also, observe first that since d is not a square
the rational number a2 − b2 d can not vanish whenever either a or b is not 0.
From x2 − 2ax + (a2 − b2 d) = 0√we have x(x − 2a) = −(a2 − b2 d) whence
2
x−1 = −(x − 2a)/(a2 −
√b d), so Q d is indeed a field. It is also a vector space
over Q with basis {1, d}, so it has dimension equal to 2 as a Q-space. Such
a field is usually called a quadratic field. We will see that in fact the set of all
algebraic numbers forms a field and that the set of√all algebraic integers forms
a subring of this field. When is a number a + b d; a, b ∈ Q an (algebraic)
integer? Looking at the equation it satisfies, it is sufficient (and we will later
show, necessary) that 2a and a2 − db2 be integers. If a is an ordinary or rational
integer then db2 must also be an integer, and since d is square-free it follows
that b is an integer, too. The only other possibility is that a be a “half-integer”,
i.e., of the form m + 1/2, m ∈ Z, in which case a2 = m2 + m + 1/4. But then
a2 − db2 can only be an integer if b is also a half integer and d ≡ 1(4).
√ So in this
case the integers consist not just of all elements
of
the
form
m
+
n
d, m, n ∈ Z
√
but also of the elements (m + 12 ) + (n + 12 ) d.
√
Exercise 5 Prove that the integers of Q d do form a ring.
Since −1 6≡ 1(4) we see finally that the ring of Gaussian integers is precisely the
ring of algebraic integers inside the field Q(i).
We are now close to understanding why a prime p ≡ 1(4) must be a sum
of two squares. Observe that if p = a2 + b2 = (a + bi)(a − bi), a, b ∈ Z then
the rational prime p has factored inside the ring of Gaussian integers Z[i]. This
raises the question of whether Z[i] behaves like Z in that it has “primes” which
can not be factored and where every element can be factored “uniquely” into a
product of primes. Here is the reason for the quotes. Even in Z factorization
is not strictly unique because we could introduce factors of −1 so we make the
following definition: In a commutative ring R (with unit element) an element
u which has an inverse is called a unit. (This may not be the best terminology,
but it is the historical one.) The units form a group under multiplication,
usually denoted R× . In Z the group of units consists only of {+1, −1}; in Z[i]
it is {±1, ±i}. Elements x, y ∈ R with y = ux where u is a unit are called
associates; this is obviously an equivalence relation. In a commutative ring it
is meaningful to say that y divides x if x = yz for some z but it generally
does not follow that if x and y divide each other that they are associates, for
there may be zero divisors. A commutative ring R in which there are no zero
divisors is called an integral domain or simply a domain. (The older name was
“domain of integrity”.) In a domain, if x and y divide each other, x = yz and
y = xw, then we have x = xwz or x(1 − wz) = 0, so 1 − wz = 0. Thus w
and z are units and x and y are associates. An element which has no divisors
except itself and associates is called irreducible. When we speak of “unique
factorization” it means factorization into irreducibles which is unique up to the
7
order of the irreducible factors and multiplication by units (or replacement of
factors by associates). It is quite possible in a domain for an element to have
genuinely distinct factorizations
into irreducibles. √
Consider, for example, the
√
ring of integers of Q( −5);√it consists √
of all m + n −5 with m, n ∈ Z. In this
ring the elements 3, 7, 1 + 2 −5, 1 − 2 −5 are all irreducible.
Exercise 6 Prove this from first principles. (We will see more sophisticated
reasons later.)
√
Unfortunately unique factorization fails inside the ring of integers of Q( −5
for we √
have the two √
distinct factorizations into irreducible factors 21 = 3 · 7 =
(1 + 2 −5) · (1 − 2 −5). A domain in which we have unique factorization
into irreducibles is called a unique factorization domain, abbreviated UFD, or a
factorial domain. In these we sometimes call irreducible elements “primes”, but
bear in mind that with this definition +2 and −2 are both rational primes. It is
a basic theorem (and not too difficult) that Z[i] is a factorial domain. Obviously
in a factorial domain if an irreducible element divides a product then it must
divide one of the factors; in fact, this is a crucial property that one must prove to
show that a domain is factorial. If R is a factorial domain and π an irreducible
element of R then√the quotient ring R/π is again a √
domain and√conversely. On
the other hand Z[ −5]/3 is not a domain since (1+2 −5)·(1−2 −5) = 3·7 ≡ 0
mod 3. We will prove that Z[i] is factorial, but for the moment let’s accept it.
Lemma 2 A finite domain R is a field
Proof. We must show that every non-zero element x ∈ R has an inverse.
Consider the set of all xy as y varies in R. No two of these can be identical, for
if xy = xy 0 then x(y − y 0 ) = 0, contradicting the assumption that R is a domain.
Since R is finite, the set of all xy with fixed x must be all of R, so there is a y
such that xy = 1. ¥
Exercise 7 We really don’t need the commutativity in the preceding Lemma or
even the existence of a unit element. Prove that if S is a finite set with an
associative multiplication with both the left and right cancelation property, i.e.,
xy = xy 0 implies y = y 0 and yx = y 0 x also implies y = y 0 then S is in fact
a group (and in particular, there is a unit element for multiplication). (Hard)
Suppose only one of the two cancelation properties holds. Is S still a group?
(Give a proof or a counterexample, but don’t spend too much time on it.)
Finally (assuming some of the things we have not yet proven) we have the proof
of Theorem 1:
Proof. To prove assertion 1. we must show that if p ≡ 1(4) then p factors
in Z[i]. Suppose to the contrary that it remained irreducible. Since Z[i] is a
factorial domain it would follow that Z[i]/p is again a domain, with exactly p2
elements, and being finite it would be a field. The equation x4 = 1 could then
have no more than four roots in Z[i]/p. But Z[i]/p ⊃ Z/p and the latter is
a field with exactly p elements. Its multiplicative group is therefore a cyclic
8
group with p − 1 elements, and this is a multiple of 4, say p − 1 = 4m. If a is
any generator of this group then 1, am , a2m and a3m are four distinct elements
satisfying x4 = 1. But (the classes of) i and −i are not amongst these and also
satisfy the equation, so there are too many roots. Therefore Z[i]/p can not be
a field, so p must factor. For 2., suppose that m is a sum of two squares, hence
of the form m = (a + bi)(a − bi) for some a, b ∈ Z and that a prime p ≡ 3(4)
divides m. Since p is still irreducible in Z[i] it must divide one of the two factors.
Suppose pk ||(a + bi) (meaning that k is the precise power to which p divides
a + bi). Since pk is real, taking conjugates we see that also pk ||(a − bi) so p2k ||m.
¥
So one way to understand the “if” (hard) part of Fermat’s original assertion,
that an odd prime p is a sum of two squares if and only if p ≡ 1(4), is to say
that such a prime must factor in Z[i]. There are many details that must be
filled in; that will be our next project.
Exercise 8 Show that if a, b are integers then Z/(a + bi) always has exactly
a2 + b2 elements, and if a, b are relatively prime then there is an isomorphism
Z/(a2 +b2 ) → Z/(a+bi) but not otherwise. What is the structure of the additive
group of Z/(a + bi) when a and b are not relatively prime? (Hint: You might
want to do the relatively prime case first.)
9