Download Summary of lectures.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vincent's theorem wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Georg Cantor's first set theory article wikipedia , lookup

Mathematical proof wikipedia , lookup

Addition wikipedia , lookup

Elementary mathematics wikipedia , lookup

Factorization wikipedia , lookup

Theorem wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Fermat's Last Theorem wikipedia , lookup

Wiles's proof of Fermat's Last Theorem wikipedia , lookup

Collatz conjecture wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

List of prime numbers wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Quadratic reciprocity wikipedia , lookup

Transcript
Theory of Numbers (V63.0248)
Professor M. Hausner
Summary of lectures.
Monday, May 14.
Divisibility. The integers Z and the natural numbers N. (Non-negative integers)
The well-ordered property on N. (Every nonempty subset of N has a least.)
Notation a|b and a 6 | b.
Elementary divisibility theorems:
a|b and b|c → a|c.
a|b and a|c → a|(xa + yb) for all integers x, y.
a|0, a|a, 1|a.
a|b and b|a → |a| = |b|.
The Division Algorithm. For a, b > 0, there are numbers q and r, called the quotient and
the remainder such that a = bq + r with 0 ≤ r < b. Proved using well ordering.
Definition of (a, b), the greatest common divisor (GCD) of a and b.
The Euclidian Algorithm to compute the GCD of two numbers. Illustration and sample
format: Find (30,105):
Numerator Denominator
105
27
27
24
24
3
Quotient
3
1
8
Remainder
24
3
0
So (30, 105) = 3, the last non-zero remainder. Using this algorithm, we showed
Theorem. The GCD of a and b is a linear combination of a and b: (a, b) = xa + yb. The
text shows that (a, b) is the least positive value of all linear combinations xa + yb when a
and b are not both 0.
An important consequence of this theorem, proved in class, is the important:
a|bc and (a, b) = 1 → a|c.
Note that the condition (a, b) = 1 states that a and b have no common divisors except the
obvious ±1. In this case, we say that a and b are relatively prime.
1
Primes.
Theorem. Every integer greater than 1 is a product of prime. We proved this using the
well ordered property of the natural numbers.
Theorem. There are infinitely many primes. We gave Euclid’s proof. Add 1 to the product
of all primes up through p and we either get a prime or a number divisible by a prime greater
than p.
***********************
Tues. May 15th
More on Divisibility.
Theorem. In any set of r consecutive numbers, one is divisible by r.
The proof is by induction on n. We show that one of n, n + 1, . . . , n + r − 1 is divisible by r.
It is true for n = 0 since r|0. Assuming it true for n = k, we prove it for n = k + 1 b y
separating into 2 cases: Either r|k or r 6 |k. We leave out the details here.
An alternative proof, as in the text, that (a, b) is a linear combination of a and b. This
also works for the GCD of more than 2 numbers. This proof is an existence proof, but does
not yield an effective way of actually computing the linear combination. Letting M = the
set of all linear combinations xq + yb. Assuming that a and b are not both 0, M contains
a positive element. Let D be the smallest positive element of M. Writing D = ax0 + by0 ,
we first show that D|a and D|b - that is, D is a common divisor of a and b. To show that
D|a, use the division algorithm to write a = Dq + r where 0 ≤ r < D. Solving for r using
D = ax0 + by0 ,, we find that r is in M. Since it is less than D it can’t be positive, since
D is the least positive element of M. Since 0 ≤ r, we must have r = 0, showing that D|a.
Similarly we can show D|b.
To show that D is the greatest common divisor of a and b, suppose c is a common divisor:
c|a and c|b. Then c|ax0 + by0 , so c|D. This shows that c ≤ D and incidentally shows that D
is a multiple of and common divisor of a and b.
More on Primes. We first need the following
Theorem: If p is prime and p|ab, then p|a or p|b.
For the proof, suppose p does not divide a. The only common (positive) divisor of p and a
is 1. So (p, a) = 1, Since p|ab, we must have p|b. So either p|a or p|b.
Similarly, by induction we have: If p|a1 a3 · · · an , then p|ai for some i.
Theorem: The Unique Factorization Theorem. Let n > 1. Then any two factorizations of n are the same, except for the order of the factors.
For a proof, suppose this is not true. Then let N be the least number for which N does not
2
have a unique factorization. Then we have
N = p1 · · · pr = q1 · · · qs
where the p’s and q’s are all primes and the factorization is different. Then p1 |N so p1 |q1 · · · qs .
So p1 divides one of q’s. Rearranging the q’s, we may suppose p1 |q1 . Since the q’s are primes,
we must have p1 = q1 . Dividing the equation by p1 , we get
N/p1 = p2 · · · pr = q2 · · · qs
Since N was the least number with no unique factorization, and N/p1 < N, this latter
factorization is the same except for order. So the original factorization was the same and
this is a contradiction proving the theorem.
Binomial Coefficients.
!
n
These are a double array of numbers
defined for all integers r and n ≥ 0. We showed
r
that the following definitions are all equivalent.
I. Recursive Definition
as in the Pascal Triangle:
!
!
0
0
A:
= 1,
= 0 for r 6= 0.
0!
r !
!
n
n−1
n−1
=
+
for n > 0.
B:
r
r
r−1
In the usual Pascal triangle, n is the row, and r is the column.
II. Formula
! Definition.
!
n
n
n!
, (0 ≤ r ≤ n) else
=
= 0.
r!(n − r)!
r
r
III. Binomial Theorem Formulation.
!
n
X
n r
n
(1 + x) =
x.
r=0 r
IV. Combinatoric
Formulation.
!
n
is the number of subsets of size r of a set having n elements.
r
In lecture, we showed that II, and IV were equivalent to I, since they satisfy the same
recursive equation as in I. We can also show III is equivalent to I, using the identity (1+x)n =
(1 + x)(1 + x)n−1 and (1 + x)0 = 1.
Some consequences, easily proved from this.
1.
!
!
n
n
is a non-negative integer, and
> 0 for 0 ≤ r ≤ n.
r
r
3
2. The product of r consecutive number is divisible by r!. In fact,
n
n(n − 1) · · · (n − r + 1)
=
r!
r
!
!
p
is divisible by p if 0 < p < r.1
r
3.
Wed, May 16.
Table of Primes and the Sieve of Eratosthenese.
This is an ancient method useful for creating relatively small tables of primes. To form a
table of primes from 2 to n, start with 2 and eliminate all multiples of 2 from 2 time 2 on.
The next number not eliminated is 3. It is a prime, and then eliminate all multiples of 3
from 3 time 3 on. The next number not eliminated is√5. It is prime and then eliminate
all multiples of 5 from 5 times 5 on. Continue up to n. At this point, all numbers not
eliminated from 2 to n will be primes. √This method is based on the result: If n is composite,
it will have a divisor d satisfying 1 ≤ n.
The problem of factoring large numbers (say 128 digits!) is extremely difficult and there are
very few algorithms to simplify this process.
Congruences.
Definition. a ≡ b mod n is defined to mean n|(b − a). A few elementary results are
1. a ≡ a mod n (Reflexive property).
2. If a ≡ b mod n then b ≡ a mod n (Symmetric property).
3. If a ≡ b mod n and b ≡ c mod n then a ≡ c mod n (Transitive property.)
4. If a ≡ b mod n then a + x ≡ b + x mod n.
5. If a ≡ b mod n then ax ≡ bx mod n.
6. If a ≡ b mod n and c ≡ d mod n then a + c ≡ b + d mod n.
7. If a ≡ b mod n and c ≡ d mod n then ac ≡ bd mod n.
8. If a ≡ b mod n then ak ≡ bk mod n.
Definition. A complete residue system mod n is a set of n numbers r1 , r2 , . . . , rn such
that no two are congruent mod n. That is: If ri ≡ rj mod n then i = j. The standard
complete residue system mod n are the remainders mod n: 0, 1, . . . , n − 1. We showed that
any complete residue system is (mod n) a rearrangement of the standard complete residue
system. Further, that any integer a is congruent mod n to one and only one number in a
complete residue system.
Definition. φ(n) is defined as then number of integers between 1 and n which are relatively
prime to n. For example, φ(8) = 4 since the numbers between 1 and 8 which are relatively
1
The letter p is always used in these notes to designate a prime.
4
prime to 8 are 1, 3, 5, 7. (4 in all). We stated, without proof, that φ(n) = n
Y
p−1
. For
p
p|n
example, φ(8) = 8 × 1/2 = 4 and φ(21) = 21 × (2/3) × (6/7) = 12.
Definition. A reduced residue system mod n is a set of φ(n) numbers r1 , r2 , . . . , rφ(n , all
relatively prime to n such that no two are congruent mod n. That is: If ri ≡ rj mod n
then i = j. The standard reduce residue system mod n are the remainders mod n which
are relatively prime to n. For example, the standard reduced residue system mod 8 is 1,
3, 5, 7. We showed that any reduced residue system is (mod n) a rearrangement of the
standard reduced residue system. Further, that any integer a which is relatively prime to n
is congruent mod n to one and only one number in a reduced residue system.
Some results on division put into congruence form.
1. If ab ≡ 0 mod n and (a, n) = 1 then b ≡ 0 mod n.
This is equivalent to: If n|ab and (n, a) = 1, then n|b.
From this we easily show
The Cancelation Law.
if ax ≡ ay mod n and (a, n) = 1 then x ≡ y mod n.
If n is prime this is simply
If ax ≡ ay mod p and a 6≡ 0 mod p then x ≡ y mod p.
Note the similarity with the familiar algebraic law if congruence is replace by equality:
If ax = ay and a 6= 0 then x = y.
The result for primes is equivalent to
If ab ≡ 0 mod p then a ≡ 0 mod p or b ≡ 0 mod p.
Using these results, we have
Theorem: . If r1 , . . . , rn is a complete residue system mod n and (a, n) = 1, then so is If
ar1 , . . . , arn . Similarly, if r1 , . . . , rφ(n) is a reduced residue system mod n and (a, n) = 1, then
so is If ar1 , . . . , arφn .
The proof uses the definition and cancelation. For example, 0, 1, 2, 3 Is a complete residue
system mod 4. Multiplying by 3, we have 0, 3, 6, 9 is also a complete residue system mod 4.
We can now prove Fermat’s Theorem2 :
Theorem: If a is not divisible by p then ap−1 ≡ 1 mod p
Proof: 1, 2, . . . , p − 1 is a reduced residue system mod p. Therefore, so is a, 2a, . . . , a(p − 1),
Since this latter is a rearrangement of the former, mod p, the product of the numbers in the
former system is congruent to the product of the numbers in the latter:
(p − 1)! ≡ (p − 1)!ap−1 mod p
We get the result by canceling of (p − 1)!.
2
Also called Fermat’s Little Theorem to distinguish it from the so-called Fermat’s Last Theorem.
5
Working with a reduced residue system, the same method gives Euler’s generalization of this
result.
Theorem: If (a, n) = 1, then
aφ)n) ≡ 1 mod n
Thur, May17th
We reviewed the previous lecture.
Definition of inverse If ab ≡ 1 mod n, we say that a and b are inverses mod n. Also, b is
called the inverse of a mod n. The relation is symmetric: a is also the inverse of b mod n.
We write b = a mod n.
Note: a has an inverse mod n if and only if (a, n) = 1. To see this, first suppose that
(a, n) = 1. Then we have 1 = ax + ny for some x, y. Then 1 ≡ ax + ny ≡ ax mod n. So x
is the inverse of a mod n. Conversely, if a has an inverse b mod n, we have ab ≡ 1 mod n
so n|(ab − 1) and ab − 1 = nq. This shows that any divisor of a and n must also divide 1.
Thus (a, n) = 1.
When is x its own inverse mod p? The answer is x ≡ 1 or x ≡ −1 mod p.
Theorem: x2 ≡ 1 mod p if and only if x ≡ ±1 mod p.
Proof: If x2 ≡ 1 mod p, then x2 − 1 ≡ 0 mod p, or (x − 1)(x + 1) ≡ 0 mod p This implies
x ≡ ±1 mod p. These steps are all reversible.
Thus, Inverses occur in distinct pairs, with the exception of 1 and −1 which are their own
inverses.
Note: This result is not true if n is not a prime. For example (mod 8), 12 ≡ 32 ≡ 52 ≡ 72 ≡
1 mod 8. I may have incorrectly used this result for non-primes in class.
This allows us to prove
Wilson’s Theorem. (p − 1)! ≡ −1 mod p.
Proof: This is true, and uninteresting for p = 2.. When p is odd, we look at the factorization
of (p−1)! = 1·2·3 · · · (p−2)(p−1). Except for the extreme factors 1 and (p−1) (≡ −1 mod p),
the factors pair off into numbers and their inverse mod p. Any such pair multiplies out to 1
mod p. So the entire product is congruent to 1 · (p − 1) ≡ −1 mod p.
Note: The generalization mod n, as stated in class, is wrong for the reasons stated above.
This result allows us to answer the question: For what primes p can the congruence x2 ≡
−1 mod p be solved?
Theorem: The congruence x2 ≡ −1 mod p can be solved if and only if p = 2 or p = 4n + 1
for some n, or simply p ≡ 1 mod 4.
For example x2 ≡ −1 mod 41 can be solved, while x2 ≡ −1 mod 43 cannot. (The former
6
has the solution x ≡ 9 or −9 ≡ 32 mod 41.)
Proof: p = 2 is trivial. Note that an odd prime is either of the form 4n + 1 or 4n + 3. We
separate the cases.
1. If p = 4n + 3, we show that we cannot have x2 ≡ −1 mod p. For suppose we had a
solution. Raise both sides to the power (p − 1)/2 = 2n + 1 to get xp−1 ≡ (−1)2n+1 mod p.
But using Fermat’s theorem, this gives 1 ≡ −1 mod p which is a contradiction.
2, Suppose p = 4n + 1. We use Wilson’s theorem to get (4n)! ≡ −1 mod (4n + 1). This is
(1 · 2 · 3 · · · · 2n)(2n + 1)(2n + 2) · · · (4n − 1)(4n) ≡ −1 mod (4n + 1).
This is equivalent to
(1 · 2 · 3 · · · · (2n)(−2n)(−(2n − 1)) · · · (−2)(−1) ≡ −1 mod (4n + 1).
or simply (2n!)2 ≡ −1 mod p. So the solution of x2 ≡ −1 mod p is p−1
! Perhaps this is
2
best illustrated numerically for p = 13: 12! ≡ −1 mod 13 by Wilson’s theorem. But
12! = 1 · 2 · 3 · · · 6 · 7 · 8 · · · 11 · 12 ≡ 6!(−6)(−5) · · · (−2)(−1) = 6!2 mod 13
Mon, May 21
Review of some factoring results in algebra.
xn − 1 = (x − 1)(1 + x + x2 + . . . + xn−1 )
This is the “geometric series” result. As a consequence, it follows that xn − 1 is composite
if x > 2 and n ≥ 2. However, even x = 2 is covered here if n is composite. For example, if
n = rs, 2n − 1 = 2rs − 1 = y s − 1 where y = 2r , and so has a factor y − 1 = 2r − 1. For
example, 29 −1 has a factor 23 −1 = 7. Otherwise stated, the only possible primes of the form
2n − 1 are those primes of the form P = 2p − 1. These are called Mersenne primes3 The
complete list of Mersenne primes through p = 257 are p = 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107
and 127. The primes turn up in the study of perfect numbers which we shall discuss later.
A similar factoring result in algebra involves odd powers:
x2n+1 = (1 + x)
2n
X
(−1)k xk = (1 + x)(1 − x + x2 − . . . + x2n−2 − x2n−1 + x2n ).
k=0
A consequence of this factorization is that any number of the form 2n where n has an odd
factor, is composite. For example, 215 + 1 can be written (23 )5 ) + 1 = 85 + 1 is divisible
by 23 + 1 = 9. So the only primes of the form 2k + 1 must have no odd divisors of k.
n
n
These are the numbers 22 + 1. A prime number of the form 22 + 1 is called a Fermat
3
After Marin Mersenne (1588-1648) who (incorrectly) listed alleged primes of this sort up to p = 257.
7
prime, after Fermat, who conjectured that all numbers of this form were prime. The first
five numbers of this form are 3, 5, 27, 257, and 65,537, which are in fact prime. The next is
4,294,967,297 which is not a prime – it has the factor 641, In fact, no Fermat prime beyond
these 5 have been discovered, and it is not known if any exist. These primes turn up in
an unexpected way in geometry. Gauss proved that if p is a Fermat prime, then there is a
geometric construction using straight edge and compass constructing a regular polygon with
p sides. In fact, a regular polygon with n sides is constructible if and only if n is the product
of distinct Fermat primes and some power of 2.
Solving congruences. We first practice computing solutions of linear congruences of the
form ax ≡ b mod p. For example, let’s solve
31x ≡ 14 mod 83
We multiply the equation to bring the coefficient of x closer to the modulus. Here, multiplying by 3 does it. Multiply by 3 to get
93x ≡ 42 mod 83
Now reduce the coefficient of x mod 83:
10x ≡ 42 mod 83
Divide by 2:
5x ≡ 21 mod 83
Now again, multiply by 17 and reduce mod 83
85x ≡ 357 mod 83
2x ≡ 25 mod 83
Add 83 to “make 25 even”.
2x ≡ 108 mod 83
So the solution is
x ≡ 54 mod 83.
Of course, we know that if (a, n) = 1, we have ax + ny = 1 for an x, y pair derivable
from the Euclidean algorithm. This gives us ax ≡ 1 mod n. So by multiplying by b we get
abx ≡ b mod n, So y = ax solves the congruence ay ≡ b mod n. The method shown here
simplifies it, from a practical standpoint, at least without a computer.
The Chinese Remainder Theorem. This states that if (m, n) = 1, the simultaneous
congruences
x ≡ r1 mod m, x ≡ r2 mod n
8
have a unique solution mod mn : x ≡ r mod mn.
We can see how to calculate the solution with the help of an example: Solve
x ≡ 7 mod 11, x ≡ 12 mod 17
Method: Rewrite the first congruence as x = 7 + 11s. (s is arbitrary.). Substitute in
the second to get 7 + 11s ≡ 12 mod 17, or 11s ≡ 5 mod 17. Coincidentally, this is easy,
because we can write 5 ≡ 22 mod 17, So we have 11s ≡ 22 mod 17, and canceling 11, we get
s ≡ 2 mod 17. Rewrite as s = 2 + 17t, and substitute into the original s equation x = 7 + 11s
to get x = 7 + 11(2 + 17t) = 29 + 189t. So finally, we can write this as x ≡ 29 mod 189.
More generally, the Chinese Remainder Theorem states that the k congruences x ≡ ri mod
ni , i = 1, . . . , k have a unique solution mod n1 · · · nk provided (ni , nj ) = 1 for i 6= j.
Tues, May 22.
Proof of the Chinese Remainder Theorem. We illustrate by taking m = 4, n = 5 so
mn = 20. Take any number r from the complete residue system mod 20. and reduce it mod
4 to find r1 and then mod 5 to find r2 .. For example, take r = 14. Then r1 = 14 mod 4 = 2
and r2 = 14 mod 5 = 4. Then put 14 in the row 2 (r1 ) and column 4 (r2 ). Similarly, r = 7
yields r1 = 3 and r2 = 2. Finally ,choosing r = 16, we get r1 = 0 and r2 = 1. So we put 16
in row 0, column 1. These examples are indicated in the table below.
0
0
1
2
3
1 2 3
16
4
14
7
If we do this for every r between 0 and 19 inclusive, we end up with the following table
0
1
2
3
0 1 2
0 16 12
5 1 17
10 6 2
15 11 7
3 4
8 4
13 9
18 14
3 19
In this example, the 20 numbers from 0 through 19, all fit into different locations and all
locations are filled. Note that there are 20 locations, so once we know that they all fit into
different locations, we know that all locations are filled (and conversely). We can now give the
proof of the Chinese Remainder Theorem. Suppose that n, m) = 1. Let 0 ≤ r ≤ mn − 1. For
any such r define r1 = r mod m and r2 = r mod n. We now show that the map r 7→ (r1 , r2 ).
9
To prove this, suppose r and s both map onto (r1 , r2 ). Then by definition r ≡ r1 mod m
and s ≡ r1 mod m. So r ≡ s mod m. Similarly, r ≡ s mod n. Since (m, n) = 1, this implies
r ≡ s mod mn. Therefore the map r 7→ (r1 , r2 ) is 1-1. And therefore, the map is onto all
ordered pairs (a, b) where 0 ≤ a ≤ m − 1 and 0 ≤ b ≤ n − 1. Thus, the system x ≡ a mod m
and x ≡ b mod n has a unique solution mod mn.
Proof of the formula for φ(n). If we go through the proof of the Chinese Remainder
Theorem, using only integers r relatively prime to mn, we can compute φ(mn) in terms of
φ(m) and φ(n). Let r1 = r mod m and r2 = r mod n. For suppose (r, mn) = 1. Then,
If r1 = r mod m then (r1 , m) = 1, similarly, If r2 = r mod n then (r1 , n) = 1, So both
r1 and r2 are relatively prime to m and n respectively. Also, if (r, mn) = d > 1, then
some prime number p divides r and mn. Suppose, for example, that p divides m. Then since
r1 = r mod m. Since p|m, we have r1 ≡ r ≡ 0 mod m, This shows that if r1 , m) = (r2 , n) = 1,
we must have (r, mn) = 1. In terms of the table above, this shows that if we eliminate the
rows which are not relatively prime to m and the columns which are not relatively prime to
n, we are left with precisely those r which are relatively prime to mn. Thus, we have the
important result
φ(mn) = φ(m)φ(n) if (m, n) = 1
We say that φ is a multiplicative function. In general, a function f is said to be multiplicative
if f (mn) = f (m)f (n) when (m, n) = 1.
The above result is enough to show how to calculate φ(n) for any n. We first find φ(n) for
any prime power pa ). In the range from 1 through pa , the multiples of p are the only ones
not relatively prime to pa . These are 1 · p, 2 · p, . . . , pa−1 · p. There pa−1 of these. Thus there
are pa − pa−1 = pa−1 (p − 1) in the range from 1 though pa which are relatively prime to pa .
This gives
p−1
φ(pa ) = pa−1 (p − 1) = pa
p
Using multiplicity, and factoring n into a product of prime powers, we have
φ(n) = n
Y
p−1
p
p|n
We also showed that if we set d(n) equal to the number of divisors of n, then d(n) is
multiplicative. Since d(pa ) = a + 1 (clearly), we find
d(
Y
pai 1 ) =
Y
i
(ai + 1)
i
Numbers Which are the Sum of Squares.
We first note that numbers of the form 4n+3 cannot be the sum of two squares. To see
this, we need only consider odd and even possibilities. Suppose we have n = a2 + b2 . Then,
10
depending on whether a and b are even or odd, we have a2 ≡ 0 or 1 mod 4, and similarly for
b2 . Thus a2 + b2 ≡ 0, 1, or 2 mod 4. So n = a2 + b2 ≡ 3 mod 4 is not possible.
Of course, the even prime 2=1+1 is the sum of two square. We now consider primes of the
from 4n + 1.
Theorem: Any prime of the form 4n + 1 is the sum of two square.
√
The proof uses the pigeonhole (or shoe-box) principle. Let p = 4n + 1. Take K = [ p].4
√
Then K < p < K + 1. Since p ≡ 1 mod 4, we can solve the equation x2 ≡ −1 mod p.
Using this x, we form the numbers u + xv for all pairs (u, v) where 0 ≤ u, v ≤ K. There are
√
(K + 1)2 pairs, and since p < K + 1., we have p < (K + 1)2 , there are more than p such
pairs. Therefore some two different numbers of the form u + xv are congruent mod p, say
u1 + xv1 ≡ u2 + xv2 mod p. This is equivalent to a + xb ≡ 0 mod p, where a = u1 − u2 and
b = v1 − v2 . Since (u1 , v1 ) and (u2 , v2 ) are different,, we have a2 + b2 > 0. Now write the
congruence as a ≡ −xb mod p. Squaring, we get a2 ≡ x2 b2 ≡ −b2 mod p or a2 +b2 ≡ 0 mod p.
√
√
√
√
We had 0 ≤ u1 ≤ K < p, and 0 ≤ u2 ≤ K < p. So 0 ≤ u1 < p, and− p < −u2 ≤ 0.
√
√
√
√
Adding these inequalities we get − p < u1 − u2 = a < p. Similarly, − p < b < p.
Thus, a2 + b2 < 2p. But we have a < 02 + b2 and a2 + b2 ≡ 0 mod p. Therefore, a2 + b2 = p.
This proves the result.
We now show that the product of two numbers which are the sum of two squares is also a
sum of 2 squares, This is simple algebra:
(a2 + b2 )(c2 + d2 ) = (ac + bd)2 + (ad − bc)2
We also note the obvious fact that 2 = 1 + 1 is the sum of two squares. We therefore have
the following result
Theorem: Let n = 2a pb11 · · · pbrr q12c1 · · · qs2cs where pi is a prime of the from 4n + 1 and qj is
a prime of the form 4n + 3. Then n is the sum of two squares.
Wed., May 23
We now show that conversely, if n is the sum of two squares, then n = 2a pb11 · · · pbrr q12c1 · · · qs2cs
where pi is a prime of the from 4n + 1 and qj is a prime of the form 4n + 3. To do this, we
need the following result.
Theorem: If n = a2 + b2 , q is a prime of the form 4k + 3 and q|n, then q|a, q|b, and so q 2 |n.
Proof: We have
a2 + b2 ≡ 0 mod q.
We claim that q|a. For if not, a 6≡ 0 mod q, and so a has an inverse mod q: aa ≡ 1 mod q.
4
We use [x] to mean the greatest integer in x.
11
Multiply the congruence by a to get
(aa)2 + (ab)2 ≡ 0 mod q, or 1 + (ab)2 ≡ 0 mod q
Thus, the congruence x2 ≡ −1 mod q has the solution x ≡ ab mod q. But this is impossible
by a previous theorem since q ≡ 3 mod 4. This contradiction shows that q|a. Similarly, q|b.
Therefore q 2 |n = a2 + b2 . This proves the result.
Now suppose n = 2a pb11 · · · pbrr q1c1 · · · qscs = a2 + b2 where pi is a prime of the from 4n + 1 and
qj is a prime of the form 4n + 3. If qj appears in this factorization, then qj |a, qj |a, and qj2 |n.
So cj ≥ 2 and we have n/qj2 = (a/qj )2 + (b/qj )2 . Continuing this process until all the q’s are
eliminated, we find that all the exponents of the q’s are even. This is the result.
Pythagorean Triples. These are positive integers a, b, c satisfying
a2 + b2 = c2 .
Note that if p divides any two of a, b, c it divides the third. In that case, we have (a/p)2 +
(b/p)2 = (c/p)2 . continuing this process we arrive at positive integers a, b, c satisfying
a2 + b2 = c2 , in which any two of these are relatively prime. This is called relatively prime
in pairs. Such a triple (a, b, c) is called a primitive Pythagorean triple. We now characterize
these.
Since (a, b) = 1, both can’t be even. But also, both can’t be odd. For if a and b were both
odd, we would have a2 ≡ 1 mod 4 and b2 ≡ 1 mod 4. So c2 = a2 + b2 ≡ 2 mod 4. This is
impossible because the square of an even number is congruent to 0 mod 4. So with no loss
in generality, we take a odd, and b even. Since c2 − a2 = b2 , we can factor and divide by 4
to get
!2
c−a c+a
b
·
=
2
2
2
c+a
c−a
and
are relatively prime. To see this, suppose d divides each of them. Then
But
2
2
d would divide their sum c and their difference a. Thus d = 1. But if the product of relatively
prime numbers is a square, each must be a square. Thus
c−a
c+a
= r2 ,
= s2
2
2
where r and s are relatively prime. Solving for c and a, we get c = r2 + s2 , and a = r2 − s2 .
Sing b2 = c2 − a2 , we get b2 = 4r2 s2 , so b = 2rs. Since c2 and a2 are the sum and difference
of r2 and s2 , we must also have r and s of different parities.5 Summarizing, all positive
primitive solutions of the equation a2 + b2 = c2 , with b even, are given by the two parameter
system
a = r2 − s2 , b = 2rs, c = r2 + s2 , where r > s, (r, s) = 1, and r + s is odd.
5
This means that one is odd and one is even.
12
The rings Zn and the fields Zp .
For fixed n > 0, we want to identify two numbers which are congruent mod n. For example,
we are accustomed to talk about even and odd numbers. Here we identify any numbers
congruent to 1 mod 2 as “odd.”, and similarly for “even.” For any n > 0, we let a = the set
of all b ≡ a mod n. For example, if n = 2, 0 is the set of even numbers and 1 is the set of
odd numbers. For any n, we have for fixed n,
a = b if and only if a ≡ b mod n
The effect of the overbar notation is to replace congruence mod n with equality.6 We define
Zn as the finite set {0, 1, . . . , n − 1}. In Zn we can define addition and multiplication by the
formulas:
a · b = ab and a + b = a + b
We can replace 0, 1,. . . , n − 1 by any complete residue system mod n. Zn is a ring. This is
a term in algebra in which the usual laws of algebra hold. The exception is that there might
be non-zero elements without inverse. For example, In Z6 , 2, 3, and 4 have no inverses. We
can construct addition and multiplication tables in Zn . The following table gives addition
and multiplication tables in Z6 . From now on we omit the overbar, and simply write aas a.
NO confusion will occur, if we note that we are working in |Zn .
Z6 :
+
0
1
2
3
4
5
0
0
1
2
3
4
5
1
1
2
3
4
5
0
2
2
3
4
5
0
1
3
3
4
5
0
1
2
4
4
5
0
1
2
3
5
5
0
1
2
3
4
×
0
1
2
3
4
5
0
0
0
0
0
0
0
1
0
1
2
3
4
5
2
0
2
4
0
2
4
3
0
3
0
3
0
3
4
0
4
2
0
4
2
5
0
5
4
3
2
1
For primes p, the algebra in |Zp is more like ordinary algebra because any element unequal
to 0 has an inverse. In such cases, the ring is called a field. Here are similar tables for Z5 .
Z5 :
+
0
1
2
3
4
0
0
1
2
3
4
1
1
2
3
4
0
2
2
3
4
0
1
3
3
4
0
1
2
4
4
0
1
2
3
×
0
1
2
3
4
0
0
0
0
0
0
1
0
1
2
3
4
2
0
2
4
1
3
3
0
3
1
4
2
4
0
4
3
2
1
6
More accurately, we should write an instead of a, but in any context, we try to make it clear what the
underlying n is.
13
Note the 1’s in the body of the multiplication tables. These occur when the row and column
corresponding to the 1 are inverses. For the Zp column, a 1 appears once in each row and
column other than the 0 row and column. We can see this above for Z5 . We see that this
fails if n is composite, as in the multiplication table for Z6 .
Thur., May 24
Lagrange’s Four Square Theorem.
This famous theorem states that ever positive number is the sum of four squares.
We started the proof, but left it unfinished, referring the class to classic text by Hardy and
Wright, The Theory of Numbers Oxford at the Clarendon Press. This was first published in
1938 and contains many beautiful results and methods. The Lagrange theorem is on page
302. It is accessible to anyone in the class, and uses little more than congruences.
The factorization of n! If p is a prime not more than n, we can find the highest power
a of p which divides n! The text uses the notation pa ||n! The double divisor sign is used to
indicate that pa |n! but pa+1 † n! We illustrate by finding a such that 3a ||100! First pull out
all multiples of 3 from any of the factors of 100!
100! = 1 · 2 · 3 · · · 6 · · · 9 · · · 96 · 99 · 100 = 333 1 · 2 · · · · 32 · 33 · K
where (K, 3) = 1. (We will use K generically in this way.)
Tues., May 29
A number is said to be perfect if the sum of its proper divisors7 is equal to that number.
For example, 6 is a perfect number: 6 = 1 + 2 + 3. In this section we show how to find all
even perfect numbers. to date, no odd perfect number has been found, and it is not known
if there are any. We start with a definition:
X
d.
σ(n) is the sum of all divisors of n. We write σ(n) =
d|n
Thus the condition that a number n is perfect is σ(n) = 2n. This is so because σ(n) includes
the proper divisors and also n. We now show that σ(n) is multiplicative; i.e. if (m, n) = 1
then σ(mn) = σ(m)σ(n). To see this, let d1 , . . . , dj be the divisors of m and let e1 , . . . , ek
be the divisors of n. Then dr es is a divisor of mn and all divisors are of this form. Since
(m, n) = 1, a little consideration using unique factorization shows that all of the dr es are
distinct. Therefore
σ(m)σ(n) = (
X
r
dr )(
X
es ) = (
s
X
dr es ) = σ(mn)
r,s
This easily permits us to compute σ(n), once we know its prime factorization. We first
compute σ(pa ):
pa+1 − 1
σ(pa ) = 1 + p + p2 + . . . + pa =
p−1
7
A proper divisor of n is a divisor unequal to n.
14
Therefore, since σ is multiplicative, for any product of distinct prime powers, we have
σ(
Y
i
pai i )
=
Y
i
pai i +1 − 1
pi − 1
Note in particular that σ(2a ) = 2a+1 − 1. Also σ(p) = p + 1 for a prime p.
We first note the following theorem: Let p be prime with 2p − 1 also a prime (a Mersenne
prime). Then n = 2p−1 (2p − 1) is a perfect number. The proof is direct. We have
σ(n) = σ(2p−1 )σ(2p − 1) = (2p − 1)(2p − 1 + 1) = 2p (2p − 1) = 2n
We now show that conversely, and even perfect number is of this kind. For suppose n is an
even perfect number. Write it as n = 2a R where a > 0 and R is odd. Then by definition,
σ(n) = 2n or σ(2a R) = 2a+1 R. Thus
σ(2a )σ(R) = 2a+1 R or (2a+1 − 1)σ(R) = 2a+1 R.
Thus (2a+1 − 1)|2a+1 R and so (2a+1 − 1)|R since 2a+1 − 1 and 2a+1 are relatively prime. Thus,
R = c(2a+1 − 1) and so σ(R) = c2a+1 . Now we claim that c = 1. For if not, the divisors of R
are at least 1, c(2a+1 − 1), and c. So σ(R) ≥ 1 + c + c(2a+1 − 1) = 1 + c2a+1 . This contradicts
σ(R) = c2a+1 . Thus c = 1 and R = 2a+1 − 1, and σ(R) = 2a+1 . But two divisors of R are 1
and 2a+1 − 1 whose sum is 2a+1 = σ(R). So there are no further divisors of R and R must
be prime. This implies that R = 2a+1 − 1 is a prime and from our discussion of Mersenne
primes, we know that a + 1 = p, a prime. So finally, n = 2p−1 (2p − 1).
The lecture included some discussion of continued fractions, to be repeated on the next day.
Wed. and Thur., May 30 and June 1
Continued Fractions. Any positive real number x can be written As x = [x] + f where
[x] is the greatest integer in x, and 0 ≤ f < 1. If f > 0, then we can write f = 1/g where
g > 1. This gives x = [x] + 1/g, We can continue this process with g. We illustrate with an
example:
2
1
1
11
=3+ =3+ 3 =3+
3
3
1 + 12
2
The text uses the notation h3, 1, 2i for this latter expression. For any rational number x the
Euclidean algorithm for the computation of the GCD of the numerator and the denominator
will give the entries for this continued fraction. For example, consider the fraction 53/22.
We use the Euclidean algorithm to compute (58,21):
Numerator
58
21
16
5
Denominator Quotient
21
2
16
1
5
3
1
5
15
Remainder
16
5
1
0
So 58/21 = 2 + 16/21 = 2 + 1/(21/16) = 2 + 1/(1 + 5/16). Continuing in this manner we
arrive at the continued fraction
1
58
=2+
= h2, 1, 3, 5i.
21
1 + 3+1 1
5
Note that the entries in this continued fraction are the quotients in the order they appear in
the Euclidean algorithm. We can define a continued fraction in general by induction:
ha0 i = a0 ; ha0 , a1 , . . . , an i = a0 +
1
for n > 0.
ha1 , . . . , an i
Note the useful identity ha0 , a1 , ,̇an i = ha0 , a1 , ,̇an−2 , han−1 , an−2 ii.
We always assume that ai > 0. a0 ≤ 0 is also allowed. For a given continued fraction
ha0 , a1 , . . . an i, We can compute the successive convergents as ha0 i, ha0 , a1 i . . . , ha0 , . . . , an i.
For our computed continued fraction h2, 1, 3, 5i, these are 2, 3, 11/4, and 58/21. Numerically
these are 2, 3, 2.75, 2.762. Convergence to the actual answer is very rapid.
Recall that the convergents of ha0 , a1 , . . . , an i are ha0 i, ha0 , a1 i, . . . . Writing these as fractions
p0 /q0 , p1 /q1 , etc. it looks like
pn = an pn−1 + pn−2 ; p− 2 = 1, p−1 = 1 and qn = an qn−1 + qn−2 ; q − 2 = 0, q−1 = 0
(1)
for n ≥ 0. (We will prove this in what follows.) We will introduce p−2 = 0, p−1 = 1 and
q−2 = 1, q−1 = 0 in order to get proper initial values for the p’s and q’s. We then construct
the following table:
n −2 −1 0 1 2
a0 a1 a2
an
0
1 a0 p1 p2
pn
1
0 1 q1 q2
qn
...
...
...
...
Once the second row is filled in with the given continued fraction, we compute the third and
fourth row recursively using the above equations. We illustrate with the above continued
fractionh2, 1, 3, 5i:
n −2 −1 0 1 2 3
2 1 3 5
an
pn
0
1 2 3 11 58
1
0 1 1 4 21
qn
Compare the convergents found above with the values pk /qk . We now prove that pk /qk are
the convergents.
16
Theorem: Let a0 , a1 , . . . , an , . . . be a sequence with ai > 0 for i > 0. Define the sequences
pn and qn using Equation (1). Then ha0 , a1 , . . . , an i = pn /qn ..
Proof: . We prove this for rational ai > 0. We check that this is true for n = 0 and 1.
These are simply the equations hai = a/1 = p0 /q0 and ha, bi = a + 1/b = (ab + 1)/b = p1 /q1 .
We assume it for all values less than n. We let x > 0 and compute
ha0 , a1 , . . . , an−1 , xi = ha0 , a1 , . . . , an−2 , han−1 , xii. Using induction this is
ha0 , a1 , . . . , an−2 ,
+1
+ pn−3
pn−2 xan−1
xan−1 + 1
x
i =
xan−1 +1
x
qn−2 x
+ qn−3
pn−2 (xan−1 + 1) + xpn−3
=
qn−2 (xan−1 + 1) + xqn−3
x(an−1 pn−2 + pn−3 ) + pn−2
xpn−1 + pn−2
=
=
x(an−1 qn−2 + qn−3 ) + qn−2
xqn−1 + qn−2
Finally, substituting x = an , we get
ha0 , a1 , . . . , an i =
an pn−1 + pn−2
pn
=
an qn−1 + qn−2
qn
This is the required result.
Theorem: For any continued fraction, pn qn−1 − qn pn−1 = (−1)n+1
Proof: By induction. It is true for n = −1. Assuming the truth for n, we compute:
pn+1 qn − qn+1 pn = (an+1 pn + pn−1 )qn − (an+1 qn + qn−1 )pn
= an+1 pn qn − an+1 qn pn + qn−1 pn − pn−1 qn
= −(−1)n+1 = (−1)n+2 .
This is the result for n + 1 proving the theorem.
(−1)n+1
Corollary. rn − rn−1 =
.
qn qn−1
pn pn−1
−
. This shows that the convergents oscillate.
For this is simply
qn
qn−1
Corollary. The convergents are in lowest terms.
This is so because we have a linear combination of pn and qn equal to ±1.
Corollary. For an infinite continued fraction, lim (rn − rn−1 ) = 0.
n→∞
(−1)n+1
. The denominators qn clearly approach ∞ as n → ∞
This follows from rn − rn−1 =
qn qn−1
17
Theorem: pn qn−2 − qn pn−2 = (−1)n an
The proof is a direct computation:
pn qn−2 − qn pn−2 = (an pn−1 + pn−2 )qn−2 − (an qn−1 + qn−2 )pn−2
= an pn−1 qn−2 − an qn−1 pn−2 = (−1)n an
This shows that the oscillations of the rn are not extreme. When rn moves right, it returns
to the right of where it started. Similarly, when it moves left, it returns to the left of where
it started. Thus, combining these results, we see that for an infinite continued fraction
ha0 , a1 . . . .i, the n-th convergent rn has a limit, which is called the value of this continued
fraction. The convergents oscillate about this value.
Tues., June 5
Polynomials over a field F . We have noted that a field is an algebraic system in which
the usual laws of algebra including addition, subtraction, multiplication and division, hold.
In particular, and element a 6= 0 ∈ F has an inverse a−1 ∈ F satisfying aa−1 = 1. We have
noted that Zp is a field for any prime p. F [x] denotes the polynomials with coefficients in
F . A polynomial f (x) can be put into the form an xn + an−1 xn−1 + . . . + a0 . If f (x) 6= 0, the
standard form is to take an 6= 0. In this case, n is called the degree of f (written deg(f )). If
f = 0, f is not assigned a degree.
Multiplication and division are performed on polynomials in the usual way. It is easy to
see that deg(f g) = deg(f ) + deg(g). For suppose f (x) = an xn + lower degree terms, and
g(x) = bm xm + lower degree terms with an , bm 6= 0. Then f (x)g(x) = an bm xm+n + lower
degree terms. By the field properties, an bm 6= 0 since an and bm 6= 0. This proves deg(f g) =
deg(f ) + deg(g). The formulas for the degree of a sum are more complicated. It is easy to
see that if deg(f ) > deg(g) then deg(f + g) = deg(f ). But if deg(f ) = deg(g), the leading
coefficients might cancel, so all we can say is deg(f + g) ≤ max(deg(f ), deg(g)). So in all
cases we have deg(f + g) ≤ max(deg(f ), deg(g)).
There is a strong analogy between the polynomials in F [x] and the integers Z. This is
illustrated in what follows.
Division. We say that f (x)|g(x) if there is a polynomial h(x) such that g(x) = f (x)h(x).
For example, (x−1)|(x3 −1) since x3 −1 = (x−1)(x2 +x+1). In Z7 [x], we have (x+4)|(x2 +5)
because x2 + 5 = (x + 4)(x + 3). (Check this.)
For integers, we have a division algorithm, in which for any n, m with m > 0, we can
divide to get a quotient q and remainder r with 0 ≤ r < m satisfying n = mq + r.
For polynomials, we have a division algorithm, in which for any f (x), g(x) with
g(x) 6= 0, we can divide (long division of polynomials) to get a quotient q(x) and remainder
r(x) with r = 0 or deg(r(x)) < deg(g(x)) satisfying f (x) = g(x)q(x) + r(x).
For integers, we have the Euclidean algorithm, in which a series of divisions with
18
remainders leads to finding the GCD d of the two numbers m and n. d is characterized by
two facts: 1) d|m and d|n. 2) If e|m and e|n, then e|d. (The GCD of m and n is a multiple
of any common divisor of m and n.
For polynomials, we have the Euclidean algorithm, in which a series of divisions with
remainders leads to finding the GCD d(x) of the two polynomials f (x) and g(x). d(x) is
characterized by two facts: 1) d(x)|f (x) and d(x)|g(x). 2) If e(x)|f (x) and e(x)|g(x), then
e(x)|d(x). (The GCD of f (x) and g(x) is a multiple of any common divisor of f (x) and g(x).
For integers, the GCD d = (a, b) is a linear combination of a and b: d = am + bn. a
and b can be computed from the Euclidean algorithm.
For polynomials, the GCD d(x) = (f (x), g(x)) is a linear combination of f (x) and
g(x): d(x) = a(x)f (x) + b(x)g(x). a(x) and b(x) can be computed from the Euclidean
algorithm.
For the integers, if a|bc and (a, b) = 1, then a|c. This same result holds for polynomial,
with the same proof.
For the integers, if a|b and b|a, then b = ±a. For polynomials, if f (x)|g(x) and
g(x)|f (x) then g(x) = cf (x) where c is a constant (that is, an element of F ) unequal to 0.
For integers, the numbers ±1 are the only integers dividing everything. These numbers are
called units. These are the integers of absolute value 1. For polynomials. the constants c 6= 0
are the only polynomials dividing all polynomials. These are the polynomials of degree 0.
Non-zero constants are called units.
For integers, primes are the integers n greater than 1, whose only positive divisors are 1 and
n. If we broaden this definition to include negatives numbers, we can say that n is a prime
if it is not a unit and its only divisors are units and a unit times n.
The corresponding polynomials are called irreducible. These are polynomials p(x)
whose only divisors are units or a unit times p(x).
We have unique factorization of polynomials. Any polynomial f (x) is the product of irreducible polynomials. The factorization is unique up to order of factors and units. For
example,
x2 − 4 = (x − 2)(x + 2) = (−1/2)(−2x + 4)(x + 2)
All polynomials of the first degree are irreducible. Polynomials x − a and x − b are relatively
prime if a 6= b. This can be seen since (x − a) − (x − b) = b − a. So any common divisor is
a unit.
Some Important Polynomials Results. These results do not have clear analogies in the
integers.
Theorem: (The Factor Theorem.) If f (x) is a polynomial in F [x] and f (a) = 0 for some
a ∈ F , then (x − a) is a factor of f (x).
Proof: Divide f (x) by x − a to get a remainder: f (x) = (x − a)q(x) + r, where r ∈ F .
Substitute x = a, this gives 0 = f (a) = (a − a)g(a) + r = r. So f (x) = (x − a)q(x) and
19
(x − a)|f (x).
Generalizing, we have: If f (x) has distinct zeros a1 , a2 , . . . , ak , then (x − a1 ) · · · (x − ak )|f (x).
For a proof, we have f (x) = (x − a1 )f1 (x). Substituting x = a2 , we get 0 = (a2 − a1 )f1 (a2 ).
Since a2 = a1 6= 0, we have f1 (a2 ) = 0 so f1 (x) = (x − a2 )f2 (x), and f (x) = (x − a1 )(x −
a2 )f2 (x). Continuing in this way, or by induction, we get the result.
Corollary. A polynomial of degree n has at most n distinct roots.
Proof: Otherwise, The polynomial would be divisible by a polynomial of degree greater
than n which is impossible.
We can use this theorem to prove Wilson’s Theorem, with the help of Fermat’s theorem. In
Z∗p , we have xp = x for x = 0, 1, . . . , p − 1. So the polynomial xp − x has these p elements
as zeros. Thus xp − x = x(x − 1)(x − 2) . . . (x − (p − 1))8 Now divide by x and substitute
x = 0 to get −1 = −1 · −2 · · · − (p − 1). Taking p as odd, we get −1 = (p − 1)! in Z∗p which
is Wilson’s theorem. The case p = 2 is trivial, because here, 1 = −1.
Primitive Roots.
We let Z∗p be the set of non-zero elements of Zp . The Z∗p is closed under multiplication and
taking inverses. (Technically, it is a group.) Z∗p has p − 1 elements. By Fermat’s theorem,
ap−1 = 1. For a ∈ Z∗p we define the order of a as the least positive integer k such that ak = 1.
We now show:
Theorem: If the order of a is h, then h|(p − 1).
For the proof, we divide p − 1 by h and get a remainder: p − 1 = hq + r with 0 ≤ r < h.
Then
1 = ahq+r = ahq ar = (ah )q ar = 1q ar = ar
Since h was the least positive integer k with ak = 1, we must have r ≤ 0. But r ≥ 0 as a
remainder. It follows that r = 0, so p − 1 = hq and h|(p − 1). This is the result.
Theorem: Let a ∈ Z∗p . If h is the order of a, then the h elements 1, a, a2 , . . . , ah−1 are all
distinct.
Proof: If ai = aj with 0 ≤ i < j ≤ h − 1, then 0 ≤ j − i < h, and aj−i = 1. Since j − i < h,
it follows that j − i = 0 by the definition of order.
Theorem: Let a ∈ Z∗p . If h is the order of a, then am = an if and only if m ≡ n mod h.
Proof: If m ≡ n mod h, then m = n + hs. So am = an+hs = an ahs = an . Conversely, suppose
an = am . Then an−m = 1. Dividing n − m by h, we get n − m = hq + r, with 0 ≤ r < h.
Then ahq+r = 1. Using ah = 1, we get ar = 1. Since r < h, we must have r = 0 using the
definition of order. Thus, n − m = hq and m ≡ n mod h.
Theorem: Let a ∈ Z∗p , and let h be the order of a. The any element of order h is necessarily
8
The polynomial result also calls for a non-zero constant. We can see it is 1 by comparing the coefficient
of xp on both sides of this equation.
20
a power of a.
Proof: There are h powers of a by a previous theorem. They all satisfy the equation
xh − 1 = 0. But an equation of degree h has at most h solutions. So all the solutions of this
equation are powers of a. Therefore, since it satisfies this equation, every element of order h
must be a power of a.
Theorem: Let a ∈ Z∗p , and let h be the order of a. Then there are exactly φ(h) elements of
order h. These are ak where 1 ≤ k ≤ h with (k, h) = 1.
Proof: . We first show that these φ(h) elements have order h. Let b = ak where (k, h) = 1
and 1 ≤ k ≤ h. Then bh = akh = (ah )k = 1. Now suppose that bs = 1 with s > 0. This gives
aks = 1, and so h|ks. Since (k, h) = 1 this implies h|s. Since s > 0. we have s ≥ h. This
shows that the smallest power of b equal to 1 is h. Thus the order of b is h.
We now show that no other power of a has order h. Let c = as , with (s, h) > 1. Let d|h and
d|s with d > 1. So s = dt and c = adt . Then ch/d = (adt )(h/d) = aht = 1. So c cannot have
order h since a smaller power than h can be used as a power of c to yield 1. This completes
the proof.
We know that the order of any element divides p − 1. We now prove that any divisor d of
(p − 1) is an order of some element (and so exactly φ(d) elements.)
Theorem: . Let d|(p − 1). Then there are φ(d) elements in Z∗p with order d.
Proof: Let d|(p − 1) and define N (d) as the number of elements of order d. Then there are
either 0 elements of order d or φ(d) such elements.
So N (d) ≤ φ(d). Since every element in
X
∗
N (d) = p − 1. Thus,
Zp has some order dividing p − 1, we have
d|(p−1)
p−1=
X
N (d) ≤
d|(p−1)
X
φ(d) = p − 1
d|(p−1
(The last equality is a previous theorem.) This shows that all of the inequalities involving
≤ must be replaced by equalities. Therefore N (d) = φ(d), and so, for any d|(p − 1) there
are exactly φ(d) elements of Z∗p having order d.
An element of Z∗p is called a primitive element if it has order p − 1. This analysis shows that
there are φp − 1 primitive elements, and shows how to find them, once one is known. We
illustrate for p = 7. Here 23 = 1 so 2 is not a primitive element. 33 = 27 = −1, so 3 does
have order 6, and is a primitive root. We list the powers of 3 in the table below:
n
3n
1
3
2 3 4
2 6 4
5 6
5 1
The above theory shows that the other primitive root is 35 = 5.
21
Thur, June 7
Number Theoretic Functions. All functions are understood to be functions of a positive
integer. A function f is called multiplicative if
f (mn) = f (m)f (n) when (m, n) = 1.
We first note that if (m, n) = 1 any divisor d of mn can be written uniquely as d = d1 d2
with d1 |m and d2 |n. Conversely, if (m, n) = 1 and d1 |m and d2 |n, and d = d1 d2 then d|mn.
The proof is straightforward and is clear, using unique factorization.
Theorem: Let f (n) be multiplicative. Define F (n) =
X
f (d). Then F (n) is multiplicative.
d|n
Proof: Let (m, n) = 1. Using the remark above about divisors of mn, we have
F (mn) =
X
f (d) =
d|mn
X
f (d1 d2 ) =
d1 |m,d2 |n
We can use this theorem to prove
X
f (d1 )f (d2 ) = (
d1 |m,d2 |n
X
X
f (d1 ))(
d1 |m
X
f (d2 )) = f (m)f (n).
d2 |n
φ(d) = n. This was previously proved using the defini-
d|n
P
tion of φ(d). By the theorem just proved, we know that F (n) = d|n φ(d) is multiplicative.
We also know, from the definition, that φ(pa ) = pa − pa−1 . Therefore,
F (pa ) =
X
φ(d) = 1 +
d|pa
a
X
(pi − pi−1 ) = pa
i=1
(The sum is a “telescoping series” and it can be proved by induction on a.) This computation
shows that F (pa ) = pa . Therefore, since F is multiplicative, we have for any n,
F (n) = F (
Y
p|n
pan ) =
Y
F (pan ) =
p|n
Y
pan = n
p|n
The Möbius Function µ(n). This function is defined as follows.
µ(n) = (−1)k if n is the product of k distinct primes
Otherwise, µ(n) = 0.
Thus µ(n) = 0 when n is divisible by the square of a prime.
For example µ(5) = −1, µ(28) = 0, µ(35) = 1.
Theorem:
X
µ(d) = 0 if n > 1. If n = 1, the sum is 1.
d|n
It is easy to see that µ(n) is multiplicative. For if m and n are relatively prime
with m and n as the product of r and s distinct primes, then mn is the product of r + s
distinct primes, Therefore, µ(m)µ(n) = (−1)r (−1)s = (−1)r+s = µ(mn). Therefore, (using
22
f (n) = 1 in the theorem on adding a function of the divisors of n) it follows that
is multiplicative. We now compute its value for prime powers. If a ≥ 1 then
X
P
d|n
µ(d)
µ(d) = µ(1) + µ(p) = 1 − 1 = 0.
d|pa
Since F is multiplicative,, it follows that F (n) = 0 for n > 1. Clearly, F (1) = 1. This proves
the theorem.
An alternate proof is as follows. If n is product of k prime powers pai i , then the only
P
contribution to the sum d|n µ(d) is when d is the product of some of the primes. The
possibilities are
{1, pi , pi pj (i < j), . . . , p1 · · · pk }.
The contribution to the sum from these divisors are
!
!
p
p
−
+ . . . + (−1)k
1−p+
2
3
But this is the expansion of (1 − 1)k which is 0 when k > 0.
Finally, we prove the famous Möbius Inversion Formula: If f (n) and F (n) are each defined
for n ≥ 1 and
X
F (n) =
f (n), n ≥ 1
(2)
d|n
Then
f (n) =
X
µ(d)F (n/d)
(3)
d|n
For example, setting f (n) = an , F (n) = bn , we are given
b1 = a1 , b2 = a1 + a2 , b3 = a1 + a3 , b4 = a1 + a2 + a4 , . . .
We can solve these successively for an as functions of the b’s::
a1 = b1 , a2 = b2 − b1 , a3 = b3 − b1 , a4 = b4 − b2 , . . .
The inversion formula gives the answer directly. For example, a20 = b20 − b10 − b4 + b2 .
To prove the inversion formula, we directly compute the right hand side of Equation (3). In
this case, we let D designate a divisor of n/d.
X
d|n
µ(d)F (n/d) =
X
d|n
µ(d)
X
D|(n/d)
f (D) =
X X
µ(d)f (D)
d|n D|(n/d)
Note that D|(n/d) and dD|n and d|(n/D)X
are all equivalent. In the above sum we combine
µ(n/D). But this has been shown to be 0,
the coefficients of f (D). They sum to
d|(n/D)
23
except for D = n, when it is 1. This is the coefficient of f (n). Thus, the sum is f (n) which
is the result.
Note: Rearranging such a double some can be confusing, so we illustrate with an example.
We give an example of this procedure for n = 6. We have
F (1) = f (1)
F (2) = f (1) + f (2)
F (3) = f (1) +
f (3)
F (6) = f (1) + f (2) + f (3) + f (6)
We now multiply f (d) by µ(6/d) This gives
µ(6)F (1) = µ(6)f (1)
µ(3)F (2) = µ(3)f (1) + µ(3)f (2)
µ(2)F (3) = µ(2)f (1)
+ µ(2)f (3)
µ(1)F (6) = µ(1)f (1) + µ(1)f (2) + µ(1)f (3) + µ(1)f (6)
Adding, we get the result f (6) =
X
µ(d)F (6/d). We used the formula
d|6
X
µ(d) = 0 if n 6= 1,
d|n
and equals 1 if n = 1.
Mon, June 11.
Error checking. Any positive number n can be written uniquely in base 10 in the form
n = a0 + 10a1 + 102 a2 + . . . + 10k ak ,
where each ai is a “digit”: 0 ≤ ai ≤ 9 and ak , the leading digit, is not zero. The number
is written as ak . . . a2 a1 a0 . There are a few simple ways to find n mod 9, 10, and 11. Mod
10 is the simplest, since n ≡ a0 mod 10.We can find n mod 9 by noting that 10 ≡ 1 mod 9.
Therefore
n = a0 + 10a1 + 102 a2 + . . . + 10k ak ≡ a0 + a1 + a2 + . . . + ak = S(n) mod 9
P
where we define S(n) as the sum of the digits of n: S(n) = ai . For example, if n = 345, 682,
S(n) = 28, so n ≡ 28 ≡ 10 ≡ 1 mod 9, where we have applied this formula repeatedly to the
sum. When finding n mod 9, we can simply add the digits mod 9. Mod 11 is similar. Using
10 ≡ −1 mod 11 we find:
n = a0 + 10a1 + 102 a2 + . . . + 10k ak ≡ a0 − a1 + a2 − . . . + (−1)k ak = A(n) mod 11
P
where we define A(n) as the alternating sum of the digits of n: A(n) = (−1)i ai . For
example, if n = 456, 821, A(n) = 1 − 2 + 8 − 6 + 5 − 4 = 14 − 12 = 2, so n ≡ 2 mod 11,
By checking a calculation mod 9, 10, or 11, many errors can be found. For example, we
immediately know that the alleged result 437 × 538 = 234, 109 is wrong because we see that
24
the last digit of the answer should be 6, not 9. By looking at the last digit, we have have
checked mod 10. This doesn’t work for the alleged computation 437 × 538 = 233, 106. But if
we check mod 9 we find 437 ≡ 5 and 538 ≡ 7 mod 9. Their product should be congruent to
5 × 7 = 35 ≡ 8 mod 9. But the alleged answer is congruent to 6. This inconsistency shows
that the calculation is incorrect. The calculation 437 × 538 = 231, 506 passes the mod 9 and
10 test, but it fails the 11 test: By observation 437 ≡ 8 mod 11 and 538 ≡ 10 mod 11, so
the product should be congruent to 80 or 3 mod 11. But the alleged answer is congruent
to 6 − 0 + 5 − 1 + 3 − 2 = 11 or 0 mod 11. The correct answer is 235,106. The mod 10
calculation check the last digit. The mod 9 calculation will usually catch an incorrect digit
in the answer. The mod 11 check will catch a transposition of 2 digits.
Realistically, if the numbers are not too large, a hand calculator is a fine direct check of the
answer. Also, the method is not fail-safe. By checking mod 9, 10, 11, your are checking
your answer mod 990. Any number congruent to the correct answer mod 990 will pass these
tests. So these techniques can be described as recreational mathematics.
The Legendre symbol. In what follows, we take p and q as odd primes. A number a
is called a quadratic residue mod p if a 6≡ 0 mod p and the equation x2 ≡ a mod p has a
solution. That is, a = b2 for some b. a is called a quadratic non-residue mod p!if a 6≡ 0 mod p
a
is defined as
and the equation x2 ≡ a mod p has no solution. The Legendre symbol
p
follows:
!
a
= 1 if a is a quadratic residue mod p
p!
a
= −1 if a is a quadratic non-residue mod p
p!
a
= 0 if a ≡ 0 mod p
p
While, on the face of, it, this symbol is simply a code to identify quadratic residues, it enjoys
many useful algebraic properties. Before beginning, we start with the following results.
Theorem: Any quadratic residue satisfies the equation x(p−1)/2 − 1 ≡ 0 mod p.
Proof: If a is a quadratic residue mod p, then a = b2 for some b ∈ Z∗p Therefore,
a(p−1)/2 = (b2 )(p−1)/2 = bp−1 = 1 in Z∗p
This proves the result.
Theorem: . There are (p − 1)/2 quadratic residues, and (p − 1)/2 quadratic non-residues.
Proof: We show that the numbers i2 where 1 ≤ i ≤ (p − 1)/2, namely the numbers
12 , 22 , 32 , . . . , ((p − 1)/2)2 are distinct quadratic residues mod p. To see this, suppose
i2 ≡ j 2 mod p where 1 < i ≤ j ≤ (p − 1)/2. Then (j − i)(j + i) = j 2 − i2 ≡ 0 mod p.
But 0 ≤ j − i < j + i < p, since i, j ≤ (p − 1)/2 < p/2. Since p † (j + i), we must have
25
p|(j − i) and so i = j. Thus, the powers i2 are distinct for 1 ≤ i ≤ (p − 1)/2. The numbers
(p + 1)/2 ≤ i ≤ p − 1 can be written as (p − i) for 1 ≤ i ≤ (p − 1)/2, so they are the negatives
mod p of the numbers i for which 1 ≤ i ≤ (p − 1)/2. So they yield the same squares. Thus
the quadratic residues are simple the (p − 1)/2 numbers i2 , where 1 ≤ i ≤ (p − 1)/2. The
remaining (p − 1)/2 numbers mod p are non-residues.
We now state and prove the algebraic properties of the Legendre symbol.
!
1.
a
p
2,
ab
p
≡ ap−1/2 mod p.
!
=
a
p
!
!
b
.
p
3. If a ≡ b mod p then
!
4.
a2
p
5.
−1
p
a
p
!
=
!
b
.
p
= 1.
!
= (−1)(p−1)/2 .
Proofs. 1. We already know that if a is a quadratic residue, then ap−1/2 mod p. This is
equation 1 in that case.. We consider the equation xp−1 − 1 = (x(p−1)/2 − 1)(x(p−1)/2 + 1) = 0.
We have shown that the first factor has the (p − 1)/2 quadratic residues as its zeros. These
are the only zeros because a polynomial of degree n can have at most n zeros. Therefore,
the non-residues are zeros of the second factor.
Namely a(p−1)/2 + 1 ≡ 0 mod p for any non!
a
mod p. This shows that equation 1 is true when
residue. We can rewrite as a(p−1)/2 ≡
p
a is a non-residue mod p.
!
ab
a
2. We have
≡ (ab)(p−1)/2 = a(p−1)/2 b(p−1)/2 ≡
p
p
of equation 2 are ±1 the congruence implies equality.
!
b
p
!
mod p. But since both sides
3 and 4 follow from the definition.. We know 5 as a congruence mod p from 1. But since
both sides are ±1, we have equality.
!
!
a
a
≡ a(p−1)/2 mod p can be used to calculate
, the following
While the basic equation
p
p
method does not involve computing powers of a. It is used for a proof of the quadratic
reciprocity law, proved in the next section. We illustrate it with a simple computation to
26
5
calculate
. We start by multiplying 5 by all the numbers from 1 through 6 (This is
13
p − 1/2 for p = 13. We then reduce mod 13: We get
5 × 1 ≡ 5 mod 13
5 × 2 ≡ 10 mod 13 xx
5 × 3 ≡ 2 mod 13
5 × 4 ≡ 7 mod 13 xx
5 × 5 ≡ 12 mod 13 xx
5 × 6 ≡ 4 mod 13
An xx mark was placed after each remainder which was bigger than 6 p − 1/2 for p = 13.
IN these cases replace the remainder r by r − 13. This has the effect of making it negative,
keeping the congruence, and except for the sign, putting the number in the range from1
through 6:
5 × 1 ≡ 5 mod 13
5 × 2 ≡ −3 mod 13 xx
5 × 3 ≡ 2 mod 13
5 × 4 ≡ −6 mod 13 xx
5 × 5 ≡ −1 mod 13 xx
5 × 6 ≡ 4 mod 13
Note that except for the sign, the remainders are a rearrangement of the numbers from
1 through 6. Now multiply these
congruences to get 56 6! ≡ (−1)36! mod 13, so 56 ≡
5
= −1, using the congruence 1.
−1 mod 13. This shows that
13
We now state and prove this result in general.
Theorem: Let (a, p) = 1. For each i such that 1 ≤ i ≤ (p − 1)/2, let
! ui be the remainder
a
of ia mod p. Let n = the number of remainders ui > p/2.. Then
= (−1)n .
p
Proof: Suppose there are n remainders r1 , . . . , rn which are greater than p/2, and k remainders s1 , . . . .sk which are less than p/2. Then k + n = (p − 1)/2. Then
(a) The ri are distinct, since if 1 ≤ i ≤ j ≤ (p − 1)/2, and ia ≡ ja mod p, we must
have i ≡ j and so i = j. Similarly, the si are all distinct.
(b) The values of p − ri are all distinct, since if p − ri = p − rj , we have ri = rj and
so i = j.
(c) Further, we cannot have p − ri = sj . For ri ≡ σa and sj ≡ τ a where 1 ≤
σ, τ < p/2. So if p − ri = sj , we have p = ri + sj ≡ σa + τ s = (σ + τ )a mod p. But
0 < σ + τ < p/2 + p/2 = p. So (σ + τ, p) = 1. Canceling, we get 0 ≡ a mod p which is a
contradiction.
Thus the (p − 1)/2 elements (p − r1 ), . . . , (p − r)n), s1 , . . . , sk are all distinct and in the range
27
from 1 through (p − 1)/2. Thus, they are a rearrangement of the numbers 1, 2, . . . , (p − 1)/2..
Since the remainders of ia are ri and sj , we get after multiplying
1a · 2a · · ·
p−1
a = a(p−1)/2) (p − r1 ) · · · (p − rn )s1 · · · sc
2
≡ (−1)n r1 . . . rn s1 · · · sk mod p
p−1
≡ (−1)n 1 · 2 · · ·
mod p
2
Canceling, we get
(p−1)/2
a
n
≡ (−1) mod p. In view of the above theorem and the basic
a
p
!
≡ a(p−1)/2 mod p,
this gives the result.
Tues., June 12
The Quadratic Reciprocity Law. The analysis continues. Recall that r1 , . . . , rn , s1 , . . . , sk
is a rearrangement, mod p, of the numbers ia, 1 ≤ i ≤ (p − 1)/2. Further, the numbers
p − r1 , . . . , p − rn , s1 , . . . , sk is a rearrangement of the numbers 1, 2, . . . , (p − 1)/2.
If m is divided by n, leaving a remainder r, we have m = nq + r, where 0 ≤ r < n. So
m/m = q + r/n, and 0 ≤ r/n < 1. Thus, q = [m/n], and m = n[m/n] + r. Here [x] is the
greatest integer function. We now bring the quotient into play in the above analysis.
Write ia = p[ia/p] + r, for 1 ≤ i ≤ (p − 1)/2. Here r will be one of the ri or sj of the previous
theorem. We sum over all i to get
(p−1)/2
a
X
(p−1)/2
i=p
i=1
X
[ia/p] +
X
i=1
ri +
X
i
sj
(4)
j
Since p − r1 , . . . , p − rn , s1 , . . . , sk is a rearrangement of the numbers from 1 to (p − 1)/2,
(p−1)/2
X
i can be computed in two ways. It is
i=1
we can use the high school result
n
X
X
(p − ri ) +
X
sj or np −
P
ri +
P
sj . Also
i = n(n + 1)/2. In this case, n = (p − 1/2 , so
i=1
1p−1p+1
p2 − 1
n(n + 1)/2 =
=
. Thus
2 2
2
8
(p−1)/2
X
i = np −
X
ri +
X
sj .
i=1
Subtracting this from Equation (4), we get
(p−1)/2
2
(a − 1)(p − 1)/8 = p
X
i=1
28
[ia/p] + 2
X
ri − np.
(5)
Now take this mod 2:
(p−1)/2
X
(a − 1)(p2 − 1)/8 =
[ia/p] − n
(mod 2)
(6)
i=1
We take two cases:
Case 1. a is odd. Then we have
(p−1)/2
X
0=
[ia/p] − n mod 2
i=1
So n ≡
P(p−1)/2
i=1
[ia/p] mod 2, and so by the previous result
a
p
!
= (−1)
P(p−1)/2
i=1
[ia/p]
when a is odd.
Case 2. a = 2. Here equation (6) becomes
(p−1)/2
X
(p2 − 1)/8 =
[2i/p] − n
(mod 2)
i=1
In this case, since i < p/2, we have 2i < p, so [2i/p] = 0, and the sum vanishes. Therefore
we have n ≡ (p2 − 1)/8 mod 2, and by the previous theorem,
2
p
!
= (−1)(p
2 −1)/8
This result can be read simply as
2
p
!
= 1 if p ≡ ±1 mod 8
= −1 if p ≡ 4 ± 1 mod 8
The quadratic reciprocity law states that if p and q are different odd primes, then
p
q
!
q
p
!
= (−1)
p−1 q−1
2
2
Our result, so far, gives
p
q
!
q
p
!
= (−1)
P(p−1)/2
1=1
[iq/p]+
P(q−1)/2
j=1
[jp/q]
We shall show these are equivalent by proving these two exponents are equal.
29
Wed., June 13
We now show that
(q−1)/2
X
X
p − 1 q − 1 (p−1)/2
[iq/p] +
[jp/q]]
=
2
2
1=1
j=1
for distinct odd primes p and q. The combinatorial proof is as follows. We consider all
ordered couples (i, j) with 1 ≤ i ≤ (p − 1)/2 and 1 ≤ j ≤ (q − 1)/2. Since there are (p − 1)/2
p−1q −1
such couples. We
choices for i and (q − 1)/2 choices for j. There is a total of
2
2
now count these couples in a different way:
Case 1. jp < iq. This is the condition j < iq/p. For fixed i, the j’s can be
1, 2, . . . , [iq/p] so there are [iq/p] possibilities for j. So for 1 ≤ i ≤ (p − 1)/2, there is a total
(p−1)/2
of
X
[iq/p] couples (u, v) for which jp < iq.
1=1
(q−1)/2
Case 2. iq < jp. The same proof shows that in this case there are
X
[jp/q]]
j=1
couples satisfying this condition.
Finally, jp = iq is not possible, because this implies p|iq and so p|iq which is not
possible because 1 ≤ i ≤ (p − 1)/2.
Summarizing, the full quadratic reciprocity law is: If p and q are distinct odd primes, then
p
q
!
= (−1)
p−1 q−1
2
2
;
2
p
!
= (−1)
(p2 −1)/8
;
−1
p
!
= (−1)(p−1)/2
The Congruence x2 ≡ a mod pn . If p is any prime and x2 ≡ a mod pn with n > 1, then
2
clearly
! x ≡ a mod p. We shall show that the converse is true: If p is odd, n > 1 and
a
= 1, then the equation x2 ≡ a mod pn has exactly two solutions, x ≡ ±b mod pn .
p
To see this, suppose, we have a solution x ≡ b mod p for the equation x2 ≡ a mod p. We
shall show how to “lift” this solution to a solution mod p2 . We have (a − b2 )/p = c. The
solution to the p congruence may be written x = b + pt. Substituting into the congruence
x2 ≡ a mod p, we get (b + pt)2 ≡ a mod p2 , or b2 + 2pt + p2 ≡ a mod p2 . Using (a − b2 )/p = c,
this becomes 2pt ≡ pc mod p2 , This is equivalent to 2t ≡ c mod p. this linear equation has a
unique solution mod p, say t = d + ps. this gives x = b + p(d + ps), or x ≡ b + pd mod p2 . In
the same way, we can lift this solution to a solution mod p3 , and ultimately, by induction,
to pn ,
We illustrate with an example. Consider the congruence x2 ≡ 14 mod 125. We first work
with x2 ≡ 14 ≡ 4 mod 5. One solution is x ≡ 2 mod 5. Writing x = 2 + 5t, we work
with the congruence x2 ≡ 14 mod 25. This gives 4 + 20t + 25t3 ≡ 14 mod 25. Simplifying,
30
20t ≡ 10 mod 25. Dividing by 5, we get 4t ≡ 2 mod 5. Solving, t ≡ 3 mod 5. Writing
t = 3 + 5s, this gives x = 2 + 5(3 + 5s) = 17 + 25s. This gives x mod 25. Substituting into the
original congruence, we get (17+25s)2 ≡ 14 mod 125. Simplifying, 275+34·25s ≡ 0 mod 125,
Dividing by 25, 11 + 34s mod 5 or 1 + 4s ≡ 0 mod 5.Solvings=1+5u, so x = 17 + 25(1 + 5u).
So finally, the solution is x ≡ 42 mod 125. This is lifted from x ≡ 2 mod 5. To lift from
x ≡ −2 mod 5, we would arrive at x ≡ −42 ≡ 83 mod 125.
The case p = 2 needs special consideration. For example the congruence x2 ≡ 3 mod 4 has
no solution, although the congruence mod 2 has the obvious solution x ≡ 1 mod 2. −1 ≡ 1
is the same solution. also, the congruence x2 ≡ 1 mod 8 has 4 solutions, x ≡ 1, 3, 5, 7 mod 8.
We do not consider the modulus 2n in these notes.
The congruence x2 ≡ a mod n can now be reduced to a congruence mod prime powers. For,
writing n = n1 · · · nk , where ni = pai i with distinct pi . Then the congruence is equivalent
to the system of congruences x2 ≡ a mod ni , i = 1, . . . , k. If the congruence mod ni has ai
solutions, then the congruence mod n will have a1 · · · ak solutions. This is so, because any
solution x ≡ bi mod ni give rise to a solution mod n by the Chinese remainder theorem.
The Gaussian Integers.
Gaussian integers Z[i] are defined as the set of complex numbers a + bi, where a and b are
integers in Z. This system is closed under addition, subtraction and multiplication. Division,
done by “rationalizing the denominator” yields complex number a + bi where a and b are
rational numbers. We use the notation z to indicate the conjugate of z. Thus, a + bi = a−bi.
The following results can easily be checked: z = z, and zw = zw.
We define N (z) = zz. In coordinate form, N (a + bi) = (a + bi)(a − bi) = a2 + b2 . We have
N (zw) = N (z)N (w). This equation give a simple proof that the product of two sums of
squares is a sum of squares. For example, if n = a2 + b2 and m = c2 + d2 , then n = N (α)
and m = N (c + di), where α = a + bi and γ = c + di. So nm = N (αγ). Computing
αγ = (ac − bd) + i(ad + bc), this gives the identity
(a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2
We had this before, but here it arises naturally. We get another expression if we choose
γ = c − di.
Units, associates, and primes. The definition of division in Z[i] is as expected. α|β
if and only if α = βγ for some γ ∈ Z[i]. A unit u is an element which divides 1, and so
divides all elements. To find the units, suppose u is a unit. Then 1 = uv. Taking norms,
1 = N (u)N (v), so N (u) = 1. If u = a + bi, this is a2 + b2 = 1 Thus a = ±1 and b = 0, or
b = ±1 and a = 0. Thus there are 4 units: 1, −1, i, and −i. If α and β divide each other:
α|β and β|α, then it is easy to show that β = uα where u is a unit. In this case, we say that
α and β are called associates. A prime π is an element whose only divisors are units and
associates of π. A unit u is characterized by the condition N (u) = 1.
31
We can give a simple sufficient condition for an element of Z[i] to be a prime:
Theorem: If N (α) is a prime in Z, then α is a prime in Z[i]T.
Proof: For suppose α is not a prime. Then α = βγ. Then N (α) = N (β)N (γ). But since
N (α) is a prime, N (β) = 1 or N (γ) = 1, so either β or γ is a unit. Therefore α is a prime.
For example, 1 + i and 4 − i are primes, because their norms are respectively 2 and 17.
Z[i] behaves like Z in a very important way. There is a division algorithm in Z[i] which
allows for a version of the Euclidean algorithm used to find the GCD of two numbers.
Theorem: (The Division Algorithm.) Let α and β be elements of Z[i] with β 6= 0. Then
there exists γ and ρ in Z[i] such that α = βγ + ρ, and N (ρ) < N (β).
Proof: . Let α/β = δ, where δ = c + di, with c and d rational. Now find integers m and n
closest to m and n respectively, so that c = m + f1 , d = n + f2 with |fi | ≤ 1/2 for i = 1 and
2. Then N (f1 + f2 i) ≤ 1/4 + 1/4 = 1/2. By definition we have
α/β = (m + ni) + (f1 + f2 i)
so
α = βγ + ρ
where γ = m + ni and ρ = β(f1 + f2 i. We have
N (ρ) = Nβ)N (f1 + f2 i) ≤ N (β)/2 < N (β)
Finally ρ ∈ Z[i] since it is α − βγ.
The division algorithm allows us to use the Euclidean algorithm to find the GCD of any two
elements, and to express it as linear combination (with coefficients in Z[i]) of those elements.
As in the theory in Z this gives us unique factorization (up to order of primes and unit
factors).
What are the primes in Z[i]? First suppose that an integer prime p is the sum of two squares:
p = a2 + b2 . Then setting π = a + bi, we have N (π) = a2 + b2 = p, so π is a prime in Z[i]. So
is π and p splits into two prime factors: p = ππ. Note the associates of π are π, iπ, −π, iπ,
and similarly for π. The 8 variations here correspond to p = (±a)2 + (±b)2 and this sum in
reverse order. The only time there are overlaps here9 are for the prime 2. Here π = 1 + i,
and π = 1 − i = −i(1 + i) = iπ. So the factorization 2 is 2 = −i(1 + i)2 .
The above analysis applies to the prime 2, and to any prime p ≡ 1 mod 4. If p ≡ 3 mod 4,
then p is a prime in Z[i]. To see this, suppose p ≡ 3 mod 4, and p is not a prime in Z[i].
Then p = αβ with N (α) and N (β) > 1. Taking norms, p2 = N (α)N (β), and so N (α) = p.
But this give p as the sum of square which is not possible in Z. This is a contradiction.
9
This can be seen geometrically.
32
Summarizing, the distinct primes of Z[i] are:
Type 1. The prime 1 + i. Here 2 = −i(1 + i)2 .
Type 2. Integer primes p ≡ 3 mod 4.
Type 3. Each prime p ≡ 1 mod 4 generates two distinct primes (non-associates) π and π
satisfying p = ππ.
Remark. To factor α in Z[i], take norms. For example, let’s factor 7 − 2i into primes. We
have N (7 − i) = 49 + 1 = 50 = 2 · 52 . So 1 + i is a factor, and an other is either 2 + i or 2 − i
up to a unit factor. Since
(7 + i)(1 − i)
8 − 6i
7+i
=
=
= 4 − 3i
1+i
2
2
Now (2 + i)2 = 3 + 4i = i(4 − 3i), so (4 − 3i)/(2 + i)2 = 1/i = −i. so putting this all together,
we get 7 + i = −i(1 + i)(2 + i)2 .
What is the factorization of an integer n ∈ Z? In Z, the prime factorization of n is
n = 2a pa11 · · · par r q1b1 · · · qsbs
where pi ≡ 1 mod 4 and qi ≡ 3 mod 4. Each pi = πi πi , so the factorization of n in Z[i] is:
n = 2a π1a1 π a11 · · · πrar π ar r q1b1 · · · qsbs
When is a positive integer n a sum of two squares? This is true if n = a2 +b2 = N (a+bi) = αα
where alpha = a + bi. If we factor α into primes, we have α = π1 · · · πr q1 · · · qs , where qi ∈ Z
and qi is a prime ≡ 3 mod 4, and pi = N (πi ) is 2, or a prime congruent to 1 mod 4. Therefore,
n = N (α) = αα = p1 · · · pr q12 · · · qs2
where pi = N(πi ) is 2, or a prime congruent to 1 mod 4 and qi is a prime ≡ 3 mod 4. Thus,
every prime q ≡ 3 mod 4 will appear in the factorization of n to an even power. Conversely.
if every prime q ≡ 3 mod 4 will appear in the factorization of n to an even power, then the
factorization of n in Z[i] has the form n = p1 · · · pr · q12 · · · qs2 . In Z[i], this can be factored
further into n = π1 π 1 · · · πr π r · q12 · · · qs2 , where pi = πr π r . So, taking α = π1 · · · πr · q1 · · · qs ,
we have n = αα. Note that we may choose πi instead of πi at any point in this factorization,
yielding a different α, and so a different way of expressing n as the sum of two squares.
We can now answer the question: How many distinct ways can a positive integer n be
expressed as a sum of two squares? For simplicity, let us assume that n is square-free. Then
n cannot be expressed a sum of two squares if p ≡ 3 mod 4 and p|n. Suppose n = p1 p2
where pi ≡ 1 mod 4. Writing p1 = αα and p2 = ββ. Then n = ααββ. So we can write
n = N (γ) = γγ by choosing γ = αβ or αβ.10 For example, take n = 85 = 5 · 17 Take
10
Choosing γ = αβ gives nothing new, as this is the conjugate of αβ. It gives a different answer, but only
a variant. For example, 5 = 12 + (−2)2 is a variant of the 5 = 12 + 22 .. This is also the reason we omit units
in our analysis.
33
α = 2 + i and β = 4 + i Then
γ = αβ = (2 + 1)(4 + i) = 7 + 6. Note that 49 + 36 = 85.
γ = αβ = (2 + 1)(4 − i) = 9 + 2. Note that 81 + 4 = 85.
Similarly if n is the product of k distinct primes congruent to 1 mod 4, then n can be
written as the sum of two square in 2n−1 different ways. We illustrate for n = 3. Take
n = 1105 = 5 · 13 · 17.. Let α = 2 + i, So N (α) = 5. Let β = 3 + 2i, with N (β = 13 and
γ = 4 + i with N (γ) = 17. Computing
αβγ = (2 + i)(3 + 2i)(4 + i) = 9 + 32i.
αβγ = (2 + i)(3 − 2i)(4 + i) = 33 + 4i.
αβγ = (2 + i)(3 + 2i)(4 − i) = 23 + 24i
αβγ = (2 + i)(3 − 2i)(4 − i) = 31 − 12i.
Note that 1, 105 = 92 + 322 = 332 + 42 = 232 + 242 = 312 + 122 .
Thur, 6/14
We shall try to analyze the equation z = x2 + 2y 2 in a way analogous to our consideration
of the equation z = x2 + y 2 .
For what values of p is
−2
p
−2
p
!
!
= 1? We compute
=
−1
p
!
2
p
!
= (−1)p−1)/2 (−1)(p
By considering the cases p ≡ 1, 3, 5, 7 mod 8, we find that
2 −1)/8
−2
p
!
.
= 1 if and only if p ≡
1 mod 8 or p ≡ 3 mod 8. Therefore, we can show:
Theorem: If p = x2 + 2y 2 , then p ≡ 1 or 3 mod 8.
Proof: If p = x2 + 2y 2 , then x2 + 2y 2 ≡ 0 mod p, or x2 ≡ −2y 2 mod p, since 0 < x, y < p,
y has!an inverse mod p, and multiplying by y −1 , we get (xy −1 )2 ≡ −2 mod p. Therefore
−2
= 1 and so p ≡ 1 or 3 mod 8. The following table, computed in class shows the first
p
4 primes in each category, together with x, y satisfying x2 + 2y 2 = p.
p≡1 x
17
3
41
3
23
1
89
9
p≡3 x
3
1
11
3
19
1
43
5
y
2
4
6
2
y
1
1
3
2
Here p = x2 + 2y 2 , and the computation suggests that every prime congruent 1 or 3 mod 8
can be written uniquely as a sum x2 + 2y 2 . This is true, but the proof is deferred. We take
it as true, in the discussion below.
34
√
We now copy the results in √
Z[i] by constructing
Z[i
√
√2]. this is motivated by the algebraic
identity x2 + 2y 2 = (x − i 2)(x + i 2) Set θ = i 2, so θ2 = −2. We let Z[θ] be the
set of all numbers of the form a + bθ. This set is closed under addition, subtraction, and
multiplication. If we allow a and b to be rational, the resulting set is called Q[θ]. If z = a+bθ
we set z = a − bθ. then zz = a2 + 2b2 , and we write N (z) = zz. We can easily prove that
zw = zw. Hence N (zw) = zwzw = zwzw = zzww = N (z)(N (w). Thus, the product of two
numbers of the form a2 + 2b2 is also of this form.
The Euclidean Algorithm in Z[θ]. Following the proof in Z[i], we have the following
division algorithm:
If α, β ∈ Z[θ] with β 6= 0, then there exists γ, ρ ∈ Z[θ] such that α = βγ + ρ with N (ρ) <
N (β). The proof is the same as in Z[i], except here we find that we have N (ρ) ≤ (3/4)N (β).
Once we have the division algorithm, we can use the Euclidean algorithm to find the GCD of
2 elements, express it as a linear combination of the elements, and prove unique factorization
into primes.
Units, Associates, and Primes in Z[θ].
As in Z[i], units are elements that divide 1, hence everything. If u is a unit in Z[θ], the
1 = uv, and taking norms, we have 1 = N (1) = N (u)N (v). So N (u) = 1. If u = a + bθ, this
gives 1 = a2 + 2b2 , so a = ±1 and b = 0. So the only units are 1 and −1, as in Z, and u is
a unit if and only if N (u) = 1. The associates of a prime π are ±pi. An element π is prime
if and only if its only divisors are units and associates. As in the theory of Z[i], with the
same proof, we have the sufficient condition: If N (z) is a prime in Z, then z is a prime in
Z[θ]. For example 3 + 4θ is a prime, since N (3 + 4θ) = 9 + 2 · 16 = 41. Thus, primes p ∈ Z
of the form p = a2 + 2b2 split into two primes in Z[θ]: p = (a + bθ)(a − bθ) = ππ. As in the
discussion of Z[i], primes not of this form stay primes in Z[θ]. Note that 2 splits: 2 = −θ2 .
It is a square, up to a unit factor. Thus any prime congruent to 1 or 3 mod 8 splits into two
distinct primes π and π. The prime 2 splits into −θθ. A prime congruent to 5 or 7 mod 8
remains a prime in Z[theta].
Using unique factorization in Z[θ], we can now show:
A positive integer n ∈ Z can be written in the form a2 + 2b2 if and only if pa ||n with odd a
implies p ≡ 1 or 3 mod 8.
The proof follows the reasoning of the similar result for Z[i]. Assume n = a2 + 2b2 . Then
n = N (a + bθ) = αα where α = a + bθ. Writing α as a product of primes in Z[θ], the
factorization will contain primes π, such that N (π) is a prime in Z congruent to 1 or 3
mod 8. The other primes are primes in Z congruent to 5 or 7 mod 8. This shows that the
factorization of n = αα primes q congruent to 5 or 7 will appear an even number of times.
The converse is also true, and the proof follows the lines of the corresponding result in Z[i].
If n is square-free and is not divisible by any prime congruent to 5 or 7, then the number of
35
ways n can be written in the form a2 + 2b2 is, as in the result for Z[i], 2k−1 , where k is the
number of primes in the factorization of n.
There remains the result: If p ≡ 1 or 3 mod 8, then p can be written a2 + 2b2 . The proof
follows the proof on page 11 on primes that are the sum of squares. However, what we end
up with are numbers x, y such that x2 + 2y 2 ≡ 0 mod p, with 0 < x2 + 2y 2 < 3p. Thus we
can say x2 + 2y 2 = p or x2 + 2y 2 = 2p. In the latter case, working mod 2, it follows that x2 ,
hence x is even. So we have x = 2u, and 4u2 + 2y 2 = 2p. Dividing by 2, we get p = y 2 + 2u2 .
This is the result.
36