Download cs413encryptmathoverheads

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polynomial ring wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

Factorization wikipedia , lookup

Modular representation theory wikipedia , lookup

Gaussian elimination wikipedia , lookup

Commutative ring wikipedia , lookup

Field (mathematics) wikipedia , lookup

Group (mathematics) wikipedia , lookup

Homomorphism wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Addition wikipedia , lookup

Polynomial greatest common divisor wikipedia , lookup

Algebraic number field wikipedia , lookup

Transcript
1
CS 413, Computer and Data Security
Math for Encryption Overheads
I. Thinking Concretely about Division and Remainders. The Euclidean
Algorithm for Finding the Greatest Common Divisor.
Some Definitions:
The material in this section has some independent utility, but its main
purpose in the presentation right now is to give some concrete practice in
thinking about integer division and remainders, which will be useful
preparation for understanding sections II and III.
Definition: Prime number: Any integer greater than 1 that has only 1 and
itself as factors is prime. Historically, the number 1 has occasionally been
treated as a prime number. It is certainly true that it only has 1 and itself as
factors. It is generally not included in the definition of primes, not because
it fails in some way, but because it has so many other unique characteristics,
that classifying simply as prime does not do it justice.
Definition: Composite number: A number which is not 1 and not prime is
composite. In other words, an integer greater than 1 which has factors other
than 1 and itself is composite.
Definition: Greatest common divisor: This is the largest integer which is a
factor of two other integers. The notation is usually given as follows: Given
a and b, positive integers, gcd(a, b) = x is the largest integral factor of a and
b. Note that x <= a and x <= b.
Definition: Relatively prime: Given 2 positive integers, a and b, if the
gcd(a, b) = 1, then a and b are relatively prime. Both a and b may be
composite. If you did a prime factorization of a and b you would find that
they have no prime factors in common.
2
Finding the gcd:
One approach to finding the greatest common divisor of 2 positive integers:
Find the prime factorization of each. The product of the prime factors they
have in common forms the greatest common factor or greatest common
divisor.
For example:
72 = 2 * 2 * 3 * 3
30 = 2 * 3 * 5
The common prime factors of the two numbers are 2 and 3. 2 * 3 = 6 is their
greatest common divisor.
The Euclidean algorithm: This is an iterative algorithm, suitable for
implementation in a computer program, that will find the greatest common
divisor of two integers without having to find their prime factorization. Here
is an illustration of the algorithm using the values 72 and 30, followed by a
brief description of how it works:
72 = 2 * 30 + 12
a = 72, m0 =2, b = 30, r0 = 12
30 = 2 * 12 + 6
b = 30, m1 = 2, r0 = 12, r1 = 6
12 = 2 * 6 + 0
r0 = 12, m2 = 2, r1 = 6, r2 = 0
The last non-zero remainder, r1 = 6, is the gcd(72, 30).
Description: Given 2 numbers, find the remainder when you divide the
larger by the smaller. The claim is that the gcd of the smaller and the
resulting remainder is the same as the gcd of the original pair. You repeat
the process of dividing and finding the remainder until the remainder is 0.
The last remainder before that is the gcd for the whole sequence of pairs of
numbers.
3
Demonstrating that Euclid’s Algorithm Works
Givens:
Let integers a and b be given.
Let x = gcd(a, b)
Without loss of generality, assume that a > b. Then it is possible to write the
following:
a = mb + r
To show:
Show that x = gcd(a, b) = gcd(b, r)
Step 1:
Since x = gcd(a, b), then there must be some values a1 and b1 such that:
a = a1x and b = b1x
Now substitute these expressions for a and b in the expression relating a and
b:
a1x = mb1x + r
Now solve the expression for r:
r = a1x – mb1x
r = x(a1 – mb1)
Conclusion: x, the gcd(a, b), is also a factor of r.
4
Step 2:
Given that x = gcd(a, b), x is a factor of b. From step 1, x is also a factor of
r. You want to show that gcd(b, r) = gcd(a, b) = x. You can do this by
ruling out two possible cases:
Case 1: gcd(b, r) = y < x = gcd(a, b)
Case 2: gcd(b, r) = y > x = gcd(a, b)
Case 1:
This is the simpler case.
If x = gcd(a, b), then x is a factor of b.
As shown in step 1, x is a factor of r.
If x is a factor of both b and r, the gcd(b, r) can be no less than x.
Case 2:
This case is slightly more complex.
Suppose that gcd(b, r) = y.
Express a in this way: a = mb + r
If y = gcd(b, r), by definition, y is a factor of b.
If y = gcd(b, r), then there also exist values b2 and r2 such that:
b = b2y and r = r2y
Now substitute these expressions for b and r in the expression relating a, b
and r:
a = mb2y + r2y
5
Factoring a = mb2y + r2y gives this expression:
a = y(mb2 + r2)
This shows that y is a factor of a.
Conclusion of Case 2:
y is a factor of both b and a.
x = gcd(a, b), so y <= x.
y = gcd(b, r), so gcd(b, r) <= x.
Overall Conclusion:
y = gcd(b, r) is not less than x = gcd(a, b).
y = gcd(b, r) is not greater than x = gcd(a, b).
Therefore, y = gcd(b, r) = x = gcd(a, b).
General Statement of Algorithm:
The result says:
Given: a, b, x = gcd(a, b), a = mb + r, gcd(a, b) = gcd(b, r)
a and b are the knowns. x is the unknown. Given a and b, using integer
division and modulus, it is easy to find b and r. Then the same result can be
applied to b and r. The steps would go as follows:
a = mb + r0
gcd(a, b) = gcd(b, r0)
b = m1r0 + r1
gcd(b, r0) = gcd(r0, r1)
r0 = m2r1 + r2
gcd(r0, r1) = gcd(r1, r2)
…
The algorithm terminates when you reach rn = 0.
6
Consider the following points:
1. All ri are integers.
2. All ri >= 0.
3. At every step, rj < ri. (Note that this critical point is not proven.)
The conclusion you can reach from this is that in a finite number of steps the
algorithm will converge to the point where r equals 0. In other words, after
n + 1 steps you reach the following result:
gcd(a, b) = … = gcd(rn, 0)
And:
gcd(rn, 0) = rn
This is true because 0 is evenly divisible by anything. Anything will go into
0 zero times. In other words, for any a, gcd(a, 0) = a.
Thus, you know that the algorithm has converged when the remainder is 0,
and the greatest common divisor was rn, the non-zero remainder that
preceded the remainder of 0.
The example using the values 72 and 30 is repeated here in order to
summarize:
72 = 2 * 30 + 12
a = 72, m =2, b = 30, r0 = 12
30 = 2 * 12 + 6
b = 30, m1 = 2, r0 = 12, r1 = 6
12 = 2 * 6 + 0
r0 = 12, m2 = 2, r1 = 6, r2 = 0
The final remainder is 0 and the remainder before that was 6. According to
the algorithm, gcd(72, 30) = gcd(30, 12) = gcd(12, 6) = gcd(6, 0) = 6, which
agrees with the result obtained using prime factorizations at the beginning of
this section.
7
II. Algebraic Background
Algebras in general are defined in terms of one or more operators and a set
of values which the operators can be applied to. For the purposes of the
exposition below, let a single operator be represented by • and the set of
interest be S. Within an algebraic system, certain properties can be defined.
Here are the definitions of some of those properties:
Closure: Given a, b ε S, a • b ε S.
Identity: Given some arbitrary a ε S, there is an i ε S such that a • i = i • a =
a.
For the familiar operations + and * in the reals, the identities are 0 and 1,
respectively.
Inverse: For some a ε S, its inverse is a-1 ε S such that a • a-1 = i.
For the familiar operations + and *, inverses are readily apparent. The
additive inverse of 1 is -1, for example, and the multiplicative inverse of 7 is
1/7. Note that, depending on the set of values and the operation in question,
some values may not have inverses. 0 doesn’t have a multiplicative inverse
in the set of real numbers. If you restrict yourself to the set of integers, no
values except 1 and -1 have multiplicative inverses. If you restrict yourself
to the set of positive integers, there are no additive inverses.
The Associative Property: For a, b, c ε S, (a • b) • c = a • (b • c).
The Commutative Property: For a, b ε S, a • b = b • a.
The Distributive Property: Given two operations on the set, + and *, *
distributes over + if the following holds: For a, b, c ε S, a * (b + c) = (a * b)
+ (a * c).
8
An Algebraic Group:
This is a set S with one operation, say •, and the following 4 properties:
1.
2.
3.
4.
Closure under •.
Identity under •.
An inverse for all elements of the set under •.
Associativity under •.
Notice that commutativity is NOT one of the properties of a group. There
are groups which are commutative and it is generally easier to think of an
example of a commutative group than a non-commutative group. For
instance, consider the positive and negative integers under addition. This
satisfies all four of the requirements for a group and in addition the
commutative property holds. In honor of the great Norwegian
mathematician Niels Henrik Abel, who proved the general insolubility of the
quintic equation, commutative groups are usually called Abelian groups.
Algebraic Structures Lacking Some Properties:
Some of the algebraic definitions may seem somewhat odd on the surface.
Using addition and subtraction in the reals as a reference point, it may not be
clear how an element of an algebraic structure may not have an inverse, or
how an operation may not be commutative. This section looks at some
questions about inverses and commutativity. You have encountered
mathematical constructs where not all of the familiar rules of algebra in the
integers or reals apply. This section starts with some verbal discussion, and
then follows with some concrete examples.
Non-Commutativity:
Let A be an m x n matrix. Let B be an n x p matrix. Let the • represent
standard matrix multiplication. Then A • B is a well defined operation
because A has the same number of columns as B has rows. On the other
hand, A • B ≠ B • A because B • A is not even a valid product, assuming that
p ≠ m. Thus, in general, matrix multiplication is NOT commutative. It is
commutative only in the special case of square matrices.
9
Inequality of Left and Right Inverses:
Notice that this observation about commutativity also affects the nature of
inverses. We are accustomed to the inverses in the real numbers where this
holds: a • a-1 = a-1 • a = i. In a system where the commutative property
doesn’t hold, even though inverses might exist, the left inverse and the right
inverse of an element might not be the same.
Non-Existence of Inverses:
The zero matrix of any size has no multiplicative inverse. The number 0 in
the reals also has no multiplicative inverse. However, a non-zero matrix
might also have no inverse. A given non-zero matrix might also have an
inverse on one side but not the other.
Examples:
First of all, consider any zero matrix such as the following:
 0 0

 0 0
O = 
There is no 2 x 2 matrix that it can be multiplied by to arrive at the identity:
 1 0

 0 1
I = 
Now consider a matrix where the rows and columns are linear combinations
of each other:
 1 2

 2 4
A = 
An inverse would be of this form:
a b

c
d


A-1 = 
10
For A-1 to work as a right inverse, these equations, among others, would
have to be satisfied:
a + 2c = 1
2a + 4c = 0
This system is inconsistent, so there is no right inverse of this matrix. It can
be shown by similar means that no left inverse exists either.
Keep in mind that a right hand inverse matrix of A m x n, say Ar-1, has to be
n x m, and the product A • Ar-1 would give Im, the square identity matrix
with dimensions m x m. A left hand inverse matrix of A, Al-1, has to be n x
m, and the product Al-1 • A would give In, the square identity matrix with
dimensions n x n.
Now consider this matrix:
1 1


A =  1 2
 2 1


You can verify that it has this unique left hand inverse:
Al
-1
1 
1  2

3
3
=

1

2
1
3
3

You can also verify that this is not the right hand inverse, and in fact show
that the equations for a right hand inverse are inconsistent, meaning that one
doesn’t exist.
These examples were presented to remind you that you have encountered
algebraic systems where the behavior is not the same as with standard
arithmetic in the real numbers. In particular, commutativity and inverses
may be an issue.
11
An Algebraic Ring:
This is a set S with two operations, addition (+) and multiplication (*) and
the following properties:
1.
2.
3.
4.
Under addition, S is an Abelian (commutative) group.
S is closed under multiplication.
Multiplication is associative.
Multiplication distributes over addition.
For our purposes the ring is an intermediate structure. We are more
interested in the structure that follows, which has a definition which relies on
knowing both what a group and a ring are.
An Algebraic Field:
This is a set S with two operations, addition (+) and multiplication (*) and
the following properties:
1. S is a ring.
2. With the exception of the 0 element (the additive identity), which
does not have a multiplicative inverse, S satisfies the requirements for
an Abelian group under multiplication.
You may have heard the expression “the field of real numbers”. This means
that the real numbers form an algebraic field. The properties given above
correspond to the everyday characteristics of arithmetic with real numbers
that we are used to.
The concept of an algebraic field is important because a lot of advanced
cryptography is based on it. In the next section, modular arithmetic will be
discussed, and the claim will be made that modular arithmetic leads to an
algebraic field. RSA encryption is based on the problem of finding
multiplicative inverses, and algebra such as this is needed in order to know
whether inverses exist and what the algorithms might be for finding them.
12
III. Modular Arithmetic and Modular Fields
Definition of Modulus:
Modular arithmetic means finding the remainder when one integer is divided
by another. The following statements are equivalent in describing the
operation of modulus for integers a, b, c, and n:
a mod n = b
a%n=b
a=c*n+b
Simple Examples that Work:
You have already encountered modular arithmetic applied to cryptography
in a very simple way. If the letters from a to z are represented by the
integers from 0 to 25, Caesar’s cipher can be expressed using modular
arithmetic:
c = (p + 3) mod 26
It is also possible to devise ciphers of this form:
c = (p * 3) mod 26
It is possible to show that this cipher works. This can be done by running
through all of the values for p from 0 to 25 and noting that each maps to a
different value of c.
13
A Simple Example that Doesn’t Work:
Consider this cipher:
c = (p * 2) mod 26
Observe:
The numeric representation for the plaintext letter a is 0.
The numeric representation for the plaintext letter n is 13.
The ciphertext of a = (0 * 2) mod 26 = 0.
The ciphertext of n = (13 * 2) mod 26 = 0.
This is a collision.
This pattern repeats for every pair of letters in the lower and upper halves of
the alphabet. The example provides an entry point for asking algebraic
questions about modular encryption schemes. Observe the following facts
about the choice of the factors in the earlier multiplicative scheme that
worked and this one that doesn’t:
26 itself is not prime.
3 and 26 are relatively prime.
2 is a factor of 26. 2 and 26 are not relatively prime.
No further explanation will be given here. However, the property of relative
primeness turns out to be the basis for the difference in the schemes.
Relative primeness also turns out to be a significant property in other more
advanced encryption schemes.
14
Equivalence Classes:
Modulus divides the integers into subsets, or classes. In order to begin the
discussion, here is the expression for modulus again:
a mod n = b
Given some finite n, b can take on the values 0 through n – 1.
If a can be taken from among the integers without restriction, then for each b
there is an infinite set of values a for which the modulus n is b. Each of
these sets of values of a which map to the same b is an equivalence class.
The n equivalence classes, Ci, indexed from 0 through n – 1, that can be
defined in this way:
Ci = {a | a mod n = i}
Any two numbers that have the same remainder upon division by n are in
the same equivalence class. Their relationship can be shown in this way:
(x mod n) = (y mod n)
If these conditions are met, the two numbers, x and y, are simply said to be
equivalent mod n, and the following notation is used to express this:
x ≡n y
15
It can be stated informally that successive elements of an equivalence class
are separated by n units. More generally, the statement can be made that
any two elements are separated by a multiple of n units. This can be
expressed as follows:
x ≡n y ↔
(x – y) = kn for some integer k
That is, the difference between any two elements of a modular equivalence
class is an integer multiple of n.
In case this isn’t clear, consider the following. If x and y are in the same
equivalence class with remainder b:
x = cn + b for some c
y = dn + b for some d
Then subtracting the second equation from the first gives:
x – y = (c – d)n
x – y = kn
In other words, k = c – d, and the difference between x and y is indeed a
multiple of n.
16
Modular Fields:
What does any of this have to do with the presentation of the algebraic
structures in the previous section?
1. Given some integer n which is prime,
2. Given the set S = {0, 1, 2, …, n – 1},
3. Given an operation denoted “+”, defined as normal integer addition
modulus n,
4. And given an operation denoted “*”, defined as normal integer
multiplication modulus n,
5. This set of elements and these two operations form an
algebraic field known as a modular field.
Some of the previous material dwelled on the idea that there are algebraic
structures that don’t have all of the properties of addition and multiplication
in the reals.
This was intended as preparation for the converse idea, given here: There
are unfamiliar algebraic systems which do have all of the properties of
arithmetic in the reals.
Note that the first requirement for this to hold is that n be prime. The fact
that this modular structure is a field will not be proven here. What follows
below is a discussion of its characteristics in more pragmatic terms.
17
Modular Addition:
Suppose you choose n = 5, as your prime number. The set S = {0, 1, 2, 3,
4}. If addition is defined as addition mod 5, you can write a simple addition
table for the set:
+
0
1
2
3
4
0
0
1
2
3
4
1
1
2
3
4
0
2
2
3
4
0
1
3
3
4
0
1
2
4
4
0
1
2
3
You can verify all of the entries in the table, but just one should illustrate the
general idea:
(3 + 4) mod 5 = (7) mod 5 = 2
It is true that in doing the arithmetic in this way, we make use of the value 7.
7 is not an element of the set, but this is not a problem because 7 is not the
final answer. Speaking algebraically, the key property illustrated by this is
closure, a property needed for the structure.
You might also ask, if this is the table for the addition operation, is it clear
that every element in S has an additive inverse? Take this for example:
(2 + 3) mod 5 = (5) mod 5 = 0
This looks a little strange because the additive inverses are not negatives of
each other. There are no negative elements in the field in the sense that the
integers or the reals have negatives. However, since the sum of 2 and 3 is
the additive identity, 0, they are additive inverses by definition.
You can confirm that 0 is the additive identity by looking at the row for 0 in
the table. 0 plus anything gives the same thing back. You can also confirm
that every element of the set has a (unique) additive inverse by observing
that each row and each column in the addition table contains one (and only
one) zero element.
18
Modular Multiplication:
This is the multiplication table for the modular field with n = 5:
*
0
1
2
3
4
0
0
0
0
0
0
1
0
1
2
3
4
2
0
2
4
1
3
3
0
3
1
4
2
4
0
4
3
2
1
This example shows the derivation of an entry in the table:
(3 * 4) mod 5 = (12) mod 5 = 2
Looking at the row for 1 in the table, it is clear that 1 is the multiplicative
identity in the field. Every element times 1 gives the same element back.
You can also observe that 0 does not have a multiplicative inverse and that 1
appears in every other row and column, indicating that all other elements of
the field do have multiplicative inverses.
As a matter of fact, the table is symmetric, and every row and every column
contains exactly one occurrence of each of the values of S. Each row and
each column is a permutation of S.
This example shows multiplicative inverses:
(2 * 3) mod 5 = (6) mod 5 = 1
The product of 2 and 3 is 1, the multiplicative identity, so 2 and 3 are
multiplicative inverses by definition.
2 and 3 are both the additive and the multiplicative inverses of each other.
This seems unusual because nothing like this can happen in the integers or
the reals. However, it is an interesting coincidence, not a property. From
the point of view of algebraic properties, the modular field works just like
the integers and the reals.
19
The Difference Between n Prime and n Not Prime:
The previous section showed a multiplication table with n = 5, prime. One
of the conditions for having a modular field is that n be prime. By way of
contrast, consider the modular multiplication table for n = 4, composite:
*
0
1
2
3
0
0
0
0
0
1
0
1
2
3
2
0
2
0
2
3
0
3
2
1
Notice that the row and column for element 2 do not contain a 1. In other
words, 2 does not have a multiplicative inverse in this structure. This fact
alone shows that you’re not dealing with a field.
Other interesting characteristics can be found in a structure like this. The
row and column for element 2 also don’t contain 3, but they contain two
occurrences of 0 and 2.
n = 4 is composite. The elements 1 and 3 are relatively prime to 4. Their
rows in the table also look like rows in the table for n prime. They are
permutations of the elements of the set S.
2 is a factor of 4. It is not relatively prime to 4. Its row in the table is not
like a row in a table for n prime because it contains repetitions. You may
recall the example of a cipher given earlier that didn’t work:
c = (2 * p) mod 26
It didn’t work because there were collisions. If you formed the row for 2 in
the modular multiplication table for n = 26, you would get this:
0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24
Without trying to prove anything at this point, the observation can be made
that the collisions in the cipher occur where there are repetitions in the
multiplication table. Such a cipher works if the factor is relatively prime to
n, because there are no repetitions. It does not work if the factor is not
relatively prime to n, because there are repetitions.
20
Modular Field Properties:
Here is a listing of the properties of a field as they hold for a modular field
base n, where n is prime:
1. Associativity:
(a + (b + c)) mod n = ((a + b) + c) mod n
(a * (b * c)) mod n = ((a * b) * c) mod n
2. Commutativity:
(a + b) mod n = (b + a) mod n
(a * b) mod n = (b * a) mod n
3. Distributivity:
(a * (b + c)) mod n = ((a * b) + (a * c)) mod n
4. Identities:
(a + 0) mod n = (0 + a) mod n = a
(a * 1) mod n = (1 * a) mod n = a
It is not hard to show that these properties hold for a modular field. The
arithmetic operations for the modular field were defined to be like integer
operations, with modulus applied to the result. Each of these properties is
expressed as an equality. Because the integers are a field, each equality
holds in the integers before the application of modulus. Therefore, the
results after applying modulus are also the same.
21
5. Inverses:
There exists an additive inverse, -a, such that the sum of a and -a mod
n is 0 for all a. Without quantifiers, the notation for this is:
(a + (-a)) mod n = 0
“-a” is used to signify the additive inverse. It should not be confused
with the meaning of a negative number in the reals.
There exists a multiplicative inverse, a-1, such that the product of a
and a-1 mod n is 1 for all a except 0. Without quantifiers, the notation
for this is:
(a * (a-1)) mod n = 1
“a-1” is used to signify the multiplicative inverse. It should not be
confused with meaning of a-1 = 1/a in the reals.
Showing that additive inverses exist is not hard. For any a, -a = n – a.
Showing that multiplicative inverses exist is not so straightforward. It
is necessary to show that for any a there exists some b in S and some
integer constant k such that:
a * b = kn + 1
No proof of the existence of multiplicative inverses for n prime will
be given here. Their existence, their properties, and an algorithm for
finding them are all of great interest for cryptography and will be
pursued in greater depth later.
22
Reducibility Under Addition:
Modular arithmetic also has a computational property, reducibility, which is
not a field property. This is a statement of reducibility under addition:
(a + b) mod n = ((a mod n) + (b mod n)) mod n
Loosely speaking, this means that taking the modulus of some expression
“distributes” over its parts. You can apply modulus to subparts of
expressions first, combine these results, and then find the modulus. This can
be simpler than computing unreduced results and then finding the modulus.
In order to show that this property holds, consider the following:
a mod n = ra ↔
a = cn + ra
b mod n = rb ↔
b = dn + rb
Evaluating a + b without reducing requires finding the sum of a and b first:
(a + b) mod n
= ((cn + ra) + (dn + rb)) mod n
= (n(c + d) + (ra + rb)) mod n
= (ra + rb) mod n
Evaluating a + b with reducing requires finding the sum of smaller
quantities:
(a + b) mod n
= ((a mod n) + (b mod n)) mod n
= (ra + rb) mod n
Note that if (ra + rb) < n, then the final result would simply be (ra + rb). If
(ra + rb) > n, then the last mod operation has an effect.
23
Reducibility Under Multiplication:
Modular arithmetic also has reducibility under multiplication:
(a * b) mod n = ((a mod n) * (b mod n)) mod n
In order to show that this property holds, consider the following:
a mod n = ra ↔ a = cn + ra
b mod n = rb ↔ b = dn + rb
Evaluating a * b without reducing requires finding the product of a and b
first:
(a * b) mod n
= ((cn + ra) * (dn + rb)) mod n
= (cn * dn + cn * rb + ra * dn + ra * rb) mod n
= (n(c * dn + c * rb + ra * d) + ra * rb) mod n
= (ra * rb) mod n
Evaluating a * b with reducing requires finding the product of smaller
quantities:
(a * b) mod n
= ((a mod n) * (b mod n)) mod n
= (ra * rb) mod n
Note that if (ra * rb) < n, then the final result would simply be (ra * rb). If
(ra * rb) > n, then the last mod operation has an effect.