Download CL_Paper3_MultiplicationandDivisionAlgorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Classical Hamiltonian quaternions wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Big O notation wikipedia , lookup

History of logarithms wikipedia , lookup

Approximations of π wikipedia , lookup

Algebra wikipedia , lookup

Location arithmetic wikipedia , lookup

Horner's method wikipedia , lookup

Elementary mathematics wikipedia , lookup

Elementary arithmetic wikipedia , lookup

Addition wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Arithmetic wikipedia , lookup

Transcript
Multiplication and Division Algorithms
Charlton Lu
Mathematics of the Universe – Math 89S
Professor Bray
6 December 2016
Introduction:
Today, computers can be programmed to solve extremely complex problems and process
large amounts of data—their speed allows them to perform many tasks in a fraction of the time
that it would take a human. However, their ability to solve problems quickly stems not only from
the raw computing ability of a computer, but also from the efficient algorithms that programmers
write in order to optimize efficiency. For example, the computer programs that perform
multiplication and division use efficient methods that far outpace our conventional methods. By
maximizing efficiency at the most fundamental level of computer science, computer
programmers can drastically improve the computing speed of programs that call upon these
operations. This paper will discuss the algorithms computers use in multiplication and division
that optimize calculation speed.
Time Complexity:
In order to measure efficiency, we introduce the idea of time complexity. The time
complexity of an algorithm measures the time an algorithm takes to run as a function of the input
length. The time taken to run is estimated by counting the amount of elementary operations (e.g.
single digit multiplication) required to execute the algorithm. We focus on multiplications rather
than additions and subtractions because their computing time will far eclipse that of addition and
subtraction as input sizes increase. Then, we can produce an expression for the number of
elementary operations needed as a function of the input length. For example, an algorithm may
require n3 + 1 elementary operations to execute where n is the input length. The time
complexity of this algorithm is described asymptotically—as the input length goes to infinity.
Thus, the time complexity of this algorithm is represented as O(n3 )—equivalent to the limit as n
goes towards infinity.
In some cases, the number of operations an algorithm requires to solve a problem may
depend on not only the length of the input, but also what the inputs are. For example, multiplying
two 2-digit numbers takes fewer operations if one of the numbers is a multiple of 10. On a more
complex level, Gaussian Elimination will take fewer operations in many situations such as if any
row has a 0 as the leading entry, or if one row is a scalar multiple of another. In cases where the
algorithm may take fewer steps, the time complexity of the algorithm is typically given by the
worst-case time-complexity, which is the maximum number of operations needed to run the
algorithm on an input of length n (Papadimitriou).
Multiplication algorithms
Multiplication, a fundamental operation in mathematics, is pervasive in computer
programs. Given how often multiplication is used, an algorithm that shortens multiplication time
will also significantly speed up many programs. Recognizing this fact, mathematicians and
computer scientists alike searched for a more efficient multiplication algorithm.
The classic multiplication algorithm we all learned in grade school, also known as long
multiplication, requires the multiplication of each digit of one number by each digit of the other
number. A multiplication of two 2-digit numbers a1 a2 and b1 b2 will require four single-digit
multiplications: a2 b2 , a2 b1 , a1 b2 , and a1 b1. More generally, the product of two n-digit
multiplications will require n2 multiplications of single digit numbers, which implies that long
multiplication has a time complexity of O(n2 ). While the long multiplication algorithm works
very quickly for small numbers, the number of operations required increases quadratically, which
translates to a quadratically increasing computing time.
In 1960, Anatoly Karatsuba discovered an algorithm that made multiplication more
efficient. His algorithm comes from the decomposition of an n-digit number X into a sum of two
different numbers 10n/2 X1 + X2 where X1 is the first n/2 digits and X2 is the second n/2. If we
have another n-digit number Y that can be decomposed into 10n/2 Y1 + Y2 ,
we can now express the product XY as the product of the two compositions:
1. XY = (10n/2 X1 + X2 )(10n/2 Y1 + Y2 )
Multiplying the two compositions yields a new expression:
n
2. XY = 10n X1 Y1 + 102 (X1 Y2 + X2 Y1 ) + X2 Y2
In this expression, we have to perform four multiplications, which is no more efficient than long
multiplication. However, by implementing a clever trick, we can reduce the number of necessary
multiplications to three. If we calculate the value (X1 + X2 )(Y1 + Y2 ), we yield the number
X1 Y1 + X1 Y2 + X2 Y1 + X 2 Y2 . Subtracting X1 Y1 and X 2 Y2 , which we already must calculate to
find the first and third terms of the product, we are left with X1 Y2 + X2 Y1, the middle term of the
expression 2. Thus, we can calculate all three terms of the polynomial in just three
multiplications (Ofman, Karatsuba):
X1 Y1 , (X1 + X2 )(Y1 + Y2 ), and X2 Y2 .
To give a numerical example, consider the 2-digit numbers 12 and 34. Then we have:
3. X1 = 1, X2 = 2, Y1 = 3, and Y2 = 4.
Thus, we find that:
4. X1 Y1 = (1)(3) = 3, (X1 + X2 )(Y1 + Y2 ) = (1 + 2)(3 + 4) = 21, and X2 Y2 = 8.
Then, we can plug back into expression 2 to yield:
2
5. 102 (3) + 102 (21 − 3 − 8) + 8 = 408
The example here is known as the base case in recursion: it can be solved using addition,
subtraction, and single digit multiplication—the most fundamental operations. When larger
numbers are passed onto the algorithm, more steps are required in order to reach the base case.
For example, if X and Y are 4-digit integers, then X1 , X2 , Y1 , and Y2 will then be 2-digit integers.
As a result, the multiplications X1 Y1 , (X1 + X2 )(Y1 + Y2 ), and X2 Y2 cannot be solved using
single digit operations. Another recursive call of the Karatsuba multiplication must be made in
order to solve the three smaller multiplications. The time complexity of the Karatsuba
multiplication can be calculated using this fact (Babai). If each call of the Karatsuba
multiplication algorithm M(n) requires three further multiplications and halves the number of
digits in each smaller multiplication, then we write:
n
6. M(n) = 3M(2).
If we use mathematical induction, we find that for every i where i ≤ k, we have:
n
7. M(n) = 3i (2i ), and if i = k, then 3k M(n/2k ) = 3k M(1) = 3k
since the base case does not require further Karatsuba multiplication, but only three single-digit
multiplications. To find the time complexity of Karatsuba multiplication, we take the log of
equation 4, and simplify, leading to the equation:
8. M(n) = nlog2 3 = n1.585
Thus, the time complexity of Karatsuba multiplication (O(n1.585 )) is more efficient than long
multiplication, and takes significantly less computing time when the length of the input increases
to very high numbers. Looking at the Figure 1 below, we can see that long multiplication quickly
begins to require much more operations than Karatsuba multiplication; for an input length of just
10, long multiplication will already require more than double the operations as Karatsuba
multiplication (Knuth).
Figure 1:
Long Multiplication (Red)
Karatsuba Multiplication (Blue)
Toom-Cook Algorithm
The Toom-Cook algorithm is a generalization of the Karatsuba multiplication. Instead of
splitting integers into two parts, the Toom-Cook algorithm allows for more complex splitting.
For example, the Toom-Cook 3 Way method involves splitting an integer into three parts. These
parts are written in terms of a polynomials so that integers X and Y will be split such that:
9. X(t) = (X2)t 2 + X1(t) + X0
10. Y(t) = (Y2)t 2 + Y1(t) + Y0
where t is equal to 103 since we split each integer is split into three pieces in the Toom-Cook 3
algorithm. Multiplying these together, we obtain a new polynomial expression W:
11. W(t) = w4 t 4 + w3 t 3 + w2 t 2 + w1 t + wo
First, we choose strategic values of w = (0, 1, −1, 2, ∞) in order to make the future calculations
easier. Because W(t) is the product of X(t) and Y(t), we can plug the strategic values of w into
X(t) and Y(t) and multiply them together, which involves five multiplications. If the length of
the integers being multiplied is greater than 3, these multiplications will require another recursive
call of the Toom-Cook algorithm, so a large multiplication can be decomposed into five smaller
ones. Each of these products will be equal to the following, respectively:
12. W(0) = wo
13. W(1) = w4 + w3 + w2 + w1 + wo
14. W(−1) = w4 − w3 − w2 − w1 + wo
15. W(2) = 16w4 + 8w3 + 4w2 + 2w1 + wo
16. W(∞) = w4
By plugging values into this polynomial, we can obtain a system of equations to solve for the
polynomial coefficients. We can now see the rationale behind choosing values w =
(0, 1, −1, 2, ∞)—they yield a system of equations that is easier to solve. These equations can be
rewritten as
W(0)
1
W(1)
1
17. W(−1) = 1
1
W(2)
( W(∞) ) (0
wo
0
0
0
0
w1
1
1
1
1
w2
−1 −1 −1 1
w3
2
4
8 16
0
0
0
1 ) (w4 )
which can be solved for w4 , w3 , w2 , w1 , wo . Plugging back into our original equation, we can
find that the product of the multiplication is equal to 1012 w4 + 109 w3 + 106 w2 + 103 w1 +
wo (Kronenburg).
Toom-Cook Complexity
The Toom Cook 3-algorithm reduces a large multiplication to 5 multiplications of smaller
numbers. Thus, we can write:
n
18. 𝑀(n) = 5M(3).
This implies that the Toom Cook 3-algorithm has a complexity of 𝑂(𝑛log3 5 ) = 𝑂(𝑛1.465 ).
In general, the Toom Cook k-algorithm has a complexity of 𝑂(nlogk 2k−1 ). It is also interesting
to note that Karatsuba multiplication is actually a special case of the Toom Cook algorithm
(Toom-Cook 2-algorithm) and long multiplication is simply the Toom Cook 1-algorithm
(Knuth).
Implementation:
Because the time-complexity of Karatsuba and Toom-Cook multiplication are both better
than that of long multiplication, they will become faster than long multiplication as input size
rises towards infinity. However, they are not necessarily more efficient than long multiplication
at all input sizes. There are several factors that contribute to this nuance. For instance, both
Karatsuba and Toom-Cook multiplication are slower than long multiplication at small inputs
because recursive overhead—the computing power required to recursively call a function several
times—will outweigh multiplicative efficiency when 𝑛 is small. Therefore, implementations of
the Karatsuba and Toom-Cook algorithms often switch to long multiplication when the operands
in multiplication become small. In addition, for small inputs, the increase in number of additions
and subtractions outweighs the increased efficiency in multiplication, so long multiplication will
be more efficient. Lastly, solving the matrix equation in the Toom-Cook algorithm takes
computing power, and only becomes strategic for large inputs when the multiplicative efficiency
begins to outweigh the computing power required to solve the matrix equation. For even larger
numbers, algorithms such as the Schönhage–Strassen algorithm, which utilizes Fourier
Transformations, become more efficient (Knuth).
Division algorithm
Division can be understood as a multiplication between a dividend and the reciprocal of
the divisor. As a result, many division algorithms involve two steps: finding the reciprocal of the
dividend, and employing a multiplication algorithm to find the quotient.
1
To find the reciprocal of a number 𝐷, we can create an equation with root D and then use
Newton’s method to estimate the root. One equation that works is
1
19. 𝑓(X) = X − D.
f(X )
Applying Newton’s method, we can iterate Xi+1 = X i − f′ (Xi ) to find increasingly accurate
i
1
estimates of D. Each iteration of the algorithm can be given by the following:
20. Xi+1 = Xi −
1
−D
Xi
1
− 2
Xi
= Xi (2 − DXi ).
Figure 2:
Newton’s Method
Figure 2 gives a visual representation of Newton’s Method at work. Using the derivative of the
function, we find increasingly precise estimates of the root. We can quantity the accuracy of each
estimate using error analysis: if we define the error ϵi to be Xi −
1
D
, the difference between the
1
actual value of 𝐷 and our estimate, we find the error after each iteration of Newton’s Method.
However, using 𝜖𝑖 = 𝐷𝑋𝑖 − 1, which is equivalent to the first expression multiplied by 𝐷, will
yield the same result with much cleaner math. We then calculate the error after an iteration of
Newton’s method.
𝜖𝑖+1 = 𝐷𝑋𝑖+1 − 1
= 2𝐷𝑋𝑖 − 𝐷2 𝑋𝑖2 − 1
= −(𝐷𝑋𝑖 − 1)2
= −𝜖𝑖2
This implies that each iteration of Newton’s method will result in roughly a doubling of the
number of correct digits in the result. After calculating the reciprocal of 𝐷 to a sufficient degree
of accuracy, we can then use one of the aforementioned multiplication algorithms to find the
quotient. The use of Newton’s method to find the reciprocal of a dividend and then multiplying
the reciprocal by the divisor is known as the Newton-Raphson method. The time complexity of
finding the reciprocal is 𝑂(log(𝑛) 𝐹(𝑛)) where 𝐹(𝑛) is the number of fundamental operations
needed to calculate the reciprocal to 𝑛-digit precision. Therefore, the overall time-complexity of
division will be the time complexity of the sum of operations needed to find the reciprocal and
multiply the reciprocal by the dividend: 𝑂(log(𝑛) 𝐹(𝑛) + 𝑛1.465 ) = 𝑂(𝑛1.465 ) since exponential
growth will eclipse the other term’s growth as input length approaches infinity. Therefore,
division will have the same asymptotic complexity as multiplication (Cook).
Like multiplication, the implementation of division algorithms will often be slower than
standard long division for small inputs due to recursive overhead. In addition, the computing
power used to calculate the reciprocal of an operand for small inputs would be more efficiently
utilized on simply carrying out the long division. Thus, like multiplication, the division algorithm
used by computers will depend on the input length. Typically, the Newton-Raphson algorithm is
only used for very large inputs (Papadimitriou).
Conclusion:
Currently, the fastest multiplication and division algorithm for two n digit integers
remains an open question in computer science. And yet, an increase in multiplication and
division efficiency could lead to significantly faster computing speeds and new capabilities to
process ever-increasing amounts of data.
Works Cited
Babai, Laszlo. "Divide and Conquer: The Karatsuba–Ofman algorithm." UChicago: Laszlo
Babai. 3 Dec 2016 <http://people.cs.uchicago.edu/~laci/HANDOUTS/karatsuba.pdf>.
Bodrato, M. Zanoni. "Integer and Polynomial Multiplication: Towards Optimal Toom-Cook
Matrices." Proceedings of the ISSAC 2007 Conference. New York: ACM press, 2007.
Cook, Stephen A. On the Minimum Computation Time of Functions. Cambridge, 1966.
Knuth, Donald. The Art of Computer Programming. Addison Wesley , 2005.
Kronenburg, M.J. Toom-Cook Multiplication: Some Theoretical and Practical Aspects. 16 Feb
2016.
Ofman, A. Karatsuba and Yu. "Multiplication of Many-Digital Numbers by Automatic
Computers." Proceedings of the USSR Academy of Sciences. Physics-Doklady, 1963.
595–596.
Papadimitriou, Christos H. Computational complexity. Reading: Addison-Wesley, 1994.
Smith, Mark D. "Newton-Raphson Technique ." 1 Oct 1998. MIT 10.001. 6 Dec 2016
<http://web.mit.edu/10.001/Web/Course_Notes/NLAE/node6.html>.
Picture Bibliography
1. Graphed on https://www.desmos.com/
2. http://tutorial.math.lamar.edu/Classes/CalcI/NewtonsMethod_files/image001.gif