Download CL_Paper3_MultiplicationandDivisionAlgorithms

Multiplication and Division Algorithms Charlton Lu Mathematics of the Universe – Math 89S Professor Bray 6 December 2016 Introduction: Today, computers can be programmed to solve extremely complex problems and process large amounts of data—their speed allows them to perform many tasks in a fraction of the time that it would take a human. However, their ability to solve problems quickly stems not only from the raw computing ability of a computer, but also from the efficient algorithms that programmers write in order to optimize efficiency. For example, the computer programs that perform multiplication and division use efficient methods that far outpace our conventional methods. By maximizing efficiency at the most fundamental level of computer science, computer programmers can drastically improve the computing speed of programs that call upon these operations. This paper will discuss the algorithms computers use in multiplication and division that optimize calculation speed. Time Complexity: In order to measure efficiency, we introduce the idea of time complexity. The time complexity of an algorithm measures the time an algorithm takes to run as a function of the input length. The time taken to run is estimated by counting the amount of elementary operations (e.g. single digit multiplication) required to execute the algorithm. We focus on multiplications rather than additions and subtractions because their computing time will far eclipse that of addition and subtraction as input sizes increase. Then, we can produce an expression for the number of elementary operations needed as a function of the input length. For example, an algorithm may require n3 + 1 elementary operations to execute where n is the input length. The time complexity of this algorithm is described asymptotically—as the input length goes to infinity. Thus, the time complexity of this algorithm is represented as O(n3 )—equivalent to the limit as n goes towards infinity. In some cases, the number of operations an algorithm requires to solve a problem may depend on not only the length of the input, but also what the inputs are. For example, multiplying two 2-digit numbers takes fewer operations if one of the numbers is a multiple of 10. On a more complex level, Gaussian Elimination will take fewer operations in many situations such as if any row has a 0 as the leading entry, or if one row is a scalar multiple of another. In cases where the algorithm may take fewer steps, the time complexity of the algorithm is typically given by the worst-case time-complexity, which is the maximum number of operations needed to run the algorithm on an input of length n (Papadimitriou). Multiplication algorithms Multiplication, a fundamental operation in mathematics, is pervasive in computer programs. Given how often multiplication is used, an algorithm that shortens multiplication time will also significantly speed up many programs. Recognizing this fact, mathematicians and computer scientists alike searched for a more efficient multiplication algorithm. The classic multiplication algorithm we all learned in grade school, also known as long multiplication, requires the multiplication of each digit of one number by each digit of the other number. A multiplication of two 2-digit numbers a1 a2 and b1 b2 will require four single-digit multiplications: a2 b2 , a2 b1 , a1 b2 , and a1 b1. More generally, the product of two n-digit multiplications will require n2 multiplications of single digit numbers, which implies that long multiplication has a time complexity of O(n2 ). While the long multiplication algorithm works very quickly for small numbers, the number of operations required increases quadratically, which translates to a quadratically increasing computing time. In 1960, Anatoly Karatsuba discovered an algorithm that made multiplication more efficient. His algorithm comes from the decomposition of an n-digit number X into a sum of two different numbers 10n/2 X1 + X2 where X1 is the first n/2 digits and X2 is the second n/2. If we have another n-digit number Y that can be decomposed into 10n/2 Y1 + Y2 , we can now express the product XY as the product of the two compositions: 1. XY = (10n/2 X1 + X2 )(10n/2 Y1 + Y2 ) Multiplying the two compositions yields a new expression: n 2. XY = 10n X1 Y1 + 102 (X1 Y2 + X2 Y1 ) + X2 Y2 In this expression, we have to perform four multiplications, which is no more efficient than long multiplication. However, by implementing a clever trick, we can reduce the number of necessary multiplications to three. If we calculate the value (X1 + X2 )(Y1 + Y2 ), we yield the number X1 Y1 + X1 Y2 + X2 Y1 + X 2 Y2 . Subtracting X1 Y1 and X 2 Y2 , which we already must calculate to find the first and third terms of the product, we are left with X1 Y2 + X2 Y1, the middle term of the expression 2. Thus, we can calculate all three terms of the polynomial in just three multiplications (Ofman, Karatsuba): X1 Y1 , (X1 + X2 )(Y1 + Y2 ), and X2 Y2 . To give a numerical example, consider the 2-digit numbers 12 and 34. Then we have: 3. X1 = 1, X2 = 2, Y1 = 3, and Y2 = 4. Thus, we find that: 4. X1 Y1 = (1)(3) = 3, (X1 + X2 )(Y1 + Y2 ) = (1 + 2)(3 + 4) = 21, and X2 Y2 = 8. Then, we can plug back into expression 2 to yield: 2 5. 102 (3) + 102 (21 − 3 − 8) + 8 = 408 The example here is known as the base case in recursion: it can be solved using addition, subtraction, and single digit multiplication—the most fundamental operations. When larger numbers are passed onto the algorithm, more steps are required in order to reach the base case. For example, if X and Y are 4-digit integers, then X1 , X2 , Y1 , and Y2 will then be 2-digit integers. As a result, the multiplications X1 Y1 , (X1 + X2 )(Y1 + Y2 ), and X2 Y2 cannot be solved using single digit operations. Another recursive call of the Karatsuba multiplication must be made in order to solve the three smaller multiplications. The time complexity of the Karatsuba multiplication can be calculated using this fact (Babai). If each call of the Karatsuba multiplication algorithm M(n) requires three further multiplications and halves the number of digits in each smaller multiplication, then we write: n 6. M(n) = 3M(2). If we use mathematical induction, we find that for every i where i ≤ k, we have: n 7. M(n) = 3i (2i ), and if i = k, then 3k M(n/2k ) = 3k M(1) = 3k since the base case does not require further Karatsuba multiplication, but only three single-digit multiplications. To find the time complexity of Karatsuba multiplication, we take the log of equation 4, and simplify, leading to the equation: 8. M(n) = nlog2 3 = n1.585 Thus, the time complexity of Karatsuba multiplication (O(n1.585 )) is more efficient than long multiplication, and takes significantly less computing time when the length of the input increases to very high numbers. Looking at the Figure 1 below, we can see that long multiplication quickly begins to require much more operations than Karatsuba multiplication; for an input length of just 10, long multiplication will already require more than double the operations as Karatsuba multiplication (Knuth). Figure 1: Long Multiplication (Red) Karatsuba Multiplication (Blue) Toom-Cook Algorithm The Toom-Cook algorithm is a generalization of the Karatsuba multiplication. Instead of splitting integers into two parts, the Toom-Cook algorithm allows for more complex splitting. For example, the Toom-Cook 3 Way method involves splitting an integer into three parts. These parts are written in terms of a polynomials so that integers X and Y will be split such that: 9. X(t) = (X2)t 2 + X1(t) + X0 10. Y(t) = (Y2)t 2 + Y1(t) + Y0 where t is equal to 103 since we split each integer is split into three pieces in the Toom-Cook 3 algorithm. Multiplying these together, we obtain a new polynomial expression W: 11. W(t) = w4 t 4 + w3 t 3 + w2 t 2 + w1 t + wo First, we choose strategic values of w = (0, 1, −1, 2, ∞) in order to make the future calculations easier. Because W(t) is the product of X(t) and Y(t), we can plug the strategic values of w into X(t) and Y(t) and multiply them together, which involves five multiplications. If the length of the integers being multiplied is greater than 3, these multiplications will require another recursive call of the Toom-Cook algorithm, so a large multiplication can be decomposed into five smaller ones. Each of these products will be equal to the following, respectively: 12. W(0) = wo 13. W(1) = w4 + w3 + w2 + w1 + wo 14. W(−1) = w4 − w3 − w2 − w1 + wo 15. W(2) = 16w4 + 8w3 + 4w2 + 2w1 + wo 16. W(∞) = w4 By plugging values into this polynomial, we can obtain a system of equations to solve for the polynomial coefficients. We can now see the rationale behind choosing values w = (0, 1, −1, 2, ∞)—they yield a system of equations that is easier to solve. These equations can be rewritten as W(0) 1 W(1) 1 17. W(−1) = 1 1 W(2) ( W(∞) ) (0 wo 0 0 0 0 w1 1 1 1 1 w2 −1 −1 −1 1 w3 2 4 8 16 0 0 0 1 ) (w4 ) which can be solved for w4 , w3 , w2 , w1 , wo . Plugging back into our original equation, we can find that the product of the multiplication is equal to 1012 w4 + 109 w3 + 106 w2 + 103 w1 + wo (Kronenburg). Toom-Cook Complexity The Toom Cook 3-algorithm reduces a large multiplication to 5 multiplications of smaller numbers. Thus, we can write: n 18. 𝑀(n) = 5M(3). This implies that the Toom Cook 3-algorithm has a complexity of 𝑂(𝑛log3 5 ) = 𝑂(𝑛1.465 ). In general, the Toom Cook k-algorithm has a complexity of 𝑂(nlogk 2k−1 ). It is also interesting to note that Karatsuba multiplication is actually a special case of the Toom Cook algorithm (Toom-Cook 2-algorithm) and long multiplication is simply the Toom Cook 1-algorithm (Knuth). Implementation: Because the time-complexity of Karatsuba and Toom-Cook multiplication are both better than that of long multiplication, they will become faster than long multiplication as input size rises towards infinity. However, they are not necessarily more efficient than long multiplication at all input sizes. There are several factors that contribute to this nuance. For instance, both Karatsuba and Toom-Cook multiplication are slower than long multiplication at small inputs because recursive overhead—the computing power required to recursively call a function several times—will outweigh multiplicative efficiency when 𝑛 is small. Therefore, implementations of the Karatsuba and Toom-Cook algorithms often switch to long multiplication when the operands in multiplication become small. In addition, for small inputs, the increase in number of additions and subtractions outweighs the increased efficiency in multiplication, so long multiplication will be more efficient. Lastly, solving the matrix equation in the Toom-Cook algorithm takes computing power, and only becomes strategic for large inputs when the multiplicative efficiency begins to outweigh the computing power required to solve the matrix equation. For even larger numbers, algorithms such as the Schönhage–Strassen algorithm, which utilizes Fourier Transformations, become more efficient (Knuth). Division algorithm Division can be understood as a multiplication between a dividend and the reciprocal of the divisor. As a result, many division algorithms involve two steps: finding the reciprocal of the dividend, and employing a multiplication algorithm to find the quotient. 1 To find the reciprocal of a number 𝐷, we can create an equation with root D and then use Newton’s method to estimate the root. One equation that works is 1 19. 𝑓(X) = X − D. f(X ) Applying Newton’s method, we can iterate Xi+1 = X i − f′ (Xi ) to find increasingly accurate i 1 estimates of D. Each iteration of the algorithm can be given by the following: 20. Xi+1 = Xi − 1 −D Xi 1 − 2 Xi = Xi (2 − DXi ). Figure 2: Newton’s Method Figure 2 gives a visual representation of Newton’s Method at work. Using the derivative of the function, we find increasingly precise estimates of the root. We can quantity the accuracy of each estimate using error analysis: if we define the error ϵi to be Xi − 1 D , the difference between the 1 actual value of 𝐷 and our estimate, we find the error after each iteration of Newton’s Method. However, using 𝜖𝑖 = 𝐷𝑋𝑖 − 1, which is equivalent to the first expression multiplied by 𝐷, will yield the same result with much cleaner math. We then calculate the error after an iteration of Newton’s method. 𝜖𝑖+1 = 𝐷𝑋𝑖+1 − 1 = 2𝐷𝑋𝑖 − 𝐷2 𝑋𝑖2 − 1 = −(𝐷𝑋𝑖 − 1)2 = −𝜖𝑖2 This implies that each iteration of Newton’s method will result in roughly a doubling of the number of correct digits in the result. After calculating the reciprocal of 𝐷 to a sufficient degree of accuracy, we can then use one of the aforementioned multiplication algorithms to find the quotient. The use of Newton’s method to find the reciprocal of a dividend and then multiplying the reciprocal by the divisor is known as the Newton-Raphson method. The time complexity of finding the reciprocal is 𝑂(log(𝑛) 𝐹(𝑛)) where 𝐹(𝑛) is the number of fundamental operations needed to calculate the reciprocal to 𝑛-digit precision. Therefore, the overall time-complexity of division will be the time complexity of the sum of operations needed to find the reciprocal and multiply the reciprocal by the dividend: 𝑂(log(𝑛) 𝐹(𝑛) + 𝑛1.465 ) = 𝑂(𝑛1.465 ) since exponential growth will eclipse the other term’s growth as input length approaches infinity. Therefore, division will have the same asymptotic complexity as multiplication (Cook). Like multiplication, the implementation of division algorithms will often be slower than standard long division for small inputs due to recursive overhead. In addition, the computing power used to calculate the reciprocal of an operand for small inputs would be more efficiently utilized on simply carrying out the long division. Thus, like multiplication, the division algorithm used by computers will depend on the input length. Typically, the Newton-Raphson algorithm is only used for very large inputs (Papadimitriou). Conclusion: Currently, the fastest multiplication and division algorithm for two n digit integers remains an open question in computer science. And yet, an increase in multiplication and division efficiency could lead to significantly faster computing speeds and new capabilities to process ever-increasing amounts of data. Works Cited Babai, Laszlo. "Divide and Conquer: The Karatsuba–Ofman algorithm." UChicago: Laszlo Babai. 3 Dec 2016 <http://people.cs.uchicago.edu/~laci/HANDOUTS/karatsuba.pdf>. Bodrato, M. Zanoni. "Integer and Polynomial Multiplication: Towards Optimal Toom-Cook Matrices." Proceedings of the ISSAC 2007 Conference. New York: ACM press, 2007. Cook, Stephen A. On the Minimum Computation Time of Functions. Cambridge, 1966. Knuth, Donald. The Art of Computer Programming. Addison Wesley , 2005. Kronenburg, M.J. Toom-Cook Multiplication: Some Theoretical and Practical Aspects. 16 Feb 2016. Ofman, A. Karatsuba and Yu. "Multiplication of Many-Digital Numbers by Automatic Computers." Proceedings of the USSR Academy of Sciences. Physics-Doklady, 1963. 595–596. Papadimitriou, Christos H. Computational complexity. Reading: Addison-Wesley, 1994. Smith, Mark D. "Newton-Raphson Technique ." 1 Oct 1998. MIT 10.001. 6 Dec 2016 <http://web.mit.edu/10.001/Web/Course_Notes/NLAE/node6.html>. Picture Bibliography 1. Graphed on https://www.desmos.com/ 2. http://tutorial.math.lamar.edu/Classes/CalcI/NewtonsMethod_files/image001.gif

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download CL_Paper3_MultiplicationandDivisionAlgorithms