Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Powers Integer powers Integer powers x^n with integer n are computed by a fast algorithm of "repeated squaring". The algorithm is based on the following trick: if n is even, say n=2*k, then x^n=x^k^2; and if n is odd, n=2*k+1, then x^n=x*x^k^2. Thus we can reduce the calculation of x^n to the calculation of x^k with k<=n/2, using at most two long multiplications. The function power(m,n) calculates the result of m^n for n>0, m>0, integer n and integer m. The bit shifts and the check for an odd number are very fast operations if the internal representation of big numbers uses base 2. It is easier to implement the non-recursive version of the squaring algorithm in a slightly different form. Suppose we obtain the bits b[i] of the number n in the usual order, so that n=b[0]+2*b[1]+...+b[m]*2^m. Then we can express the power x^n as x^n=x^b[0]*x^2^b[1]*...*x^2^m^b[m]. In other words, we evaluate x^2, x^4, ... by repeated squaring, select those x^2^k for which the k-th bit b[k] of the number n is nonzero, and multiply all selected powers together. Real powers The squaring algorithm can be used to obtain integer powers x^n in any ring---as long as n is an integer, x can be anything from a complex number to a matrix. But for a general real number n, there is no such trick and the power x^n has to be computed through the logarithm and the exponential function, x^n=Exp(n*Ln(x)). An exceptional case is when n is a rational number with a very small numerator and denominator, for example, n=2/3. In this case it is faster to take the square of the cubic root of x. (See the section on the computation of roots below.) Then the case of negative x should be handled separately. This speedup is not implemented in Yacas. Roots Computation of roots r=x^(1/n) is efficient when n is a small integer. The basic approach is to numerically solve the equation r^n=x. Method 1: bisection For integer N, the following steps are performed: • • • Find the highest bit set, l2, in the number N. 1<<(l2/2) is definitely a bit that is set in the result. Start by setting that bit in the result, u=1<<l2. It is also the highest bit set in the result. Now, traverse all the lower bits, one by one. For each lower bit, starting at lnext=l2-1, set v=1<<lnext. Now, (u+v)^2=u^2+2*u*v+v^2. If (u+v)^2<=N, then the bit set in v should also be set in the result, u, otherwise that bit should be cleared. Set lnext=lnext-1, and repeat until all bits are tested, and lnext=0 Method 2: Newton's iteration An efficient method for computing the square root is found by using Newton's iteration for the equation r^2-x=0. The initial value of r can be obtained by bit counting and shifting, as in the bisection method. The iteration formula is r'=r/2+x/(2*r). Therefore it makes sense to use a method that avoids divisions. One variant of Newton's method is to solve the equation 1/r^2=x. The solution of this equation r=1/Sqrt(x) is the limit of the iteration r'=r+r*(1-r^2*x)/2 that does not require any divisions (but instead requires three multiplications). The final multiplication r*x completes the calculation of the square root. Method 3: higher-order iterations A higher-order generalization of Newton's iteration for inverse square root 1/Sqrt(x) is: r'=r+r/2*(1-r^2*x)+3*r/8*(1-r^2*x)^2+... The more terms of the series we add, the higher is the convergence rate. This is the Taylor series for (1-y)^(-1/2) where y:=1-r^2*x. If we take the terms up to y^(n-1), the precision at the next iteration will be multiplied by n. The usual second-order iteration (our "method 2") corresponds to n=2. Logarithm Integer logarithm The "integer logarithm", defined as the integer part of Ln(x)/Ln(b), where x and b are integers, is computed using a special routine IntLog(x,b) with purely integer math. When both arguments are integers and only the integer part of the logarithm is needed, the integer logarithm is much faster than evaluating the full floating-point logarithm and truncating the result. The basic algorithm consists of (integer-) dividing x by b repeatedly until x becomes 0 and counting the necessary number of divisions. Real logarithms There are many methods to compute the logarithm of a real number. Here we collect these methods and analyze them. The logarithm satisfies Ln(1/x)= -Ln(x). Therefore we need to consider only x>1, or alternatively, only 0<x<1. Method 1: Taylor series The logarithm function Ln(x) for general (real or complex) x such that Abs(x-1)<1 can be computed using the Taylor series, Ln(1+z)=z-z^2/2+z^3/3-... The series converges quite slowly unless Abs(x) is small. For real x<1, the series is monotonic, Ln(1-z)= -z-z^2/2-z^3/3-..., and the round-off error is somewhat smaller in that case (but not very much smaller, because the Taylor series method is normally used only for very small x). If x>1, then we can compute -Ln(1/x) instead of Ln(x). However, the series converges very slowly if x is close to 0 or to 2. Method 2: square roots + Taylor series The method of the Taylor series allows to compute Ln(x) efficiently when x1=10^(-N) is very close to 1 (i.e. for large N). For other values of x the series converges very slowly. We can transform the argument to improve the performance of the Taylor series. One way is to take several square roots, reducing x to x^2^(-k) until x becomes close to 1. Then we can compute Ln(x^2^(-k)) using the Taylor series and use the identity Ln(x)=2^k*Ln(x^2^(-k)). Method 3: inverse exponential The method is to solve the equation Exp(x)-a=0 to find x=Ln(a). We can use either the quadratically convergent Newton iteration, x'=x-1+a/Exp(x), or the cubically convergent Halley iteration, x'=x-2*(Exp(x)-a)/(Exp(x)+a). Method 4: continued fraction There is a continued fraction representation of the logarithm: Ln(1+x)=x/(1+x/(2+x/(3+(4*x)/(4+(4*x)/(5+(9*x)/(6+...)))))). This fraction converges for all x, although the speed of convergence varies with the magnitude of x. This method does not seem to provide a computational advantage compared with the other methods. Method 5: bisection A simple bisection algorithm for Ln(x)/Ln(2) (the base 2 logarithm) with real x is described in [Johnson 1987]. First, we need to divide x by a certain power of 2 to reduce x to y in the interval 1<=y<2. We can use the bit count m=BitCount(x) to find an integer m such that 1/2<=x*2^(-m)<1 and take y=x*2^(1-m). Then Ln(x)/Ln(2)=Ln(y)/Ln(2)+m-1. Now we shall find the bits in the binary representation of Ln(y)/Ln(2), one by one. Given a real y such that 1<=y<2, the value Ln(y)/Ln(2) is between 0 and 1. Now, Ln(y)/Ln(2)=2^(-1)*Ln(y^2)/Ln(2). The leading bit of this value is 1 if y^2>=2 and 0 otherwise. Therefore we need to compute y'=y^2 using a long P-digit multiplication and compare it with 2. If y'>=2 we set y=y'/2, otherwise we set y=y'; then we obtain 1<=y<2 again and repeat the process to extract the next bit of Ln(y)/Ln(2). Exponential The exponential function satisfies Exp(-x)=1/Exp(x). Therefore we need to consider only x>0. Method 1: Taylor series The exponential function is computed using its Taylor series, Exp(x)=1+x+x^2/2! +... This series converges for all (complex) x, but if Abs(x) is large, or if x is negative, then the series converges slowly and/or gives a large round-off error. So one should use this Taylor series only when x is small. Method 2: squaring + Taylor series A speed-up trick used for large x is to divide the argument by some power of 2 and then square the result several times, i.e. Exp(x)=Exp(2^(-k)*x)^2^k, Method 3: inverse logarithm An alternative way to compute x=Exp(a) if a fast logarithm routine is available would be to solve the equation Ln(x)=a. Newton's method gives the iteration x'=x*(a+1-Ln(x)). The iteration converges quadratically to Exp(a) if the initial value of x is 0<x<Exp(a+1). Method 4: linear reduction + Taylor series In this method we reduce the argument x by subtracting an integer. Suppose x>1, then take n=Floor(x) where n is an integer, so that 0<=xn<1. Then we can compute Exp(x)=Exp(n)*Exp(x-n) by using the Taylor series on the small number x-n. The integer power e^n is found from a precomputed value of e. Method 5: continued fraction There is a continued fraction representation of the exponential function: Exp(-x)=1-x/(1+x/(2-x/(3+x/(2-x/(5+x/(2-...)))))). This fraction converges for all x, although the speed of convergence varies with the magnitude of x. This method does not seem to provide a computational advantage compared with the other methods. Trigonometric functions Trigonometric functions Sin(x), Cos(x) are computed by subtracting 2*Pi from x until it is in the range 0<x<2*Pi and then using the Taylor series. (The value of Pi is precomputed.) Tangent is computed by dividing Sin(x)/Cos(x) or from Sin(x) using the identity Tan(x)=Sin(x)/Sqrt(1-Sin(x)^2). Method 1: Taylor series The Taylor series for the basic trigonometric functions are Sin(x)=x-x^3/3! +x^5/5! -x^7/7! +..., Cos(x)=1-x^2/2! +x^4/4! -x^6/6! +.... These series converge for all x but are optimal for multiple-precision calculations only for small x. The convergence rate and possible optimizations are the same as those of the Taylor series for Exp(x). Method 2: argument reduction Basic argument reduction requires a precomputed value for Pi/2. The identities Sin(x+Pi/2)=Cos(x), Cos(x+Pi/2)= -Sin(x) can be used to reduce the argument to the range between 0 and Pi/2. Then the bisection for Cos(x) and the trisection for Sin(x) are used. For Cos(x), the bisection identity can be used more efficiently if it is written as 1-Cos(2*x)=4*(1-Cos(x))-2*(1-Cos(x))^2. If 1-Cos(x) is very small, then this decomposition allows to use a shorter multiplication and reduces round-off error. For Sin(x), the trisection identity is Sin(3*x)=3*Sin(x)-4*Sin(x)^3. Method 3: inverse ArcTan(x) The function ArcTan(x) can be found from its Taylor series. Then the function can be inverted by Newton's iteration to obtain Tan(x) and from it also Sin(x), Cos(x) using the trigonometric identities. Alternatively, ArcSin(x) may be found from the Taylor series and inverted to obtain Sin(x). Inverse trigonometric functions Inverse trigonometric functions are computed by various methods. To compute y=ArcSin(x), Newton's method is used for to invert x=Sin(y). The inverse tangent ArcTan(x) can be computed by its Taylor series, ArcTan(x)=x-x^3/3+x^5/5-..., or by the continued fraction expansion, ArcTan(x)=x/(1+x^2/(3+(2*x)^2/(5+(3*x)^2/(7+...)))). The convergence of this expansion for large Abs(x) is improved by using the identities ArcTan(x)=Pi/2*Sign(x)-ArcTan(1/x), ArcTan(x)=2*ArcTan(x/(1+Sqrt(1+x^2))). Thus, any value of x is reduced to Abs(x)<0.42. This is implemented in the standard library scripts. By the identity ArcCos(x):=Pi/2-ArcSin(x), the inverse cosine is reduced to the inverse sine. Newton's method for ArcSin(x) consists of solving the equation Sin(y)=x for y. Implementation is similar to the calculation of pi in PiMethod0(). For x close to 1, Newton's method for ArcSin(x) converges very slowly. An identity ArcSin(x)=Sign(x)*(Pi/2-ArcSin(Sqrt(1-x^2))) can be used in this case. Another potentially useful identity is ArcSin(x)=2*ArcSin(x/(Sqrt(2)*Sqrt(1+Sqrt(1-x^2)))). Inverse tangent can also be related to inverse sine by ArcTan(x)=ArcSin(x/Sqrt(1+x^2)), ArcTan(1/x)=ArcSin(1/Sqrt(1+x^2)). Alternatively, the Taylor series can be used for the inverse sine: ArcSin(x)=x+1/2*x^3/3+(1*3)/(2*4)*x^5/5+(1*3*5)/(2*4*6)*x ^7/7+.... An everywhere convergent continued fraction can be used for the tangent: Tan(x)=x/(1-x^2/(3-x^2/(5-x^2/(7-...)))). Hyperbolic and inverse hyperbolic functions are reduced to exponentials and logarithms: Cosh(x)=1/2*(Exp(x)+Exp(-x)), Sinh(x)=1/2*(Exp(x)Exp(-x)), Tanh(x)=Sinh(x)/Cosh(x), ArcCosh(x)=Ln(x+Sqrt(x^2-1)), ArcSinh(x)=Ln(x+Sqrt(x^2+1)), ArcTanh(x)=1/2*Ln((1+x)/(1-x)). CORDIC To determine the sine or cosine for an angle β, the y or x coordinate of a point on the unit circle corresponding to the wanted angle needs to be found. Using CORDIC, we would start with the vector v0: In the first iteration, this vector would be rotated 45° counterclockwise to get the vector v1. Successive iterations will rotate the vector in the right direction by half the amount of the previous iteration until the wanted value has been achieved. An illustration of the CORDIC algorithm in progress. More formally, every iteration calculates a rotation, which is performed by multiplying the vector vi with the rotation matrix Ri: The rotation matrix R is given by: Using the following two well-known trigonometric identities the rotation matrix becomes: The expression for the rotated vector vi + 1 = Rivi then becomes: where xi and yi are the components of vi. Restricting the angles γi so that tan(γi)takes on the values the multiplication with the tangent can be replaced by a division by a power of two, which is efficiently done in digital computer hardware using a bit shift. The expression then becomes: where and σi can have the values of −1 or 1 and is used to determine the direction of the rotation: if the angle βi is positive then σi is 1, otherwise it is −1. We can ignore Ki in the iterative process and then apply it afterward by a scaling factor: which is calculated in advance and stored in a table. Additionally it can be noted that [3] to allow further reduction of the algorithm's complexity. After a sufficient number of iterations, the vector's angle will be close to the wanted angle β. For most ordinary purposes, 40 iterations (n = 40) is sufficient to obtain the correct result to the 10th decimal place. The only task left is to determine if the rotation should be clockwise or counterclockwise at every iteration (choosing the value of σ). This is done by keeping track of how much we rotated at every iteration and subtracting that from the wanted angle, and then checking if βn + 1 is positive and we need to rotate clockwise or if it is negative we must rotate counterclockwise in order to get closer to the wanted angle β. The values of γn must also be precomputed and stored. But for small angles, arctan(γn) = γn in fixed point representation, reducing table size. As can be seen in the illustration above, the sine of the angle β is the y coordinate of the final vector vn, while the x coordinate is the cosine value. function v = cordic(beta,n) % This function computes v = [cos(beta), sin(beta)] (beta in radians) % using n iterations. Increasing n will increase the precision. if beta < -pi/2 | beta > pi/2 if beta < 0 v = cordic(beta + pi, n); else v = cordic(beta - pi, n); end v = -v; % flip the sign for second or third quadrant return end % Initialization of tables of constants used by CORDIC % need a table of arctangents of negative powers of two, in radians: % angles = atan(2.^-(0:27)); angles = [ ... 0.78539816339745 0.46364760900081 0.24497866312686 0.12435499454676 ... 0.06241880999596 0.03123983343027 0.01562372862048 0.00781234106010 ... 0.00390623013197 0.00195312251648 0.00097656218956 0.00048828121119 ... 0.00024414062015 0.00012207031189 0.00006103515617 0.00003051757812 ... 0.00001525878906 0.00000762939453 0.00000381469727 0.00000190734863 ... 0.00000095367432 0.00000047683716 0.00000023841858 0.00000011920929 ... 0.00000005960464 0.00000002980232 0.00000001490116 0.00000000745058 ]; % and a table of products of reciprocal lengths of vectors [1, 2^-j]: Kvalues = [ ... 0.70710678118655 0.63245553203368 0.61357199107790 0.60883391251775 ... 0.60764825625617 0.60735177014130 0.60727764409353 0.60725911229889 ... 0.60725447933256 0.60725332108988 0.60725303152913 0.60725295913894 ... 0.60725294104140 0.60725293651701 0.60725293538591 0.60725293510314 ... 0.60725293503245 0.60725293501477 0.60725293501035 0.60725293500925 ... 0.60725293500897 0.60725293500890 0.60725293500889 0.60725293500888 ]; Kn = Kvalues(min(n, length(Kvalues))); % Initialize loop variables: v = [1;0]; % start with 2-vector cosine and sine of zero poweroftwo = 1; angle = angles(1); % Iterations for j = 0:n-1; if beta < 0 sigma = -1; else sigma = 1; end factor = sigma * poweroftwo; R = [1, -factor; factor, 1]; v = R * v; % 2-by-2 matrix multiply beta = beta - sigma * angle; % update the remaining angle poweroftwo = poweroftwo / 2; % update the angle from table, or eventually by just dividing by two if j+2 > length(angles) angle = angle / 2; else angle = angles(j+2); end end % Adjust length of output vector to be [cos(beta), sin(beta)]: v = v * Kn; return