Download IEEE 754 double precision properties

2/6/2014 Computational Methods CMSC/AMSC 460 Ramani Duraiswami, Dept. of Computer Science Fractions in binary • • • • • 0.1 = ½ 0.01 = 1/22 = ¼ 0.11 = ½ + ¼ = ¾ 0.00001 = 1/25 0.1+0.1= 1.0 • 0.675= 1× ½ + 0 ×1/4 + 1 ×1/8 + 0 × 1/16 + 1 × 1/32 + … • 0.10101100110011001100110011001100110011001100 11001101 … • Matlab dec2bin 1 2/6/2014 Fractions in binary – how to • • • • • Convert .625 to a binary representation.. Step 1: multiply by 2. so .625 x 2 = 1.25, So .625 = .1??? . . . (base 2) . Step 2: disregard whole number part and multiply by 2 Because .25 x 2 = 0.50, second binary digit is a 0. So .625 = .10?? . . . (base 2) . • Step 3: multiply decimal by 2 once again. .50 x 2 = 1.00, So now we have .625 = .101 (base 2) . • 0 is the fractional part and we are done Infinite fractions • Consider 1/10 = 0.1 1. Because .1 x 2 = 0.2, .1 (decimal) = .0??? . . . (base 2) 2. .2 x 2 = 0.4, .1 (decimal) = .00?? . . . (base 2) 3. 0.4 x 2 = 0.8, .1 (decimal) = .000?? . . . (base 2) 4. 0.8 x 2 = 1.6, .1 (decimal) = .0001?? . . . (base 2) 5. 0.6 x 2 = 1.2, .1 (decimal) = .00011?? . . . (base 2) • Observe: fraction in step 5 is the same as in step 2. • So just keep repeating steps 2-5, • .1 (decimal) = .00011001100110011 . . . (base 2) . 2 2/6/2014 Scientific Notation Sign of Exponent -6.023 x 10-23 Sign Exponent Normalized Mantissa Base IEEE-754 (double precision) 0 1 11 0 0000000000 s i g n exponent 12 63 00000000000……000000000000 mantissa (significand) (-1)S * 1 . f * 2 E-1023 Sign 1 is understood Mantissa (w/o leading 1) Base Exponent 3 2/6/2014 IEEE – 754 standard • Most numbers are normalized --- an implied 1 before decimal – ±(1+f) × 2e • f is called the mantissa -- 0≤f<1 – For double precision f has 52 places • e is called the exponent – signed integer – – – – For double precision e has 11 places Using two’s complement we could go from -1024 to 1023 Instead we go from -1022 ≤ e ≤ 1023 Save the remaining two to represent other numbers • Combination of e and f determine properties of floating point numbers --- precision, distribution, range Effects of floating point • Floatgui 4 2/6/2014 Effects of floating point • eps is the distance from 1 to the next larger floating-point number. • Or number which when added to 1 gives a number different from 1 • eps = 2-52 • In Matlab Binary Decimal eps 2^(-52) 2.2204e-16 realmin 2^(-1022) 2.2251e-308 realmax (2-eps)*2^1023 1.7977e+308 Some numbers cannot be exactly represented 5 2/6/2014 Rounding vs. Chopping • Chopping: Store x as c, where |c| < |x| and no machine number lies between c and x. • Rounding: Store x as r, where r is the machine number closest to x. • IEEE standard arithmetic uses rounding. Experiment • i=0; x=1; while 1+ x > 1, x=x/2, pause (.01), i=i+1; end • i=0; x=1; while x+ x > x, x=2*x, pause (.01), i=i+1; end • i=0; x=1; while x+ x > x, x=x/2, pause (.01), i=i+1; end 6 2/6/2014 IEEE-754 (single precision) 0 1 8 9 0 00000000 s i g n 31 00000000000000000000000 exponent mantissa (significand) (-1)S * 1.f * 2 E-127 Sign 1 is understood Mantissa (w/o leading 1) Base Exponent Error definition • • • • • Computation should have result: c Actual result is: x abs_err= |x – c| rel_err = |x-c|/ x for x≠0 x= (c)(1+rel_err) • Usually we do not know c – No need to solve the problem if we already knew it! • Error analysis tries to estimate abs_err and rel_err using computed result x and knowledge of the algorithm and data 7 2/6/2014 Good rule of thumb • Always do your computations with variables which are scaled to be about one Effects of floating point errors • Singular equations will only be nearly singular • Severe cancellation errors can occur x = 0.988:.0001:1.012; y = x.^7-7*x.^6+21*x.^5-35*x.^4+35*x.^3-21*x.^2+7*x-1; plot(x,y) 8 2/6/2014 Errors can be magnified • Errors can be magnified during computation. • Let us assume numbers are known to 0.05% accuracy • Example: 2.003  100 and 2.000  100 – both known to within ± .001 • Perform a subtraction. Result of subtraction: 0.003  100 • but true answer could be as small as 2.002 - 2.001 = 0.001, • or as large as 2.004 - 1.999 = 0.005! • Absolute error of 0.002 • Relative error of 200% ! • Adding or subtracting causes the bounds on absolute errors to be added Error effect on multiplication/division • Let x and y be true values • Let X=x(1+r) and Y=y(1+s) be the known approximations • Relative errors are r and s • What is the errors in multiplying the numbers? • XY=xy(1+r)(1+s) • Absolute error =|xy(1-rs-r-s-1)|= (rs+r+s)xy • Relative error = |(xy-XY)/xy| = |rs+r+s| <= |r|+|s|+|rs| • If r and s are small we can ignore |rs| • Multiplying/dividing causes relative error bounds to add 9

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download IEEE 754 double precision properties