Download IEEE 754 double precision properties

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of logarithms wikipedia , lookup

Large numbers wikipedia , lookup

Location arithmetic wikipedia , lookup

Rounding wikipedia , lookup

Approximations of π wikipedia , lookup

Addition wikipedia , lookup

Elementary mathematics wikipedia , lookup

Arithmetic wikipedia , lookup

Positional notation wikipedia , lookup

Transcript
2/6/2014
Computational Methods
CMSC/AMSC 460
Ramani Duraiswami,
Dept. of Computer Science
Fractions in binary
•
•
•
•
•
0.1 = ½
0.01 = 1/22 = ¼
0.11 = ½ + ¼ = ¾
0.00001 = 1/25
0.1+0.1= 1.0
• 0.675= 1× ½ + 0 ×1/4 + 1 ×1/8 + 0 × 1/16 + 1 × 1/32 +
…
• 0.10101100110011001100110011001100110011001100
11001101 …
• Matlab dec2bin
1
2/6/2014
Fractions in binary – how to
•
•
•
•
•
Convert .625 to a binary representation..
Step 1: multiply by 2. so .625 x 2 = 1.25,
So .625 = .1??? . . . (base 2) .
Step 2: disregard whole number part and multiply by 2
Because .25 x 2 = 0.50, second binary digit is a 0.
So
.625 = .10?? . . . (base 2) .
• Step 3: multiply decimal by 2 once again. .50 x 2 = 1.00,
So now we have .625 = .101
(base 2) .
• 0 is the fractional part and we are done
Infinite fractions
• Consider 1/10 = 0.1
1. Because .1 x 2 = 0.2, .1 (decimal) = .0??? . . . (base 2)
2. .2 x 2 = 0.4, .1 (decimal) = .00?? . . . (base 2)
3. 0.4 x 2 = 0.8,
.1 (decimal) = .000?? . . . (base 2)
4. 0.8 x 2 = 1.6,
.1 (decimal) = .0001?? . . . (base 2)
5. 0.6 x 2 = 1.2,
.1 (decimal) = .00011?? . . . (base 2)
• Observe: fraction in step 5 is the same as in step 2.
• So just keep repeating steps 2-5,
• .1 (decimal) = .00011001100110011 . . . (base 2) .
2
2/6/2014
Scientific Notation
Sign of
Exponent
-6.023 x 10-23
Sign
Exponent
Normalized
Mantissa
Base
IEEE-754 (double precision)
0
1
11
0
0000000000
s
i
g
n
exponent
12
63
00000000000……000000000000
mantissa (significand)
(-1)S * 1 . f *
2 E-1023
Sign
1 is understood
Mantissa (w/o leading 1)
Base
Exponent
3
2/6/2014
IEEE – 754 standard
• Most numbers are normalized --- an implied 1 before
decimal
– ±(1+f) × 2e
• f is called the mantissa --
0≤f<1
– For double precision f has 52 places
• e is called the exponent – signed integer
–
–
–
–
For double precision e has 11 places
Using two’s complement we could go from -1024 to 1023
Instead we go from -1022 ≤ e ≤ 1023
Save the remaining two to represent other numbers
• Combination of e and f determine properties of floating
point numbers --- precision, distribution, range
Effects of floating point
• Floatgui
4
2/6/2014
Effects of floating point
• eps is the distance from 1 to the next larger floating-point
number.
• Or number which when added to 1 gives a number
different from 1
• eps = 2-52
• In Matlab
Binary
Decimal
eps
2^(-52)
2.2204e-16
realmin 2^(-1022)
2.2251e-308
realmax (2-eps)*2^1023 1.7977e+308
Some numbers cannot be exactly represented
5
2/6/2014
Rounding vs. Chopping
• Chopping: Store x as c, where |c| < |x| and no machine
number lies between c and x.
• Rounding: Store x as r, where r is the machine number
closest to x.
• IEEE standard arithmetic uses rounding.
Experiment
• i=0; x=1; while 1+ x > 1, x=x/2, pause (.01), i=i+1; end
• i=0; x=1; while x+ x > x, x=2*x, pause (.01), i=i+1; end
• i=0; x=1; while x+ x > x, x=x/2, pause (.01), i=i+1; end
6
2/6/2014
IEEE-754 (single precision)
0
1
8
9
0 00000000
s
i
g
n
31
00000000000000000000000
exponent
mantissa (significand)
(-1)S * 1.f * 2 E-127
Sign
1 is understood
Mantissa (w/o leading 1)
Base
Exponent
Error definition
•
•
•
•
•
Computation should have result: c
Actual result is: x
abs_err= |x – c|
rel_err = |x-c|/ x
for x≠0
x= (c)(1+rel_err)
• Usually we do not know c
– No need to solve the problem if we already knew it!
• Error analysis tries to estimate abs_err and rel_err using
computed result x and knowledge of the algorithm and
data
7
2/6/2014
Good rule of thumb
• Always do your computations with variables which are
scaled to be about one
Effects of floating point errors
• Singular equations will
only be nearly singular
• Severe cancellation errors can
occur
x = 0.988:.0001:1.012;
y = x.^7-7*x.^6+21*x.^5-35*x.^4+35*x.^3-21*x.^2+7*x-1;
plot(x,y)
8
2/6/2014
Errors can be magnified
• Errors can be magnified during computation.
• Let us assume numbers are known to 0.05% accuracy
• Example: 2.003  100 and 2.000  100
– both known to within ± .001
• Perform a subtraction. Result of subtraction:
0.003  100
• but true answer could be as small as 2.002 - 2.001 = 0.001,
• or as large as 2.004 - 1.999 = 0.005!
• Absolute error of 0.002
• Relative error of 200% !
• Adding or subtracting causes the bounds on absolute
errors to be added
Error effect on multiplication/division
• Let x and y be true values
• Let X=x(1+r) and Y=y(1+s) be the known
approximations
• Relative errors are r and s
• What is the errors in multiplying the numbers?
• XY=xy(1+r)(1+s)
• Absolute error =|xy(1-rs-r-s-1)|= (rs+r+s)xy
• Relative error = |(xy-XY)/xy|
= |rs+r+s| <= |r|+|s|+|rs|
• If r and s are small we can ignore |rs|
• Multiplying/dividing causes relative error bounds to add
9