* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
FLOATING POINT NUMBERS-I Introduction H. Wiklicky (with thanks to A. Gopalan, R. Hayden and N. Dulay) [email protected] [email protected] Why do we need this: large, small and fractional numbers World population ~7, 000, 000, 000 people One light year One solar mass Electron diameter Electron mass Smallest measurable length of time 9, 130, 000, 000, 000 km 2, 000, 000, 000, 000, 000, 000, 000, 000, 000 kg 0.000, 000, 000, 000, 000, 000, 01 m 0.000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 9 kg 0.000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 1 sec Pi (to 14 decimal places) Standard rate of VAT Googol 3.14159 26535 8979... 20% 1 followed by a 100 zeros ο Large integers Example: How can we represent integers up to 30 decimal digits long? β’ Binary: 2 π = 1030 β π = log 2 (1030 ) β 100 bits (1 decimal digit β 3.32 bits) β’ BCD: 30 × 4 = 120 bits β’ ASCII: 30 × 8 = 240 bits Floating point numbers Recall scientific notation: π × 10πΈ π × 2πΈ Decimal Binary This is the basis for most floating point representation schemes β’ M is the coefficient (aka. significand, fraction or mantissa) β’ E is the exponent (aka. characteristic) β’ 10 (or for binary, 2) is the radix (aka. base) β’ No. of bits in exponent determines the range (bigness/smallness) β’ No. of bits in coefficient determines the precision (exactness) Zones of expressibility β’ Example: assume numbers are formed with a signed 3- digit coefficient and a signed 2-digit exponent Zones of expressibility: Zero Negative overflow β.999 × 10+99 Expressible βve nums Negative underflow β.001 × 10β99 0 Positive Expressible underflow +ve nums Positive overflow +.001 × 10β99 +.999 × 10+99 Real vs. floating point numbers Mathematical real Floating point number ββ β¦ + β Finite (Uncountably) infinite Finite Spacing ? Gap between numbers varies Errors ? Incorrect results are possible Range No. of values Some questions (assume signed 3-digit coefficient and a signed 2digit exponent as before): β’ What are the closest floating point numbers to . πππ × ππβππ ? What is the gap between this number and them? β’ What about . πππ × ππβππ ? Normalised floating point numbers β’ Depending on how you interpret the coefficient, floating point numbers can have multiple forms, e.g.: 0.023 × 104 = 0.230 × 103 = 2.3 × 102 = 0.0023 × 105 β’ For hardware implementations it is desirable for each number to have a unique floating point representation, a normalised form β’ Weβll normalise coefficients in the range [π, β¦ πΉ) where πΉ is the base, e.g.: 1, β¦ , 10 for decimal 1, β¦ , 2 for binary Normalised forms (base 10) Number 23.2 × 104 β4.01 × 10β3 343 000 × 100 0.000 000 098 9 × 100 Normalised form 2.32 × 105 β4.01 × 10β3 3.43 × 105 9.89 × 10β8 Binary fractions Binary 0.1 0.01 0.001 0.11 0.111 0.011 0.101 Decimal 0.5 0.25 0.125 0.75 0.875 0.375 0.625 Binary fraction to decimal fraction β’ What is the binary value 0.01101 in decimal? πβπ πβπ πβπ πβπ πβπ 0 1 1 0 1 . 1 β’ 4 β’ 1 + 8 + 8+4+1 25 1 32 = 13 32 = 0.40625 ππ ππ π π π π . 0 1 1 0 1 = 13 32 β’ What about 0.000 110 011? Binary fraction to decimal fraction β’ What is the binary value 0.01101 in decimal? πβπ πβπ πβπ πβπ πβπ 0 1 1 0 1 . 1 β’ 4 β’ 1 + 8 + 8+4+1 25 1 32 = 13 32 = 0.40625 ππ ππ π π π π . 0 1 1 0 1 = 13 32 β’ What about 0.000 110 011? β’ Answer: 32+16+2+1 29 = 51 512 = 0.099 609 375 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 0.6875 = = + 2 2 2 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 0.6875 = = + = + 2 2 2 2 4 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 = + + 2 8 8 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? 1.6 0.1 = 16 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? 1.6 1 0.6 0.1 = = + 16 16 16 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? 1.6 1 0.6 1 1.2 0.1 = = + = + 16 16 16 16 32 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? 1.6 1 0.6 1 1.2 1 1 0.2 0.1 = = + = + = + + 16 16 16 16 32 16 32 32 Decimal fraction to binary fraction β’ What is the decimal value 0.6875 in binary? 1.375 1 0.375 1 0.75 1 1.5 0.6875 = = + = + = + 2 2 2 2 4 2 8 1 1 0.5 1 1 1 = + + = + + 2 8 8 2 8 16 So the answer is 0.1011 β’ What is the decimal value 0.1 in binary? 1.6 1 0.6 1 1.2 1 1 0.2 1 1 1.6 0.1 = = + = + = + + = + + 16 16 16 16 32 16 32 32 16 32 256 β¦ Normalised forms (base 2) Number 100.01 × 21 1010.11 × 22 0.00101 × 2β2 1100101 × 2β2 Normalised form 1.0001 × 23 1.01011 × 25 1.01 × 2β5 1.100101 × 24 Floating point multiplication πΈ π1 × π2 = π1 × 10 1 × π2 × 10πΈ2 = π1 × π2 × 10πΈ1 × 10πΈ2 = π1 × π2 × 10πΈ1 +πΈ2 β’ That is, we multiply the coefficients and add the exponents β’ Example: 2.6 × 106 × 5.4 × 10β3 = 2.6 × 5.4 × 103 = 14.04 × 103 β’ We must also normalise the result, so final answer is 1.404 × 104 Truncation and rounding β’ For many computations, the result of a floating point operation is too large to store in the coefficient β’ Example (with a 2-digit coefficient): 2.3 × 101 × 2.3 × 101 = 5.29 × 102 β’ Truncation ο¨ 5.2 × 102 β’ Rounding ο¨ 5.3 × 102 (biased error) (unbiased error) Floating point addition β’ A floating point addition such as 4.5 × 103 + 6.7 × 102 is not a simple coefficient addition, unless the exponents are the same. Otherwise, we need to align them first π1 + π2 = π1 × 10πΈ1 + π2 × 10πΈ2 = π1 + π2 × 10πΈ2βπΈ1 × 10πΈ1 β’ To align, choose the number with the smaller exponent and shift its coefficient the corresponding number of digits to the right 4.5 × 103 + 6.7 × 102 = 4.5 × 103 + 0.67 × 103 = 5.17 × 103 = 5.2 × 103 (rounded) Exponent overflow and underflow β’ Exponent overflow occurs when the result is too large i.e. when the resultβs exponent > maximum exponent β’ Example: if max exponent is 99 then 1099 × 1099 = 10198 (overflow) To handle overflow, set value as infinity or raise an exception β’ Exponent underflow occurs when the result is too small i.e. when the resultβs exponent < smallest exponent β’ Example: if min exponent is -99 then 10β99 × 10β99 = 10β198 (underflow) To handle underflow, set value as zero or raise an exception Comparing floating point values β’ Because of the potential for producing inexact results, comparing floating point values should account for close results β’ If we know the desired magnitude and precision of results, we can adjust for closeness (epsilon). For example: π=π π=1 π β π < π < (π + π) 1 β 0.000005 < π < 1 + 0.000005 0.999995 < π < 1.000005 β’ A more general approach is to calculate closeness of two numbers based on the relative size of the two numbers being compared