Download Integers and Floating Point

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Large numbers wikipedia , lookup

Approximations of π wikipedia , lookup

Location arithmetic wikipedia , lookup

Addition wikipedia , lookup

Positional notation wikipedia , lookup

Arithmetic wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
CMPE12 – More about Numbers
Integers and Floating
Point
(Rest of Textbook Chapter 2 plus more)"
Review: Unsigned Integer
•  A string of 0s and 1s that represent a
positive integer."
•  String is Xn-1, Xn-2, … X1, X0, where Xk is
either a 0 or a 1 and has a weight of 2k."
•  The represented number is the sum of
all the weights for each 1 in the string."
CMPE12 – Fall 2011 – J. Ferguson"
14 - 2"
Signed Integers"
Allow us to represent positive and negative integers."
4 important types:"
"Sign and Magnitude -- Leftmost bit is the sign and
the remaining bits are the unsigned magnitude."
"1ʼs complement -- The additive inverse of a number
is the bit-wise complement of the number."
"2ʼs complement -- The additive inverse of a number
is the bit-wise complement plus one to the number."
"Bias or excess notation -- a bias is subtracted from
the “unsigned value” to get the bias value."
CMPE12 – Fall 2011 – J. Ferguson"
14 - 3"
Quick Review of Signed Integers"
Decimal
1’s
complement
2’s
complement
Sign-andMagnitude
22
0010110
0010110
0010110
-22
1101001
1101010
1010110
CMPE12 – Fall 2011 – J. Ferguson"
14 - 4"
Biased notation"
•  How does it work? The signed
integer is biased so that the bias
value is represented by 000…000."
•  Advantages"
–  Preserves lexical order"
–  Single zero"
–  Most versatile "
•  Disadvantages:"
-  Add and sub require one additional operation
to adjust the bias"
CMPE12 – Fall 2011 – J. Ferguson"
Rep
Value
000
-3
001
-2
010
-1
011
0
100
1
101
2
110
3
111
4
Bias 3
14 - 5"
Conversion: Decimal/BiasX"
•  Decimal -> BiasX"
–  Add X, then Convert
to Binary"
•  BiasX -> Decimal"
–  Convert Binary to
Decimal, then
Subtract X"
CMPE12 – Fall 2011 – J. Ferguson"
Rep
D
Value
000
0
-3
001
1
-2
010
2
-1
011
3
0
100
4
1
101
5
2
110
6
3
111
7
4
14 - 6"
Addition of Bias representations?"
What you get"
unsigned"
(x+b)!
(x+ b)! z+2b!
x!
+(y+b)!
+(y+ b)!
- b!
+y!
(z+b)!
(z+2b)! z+ b!
z!
What you want"
How you convert"
So you must subtract out the additional Bias
when you are finished!"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 7"
Subtraction?"
What you get"
unsigned"
(x+b)
(x+b)
z+0
x
-(y+b)
-(y+b)
+ b
-y
(z+b)
(z-0)
z+b
z
What you want
How you convert"
So you must add back the Bias when you are
finished!"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 8"
Biased notation mapping"
Number"
-3!
represented"
Number" 000!
encoded"
-2!
-1!
0!
1!
2!
3!
4!
001!
010!
011!
100!
101!
110!
111!
Range on n bits:"
-(2n-1-1) to 2n-1"
if Bias is 011…112"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 9"
32-bit word"
•  32 bit word can represent ~ 4.3 billion
values"
–  Integers: 0 -> ~4.3 billion"
–  Signed Integers: ~ -2.15B -> 2.15B"
•  Fractional numbers?"
•  Very large numbers?"
•  Numbers with very small magnitude?"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 10"
Scientific Notation"
•  Example: 6.023*1023"
•  Of form A.xxx *(BASE)exponent"
•  In Binary: 1.xxx… * 2exponent"
–  Or maybe Y16.xx…16 * 16exponent"
•  Standard: IEEE standard for floating
point arithmetic"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 11"
IEEE standard for floating point"
1.xxx * 2exponent in a 32-bit word"
The 1. and the 2 can be assumed."
xxx…xx and exponent (and sign) is all
that must be specified."
CMPE12 – Fall 2011 – J. Ferguson"
14 - 12"
Floating Point Numbers"
8 bits
S Exponent
1 means"
negative"
23 bits"
Fraction (xx…xx)"
(In Bias 127)"
How do we convert to Decimal?"
If 00000000 < Exponent < 111111111"
N = (-1)S * 1.Fraction * 2Exponent-127"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 13"
Converting from Decimal to Float"
1.  Convert to Binary (eg. -10010.01101)"
2.  Normalize (form = 1.xxxxxx *2EXP)"
3.  Convert EXP to bias127 (add 127 to it)"
4.  MSB [31] gets sign"
5.  [23:30] gets EXP (bias127)"
6.  [0:22] gets xxxxxxxxxxxxxxxxxxxxxxx"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 14"
Convert to IEEE FP"
•  56.5"
•  -5.625"
•  -.0004 (do to 5 binary places)"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 15"
Your hard work has not gone
unnoticed!"
From: “The Chronicle of Higher Education”"
The average full-time undergraduate student studies
about … 15 hours a week—but the duration varies by
major, according to this year's National Survey of Student
Engagement."
Engineering majors spend the most time studying, 19
hours a week, but even among those who exceed 20
hours, nearly a quarter still often show up for class
without assignments completed."
CMPE12 – Fall 2011 – J. Ferguson"
14 - 16"
Misconceptions about floats"
•  Floats are not reals. Ex. 2/3"
•  Floats are not decimals. 0.110 =
0.0011001100110011…2"
•  Not all integers < 231 can be
represented. 224+1 =
10000000000000000000000012"
CMPE12 – Fall 2011 – J. Ferguson"
13 - 17"
More on IEEE 754 FP Standard"
• 
• 
• 
• 
Distribution of floats on number line"
“Denormalized” floats"
Double precision floats"
Arithmetic on floates"
CMPE12 – Fall 2011 – J. Ferguson"
14 - 18"
How FP numbers distributed"
•  A 32 bit number can represent at most
232 values"
•  IEEE 754 FP can represent numbers
larger than 2127 so many integers
between 0 and 2127are not represented."
•  High density close to 0."
•  Low density far from 0"
CMPE12 – Fall 2011 – J. Ferguson"
13 - 19"
Specifically: 223 values for each
value of exponent (23 bits)"
•  Between 1/2048 and 1/1024 there are
223 floats."
•  Between 1 and 2 there are 223 floats."
•  Between 230 and 231 there are 223 floats."
•  Between 2x to 2x+1 for -127 < x < 128
there are 223 floats."
CMPE12 – Fall 2011 – J. Ferguson"
13 - 20"
Number Line"
223
0
1
223
2
3
223
4
5
6
223
7
8
9
10 11 12 13 14
15 16
223
240
CMPE12 – Fall 2011 – J. Ferguson"
241
14 - 21"
Denormalized Floating Point Numbers"
8 bits
S 00000000
1 means"
negative"
23 bits"
Fraction (xx…xx)"
(Shows as Denormalized)"
How do we convert to Decimal?"
N = (-1)S * 0.Fraction * 2-126"
1.00000000000000000000000 * 2-126 is smallest
Normalized number; 0.11111111111111111111111 *
2-126 is largest Denormalized number."
CMPE12 – Fall 2011 – J. Ferguson"
16 - 22"
What if Exponent is 11111111?"
•  If FRAC is 0, the 32 bits represent + or
– infinity."
•  If FRAC is nonzero, the 32 bits
represent NaN (Not a Number)"
–  Ex: 0/0"
CMPE12 – Fall 2011 – J. Ferguson"
15 - 23"
Infinity: EXP = 11111111; FRAC=0"
•  Infinity avoids exception on overflow.
(overflow definition: result exceeds
value that can be represented)"
•  Examples of operations that return
infinity: 1/0, -1/0, 3 – inf, sqrt(+inf)
CMPE12 – Fall 2011 – J. Ferguson"
13 - 24"
Double Precision IEEE 754
Floating Point Numbers"
11 bits
S Exponent
1 means"
negative"
52 bits"
Fraction (xx…xx)"
(In Bias 1023)"
To convert to Decimal"
If 00000000000 < Exponent < 111111111111"
N = (-1)S * 1.Fraction * 2Exponent-1023"
CMPE12 – Fall 2011 – J. Ferguson"
16 - 25"
Double precision floating-point"
11 bits
52 bits"
S Exponent
1 means"
negative"
Fraction (xx…xx)"
(In Bias 1023)"
-(2-1024 - 1 ) <= exp <= 21024 "
21024 is about
CMPE12 – Fall 2011 – J. Ferguson"
2*10308"
15 - 26"
Double Precision Float"
52 significant figures base 2 is
approximately 16 significant
figures in base 10."
CMPE12 – Fall 2011 – J. Ferguson"
14 - 27"
Single vs. Double FP"
•  Range:"
–  SP: ~2-126 to 2128 . approximately: 10-38 to
1038"
–  DP approximately: 2*10-308 to 2*10308"
•  Significant figures:
""
–  SP: 23 significant bits, 223 = 8,388,608"
almost 9 significant decimal digits"
–  DP: 52 significant bits, 252 = 4*220*230 "
> 15 significant decimal digits"
CMPE12 – Fall 2011 – J. Ferguson"
13 - 28"
What is this single-precision
floating-point number?"
0 01111010 000000……………………………..000
A. 
B. 
C. 
D. 
E. 
CMPE12 – Fall 2011 – J. Ferguson"
2-5
0
0.0000000
1 * 2exp(011110102)
None of the above
15 - 29"
What is this floating-point
number?"
000000000 01000000……………………………..000
A. 
B. 
C. 
D. 
E. 
CMPE12 – Fall 2011 – J. Ferguson"
1.01
1.01*2-127
2-129
2-128
None of the above
15 - 30"
Adding two “scientific notation”
numbers"
5.345*1023 + 1.236*1025"
1.  Make their exponent the same
(0.05345*1025 + 1.236*1025)"
2.  Add the non-exponents
(1.28945*1025)"
3.  Normalize (already done)"
CMPE12 – Fall 2011 – J. Ferguson"
15 - 31"
Adding two floats"
0 11111100 01100000……………………………..000
0 11111000 110100000…………………….…..000
1.  1.011*211111100 + 1.1101*211111000"
2.  Make their exponent the same
(1.011*211111100 + .00011101*211111100)"
3.  Add nonexponents (1.01111101 *211111100)"
4.  Normalize (already done)"
0 11111100 0111110100………………………..000
CMPE12 – Fall 2011 – J. Ferguson"
15 - 32"
Multiplying two “scientific
notation” numbers"
1.  5.3*1023 * 8.1*1025"
2.  Multiply the non-exponents and add
the exponents (42.93*1048)"
3.  Normalize (4.293*1049)"
CMPE12 – Fall 2011 – J. Ferguson"
15 - 33"
Multiplying two floats"
0 10000011 0100000……………………………..000
0 10000001 110000000…………………….…..000
1.  1.01*24 * 1.11*22"
2.  Multiply the non-exponents and add the
exponents (10.0011*26)"
3.  Normalize (1.00011*27)"
0 10000110 0001100000………………………..000
CMPE12 – Fall 2011 – J. Ferguson"
15 - 34"
Add these two floats"
0 11111100
01110000……………………………..000
0 11111110 100100000…..………………….…..000
1.  Write each in normalized form"
2.  Make their exponent the same "
3.  Add nonexponents"
4.  Normalize"
CMPE12 – Fall 2011 – J. Ferguson"
15 - 35"
Multiplying these two floats"
0 10000100 0100000……………………………..000
0 01111000 100000000…………………….…..000
1.  Write normal form of numbers"
2.  Multiply the non-exponents and add the
exponents"
3.  Normalize"
CMPE12 – Fall 2011 – J. Ferguson"
15 - 36"
How is FP arithmetic done?"
Software: very, very slow."
Hardware floating-point: expensive, but usually
worth it."
Two measures of performance:"
"1. MIPS: millions of instructions executed per
second."
"2. MFLOP: millions of floating point
operations per second."
CMPE12 – Fall 2011 – J. Ferguson"
15 - 37"