Download INTEGER REPRESENTATIONS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Large numbers wikipedia , lookup

Infinity wikipedia , lookup

Law of large numbers wikipedia , lookup

Location arithmetic wikipedia , lookup

Approximations of π wikipedia , lookup

Rounding wikipedia , lookup

Addition wikipedia , lookup

Arithmetic wikipedia , lookup

Division by zero wikipedia , lookup

Positional notation wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
INTEGER REPRESENTATIONS
Unsigned Representation (B2U). Two’s Complement (B2T). Signed Magnitude (B2S). Ones’ Complement (B2O).
scheme size in bits (w) minimum value maximum value
B2U
5
0
31
B2U
8
0
255
B2U
12
0
4095
B2U
16
0
65535
B2T
5
-16
15
B2T
8
-128
127
B2T
12
-2048
2047
B2T
16
-32768
32767
B2O
5
-15
15
B2O
8
-127
127
B2O
12
-2047
2047
B2O
16
-32767
32767
B2S
5
-15
15
B2S
8
-127
127
B2S
12
-2047
2047
B2S
16
-32767
32767
Remember:
2^15 2^14 2^13 2^12 2^11 2^10 2^9 2^8 2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0
32768 16384 8192 4096 2048 1024 512 256 128 64 32
16
8
4
2
1
1. Convert 10111 base 2 to decimal for the following decoding schemes:
B2U = ________________ 23
B2O = ______________________ -8
B2T = ________________ -9
B2S = ______________________ -7
2. Convert the number 95 base 10 to the following 8-bit encoding schemes: 0101 1111 all the same
___ ___ ___ ___ ___ ___ ___ ___ B2U ___ ___ ___ ___ ___ ___ ___ ___ B2T
___ ___ ___ ___ ___ ___ ___ ___ B2S ___ ___ ___ ___ ___ ___ ___ ___ B2O
3. Convert the number 130 base 10 to the following 8-bit encoding schemes: 1000 0010 for B2U; rest=overflow
___ ___ ___ ___ ___ ___ ___ ___ B2U ___ ___ ___ ___ ___ ___ ___ ___ B2T
___ ___ ___ ___ ___ ___ ___ ___ B2S ___ ___ ___ ___ ___ ___ ___ ___ B2O
4. Convert the number -223 base 10 to the following 16-bit encoding schemes:
___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2U no negative #s
___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2T 0xFF21
___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2O 0xFF20
___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2S 0x80DF
CSE2421 Practice Problems
8|Page
SIGNED/UNSIGNED
#include <stdio.h>
% gcc -o sign2unsign sign2unsign.c
void main()
sign2unsign.c: In function ‘main’:
{
sign2unsign.c:17: warning: overflow in implicit constant conversion
int x;
for (x = -8; x < 8; x++)
printf("xd = %d xu = %u xx = %.4x\n", x, (unsigned short) x, (unsigned short) x);
printf("\n");
for (x = -8; x < 8; x++)
printf("xd = %d xu = %u xx = %.8x\n", x, x, x);
x = 65528;
printf("\n65528 in hex = %.8x\n",x);
short a = 0xFFFF + 1;
printf("\n0xFFFF + 1 = %d\n",a);
}
xd = -8 xu = 65528 xx = fff8
xd = -7 xu = 65529 xx = fff9
xd = -6 xu = 65530 xx = fffa
xd = -5 xu = 65531 xx = fffb
xd = -4 xu = 65532 xx = fffc
xd = -3 xu = 65533 xx = fffd
xd = -2 xu = 65534 xx = fffe
xd = -1 xu = 65535 xx = ffff
xd = 0 xu = 0 xx = 0000
xd = 1 xu = 1 xx = 0001
xd = 2 xu = 2 xx = 0002
xd = 3 xu = 3 xx = 0003
xd = 4 xu = 4 xx = 0004
xd = 5 xu = 5 xx = 0005
xd = 6 xu = 6 xx = 0006
xd = 7 xu = 7 xx = 0007
xd = -8 xu = 4294967288 xx = fffffff8
xd = -7 xu = 4294967289 xx = fffffff9
xd = -6 xu = 4294967290 xx = fffffffa
xd = -5 xu = 4294967291 xx = fffffffb
xd = -4 xu = 4294967292 xx = fffffffc
xd = -3 xu = 4294967293 xx = fffffffd
xd = -2 xu = 4294967294 xx = fffffffe
xd = -1 xu = 4294967295 xx = ffffffff
xd = 0 xu = 0 xx = 00000000
xd = 1 xu = 1 xx = 00000001
xd = 2 xu = 2 xx = 00000002
xd = 3 xu = 3 xx = 00000003
xd = 4 xu = 4 xx = 00000004
xd = 5 xu = 5 xx = 00000005
xd = 6 xu = 6 xx = 00000006
xd = 7 xu = 7 xx = 00000007
65528 in hex = 0000fff8
0xFFFF + 1 = 0
CSE2421 Practice Problems
9|Page
Decimal
23
+45
===
68
8-bit 2’s comp
00010111
00101101
23
-45
===
-22
00010111
11010011
100
+45
===
145
-45
-23
===
-68
01100100
00101101
11010011
11101001
Addition
0 0111111 carry
00010111
00101101
0 01000100
=68? yes
0 0010111 carry
00010111
11010011
0 11101010
=-22? yes
0 1101100 carry
01100100
00101101
0 10010001
=145? NOOOOOOOOOO
1 1000011 carry
11010011
11101001
1 10111100
=-68? yes
What is OVERFLOW… for B2T values??? When the carry-in is not equal to the carry-out (seen in green
above). That is:
P+P  N i.e. a positive value plus a positive value that results in a negative value
N+N  P i.e. a negative value plus a negative value that results in a positive value
CSE2421 Practice Problems
10 | P a g e
An example: http://pages.cs.wisc.edu/~cs354-2/cs354/karen.notes/flpt.apprec.html
Put the decimal number 64.2 into the IEEE standard single precision floating point representation.
Step 1:
Get a binary representation for 64.2. To do this, get unsigned binary representations for
the stuff to the left and right of the decimal point separately.
64
is
1000000
.2 can be gotten using the algorithm:
.2
.4
.8
.6
x
x
x
x
2
2
2
2
=
=
=
=
0.4
0.8
1.6
1.2
0  keep the whole portion here and carry the decimal down
0
1
1
.2
.4
.8
.6
x
x
x
x
2
2
2
2
=
=
=
=
0.4
0.8
1.6
1.2
0
0
1
1
now this whole pattern (0011) repeats.
so a binary representation for .2
is
.001100110011. . .
Putting the halves back together again:
64.2 is
1000000.0011001100110011. . .
Step 2:
Normalize the binary representation. (make it look like scientific notation)
1.000000 00110011. . . x 2^6
Step 3:
6 is the true exponent.
representation.
For the standard form, it needs to be in 8-bit, biased-127
6
+ 127
----133
133 in 8-bit, unsigned representation is 1000 0101
This is the bit pattern used for e in the standard form.
Step 4:
The mantissa/significand stored (f) is the stuff to the right of the radix point in the
normalized form. We need 23 bits of it.
000000 00110011001100110
Put it all together (and include the correct sign bit):
s
0
e
10000101
f
00000000110011001100110
0b 0100 0010 1000 0000 0110 0110 0110 0110
0x
4
2
8
0
6
6
6
6
CSE2421 Practice Problems
11 | P a g e
Computers represent real values in a form similar to that of scientific notation. Consider the value
1.23 x 10^4
The number has a sign (+ in this case)
The significand (1.23) is written with one non-zero digit to the left of the decimal point.
The base (radix) is 10.
The exponent (an integer value) is 4. It too must have a sign.
There are standards which define what the representation means, so that across computers there will be
consistency.
Note that this is not the only way to represent floating point numbers, it is just the IEEE standard way of doing
it. Here is what we do… The representation has three fields:
---------------------------| s |
e
|
f
|
----------------------------
S is one bit representing the sign of the number
E is an 8-bit biased integer representing the exponent
F is an unsigned integer
The decimal value represented is:
s
(-1)
E
x F x 2
where
E = e – bias  e = E + bias
F = ( f/(2^n) ) + 1 
f = (F-1)*2n
For single precision representation, n = 23 and bias = 127
For double precision representation (a 64-bit representation), n = 52 (there are 52 bits for the mantissa field)
bias = 1023 (there are 11 bits for the exponent field)
Now, what does all this mean?
 s, e, f all represent fields within a representation. Each is just a bunch of bits.
 s is just a sign bit. 0 for positive, 1 for negative. This is the sign of the number.
 e is an exponent field. The e field is a biased-127 integer representation. So, the true exponent
represented is (e - bias).
 The radix for the number is ALWAYS 2.
Note: Computers that did not use this representation, like those built before the standard, did not always
use a radix of 2. For example, some IBM machines had radix of 16.
 f is the mantissa (significand). It is in a somewhat modified form. There are 23 bits available for the
mantissa. It turns out that if floating point numbers are always stored in their normalized form, then the
leading bit (the one on the left, or MSB) is ALWAYS a 1. So, why store it at all? It gets put back into
the number (giving 24 bits of precision for the mantissa) for any calculation, but we only have to store
23 bits. This MSB is called the hidden bit.
CSE2421 Practice Problems
12 | P a g e
CSE2421 Practice Problems
13 | P a g e
To sum up, the following are the corresponding values for a given representation:
Special Values
Float Values (b = bias)
Sign
Exponent
(e)
Fraction
(f)
Value
0
00..00
00..00
+0
0
00..00
00..01
:
11..11
Positive
Denormalized Real
0.f × 2(-b+1)
0
00..01
:
11..10
XX..XX
Positive
Normalized Real
1.f × 2(e-b)
0
11..11
00..00
+Infinity
0
11..11
00..01
:
01..11
SNaN
0
11..11
10..00
:
11..11
QNaN
1
00..00
00..00
-0
1
00..00
00..01
:
11..11
Negative
Denormalized Real
-0.f × 2(-b+1)
1
00..01
:
11..10
XX..XX
Negative
Normalized Real
-1.f × 2(e-b)
1
11..11
00..00
-Infinity
1
11..11
00..01
:
01..11
SNaN
11..11
10..00
:
11.11
QNaN
1
IEEE reserves exponent field values of all 0s and all 1s to
denote special values in the floating-point scheme.
Zero
As mentioned above, zero is not directly representable in
the straight format, due to the assumption of a leading 1
(we'd need to specify a true zero mantissa to yield a value
of zero). Zero is a special value denoted with an exponent
field of zero and a fraction field of zero. Note that -0 and
+0 are distinct values, though they both compare as equal.
Denormalized
If the exponent is all 0s, but the fraction is non-zero (else
it would be interpreted as zero), then the value is a
denormalized number, which does not have an assumed
leading 1 before the binary point. Thus, this represents a
number (-1)s × 0.f × 2-126, where s is the sign bit and f is
the fraction. For double precision, denormalized numbers
are of the form (-1)s × 0.f × 2-1022. From this you can
interpret zero as a special type of denormalized number.
Infinity
The values +infinity and -infinity are denoted with an
exponent of all 1s and a fraction of all 0s. The sign bit
distinguishes between negative infinity and positive
infinity. Being able to denote infinity as a specific value
is useful because it allows operations to continue past
overflow situations. Operations with infinite values are
well defined in IEEE floating point.
Not A Number
The value NaN (Not a Number) is used to represent a
value that does not represent a real number. NaN's are
represented by a bit pattern with an exponent of all 1s and
a non-zero fraction. There are two categories of NaN:
QNaN (Quiet NaN) and SNaN (Signalling NaN).
A QNaN is a NaN with the most significant fraction bit
set. QNaN's propagate freely through most arithmetic
operations. These values pop out of an operation when
the result is not mathematically defined.
An SNaN is a NaN with the most significant fraction
bit clear. It is used to signal an exception when used
in operations. SNaN's can be handy to assign to
uninitialized variables to trap premature usage.
Semantically, QNaN's denote indeterminate
operations, while SNaN's denote invalid operations.
CSE2421 Practice Problems
14 | P a g e