Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INTEGER REPRESENTATIONS Unsigned Representation (B2U). Two’s Complement (B2T). Signed Magnitude (B2S). Ones’ Complement (B2O). scheme size in bits (w) minimum value maximum value B2U 5 0 31 B2U 8 0 255 B2U 12 0 4095 B2U 16 0 65535 B2T 5 -16 15 B2T 8 -128 127 B2T 12 -2048 2047 B2T 16 -32768 32767 B2O 5 -15 15 B2O 8 -127 127 B2O 12 -2047 2047 B2O 16 -32767 32767 B2S 5 -15 15 B2S 8 -127 127 B2S 12 -2047 2047 B2S 16 -32767 32767 Remember: 2^15 2^14 2^13 2^12 2^11 2^10 2^9 2^8 2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0 32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1 1. Convert 10111 base 2 to decimal for the following decoding schemes: B2U = ________________ 23 B2O = ______________________ -8 B2T = ________________ -9 B2S = ______________________ -7 2. Convert the number 95 base 10 to the following 8-bit encoding schemes: 0101 1111 all the same ___ ___ ___ ___ ___ ___ ___ ___ B2U ___ ___ ___ ___ ___ ___ ___ ___ B2T ___ ___ ___ ___ ___ ___ ___ ___ B2S ___ ___ ___ ___ ___ ___ ___ ___ B2O 3. Convert the number 130 base 10 to the following 8-bit encoding schemes: 1000 0010 for B2U; rest=overflow ___ ___ ___ ___ ___ ___ ___ ___ B2U ___ ___ ___ ___ ___ ___ ___ ___ B2T ___ ___ ___ ___ ___ ___ ___ ___ B2S ___ ___ ___ ___ ___ ___ ___ ___ B2O 4. Convert the number -223 base 10 to the following 16-bit encoding schemes: ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2U no negative #s ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2T 0xFF21 ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2O 0xFF20 ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ B2S 0x80DF CSE2421 Practice Problems 8|Page SIGNED/UNSIGNED #include <stdio.h> % gcc -o sign2unsign sign2unsign.c void main() sign2unsign.c: In function ‘main’: { sign2unsign.c:17: warning: overflow in implicit constant conversion int x; for (x = -8; x < 8; x++) printf("xd = %d xu = %u xx = %.4x\n", x, (unsigned short) x, (unsigned short) x); printf("\n"); for (x = -8; x < 8; x++) printf("xd = %d xu = %u xx = %.8x\n", x, x, x); x = 65528; printf("\n65528 in hex = %.8x\n",x); short a = 0xFFFF + 1; printf("\n0xFFFF + 1 = %d\n",a); } xd = -8 xu = 65528 xx = fff8 xd = -7 xu = 65529 xx = fff9 xd = -6 xu = 65530 xx = fffa xd = -5 xu = 65531 xx = fffb xd = -4 xu = 65532 xx = fffc xd = -3 xu = 65533 xx = fffd xd = -2 xu = 65534 xx = fffe xd = -1 xu = 65535 xx = ffff xd = 0 xu = 0 xx = 0000 xd = 1 xu = 1 xx = 0001 xd = 2 xu = 2 xx = 0002 xd = 3 xu = 3 xx = 0003 xd = 4 xu = 4 xx = 0004 xd = 5 xu = 5 xx = 0005 xd = 6 xu = 6 xx = 0006 xd = 7 xu = 7 xx = 0007 xd = -8 xu = 4294967288 xx = fffffff8 xd = -7 xu = 4294967289 xx = fffffff9 xd = -6 xu = 4294967290 xx = fffffffa xd = -5 xu = 4294967291 xx = fffffffb xd = -4 xu = 4294967292 xx = fffffffc xd = -3 xu = 4294967293 xx = fffffffd xd = -2 xu = 4294967294 xx = fffffffe xd = -1 xu = 4294967295 xx = ffffffff xd = 0 xu = 0 xx = 00000000 xd = 1 xu = 1 xx = 00000001 xd = 2 xu = 2 xx = 00000002 xd = 3 xu = 3 xx = 00000003 xd = 4 xu = 4 xx = 00000004 xd = 5 xu = 5 xx = 00000005 xd = 6 xu = 6 xx = 00000006 xd = 7 xu = 7 xx = 00000007 65528 in hex = 0000fff8 0xFFFF + 1 = 0 CSE2421 Practice Problems 9|Page Decimal 23 +45 === 68 8-bit 2’s comp 00010111 00101101 23 -45 === -22 00010111 11010011 100 +45 === 145 -45 -23 === -68 01100100 00101101 11010011 11101001 Addition 0 0111111 carry 00010111 00101101 0 01000100 =68? yes 0 0010111 carry 00010111 11010011 0 11101010 =-22? yes 0 1101100 carry 01100100 00101101 0 10010001 =145? NOOOOOOOOOO 1 1000011 carry 11010011 11101001 1 10111100 =-68? yes What is OVERFLOW… for B2T values??? When the carry-in is not equal to the carry-out (seen in green above). That is: P+P N i.e. a positive value plus a positive value that results in a negative value N+N P i.e. a negative value plus a negative value that results in a positive value CSE2421 Practice Problems 10 | P a g e An example: http://pages.cs.wisc.edu/~cs354-2/cs354/karen.notes/flpt.apprec.html Put the decimal number 64.2 into the IEEE standard single precision floating point representation. Step 1: Get a binary representation for 64.2. To do this, get unsigned binary representations for the stuff to the left and right of the decimal point separately. 64 is 1000000 .2 can be gotten using the algorithm: .2 .4 .8 .6 x x x x 2 2 2 2 = = = = 0.4 0.8 1.6 1.2 0 keep the whole portion here and carry the decimal down 0 1 1 .2 .4 .8 .6 x x x x 2 2 2 2 = = = = 0.4 0.8 1.6 1.2 0 0 1 1 now this whole pattern (0011) repeats. so a binary representation for .2 is .001100110011. . . Putting the halves back together again: 64.2 is 1000000.0011001100110011. . . Step 2: Normalize the binary representation. (make it look like scientific notation) 1.000000 00110011. . . x 2^6 Step 3: 6 is the true exponent. representation. For the standard form, it needs to be in 8-bit, biased-127 6 + 127 ----133 133 in 8-bit, unsigned representation is 1000 0101 This is the bit pattern used for e in the standard form. Step 4: The mantissa/significand stored (f) is the stuff to the right of the radix point in the normalized form. We need 23 bits of it. 000000 00110011001100110 Put it all together (and include the correct sign bit): s 0 e 10000101 f 00000000110011001100110 0b 0100 0010 1000 0000 0110 0110 0110 0110 0x 4 2 8 0 6 6 6 6 CSE2421 Practice Problems 11 | P a g e Computers represent real values in a form similar to that of scientific notation. Consider the value 1.23 x 10^4 The number has a sign (+ in this case) The significand (1.23) is written with one non-zero digit to the left of the decimal point. The base (radix) is 10. The exponent (an integer value) is 4. It too must have a sign. There are standards which define what the representation means, so that across computers there will be consistency. Note that this is not the only way to represent floating point numbers, it is just the IEEE standard way of doing it. Here is what we do… The representation has three fields: ---------------------------| s | e | f | ---------------------------- S is one bit representing the sign of the number E is an 8-bit biased integer representing the exponent F is an unsigned integer The decimal value represented is: s (-1) E x F x 2 where E = e – bias e = E + bias F = ( f/(2^n) ) + 1 f = (F-1)*2n For single precision representation, n = 23 and bias = 127 For double precision representation (a 64-bit representation), n = 52 (there are 52 bits for the mantissa field) bias = 1023 (there are 11 bits for the exponent field) Now, what does all this mean? s, e, f all represent fields within a representation. Each is just a bunch of bits. s is just a sign bit. 0 for positive, 1 for negative. This is the sign of the number. e is an exponent field. The e field is a biased-127 integer representation. So, the true exponent represented is (e - bias). The radix for the number is ALWAYS 2. Note: Computers that did not use this representation, like those built before the standard, did not always use a radix of 2. For example, some IBM machines had radix of 16. f is the mantissa (significand). It is in a somewhat modified form. There are 23 bits available for the mantissa. It turns out that if floating point numbers are always stored in their normalized form, then the leading bit (the one on the left, or MSB) is ALWAYS a 1. So, why store it at all? It gets put back into the number (giving 24 bits of precision for the mantissa) for any calculation, but we only have to store 23 bits. This MSB is called the hidden bit. CSE2421 Practice Problems 12 | P a g e CSE2421 Practice Problems 13 | P a g e To sum up, the following are the corresponding values for a given representation: Special Values Float Values (b = bias) Sign Exponent (e) Fraction (f) Value 0 00..00 00..00 +0 0 00..00 00..01 : 11..11 Positive Denormalized Real 0.f × 2(-b+1) 0 00..01 : 11..10 XX..XX Positive Normalized Real 1.f × 2(e-b) 0 11..11 00..00 +Infinity 0 11..11 00..01 : 01..11 SNaN 0 11..11 10..00 : 11..11 QNaN 1 00..00 00..00 -0 1 00..00 00..01 : 11..11 Negative Denormalized Real -0.f × 2(-b+1) 1 00..01 : 11..10 XX..XX Negative Normalized Real -1.f × 2(e-b) 1 11..11 00..00 -Infinity 1 11..11 00..01 : 01..11 SNaN 11..11 10..00 : 11.11 QNaN 1 IEEE reserves exponent field values of all 0s and all 1s to denote special values in the floating-point scheme. Zero As mentioned above, zero is not directly representable in the straight format, due to the assumption of a leading 1 (we'd need to specify a true zero mantissa to yield a value of zero). Zero is a special value denoted with an exponent field of zero and a fraction field of zero. Note that -0 and +0 are distinct values, though they both compare as equal. Denormalized If the exponent is all 0s, but the fraction is non-zero (else it would be interpreted as zero), then the value is a denormalized number, which does not have an assumed leading 1 before the binary point. Thus, this represents a number (-1)s × 0.f × 2-126, where s is the sign bit and f is the fraction. For double precision, denormalized numbers are of the form (-1)s × 0.f × 2-1022. From this you can interpret zero as a special type of denormalized number. Infinity The values +infinity and -infinity are denoted with an exponent of all 1s and a fraction of all 0s. The sign bit distinguishes between negative infinity and positive infinity. Being able to denote infinity as a specific value is useful because it allows operations to continue past overflow situations. Operations with infinite values are well defined in IEEE floating point. Not A Number The value NaN (Not a Number) is used to represent a value that does not represent a real number. NaN's are represented by a bit pattern with an exponent of all 1s and a non-zero fraction. There are two categories of NaN: QNaN (Quiet NaN) and SNaN (Signalling NaN). A QNaN is a NaN with the most significant fraction bit set. QNaN's propagate freely through most arithmetic operations. These values pop out of an operation when the result is not mathematically defined. An SNaN is a NaN with the most significant fraction bit clear. It is used to signal an exception when used in operations. SNaN's can be handy to assign to uninitialized variables to trap premature usage. Semantically, QNaN's denote indeterminate operations, while SNaN's denote invalid operations. CSE2421 Practice Problems 14 | P a g e