Download part 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Present value wikipedia , lookup

Transcript
Information Representation
(Level ISA3)
Floating point numbers
Floating point storage
• Two’s complement notation can’t be used
to represent floating point numbers
because there is no provision for the radix
point
• A form of scientific notation is used
Floating point storage
• Since the radix is 2, we use a binary point
instead of a decimal point
• The bits allocated for storing the number
are broken into 3 parts: a sign bit, an
exponent, and a mantissa (aka
significand)
Floating point storage
• The number of bits used for exponent and
significand depend on what we want to
optimize for:
– For greater range of magnitude, we would
allocate more bits to the exponent
– For greater precision, we would allocate more
bits to the significand
• For the next several slides, we’ll use a 14bit model with a sign bit, a 5-bit exponent,
and an 8-bit significand
Example
• Suppose we want to store the number 17 using
our 14-bit model:
– Using decimal scientific notation, 17 is:
17.0 x 100 or
1.7 x 101 or
.17 x 102
– In binary, 17 is 10001, which is:
10001.0 x 20 or
1000.1 x 21 or
100.01 x 22 or, finally,
.10001 x 25
– We will use this last value for our model:
0 00101 10001000
sign exponent
significand
Negative exponents
• With the existing model, we can only
represent numbers with positive
exponents
• Two possible solutions:
– Reserve a sign bit for the exponent (with a
resulting loss in range)
– Use a biased exponent; explanation on next
slide
Bias values
• With bias values, every integer in a given range is
converted into a non-negative by adding an offset (bias)
to every value to be stored
• The bias value should be at or near the middle of the
range of numbers
– Since we are using 5 bits for the exponent, the range of values is
0 .. 31
– If we choose 16 as our bias value, we can consider every
number less than 16 a negative number, every number greater
than 16 a positive number, and 16 itself can represent 0
– This is called excess-16 representation, since we have to
subtract 16 from the stored value to get the actual value
– Note: exponents of all 0s or all 1s are usually reserved for
special numbers (such as 0 or infinity)
Examples
Using excess-16 notation, our initial example (1710) becomes:
0 00101 10001000
sign exponent
significand
If we wanted to store 0.2510 = 1.0 x 2-2 we would have:
0 01110 10000000
sign exponent
significand
Synonymous forms
• There is still one problem with this representation
scheme; there are several synonymous forms for a
particular value
• For example, all of the following could represent 17:
0 10101 10001000
sign exponent
significand
0 10111 00100010
sign exponent
significand
0 10110 01000100
sign exponent
significand
0 11000 00010001
sign exponent
significand
Normalization
• In order to avoid the potential chaos of
synonymous forms, a convention has been
established that the leftmost bit of the significand
will always be 1
• An additional advantage of this convention is
that, since we always know the 1 is supposed to
be present, we never need to actually store it;
thus we get an extra bit of precision for the
significand; your text refers to this as the hidden
bit
Example
Express 0.0312510 in normalized floating-point form with
excess-16 bias
0.0312510 = 0.000012 x 20 = 1.0 x 2-5
Applying the bias, the exponent field is 16 – 5 = 11
0 01100 00000000
sign exponent
significand
Floating-point arithmetic
• If we wanted to add decimal numbers
expressed in scientific notation, we would
change one of the numbers so that both of
them are expressed in the same power of
the base
• Example:
1.5 x 102 + 3.5 x 103 = .15 x 103 + 3.5 x 103 =
3.65 x 103
Floating-point arithmetic
Example: add the following binary numbers as represented in
normalized 14-bit format with a bias of 16:
0 10010 11001000
0 10000 10011010
Aligning the operands around the binary point, we have:
11.001000
+ 0.10011010
------------------11.10111010
Renormalizing, retaining the larger exponent and truncating the low-order
bit, we have:
0 10010 11101110
Floating-point errors
• Any two real numbers have an infinite number of
values between them
• Computers have a finite amount of memory in
which to represent real numbers
• Modeling an infinite system using a finite system
means the model is only an approximation
– The more bits used, the more accurate the
approximation
– Always some element of error, no matter how many
bits used
Floating-point errors
• Errors can be blatant or subtle (or
unnoticed)
– Blatant errors: overflow or underflow; blatant
because they cause crashes
– Subtle errors: can lead to wildly erroneous
results, but are often hard to detect (may go
unnoticed until they cause real problems)
Example
• In our simple model, we can express normalized
numbers in the range
-.111111112 x 215 … .11111111 x 215
– Certain limitations are clear; we know we can’t store
2-19 or 2128
– Not so obvious that we can’t accurately store 256.5:
• Binary equivalent is 10000000.1, which is 10 bytes wide
• The low-order bit would be dropped (or rounded into next bit)
• Either way, we have introduced an error
Error propogation
• The relative error in the previous example can be found
by taking the ratio of the absolute value of the error to
the true value of the number:
256.5 – 256
---------------256
= .003906 (about .39%)
• In a lengthy calculation, such errors can propogate,
causing substantial loss of precision
• The next slide illustrates such a situation; it shows the
first five iterations of a floating-point product loop.
Eventually the error will be 100%, since the product will
go to 0.
Error propogation example
Multiplier
1000.001
(16.125)
Multiplicand
0.11101000
(0.90625)
14-bit product
1110.1001
(14.5625)
Real product
14.7784
Error
1.46%
1110.1001
(14.5625)
0.11101000
1101.0011
(13.1885)
13.4483
1.94%
1101.0011
(13.1885)
0.11101000
1011.1111
(11.9375)
12.2380
2.46%
1011.1111
(11.9375)
0.11101000
1010.1101
(10.8125)
11.1366
2.91%
1010.1101
(10.8125)
0.11101000
1001.1100
(9.75)
10.1343
3.79%
1001.1100
(9.75)
0.11101000
1000.1101
(8.8125)
8.3922
4.44%
Special values
• Zero is an example of a special floatingpoint value
– Can’t be normalized; no 1 in its binary version
– Usually represented as all 0s in both
exponent and significand regions
– It is common to have both positive and
negative versions of 0 in floating-point
representation
Real number line with 0 as only
special value (with 3-bit
exponent, 4-bit significand)
More special values
• Infinity: bit pattern used when a result falls in the
overflow region
– Uses exponent field of all 1s, significand of all 0s
• Not a number (NaN): used to indicate illegal
floating point operations
– Uses exponent of all 1s, any significand
• Use of these bit patterns for special values
restricts the range of numbers that can be
represented
Denormalized numbers
• There is no value to represent infinity in
the underflow region analogous to infinity
in the overflow region
• Denormalized numbers exist in the
underflow regions, shortening the gap
between small positive values and 0
Denormalized numbers
• When the exponent field of a number is all
0s and the significand contains at least a
single 1, special rules apply to the
representation:
– The hidden bit is assumed to be 0 instead of 1
– The exponent is stored in excess n-1 (where
excess n is used for normalized numbers)
Floating point storage and C
• In the C programming language, the
standard output function, printf(), prints
floating-point numbers with a default
precision of 6 (7 digits, including the whole
part, using either fixed or scientific
notation)
• We can reason backwards from this fun
fact to discover C’s default size for floating
point numbers
Floating point values and C
• To represent some number n with d decimal
digits, we need log2n x d bits:
–
–
–
–
–
Let L10 = log10(n); then 10L10 = n
So log2(10L10) = log2(n),
and L10 * log2(10) = log2(n)
which means log2(n) = log10(n) * log2(10)
The last value above (log2(10)) is a constant, about
3.322
• So, the number of bits needed to represent a
decimal number with 7 significant digits is 3.322
* 7, or 23.254; 24 bits, in other words
The IEEE 754 floating point
standard
• Specifies uniform standard for single and
double-precision floating-point numbers
• Single-precision: 32 bits
– 23-bit significand
– 8-bit using excess 127 bias
• Double-precision: 64 bits
– 52-bit significand
– 11-bit exponent with excess 1023 bias
Binary-coded decimal
representation
• Numeric coding system used mainly in
IBM mainframe & midrange systems
• BCD encodes each digit of a decimal
number to a 4-bit binary form
• In an 8-bit byte, only the lower nibble
represents the digit’s value; the upper
nibble (called the zone) is used to hold the
sign, which can be positive (1100),
negative (1101), or unsigned (1111)
BCD
• Commonly used in banking and other
financial settings where:
– Most data is monetary
– High level of I/O activity
• BCD is easier to convert to decimal for
printed reports
• Circuitry for arithmetic operations usually
slower than for unsigned binary