Download Representation of Values In Computers Data values are stored

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Infinity wikipedia , lookup

Law of large numbers wikipedia , lookup

Addition wikipedia , lookup

Arithmetic wikipedia , lookup

Approximations of π wikipedia , lookup

Positional notation wikipedia , lookup

Elementary mathematics wikipedia , lookup

Location arithmetic wikipedia , lookup

Transcript
This document briefly introduces how computers store numeric values in binary format.
It also explains why some values (like 0.7, 0.99999, 0.00001 etc.) behave strangely in computer programs.
( For example, if (0.1+0.2+0.4==0.7) is FALSE!! )
Representation of Values In Computers
Data values are stored using binary bits in computers.
1-bit can represent 2 different values: 0 represents one value and 1 represents another value.
2-bits can represent 4 different values: 00 represents a value, 01 represents another value, 10 represents..., 11...
3-bits can represent 8 different values: 000 represents a value, 001 represents another, 010..., …
Nowadays, computers uses formats like 32-bits, 64-bits etc..
Although 32-bits, 64-bits etc.. can handle a lot of different values, but they are not infinite.
(Question: how many different values can 32-bits represent -- Hint: 2 to the power 32.)
Integers vs floating point values
The handling of numbers in computers is done in 2 ways:
(1) For integers (eg. 23,089):
The computer converts the value to binary (2308910=>101 1010 0011 00012), then stores the binary bits.
(2) For floating point values (eg. 13.25):
The computer converts the value to binary (13.2510 => 1101.012), then stores the binary bits.
(Explanation: 13 in binary is 1101, and 0.25 in binary is 0.01)
*The above are simplified descriptions. For actual details please read the last section of this document later.
"Problems" of floating point values
If we convert 0.7 to binary, the answer is 0.1011001100110011....2 (ie. 2-1+2-3+2-4+2-7+..)
That means it needs infinite digits in binary to represent 0.7.
As the computer stores numbers in binary format with a fixed length, it cannot store the infinite amount of binary
digits for 0.7, so it is impossible to store the exact value of 0.7.
Therefore, to handle 0.7, it is stored as an approximate binary number only.
For example, the computer may store 0.710 as 0.10110011001100110011001100112.
Indeed, 0.10110011001100110011001100112 is equal to 0.6999999992549410010, not exactly 0.710.
That means when we write "x=0.7;", the value stored in x is not exactly 0.7!!
The same problem happens to a lot of other values that cannot be converted exactly to binary.
In these cases the computer will find a closest binary number to store the value.
Another example: to handle "x=9.9999..; ", the computer uses the closest binary number (but not longer than the
fixed length of binary digits):
9.99999
9.999999
9.9999999
9.9999999999999999
==>1001.11111111111111110112 (equal to 9.999990463256830010)
==>1001.11111111111111111112 (equal to 9.999998092651360010)
==>...
==>1010.00000000000000000002 (equal to 10.000000000000000010)
In the last example, 9.9999999999999999 is stored as 10.010.
If you compare them as: 9.9999999999999999 ==10.0 , you will get true!!
The example above assumes that the system can store at most 19 binary digits after the decimal point.
For 32-bit formats, much longer binary bits can be stored, so the approximation can be more close.
* Integers vs floating point values (Details)
The details of how the computer handles numbers:
(1) For integers (eg. 23089): convert to binary number (2308910=1101000012) and store the binary bits
In this way, 32-bits can represent about 4 thousand million different integer values
(the values are around -2,150,000,00010 to 2,150,000,00010 )
* You may simply think of it as "1 extra bit to control the sign and 31 bits for the net value"
For actual details, you may refer to Two's complement at
http://en.wikipedia.org/wiki/Signed_number_representations
(2) For floating point values (eg. 13.25):
2.1 Firstly convert to binary number (13.2510 = 1101.012),
2.2 Then reformat it "one point something times two to the power something":
1101.012 = 1.101012 x 2^3
(can be done by shifting the decimal point, and counting how many points to shift)
Hence "Floating" point
2.3 Then, for the result 1.101012 * 2^3, store the two somethings (10101 and 3) in binary bits
ie. store 10101 and 11 (because 310 = 112),
where 11 will be further shifted under a special scheme (Let's omit the details here).
In this way, 32-bits can represent up to very large value (largest value is 3.4 x 1038 ) and
down to very small value (smallest value is: 1.18×10-38)
*Based on IEEE 754 Standard (http://en.wikipedia.org/wiki/IEEE_754)