Download Data representation – chapter 5

Document related concepts

Bra–ket notation wikipedia , lookup

Abuse of notation wikipedia , lookup

Location arithmetic wikipedia , lookup

Musical notation wikipedia , lookup

History of mathematical notation wikipedia , lookup

Addition wikipedia , lookup

Big O notation wikipedia , lookup

Large numbers wikipedia , lookup

Arithmetic wikipedia , lookup

Positional notation wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
Chapter 5
Data representation
Data Rep
1
Learning outcomes

By the end of this Chapter you will be able to:
•
Explain how integers are represented in computers using:
•
Explain how fractional numbers are represented in computers
•
•
Unsigned, signed magnitude, excess, and two’s complement notations
Floating point notation (IEEE 754 single format)
•
Calculate the decimal value represented by a binary sequence in:
•
Explain how characters are represented in computers
•
•
•
Unsigned, signed notation, excess, two’s complement, and the IEEE 754
notations.
E.g. using ASCII and Unicode
Explain how colours, images, sound and movies are represented
Data Rep
2
Additional Reading

Essential Reading

Further Reading
• Stalling (2003): Chapter 9
• Brookshear (2003): Chapter 1.4 - 1.7
• Burrell (2004): Chapter 2
• Schneider and Gersting (2004): Chapter 4.2
Data Rep
3
Number representation


Representing whole numbers
Representing fractional numbers
Data Rep
4
Integer Representations
• Unsigned notation
• Signed magnitude notion
• Excess notation
• Two’s complement notation.
Data Rep
5
Unsigned Representation


Represents positive integers.
Unsigned representation of 157:
position
7
6
5
4
3
2
1
0
Bit pattern
1
0
0
1
1
1
0
1
128
64
32
16
8
4
2
1
26
25
24
23
22
21
20
contribution 27

Addition is simple:
1 0 0 1 + 0 1 0 1 = 1 1 1 0.
Data Rep
6
Advantages and disadvantages of
unsigned notation

Advantages:

Disadvantages
• One representation of zero
• Simple addition
• Negative numbers can not be represented.
• The need of different notation to represent
negative numbers.
Data Rep
7
Representation of negative
numbers



Is a representation of negative numbers
possible?
Unfortunately:
•
you can not just stick a negative sign in front of a binary
number. (it does not work like that)
There are three methods used to represent
negative numbers.
• Signed magnitude notation
• Excess notation notation
• Two’s complement notation
Data Rep
8
Signed Magnitude
Representation



Unsigned: - and + are the same.
In signed magnitude
•
the left-most bit represents the sign of the integer.
• 0 for positive numbers.
• 1 for negative numbers.
The remaining bits represent to
magnitude of the numbers.
Data Rep
9
Example


Suppose 10011101 is a signed magnitude representation.
The sign bit is 1, then the number represented is negative
position
7
6
5
4
3
2
1
0
Bit pattern
1
0
0
1
1
1
0
1
24
23
22
contribution


-
20
The magnitude is 0011101 with a value 24+23+22+20= 29
Then the number represented by 10011101 is –29.
Data Rep
10
Exercise 1
1.
3710 has 0010 0101 in signed magnitude notation. Find
the signed magnitude of –3710 ?
2.
Using the signed magnitude notation find the 8-bit
binary representation of the decimal value 2410 and 2410.
3.
Find the signed magnitude of –63 using 8-bit binary
sequence?
Data Rep
11
Disadvantage of Signed
Magnitude



Addition and subtractions are difficult.
Signs and magnitude, both have to carry out
the required operation.
They are two representations of 0
• 00000000 = + 010
• 10000000 = - 010
• To test if a number is 0 or not,
•
the CPU will need to see
whether it is 00000000 or 10000000.
0 is always performed in programs.
•
Therefore, having two representations of 0 is
inconvenient.
Data Rep
12
Signed-Summary

In signed magnitude notation,
• The most significant bit is used to represent the sign.
• 1 represents negative numbers
• 0 represents positive numbers.
• The unsigned value of the remaining bits represent The
magnitude.

Advantages:

Disadvantages:
• Represents positive and negative numbers
• two representations of zero,
• Arithmetic operations are difficult.
Data Rep
13
Excess Notation

In excess notation:
•
The value represented is the unsigned value with a fixed
value subtracted from it.
•
•
For n-bit binary sequences the value subtracted fixed value is
2(n-1).
Most significant bit:
•
•
0 for negative numbers
1 for positive numbers
Data Rep
14
Excess Notation with n bits

1000…0 represent 2n-1 is the decimal value in
unsigned notation.
Decimal value
In unsigned
notation

- 2n-1 =
Decimal value
In excess
notation
Therefore, in excess notation:
•
1000…0 will represent 0 .
Data Rep
15
Example (1) - excess to decimal

Find the decimal number represented by
10011001 in excess notation.
• Unsigned value
•
• 100110002 = 27 + 24 + 23 + 20 = 128 + 16 +8 +1 = 15310
Excess value:
• excess value = 153 – 27 = 152 – 128 = 25.
Data Rep
16
Example (2) - decimal to excess



Represent the decimal value 24 in 8-bit
excess notation.
We first add, 28-1, the fixed value
•
24 + 28-1 = 24 + 128= 152
then, find the unsigned value of 152
• 15210 = 10011000 (unsigned notation).
• 2410 = 10011000 (excess notation)
Data Rep
17
example (3)



Represent the decimal value -24 in 8-bit
excess notation.
We first add, 28-1, the fixed value
•
-24 + 28-1 = -24 + 128= 104
then, find the unsigned value of 104
•
•
10410 = 01101000 (unsigned notation).
-2410 = 01101000 (excess notation)
Data Rep
18
Example (4) (10101)



Unsigned
•
•
101012 = 16+4+1 = 2110
The value represented in unsigned notation is 21
Sign Magnitude
•
•
•
The sign bit is 1, so the sign is negative
The magnitude is the unsigned value 01012 = 510
So the value represented in signed magnitude is -510
Excess notation
•
•
•
As an unsigned binary integer 101012 = 2110
subtracting 25-1 = 24 = 16, we get 21-16 = 510.
So the value represented in excess notation is 510.
Data Rep
19
Advantages of Excess Notation






It can represent positive and negative integers.
There is only one representation for 0.
It is easy to compare two numbers.
When comparing the bits can be treated as unsigned
integers.
Excess notation is not normally used to represent
integers.
It is mainly used in floating point representation for
representing fractions (later floating point rep.).
Data Rep
20
Exercise 2
1.
•
•
2.
Find 10011001 is an 8-bit binary sequence.
Find the decimal value it represents if it was in unsigned
and signed magnitude.
Suppose this representation is excess notation, find
the decimal value it represents?
Using 8-bit binary sequence notation, find the
unsigned, signed magnitude and excess
notation of the decimal value 1110 ?
Data Rep
21
Excess notation - Summary

In excess notation, the value represented is the unsigned
value with a fixed value subtracted from it.
•


i.e. for n-bit binary sequences the value subtracted is 2(n-1).
Most significant bit:
•
•
0 for negative numbers .
1 positive numbers.
Advantages:
•
•
Only one representation of zero.
Easy for comparison.
Data Rep
22
Two’s Complement Notation

The most used representation for
integers.
• All positive numbers begin with 0.
• All negative numbers begin with 1.
• One representation of zero
• i.e.
0 is represented as 0000 using 4-bit binary
sequence.
Data Rep
23
Two’s Complement Notation with 4-bits
Binary pattern
0 1 1 1
0 1 1 0
0 1 01
0 1 0 0
0 0 1 1
0 0 1 0
0 0 0 1
0 0 0 0
1 1 1 1
1 1 1 0
1 1 0 1
1 1 0 0
1 0 1 1
1 0 1 0
1 0 0 1
1 0 0 0
Value in 2’s complement.
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
Data Rep
-8
24
Properties of Two’s Complement
Notation




Positive numbers begin with 0
Negative numbers begin with 1
Only one representation of 0, i.e. 0000
Relationship between +n and –n.
•0100
+4
00010010
+18
•1100
-4
11101110
-18
Data Rep
25
Advantages of Two’s
Complement Notation

It is easy to add two numbers.
0 0 0 1 +1
+ 0 1 0 1 +5
1 0 0 0 -8
0+1 0 1 +5
0110
1 1 0 1 -3
+6
• Subtraction can be easily performed.
• Multiplication is just a repeated addition.
• Division is just a repeated subtraction
• Two’s complement is widely used in ALU
Data Rep
26
Evaluating numbers in two’s
complement notation


Sign bit = 0, the number is positive. The value is determined in
the usual way.
Sign bit = 1, the number is negative. three methods can be
used:
Method 1
decimal value of (n-1) bits, then subtract 2n-1
Method 2
- 2n-1 is the contribution of the sign bit.
Method 3

Binary rep. of the corresponding positive number.
 Let V be its decimal value.
 - V is the required value.
Data Rep
27
Example- 10101 in Two’s
Complement



The most significant bit is 1, hence it is a
negative number.
Method 1
• 0101 =
(+5 – 25-1 = 5 – 24 = 5-16 = -11)
Method 2
4
1
-24

+5
3
0
2
1
22
1
0
0
1
20
=
-11
Method 3
• Corresponding + number is
01011 = 8 + 2+1 = 11
the result is then –11.
Data Rep
28
Two’s complement-summary





In two’s complement the most significant for an n-bit number
has a contribution of –2(n-1).
One representation of zero
All arithmetic operations can be performed by using addition
and inversion.
The most significant bit: 0 for positive and 1 for negative.
Three methods can the decimal value of a negative number:
Method 1
decimal value of (n-1) bits, then subtract 2n-1
Method 2
- 2n-1 is the contribution of the sign bit.
Method 3

Binary rep. of the corresponding positive number.
 Let V be its decimal value.
 - V is the required value.
Data Rep
29
Exercise - 10001011

Determine the decimal value
represented by 10001011 in each of
the following four systems.
1. Unsigned notation?
2. Signed magnitude notation?
3. Excess notation?
4. Tow’s complements?
Data Rep
30
Fraction Representation

To represent fraction we need other
representations:
• Fixed point representation
• Floating point representation.
Data Rep
31
Fixed-Point Representation
old position
New position
Bit pattern
Contribution
7 6 5 4 3 2 1 0
4 3 2 1 0 -1 -2 -3
1 0 0 1 1. 1 0 1
24
21 20 2-1
2-3 =19.625
Radix-point
Data Rep
32
Limitation of Fixed-Point
Representation

To represent large numbers or very
small numbers we need a very long
sequences of bits.

This is because we have to give bits to
both the integer part and the fraction
part.
Data Rep
33
Floating Point Representation
In decimal notation we can get around this
problem using scientific notation or floating
point notation.
Number
Scientific notation
Floating-point
notation
1,245,000,000,000
1.245 1012
0.1245 1013
0.0000001245
1.245 10-7
0.1245 10-6
-0.0000001245
-1.245 10-7
-0.1245 10-6
Data Rep
34
Floating Point
-15900000000000000
could be represented as
Base
Mantissa
Sign
- 159 * 1014
- 15.9 * 1015
- 1.59 * 1016
Exponent
A calculator might display 159 E14
Data Rep
35
Floating point format
 M  B E
Sign
Sign
mantissa or
significand
base
Exponent
exponent
Mantissa
Data Rep
36
Floating Point Representation
format
Sign


Exponent
Mantissa
The exponent is biased by a fixed value
b, called the bias.
The mantissa should be normalised,
e.g. if the real mantissa if of the form 1.f
then the normalised mantissa should be
f, where f is a binary sequence.
Data Rep
37
IEEE 745 Single Precision

The number will occupy 32 bits
The first bit represents the sign of the number;
1= negative 0= positive.
The next 8 bits will specify the exponent stored in
biased 127 form.
The remaining 23 bits will carry the mantissa
normalised to be between 1 and 2.
i.e. 1<= mantissa < 2
Data Rep
38
Representation in IEEE 754
single precision



sign bit:
• 0 for positive and,
• 1 for negative numbers
8 biased exponent by 127
23 bit normalised mantissa
Sign
Exponent
Mantissa
Data Rep
39
Basic Conversion

Converting a decimal number to a floating point
number.
• 1.Take the integer part of the number and generate
•
•
the binary equivalent.
2.Take the fractional part and generate a binary
fraction
3.Then place the two parts together and normalise.
Data Rep
40

IEEE – Example 1
Convert 6.75 to 32 bit IEEE format.

1.








The Mantissa. The Integer first.
6/2 =3r0
3/2 =1r1
= 1102
1/2 =0r1
2. Fraction next.
= 0.112
.75 * 2 = 1.5
.5 * 2 = 1.0
3. put the two parts together… 110.11
Now normalise
1.1011 *
Data Rep
22
41
IEEE – Example 1
 Convert 6.75 to 32 bit IEEE format.




1.
The Mantissa. The Integer first.
6/2 =3r0
3 / 2 = 1 r 1 = 1102
1/2
=0r1

2. Fraction next.

.75 * 2 = 1.5
.5 * 2 = 1.0



= 0.112
3. put the two parts together…
Now normalise
Data Rep
110.11
1.1011 * 22
42
IEEE – Example 1
 Convert 6.75 to 32 bit IEEE format.




1.
The Mantissa. The Integer first.
6/2 =3r0
3/2 =1r1
= 1102
1/2
=0r1

2. Fraction next.

.75 * 2 = 1.5
.5 * 2 = 1.0



= 0.112
3. put the two parts together…
Now normalise
Data Rep
110.11
1.1011 * 22
43
IEEE Biased 127 Exponent

To generate a biased 127 exponent

Take the value of the signed exponent and add 127.

Example.

216 then 2127+16 = 2143 and my value for the
exponent would be 143 = 100011112

So it is simply now an unsigned value ....
Data Rep
44
Possible Representations of an
Exponent
Binary

Sign Magnitude 2's
Complement
00000000
0
0
00000001
00000010
01111110
01111111
10000000
10000001
11111110
11111111
1
2
126
127
-0
-1
-126
-127
1
2
126
127
-128
-127
-2
-1
Data Rep
Biased
127
Exponent.
-127
{reserved}
-126
-125
-1
0
1
2
127
128
{reserved}
45
Why Biased ?




The smallest exponent
00000000
Only one exponent zero
01111111
The highest exponent is
11111111
To increase the exponent by one simply add 1 to the
present pattern.
Data Rep
46
Back to the example






Our original example revisited…. 1.1011 * 22
Exponent is 2+127 =129 or 10000001 in binary.
NOTE: Mantissa always ends up with a value of ‘1’
before the Dot. This is a waste of storage therefore it
is implied but not actually stored. 1.1000 is stored
.1000
6.75 in 32 bit floating point IEEE representation:0 10000001 10110000000000000000000
sign(1) exponent(8)
mantissa(23)
Data Rep
47
Representation in IEEE 754
single precision



sign bit:
• 0 for positive and,
• 1 for negative numbers
8 biased exponent by 127
23 bit normalised mantissa
Sign
Exponent
Mantissa
Data Rep
48
Example (2)

which number does the following IEEE single
precision notation represent?
1






1000 0000
0100 0000 0000 0000 0000 000
The sign bit is 1, hence it is a negative number.
The exponent is 1000 0000 = 12810
It is biased by 127, hence the real exponent is
128 –127 = 1.
The mantissa: 0100 0000 0000 0000 0000 000.
It is normalised, hence the true mantissa is
1.01 = 1.2510
Finally, the number represented is: -1.25 x 21 = -2.50
Data Rep
49
Single Precision Format

The exponent is formatted using excess-127 notation, with an
implied base of 2
•





Example:
•
•
Exponent:
10000111
Representation: 135 – 127 = 8
The stored values 0 and 255 of the exponent are used to
indicate special values, the exponential range is restricted to
2-126 to 2127
The number 0.0 is defined by a mantissa of 0 together with the
special exponential value 0
The standard allows also values +/-∞ (represented as mantissa
+/-0 and exponent 255
Allows various other special conditions
Data Rep
50
In comparison

The smallest and largest possible 32-bit
integers in two’s complement are only -232 and
231 - 1
2017/5/23
PITTData
CS 1621
Rep
51
51
Numbers in 32-bit Formats

Two’s complement integers
Expressible numbers
-231
0
231-1

Floating point numbers
Positive underflow
Negative underflow
Negative
Overflow
Expressible
negative
numbers
- (2 – 2-23)×2127

Expressible
positive
numbers
-2-127
0
2-127
Positive
Overflow
(2 – 2-23)×2127
Ref: W. Stallings, Computer Organization and Architecture, Sixth
Edition, Upper Saddle River, NJ: Prentice-Hall.
Data Rep
52
Positive Zero in IEEE 754
0 00000000 00000000000000000000000
Biased
exponent




Fraction
+ 1.0 × 2-127
Smallest positive number in single-precision
IEEE 754 standard.
Interpreted as positive zero.
True exponent less than -127 is positive
underflow; can be regarded as zero.
Data Rep
53
Negative Zero in IEEE 754
1 00000000 00000000000000000000000
Biased
exponent




Fraction
- 1.0 × 2-127
Smallest negative number in single-precision
IEEE 754 standard.
Interpreted as negative zero.
True exponent less than -127 is negative
underflow; may be regarded as 0.
Data Rep
54
Positive Infinity in IEEE 754
0 11111111 00000000000000000000000
Biased
exponent




Fraction
+ 1.0 × 2128
Largest positive number in single-precision IEEE
754 standard.
Interpreted as + ∞
If true exponent = 128 and fraction ≠ 0, then the
number is greater than ∞. It is called “not a
number” or NaN and may be interpreted as ∞.
Data Rep
55
Negative Infinity in IEEE 754
1 11111111 00000000000000000000000
Biased
exponent




Fraction
-1.0 × 2128
Smallest negative number in single-precision
IEEE 754 standard.
Interpreted as - ∞
If true exponent = 128 and fraction ≠ 0, then the
number is less than - ∞. It is called “not a
number” or NaN and may be interpreted as - ∞.
Data Rep
56
Range of numbers
 Normalized (positive range; negative is
symmetric)
smallest
00000000100000000000000000000000
+2-126× (1+0) = 2-126
largest
01111111011111111111111111111111
+2127× (2-2-23)
0
2127(2-2-23)
2-126
Positive overflow
Positive underflow
2017/5/23
PITTData
CS 1621
Rep
57
57
Representation in IEEE 754
double precision format

It uses 64 bits
• 1 bit sign
• 11 bit biased exponent
• 52 bit mantissa
Sign
Exponent
Mantissa
Data Rep
58
IEEE 754 double precision
Biased = 1023


11-bit exponent with an excess of 1023.
For example:
•
If the exponent is -1
• we then add 1023 to it. -1+1023 = 1022
• We then find the binary representation of 1022
•
•
Which is 0111 1111 110
• The exponent field will now hold 0111 1111 110
This means that we just represent -1 with an excess of
1023.
Data Rep
59
IEEE 754 Encoding
Single Precision
Double Precision
Represented Object
Exponent
Fraction
Exponent
Fraction
0
0
0
0
0
0
non-zero
0
non-zero
+/- denormalized
number
1~254
anything
1~2046
anything
+/- floating-point
numbers
255
0
2047
0
+/- infinity
255
non-zero
2047
non-zero
NaN (Not a Number)
Data Rep
60
60
Floating Point Representation
format (summary)
Sign



Exponent
Mantissa
the sign bit represents the sign
•
•
0 for positive numbers
1 for negative numbers
The exponent is biased by a fixed value b, called the bias.
The mantissa should be normalised, e.g. if the real
mantissa if of the form 1.f then the normalised mantissa
should be f, where f is a binary sequence.
Data Rep
61
Character representation- ASCII

ASCII (American Standard Code for Information Interchange)

It is the scheme used to represent characters.

Each character is represented using 7-bit binary code.

If 8-bits are used, the first bit is always set to 0

See (table 5.1 p56, study guide) for character representation in
ASCII.
Data Rep
62
ASCII – example
Symbol
decimal
7
8
9
:
;
<
=
>
?
@
A
B
C
55
56
57
58
59
60
61
62
63
64
65
66
67
Binary
00110111
00111000
00111001
00111010
00111011
00111100
00111101
00111110
00111111
01000000
01000001
01000010
01000011
Data Rep
63
Character strings


How to represent character strings?
A collection of adjacent “words” (bit-string units)
can store a sequence of letters
'H' 'e' 'l' 'l' o' ' ' 'W' 'o' 'r' 'l' 'd' '\0'

Notation: enclose strings in double quotes

Representation convention: null character defines
end of string
• "Hello world"
• Null is sometimes written as '\0'
• Its binary representationDatais Rep
the number 0
64
Layered View of Representation
Text
string
Information
Information
Character
Data
Information
Data
Information
Bit string
Data
Data
Sequence of
characters
Data Rep
65
Working With A Layered
View of Representation


Represent “SI” at the two layers shown on the
previous slide.
Representation schemes:
•
•
Top layer - Character string to character
sequence:
Write each letter separately, enclosed in quotes. End
string with ‘\0’.
Bottom layer - Character to bit-string:
Represent a character using the binary equivalent
according to the ASCII table provided.
Data Rep
66
Solution

SI

‘S’ ‘I’ ‘\0’
010100110100100000000000

•
The colors are intended to help you read it; computers don’t care that all the bits
run together.
Data Rep
67
exercise

Use the ASCII table to write the ASCII
code for the following:
• CIS110
• 6=2*3
Data Rep
68
Unicode - representation





ASCII code can represent only 128 = 27 characters.
It only represents the English Alphabet plus some control
characters.
Unicode is designed to represent the worldwide
interchange.
It uses 16 bits and can represents 32,768 characters.
For compatibility, the first 128 Unicode are the same as
the one of the ASCII.
Data Rep
69
Colour representation

Colours can represented using a sequence of bits.

256 colours – how many bits?
•
Hint for calculating
•
•

To figure out how many bits are needed to represent a range
of values, figure out the smallest power of 2 that is equal to or
bigger than the size of the range.
That is, find x for 2 x => 256
24-bit colour – how many possible colors can be
represented?
•
Hints
• 16 million possible colours (why 16 millions?)
Data Rep
70
24-bits -- the True colour
• 24-bit color is often referred to as the true
•
colour.
Any real-life shade, detected by the naked
eye, will be among the 16 million possible
colours.
Data Rep
71
Example: 2-bit per pixel

4=22 choices
• 00 (off, off)=white
• 01 (off, on)=light
0
grey
10 (on, off)=dark
grey
11 (on, on)=black
0
•
•
=
(white)
=
(light grey)
0
1
=
1
0
=
1
(dark grey)
(black)
1
Data Rep
72
Image representation





An image can be divided into many tiny squares, called
pixels.
Each pixel has a particular colour.
The quality of the picture depends on two factors:
• the density of pixels.
• The length of the word representing colours.
The resolution of an image is the density of pixels.
The higher the resolution the more information
information the image contains.
Data Rep
73
Bitmap Images

Each individual pixel (pi(x)cture element) in a
graphic stored as a binary number
•
•
Pixel: A small area with associated coordinate location
Example: each point below is represented by a 4-bit
code corresponding to 1 of 16 shades of gray
Data Rep
74
Representing Sound Graphically




X axis: time
Y axis: pressure
A: amplitude (volume)
: wavelength (inverse of frequency = 1/)
Data Rep
75
Sampling



Sampling is a method used to digitise
sound waves.
A sample is the measurement of the
amplitude at a point in time.
The quality of the sound depends on:
• The sampling rate, the faster the better
• The size of the word used to represent a
sample.
Data Rep
76
Digitizing Sound
Capture amplitude at
these points
Lose all variation
between data points
Zoomed Low Frequency Signal
Data Rep
77
Summary





Integer representation
•
•
•
•
Unsigned,
Signed,
Excess notation, and
Two’s complement.
Fraction representation
•
Floating point (IEEE 754 format )
•
Single and double precision
Character representation
Colour representation
Sound representation
Data Rep
78
Exercise
1.
Represent +0.8 in the following floating-point
representation:
•
•
•
2.
3.
1-bit sign
4-bit exponent
6-bit normalised mantissa (significand).
Convert the value represented back to
decimal.
Calculate the relative error of the
representation.
Data Rep
79