Download Data Representation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Large numbers wikipedia , lookup

Elementary arithmetic wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Approximations of π wikipedia , lookup

Arithmetic wikipedia , lookup

Location arithmetic wikipedia , lookup

Elementary mathematics wikipedia , lookup

Addition wikipedia , lookup

Positional notation wikipedia , lookup

Transcript
Data Representation
January 9–14, 2013
1 / 40
Quick logistical notes
In class exercises
Bring paper and pencil (or laptop) to each lecture!
Goals:
• break up lectures, keep you engaged
• chance to work through problems in class
• ask questions!
First homework will be posted before Friday’s lecture!
2 / 40
Outline
Internal vs. external representations
Representing the natural numbers
Binary number system
Binary arithmetic
Hexadecimal and base-N number systems
Fixed-size integer representations
Representing negative numbers
Big endian vs. little endian
3 / 40
Internal vs. external representations
Internal representation
How the data is actually represented in the computer hardware
External representation
How we interpret or conceptualize the internal representation
4 / 40
Internal representations
Usually two states, which we interpret as 0 and 1
Volatile representations:
• Capacitor (DRAM)
• charged or not
• Flip-flop circuit (SRAM)
• one of two output signals is high
Non-volatile representations:
• Region of a magnetized surface (hard disks, tape)
• positive or negative
• Floating gate transistor (flash)
• change in voltage
• one cell can represent more than two states!
• e.g. one 16-level cell ≈ four flip-flops
5 / 40
Interacting with the internal representation
Architecture provides an interface
• can interact with the internal representation
• using the abstraction of the external representation
Advantages:
• Don’t have to think about internal representation
• Architecture can be implemented by different hardware
6 / 40
Organization of the internal representation
Usually can’t refer to individual bits
• Internal representation organized into groups
• Through ISA, can read/write a group by an address
Addressable groups in MIPS
• byte = 8 bits
• word = 4 bytes = 32 bits
• (also halfword = 2 bytes = 16 bits)
7 / 40
External representations
Conceptually, view data as a sequence of 0s and 1s
The same data can be interpreted in different ways:
Example: 1111 0110
ö
246
−10
extended ASCII character
unsigned integer
signed 8-bit integer
8 / 40
Outline
Internal vs. external representations
Representing the natural numbers
Binary number system
Binary arithmetic
Hexadecimal and base-N number systems
Fixed-size integer representations
Representing negative numbers
Big endian vs. little endian
9 / 40
Decimal number system (base 10)
How it works (positional number system):
• 10 digits, used in sequence
• each position corresponds to a power of 10
• sum of each digit multiplied by position value
Example: 2037
...
105
104
103 102 101 100
. . . 100,000 10,000 1000 100 10
1
(0)
(0)
2
0
3
7
2 ·1000 + 0 ·100 + 3 ·10 + 7 ·1
= 2000 + 0 + 30 + 7 = 2037
10 / 40
Binary number system (base 2)
Works the same way!
• 2 bits, used in sequence (binary digit)
• each position corresponds to a power of 2
• sum of each bit multiplied by position value
Example: 110101
. . . 27 26 25 24 23 22 21 20
. . . 128 64 32 16 8 4 2 1
(0) (0) 1 1 0 1 0 1
1 · 32 + 1 · 16 + 0 · 8 + 1 · 4 + 0 ·2 + 1 ·1
= 32 + 16 + 0 + 4 + 0 + 1 = 53
11 / 40
Converting from binary to decimal
Very easy:
• Since binary is just 0s and 1s, no need to multiply
• Just add up the position values of the 1 bits
Example: 1011 0010
. . . 27 26 25 24 23 22 21 20
. . . 128 64 32 16 8 4 2 1
1
0 1 1 0 0 1 0
128 + 32 + 16 + 2 = 178
12 / 40
Converting from decimal to binary
Method 1: Subtracting powers of 2
For each position p from left to right
• If 2p ≤ n, subtract and write 1
• Otherwise, write 0
Example: 157
157 − 128 = 29
29 − 16 = 13
13 − 8 = 5
5−4= 1
1−1= 0
1 for 128’s position
0 for 64, 0 for 32, 1 for 16
1 for 8
1 for 4
0 for 2, 1 for 1
1001 1101
13 / 40
Converting from decimal to binary
Method 2: Successive division by 2
• Divide by 2 until you reach 0, keeping track of remainders
• Write the remainders, from last to first
Example: 157
157
78
39
19
9
4
2
1
÷
÷
÷
÷
÷
÷
÷
÷
2
2
2
2
2
2
2
2
= 78 R 1
= 39 R 0
= 19 R 1
= 9 R 1
= 4 R 1
= 2 R 0
= 1 R 0
= 0 R 1
1001 1101
14 / 40
In class exercises
Convert from binary to decimal:
• 0010 1010
• 1001 0101
Convert from decimal to binary:
• 169, by subtracting powers of 2
• 84, by successive division by 2
15 / 40
Binary addition
Just like adding decimal numbers!
To add two binary numbers
• Pairwise add each bit, starting from the right
• 0 + 0 = 0 and 0 + 1 = 1
• On 1 + 1, carry a bit to the left
Example: 0110 + 0011
Example: 0011 + 0011
11
+
0110
0011
1001
11
+
0011
0011
0110
16 / 40
Binary multiplication
Same algorithm as decimal (only easier)
To multiply two binary numbers A and B
1. For each bit b in B:
• Multiply b × A, aligning the result with b
(since b is 0 or 1, each step yields 0 or a A!)
2. Sum the results
Example: 1101 × 1101
1.
1101
×1101
1101
0
1101
1101
2.
11111
1101
0000
110100
+ 1101000
10101001
Often easiest to add
results two at a time
17 / 40
Special case: multiplying by a power of 2
Super easy, just like multiplying by powers of 10 in decimal
To multiply a binary number by 2p
Add p 0s on the right
Examples
• 100 × 1101 = 110100
• 1010 × 1000 = 1010000
18 / 40
Hexadecimal number system (base 16)
Very useful for representing binary data concisely!
• 16 digits: 0–9, A, B, C, D, E, F
• each position corresponds to a power of 16
• usually prefixed with 0x
Each hex digit corresponds to 4 bits
0
1
2
3
0000
0001
0010
0011
4
5
6
7
0100
0101
0110
0111
8
9
A
B
1000
1001
1010
1011
C
D
E
F
1100
1101
1110
1111
One byte = 2 hex digits
19 / 40
Converting hexadecimal ⇔ binary
Each hex digit corresponds to 4 bits
0
1
2
3
0000
0001
0010
0011
4
5
6
7
0100
0101
0110
0111
8
9
A
B
1000
1001
1010
1011
C
D
E
F
1100
1101
1110
1111
Examples
• 0xA4F7 = 1010 0100 1111 0111
• 0x0B60 = 0000 1011 0110 0000
We will be doing this a lot this quarter. :)
20 / 40
Converting hexadecimal ⇔ decimal
Two strategies:
• Convert directly
• Convert hexadecimal ⇔ binary ⇔ decimal
Example: 0xB6A4 (direct conversion)
...
164
163 162 161 160
. . . 65,536 4,096 256 16
1
(0)
B
6
A
4
11 · 4096 + 6 · 256 + 10 · 16 + 4 · 1
= 45056 + 1536 + 160 + 4 = 46,756
21 / 40
Representation in other bases
In general, we can represent numbers in any base
Some other significant bases:
• Base 8 — octal
• each octal digit is equivalent to three bits
(000 = 08 , 001 = 18 , 010 = 28 , . . . , 111 = 78 )
• useful in old architectures with 12, 24, 36 bit words
• support in C and many assembly languages
(071 = 718 = 5310 )
• Base 64 (0–9, A–Z, a–z, +, /)
• each base-64 digit is equivalent to six bits
• used in MIME to transmit binary data in plain ASCII text
22 / 40
In class exercises
Add in binary:
• 100 1100 + 1110 1111
Multiply in binary:
• 1011 × 101
Add in hexadecimal:
• 0x28 + 0x4A
0
1
2
3
0000
0001
0010
0011
4
5
6
7
0100
0101
0110
0111
8
9
A
B
1000
1001
1010
1011
C
D
E
F
1100
1101
1110
1111
23 / 40
Outline
Internal vs. external representations
Representing the natural numbers
Binary number system
Binary arithmetic
Hexadecimal and base-N number systems
Fixed-size integer representations
Representing negative numbers
Big endian vs. little endian
24 / 40
Arbitrary vs. fixed precision
So far, we have been assuming arbitrary precision
• to represent a bigger number, just add more bits/digits!
In practice, integers have a fixed size
• commonly 32 or 64 bits
• based on register size of the architecture
This is significant for two reasons:
• risk of overflow
• representation of negative numbers
25 / 40
Representing negative numbers
Must first specify the fixed size of the integer!
• With n bits, we can represent 2n different values
• Idea: split space so half the values represent negatives
Sign and magnitude representation
• First bit represents the sign (0 positive, 1 negative)
• Rest of bits represent the magnitude, that is |x|
Suppose 4-bit integers
• Examples:
−1 = 1001
−4 = 1100
−7 = 1111
This is exactly the representation you’re used to in decimal!
26 / 40
Problems with sign and magnitude representation
This turns out to not be a very good representation . . . why?
Issue 1: Multiple zeros
• Both 0000 and 1000 represent the same value
• This is strange and requires extra effort
Issue 2: Complicated arithmetic
Simple binary addition doesn’t work
0 010
+ 0 011
0 101
3
1 010
+ 1 011
1 101
3
0 010
+ 1 011
1 001
7
27 / 40
One’s complement representation
One’s complement
• start with the fixed-size binary representation of |x|
• invert every bit
Features:
• Binary addition is simple (wrap-around carry)
• Still two zeros (all 0s and all 1s)
Examples
• -2
• -3
• -5
1. 0010
1. 0011
1. 0101
2. 1101
2. 1100
2. 1010
28 / 40
One’s complement addition
Overflow carries “wrap around” (added on the right)
Example: −2 + −3 = −5
11
1101
+ 1100
1001
+
1
1010
29 / 40
Two’s complement representation
Two’s complement
To represent a negative number x:
1. start with the fixed-size binary representation of |x|
2. invert every bit
3. add 1 to the result
Suppose 4-bit integers:
Examples
• -1
• -4
• -7
1. 0001
1. 0100
1. 0111
2. 1110
2. 1011
2. 1000
3. 1111
3. 1100
3. 1001
30 / 40
Features of two’s complement representation
Range of expressible values with n bits
• max: 2n−1 − 1
• min: −2n−1
0 followed by all 1s
1 followed by all 0s
Fixes the issues with sign and magnitude:
• Only one zero! (all 0s)
• Binary arithmetic “just works” (discard carry out)
Examples: 2 + 3 = 5
0010
+ 0011
0101
3
−2 + −3 = −5
1110
+ 1101
1011
3
2 + −3 = −1
0010
+ 1101
1111
3
31 / 40
Sign extension
Change the size of an integer without changing its value
• if positive (left-most bit 0), pad left with 0s
• if negative (left-most bit 1), pad left with 1s
Works with both one’s and two’s complement representation
Example: Extending from 8-bits to 16-bits
• 1001 0110 ⇒ 1111 1111 1001 0110
• 0001 0011 ⇒ 0000 0000 0001 0011
32 / 40
Carry out vs. overflow
Carry out: carry after most significant bit ⇒ discard, no error
Overflow: result is out of representable range ⇒ error!
Carry out 6= overflow!
Carry out is a normal part of signed integer addition
Will get a carry out when adding:
• two negative numbers
• a negative and a positive, result is positive
Just ignore it!
33 / 40
Two’s complement overflow detection
Overflow: result is out of representable range ⇒ error!
When adding . . .
• two numbers with different signs
• overflow can never occur!
• two numbers with the same sign
• overflow occurs if the sign changes
34 / 40
Trade-offs between representations of negatives
In modern architectures, two’s complement is used
• Simple arithmetic operations
• Only one zero
• Hard to read
35 / 40
Unsigned vs. signed integers
Can interpret the same n-bit data as either unsigned or signed
Unsigned integer
• Interpret as a positive number
• Range: 0 to 2n
Signed integer
• Interpret as two’s complement
• Range: −2n−1 to 2n−1 − 1
Only different when the leftmost bit is a 1!
36 / 40
Big endian vs. little endian
Order of the addressable components in a larger data type
• Usually, the order of bytes within a word
Big endian
Bytes ordered from most significant (left) to least (right)
• Example: 256 as a 16-bit halfword: 0x0100
Little endian
Bytes ordered from least significant (left) to most (right)
• Example: 256 as a 16-bit halfword: 0x0001
Big-endian is what we’ve been assuming so far!
37 / 40
Endian conversion
Converting from big endian to little endian
1. Separate the data into addressable components (bytes)
2. Write the components (not the bits!) in reverse order
Examples
• 0x12345678
• 0xE5AD5CCA
1. 12 34 56 78
1. E5 AD 5C CA
2. 0x78563412
2. 0xCA5CADE5
Same algorithm for converting from little to big!
38 / 40
Which architectures are what endian?
Little-endian:
• x86, Atmel
• MIPS (MARS simulator)
Big-endian:
• Motorola 6800 and 68k
Bi-endian (configurable to be big or little):
• ARM, SPARC, PowerPC
• MIPS (specification)
http://en.wikipedia.org/wiki/Endianness
39 / 40
In class exercises
Assume 8-bit integers, addressable in 2-bit chunks
For each of the following numbers:
1. write in two’s complement binary form
2. convert to little endian
Numbers:
• -50
• -100
40 / 40