Download Ch2

Document related concepts
no text concepts found
Transcript
COM 249 – Computer Organization and
Assembly Language
Chapter 2 Instructions:
Language of the Computer
Based on slides from D. Patterson and
www-inst.eecs.berkeley.edu/~cs152/
Modified by S. J. Fritz Spring 2009 (1)
Introduction
• Words of a computer’s language are
called its instructions
• Its vocabulary is its instruction set.
• Goal:
– Find a language that makes it easy to build
the hardware and the compiler,
– while maximizing performance and
minimizing cost
Modified by S. J. Fritz Spring 2009 (2)
• Different computers have different instruction
sets
– But with many aspects in common
• Early computers had very simple instruction
sets
– Simplified implementation
• Many modern computers also have simple
instruction sets
Modified by S. J. Fritz Spring 2009 (3)
§2.1 Introduction
Instruction Set
Instruction Set Architecture
• Early trend was to add more and more instructions
to new CPUs to do elaborate operations
– VAX architecture had an instruction to multiply
polynomials!
• RISC philosophy (Cocke IBM, Patterson,
Hennessy,1980s)–Reduced Instruction Set
Computing (RISC)
– Keep the instruction set small and simple,
makes it easier to build fast hardware.
– Let software do complicated operations by
composing simpler ones.
Modified by S. J. Fritz Spring 2009 (4)
The MIPS Instruction Set
• Stored program concept- instructions and data are stored
as numbers.
• MIPS Instruction Set is used as the example throughout
the book
• Stanford MIPS commercialized by MIPS Technologies
(www.mips.com)
• Large share of embedded core market
– Applications in consumer electronics, network/storage equipment,
cameras, printers, …
• Typical of many modern ISAs
– See MIPS Reference Data tear-out card, and Appendixes B and E
Modified by S. J. Fritz Spring 2009 (5)
MIPS Architecture
• MIPS – semiconductor company that
built one of the first commercial RISC
architectures
• We will study the MIPS architecture in
some detail in this class
• Why MIPS instead of Intel 80x86?
– MIPS is simple, elegant. Don’t want to get
bogged down in gritty details.
– MIPS widely used in embedded apps, x86
little used in embedded, and more embedded
computers than PCs
Modified by S. J. Fritz Spring 2009 (6)
Review: Instruction Set Design
software
instruction set
hardware
Which is easier to change?
Modified by S. J. Fritz Spring 2009 (7)
Stored Program Computer
• Basic Principles
– Use of instructions that are
indistinguishable from numbers
– Use of alterable memory for programs
• Demands balance among number of
instructions, the number of clock cycles
needed by an instruction and the speed
of the clock.
Modified by S. J. Fritz Spring 2009 (8)
Overview of Design Principles
1. Simplicity favors regularity
– keep all instructions a single size
– require three register operands for arithmetic
– keep register fields in same place in each
instruction
– regularity makes implementation simpler
– simplicity enables higher performance at
lower cost
2. Smaller is faster
– the reason that MIPS has 32 registers rather
than many more
Modified by S. J. Fritz Spring 2009 (9)
Overview of Design Principles
3.Make the common case fast
– PC-relative addressing for conditional branch
– immediate addressing for constant operands
4.Good design demands good compromises
– compromise between larger addresses and
keeping instructions same length
Modified by S. J. Fritz Spring 2009 (10)
MIPS Instructions
• Design Principle 1: Simplicity favors
regularity
• The MIPS assembly language instruction
•
•
•
•
add a, b, c
means a = b + c
Each line represents one instruction
Each instruction has exactly 3 operands for
simplicity
There is one operation per MIPS instruction
Instructions are related to operations (=, +, -, *, /)
in C or Java
Modified by S. J. Fritz Spring 2009 (11)
Arithmetic Operations- Addition
• The MIPS assembly language instruction
add a, b, c
means a = b+c
• This sequence adds four variables (a=b+c+d+e)
add a, b, c
add a, a, d
add a, a, e
# the sum of b and c is placed in a
# the sum of b,c, and d is now in a
# the sum of b,c,d and e is now in a
• Notice that it takes 3 instructions to add four
variables
Modified by S. J. Fritz Spring 2009 (12)
MIPS Addition and Subtraction
• Syntax of Instructions:
1 2,3,4
where:
1) operation by name
2) operand getting result (“destination”)
3) 1st operand for operation (“source1”)
4) 2nd operand for operation (“source2”)
• Syntax is rigid:
– 1 operator, 3 operands
– Why? Keep Hardware simple via regularity
Modified by S. J. Fritz Spring 2009 (13)
MIPS Addition and Subtraction of Integers
• Addition in Assembly
– Example: add $s0,$s1,$s2 (in MIPS)
Equivalent to:
a = b + c (in C/Java)
where MIPS registers $s0,$s1,$s2 are
associated with C variables a, b, c
• Subtraction in Assembly
– Example: sub $s3,$s4,$s5 (in MIPS)
Equivalent to:
d = e - f (in C)
where MIPS registers $s3,$s4,$s5 are
associated with C variables d, e, f
Modified by S. J. Fritz Spring 2009 (14)
Addition and Subtraction
• How would MIPS do this C/Java statement?
a = b + c + d - e;
• Break into multiple instructions
add $t0, $s1, $s2 # temp = b + c
add $t0, $t0, $s3 # temp = temp + d
sub $s0, $t0, $s4 # a = temp - e
• Notice: A single line of C or Java may break
up into several lines of MIPS. Everything after
the hash mark- # - on each line is ignored
(comments)
Modified by S. J. Fritz Spring 2009 (15)
Compiling C into MIPS
• How do we do this?
• C code:
f = (g + h) - (i + j);
• Use intermediate temporary registers:
• Compiled MIPS pseudocode:
add t0, g, h
add t1, i, j
sub f, t0, t1
# temp t0 = g + h
# temp t1 = i + j
# f = t0 - t1
• Comments are to the right of the #
• Each line contains at most one instruction
Modified by S. J. Fritz Spring 2009 (16)
C, Java Variables vs. Registers
• In C (and most High Level Languages) variables
are declared first and given a type
– Example:
int fahr, celsius;
char a, b, c, d, e;
• Each variable can ONLY represent a value of the
declared type (cannot mix and match int and
char variables).
• In Assembly Language, the registers have no type;
operation determines how register contents are
treated
Modified by S. J. Fritz Spring 2009 (17)
Operands of the Computer Hardware
• Operands of arithmetic instructions must be
from a limited number of special memory
locations called registers
• Size of a MIPS register is 32 bits - called a
word (although there is a 64 bit version).
• Major difference between variables in
programming language (unlimited) and
registers is the limited number of registerstypically 32 in MIPS.
Modified by S. J. Fritz Spring 2009 (18)
Operands of the Computer Hardware
• Design Principle 2: Smaller is faster
– Very large number of registers may increase
clock cycle time because it takes electronic
signals longer to travel farther.
– Using more than 32 registers would require a
different instruction format.
– MIPS register convention is to use two
character names following a dollar sign:
$s0, $s1… for variables
$t0, $t1… for temporary locations
$a0, $a1…for arguments
Modified by S. J. Fritz Spring 2009 (19)
Compiling C into MIPS Using Registers
• C code: (similar to previous example)
f = (g + h) - (i + j);
• where f, g, h, i, j are assigned to registers
$s0, $s1, $s2, $s3 and $s4 respectively:
• Compiled MIPS code:
add $t0,$s1,$S2
add $t1,$s3,$s4
sub $S0,$t0,$t1
#register $t0 = g + h
#register $t1 = i + j
#f gets $t0 - $t1
• Variables have been replaced with registers
Modified by S. J. Fritz Spring 2009 (20)
Register Operands
• Arithmetic instructions use register
operands
• MIPS has a 32 × 32-bit register file
(32 registers, each 32 bits)
– Use for frequently accessed data
– Numbered 0 to 31
– 32-bit data called a “word”
Modified by S. J. Fritz Spring 2009 (21)
Memory Operands
• Programming Languages have both
simple variables and complex data
structures.
• How can we handle large data
structures with just a few registers?
– Data structures are kept in memory.
• MIPS includes instructions to transfer
data between memory and registers.
– Data transfer instructions ( load, store)
Modified by S. J. Fritz Spring 2009 (22)
Memory Operands
• Data transfer Instruction
– load copies data from memory to register
– lw - load word
• Format
opcode register , constant (register)
memory address
• Syntax
lw $t0, 8 ($s3)
offset
base address
Modified by S. J. Fritz Spring 2009 (23)
Memory Addressing
°
Since 1980 almost every machine uses addresses to
the level of 8-bits ( byte)
2 questions for design of Instruction Set Architecture:
Since we could read a 32-bit word as
• four loads of bytes from sequential byte addresses
• or as one load word from a single byte address,
How do byte addresses map onto words?
Can a word be placed on any byte boundary?
Modified by S. J. Fritz Spring 2009 (24)
Addressing Objects: Alignment
• Since 8-bit bytes are useful, most
architectures address individual bytes.
• Address of a word matches the address
of one of the four bytes in the word
• Addresses of sequential words differ by
4 bytes
• MIPS words must start at addresses
that are multiples of 4 - called alignment
restriction
Modified by S. J. Fritz Spring 2009 (25)
Memory Operands
• Arithmetic operations occur on registers
• More complex data structures (arrays and
structures) are kept in memory
• MIPS must include instructions that transfer
data between memory and registers
(called data transfer instructions)
• To access a word in memory, the instruction
must include the memory address.
Modified by S. J. Fritz Spring 2009 (26)
Memory Operands
• Main memory used for composite data
– Arrays, structures, dynamic data
• To apply arithmetic operations
– Load values from memory into registers
– Store result from register to memory
• Memory is byte addressed
– Each address identifies an 8-bit byte
• Words are aligned in memory
– Address must be a multiple of 4
• MIPS is Big Endian
– Most-significant byte at least address of a word
– c.f. Little Endian: least-significant byte at least address
Modified by S. J. Fritz Spring 2009 (27)
Addressing Objects: Endianess
• Computers are grouped into those that use:
– the address of the leftmost or “big end byte” as
the word address
– and those that use the “little end” or rightmost
byte
• MIPS is in the BIG Endian group
Modified by S. J. Fritz Spring 2009 (28)
Addressing Objects: “Endianess” and Alignment
• Big Endian: address of most significant :IBM 360/370,
Motorola 68k, MIPS, Sparc, HP PA
• Little Endian: address of least significant: Intel 80x86,
DEC Vax, DEC Alpha (Windows NT)
little endian byte 0
3
2
1
0
msb
lsb
0
0
big endian byte 0
1
2
3
Alignment: require that objects fall
on address that is multiple of their size.
Modified by S. J. Fritz Spring 2009 (29)
Aligned
Not
Aligned
1
2
3
Big Endian
• "Big Endian" means that the high-order (most significant)
byte of the number is stored in memory at the lowest
address, and the low-order (least significant) byte at the
highest address. (The “big end” comes first.)
• A LongInt, would then be stored as:
Base Address+0 Byte3
Base Address+1 Byte2
Base Address+2 Byte1
Base Address+3 Byte0
Big Endian
A
B
C
D
Little Endian
• Motorola processors (those used in Mac's) and
mainframes use "Big Endian" byte order.
• http://www.cs.umass.edu/~Verts/cs32/endian.html
Modified by S. J. Fritz Spring 2009 (30)
Little Endian
• "Little Endian" means that the low-order byte of the number
is stored in memory at the lowest address, and the highorder byte at the highest address. (The little end comes
first.)
• A 4 byte LongInt Byte3 Byte2 Byte1 Byte0 will be arranged
in memory as follows:
Base Address+0 Byte0
Base Address+1 Byte1
Base Address+2 Byte2
Base Address+3 Byte3
• Intel processors (those used in PC's) use "Little
Endian" byte order.
• http://www.cs.umass.edu/~Verts/cs32/endian.html
Modified by S. J. Fritz Spring 2009 (31)
Big Endian and Little Endian
• To represent the value 1025 (as a 4 byte integer):
00000000 00000000 00000100 00000001
Address
Big-Endian
00
01
02
03
00000000
00000000
00000100
00000001
Little-Endian
00000001
00000100
00000000
00000000
If UNIX were stored as 2 two-byte words then in a Big-Endian systems, it
would be stored as UNIX; in a Little-Endian system, it would be stored as
NUXI. (See http://www.webopedia.com/TERM/B/big_endian.html )
Modified by S. J. Fritz Spring 2009 (32)
Big and Little Endian
• Both have advantages and disadvantages
• In "Big Endian" form, by having the high-order byte
come first, you can test whether the number is
positive or negative by looking at the byte at offset
zero.
• In "Little Endian" form, assembly language
instructions for picking up a 1, 2, 4, or longer byte
number proceed in exactly the same way for all
formats and multiple precision math routines are
correspondingly easy to write.
• http://www.cs.umass.edu/~Verts/cs32/endian.html
Modified by S. J. Fritz Spring 2009 (33)
MIPS I Registers
• Programmable storage
– 232 x bytes of memory(r0-31)
– 32 x 32-bit
– General Purpose Registers
GPRs (R0 = 0)
r0
r1
°
°
°
r31
PC
lo
hi
0
32 bits “wide”
Modified by S. J. Fritz Spring 2009 (34)
Memory Addresses and Contents
3
2
1
0
Address
Processor
100
10
101
1
Data
Memory
The address of the third data element is 2 and
the contents of Memory[2] is 10.
Modified by S. J. Fritz Spring 2009 (35)
Registers vs. Memory
• Registers are faster to access than memory
• Operating on memory data requires loads
and stores
– More instructions to be executed
• Compiler must use registers for variables as
much as possible
– Only spill to memory for less frequently used
variables
– Register optimization is important!
Modified by S. J. Fritz Spring 2009 (36)
Arrays and Data Structures
• C and Java variables map onto registers; what about
large data structures like arrays?
• 1 of the 5 components of a computer - the memorycontains such data structures
• But MIPS arithmetic instructions only operate on
registers, never directly on memory.
• Data transfer instructions transfer data between
registers and memory:
– Memory to register (Load)
– Register to memory (Store)
Modified by S. J. Fritz Spring 2009 (37)
Anatomy: 5 components of any Computer
Registers are in the datapath of the
processor; if operands are in memory,
we must transfer them to the processor
to operate on them, and then transfer
back to memory when done.
Personal Computer
Computer
Processor
Control
(“brain”)
Datapath
Registers
Memory
Devices
Input
Store (to)
Load (from)
Output
These are “data transfer” instructions…
Modified by S. J. Fritz Spring 2009 (38)
Data Transfer: Memory to Registers
• To transfer a word of data, we need to specify two
things:
– Register: specify this by # ($0 - $31) or
symbolic name ($s0,…, $t0, …)
– Memory address: more difficult
• Think of memory as a single one-dimensional
array, so we can address it simply by
supplying a pointer to a memory address.
• Other times, we want to be able to offset from
this pointer.
• Remember: “Load FROM memory”
Modified by S. J. Fritz Spring 2009 (39)
Data Transfer: Memory to Registers
• To specify a memory address to copy
from, specify two things:
– A register containing a pointer to memory
– A numerical offset (in bytes)
• The desired memory address is the sum
of these two values.
• Example:
8($t0)
– specifies the memory address pointed to by
the value in $t0, plus 8 bytes
Modified by S. J. Fritz Spring 2009 (40)
Data Transfer: Memory to Register
• Load Instruction Syntax:
1 2, 3(4)
lw $t0,12($s0)
where
1) operation name
2) register that will receive value
3) numerical offset in bytes
4) register containing pointer to memory
• MIPS Instruction Name:
– lw (meaning Load Word, so 32 bits or one word
are loaded at a time)
Modified by S. J. Fritz Spring 2009 (41)
Data Transfer: Memory to Register
Data flow
Example:lw $t0,12($s0)
This instruction will take the pointer in $s0, add 12
bytes to it, and then load the value from the memory
pointed to by this calculated sum into register $t0
• Notes:
– $s0 is called the base register
– 12 is called the offset
– offset is generally used in accessing elements of array
or structure: base register points to beginning of array
or structure
Modified by S. J. Fritz Spring 2009 (42)
Data Transfer: Register to Memory
• Also want to store from register into memory
– Store instruction syntax is identical to Load’s
• MIPS Instruction Name:
sw (meaning Store Word, so 32 bits or one word
are loaded at a time)
Data flow
• Example:sw $t0,12($s0)
This instruction will take the pointer in $s0, add 12
bytes to it, and then store the value from register $t0
into that memory address
• Remember: “ Store INTO memory”
Modified by S. J. Fritz Spring 2009 (43)
Data Transfer Instructions- Load
• Load copies data from memory to a registerin MIPS lw or load word
• Format – operation name followed by the
register to be loaded, then a constant and
register used to access memory
• Sum of the constant portion of the instruction
and the constants of the second registers
forms the memory address
Modified by S. J. Fritz Spring 2009 (44)
Data Transfer Instructions- Load
• Assume A is an array of 100 words, with a starting
or base address in $s3
• Let g, h be variables associated with $s1,$s2
• C Assignment Statement:
g = h + A[8];
• Compiling to MIPS with operand in Memory
• First transfer A[8] to a register: (use load word - lw)
lw $t0, 8($s3)
#Temp register $t0 gets A[8]
add $s1,$s2,$t0 # g = h + A[8]
• The constant 8 is the offset and the register ($s3)
added to form the address is called the base register.
Modified by S. J. Fritz Spring 2009 (45)
Actual MIPS Memory Addresses and Contents
12
8
4
0
Address
Processor
100
10
101
1
Data
Memory
Since MIPS addresses each byte, word addresses are multiples of 4;
there are 4 bytes in a word. Byte address of third word is 8.
Modified by S. J. Fritz Spring 2009 (46)
Data Transfer Instructions- Store
• Instruction complementary to load is called
store, or store word – sw- which copies
data from a register to memory
• Format similar to load instruction: name of
operation, followed by the register to be
stored, then offset to select the element, and
finally the base register.
Modified by S. J. Fritz Spring 2009 (47)
Data Transfer Instructions- Store
•
•
•
•
Assume variable h is associated with register $s2
Base address of array A is in $s3.
C code: A[12] = h + A[8];
MIPS code:
lw $t0, 32($s3) # temp reg $t0 gets A[8]
add $t0, $s2, $t0 # temp reg $t0 gets h + A[8]
sw $t0, 48($s3) # stores h + A[8] into A[12]
Modified by S. J. Fritz Spring 2009 (48)
MIPS Memory Addressing
• Most architectures addresses individual bytes,
therefore the address of a word matches the
address of one of the 4 bytes within the word.
• Addresses of sequential words differ by 4.
• In MIPS, words must start at addresses that are
multiples of 4.- called assignment restriction.
• Remember MIPS is “big endian”
• Byte addressing affects the array index.
• Offset to be added to the base register $s3 (in
previous example) must be (4 x 8) or 32.
Modified by S. J. Fritz Spring 2009 (49)
Constants or Immediate Operands
• Design Principle 3: Make the common case FAST
• Constants occur frequently and by including
constants in arithmetic instructions, they are faster
than if the constants were loaded from memory:
lw $t0, AddrContant4($s1) # t0 = constant 4
• To add 4 to register 3 use add immediate (addi):
addi $s3, $s3, 4
# $s3 = $s3+4
• Since MIPS supports negative constants, there is no
need for a subtract immediate instruction.
Modified by S. J. Fritz Spring 2009 (50)
Pointers v. Values
• Key Concept:
• A register can hold any 32-bit value. That
value can be a (signed) int, an unsigned
int, a pointer (memory address), and so on
• If you write add $t2,$t1,$t0
then $t0 and $t1 must contain values
• If you write lw $t2,0($t0)
then $t0 must contain a pointer to memory
• Don’t mix these up!
Modified by S. J. Fritz Spring 2009 (51)
Addressing: Byte vs. Word
• Every word in memory has an address, similar to
an index in an array
• Early computers numbered words like C numbers
elements of an array:
– Memory[0], Memory[1], Memory[2], …
Called the “address” of a word
• Computers needed to access 8-bit bytes as well as
words (4 bytes/word)
• Today machines address memory as bytes,
(i.e.,“Byte Addressed”) hence 32-bit (4 byte) word
addresses differ by 4
– Memory[0], Memory[4], Memory[8], …
Modified by S. J. Fritz Spring 2009 (52)
Immediates
• Immediates are numerical constants.
• They appear often in code, so there are special
instructions for them.
• Add Immediate:
addi $s0,$s1,10 (in MIPS)
f = g + 10 (in C)
where MIPS registers $s0,$s1 are associated
with C or Java variables f, g
• Syntax similar to add instruction, except that the
last argument is a number instead of a register.
Modified by S. J. Fritz Spring 2009 (53)
Immediates and Subtraction
• There is no Subtract Immediate in MIPS: Why?
• Limit types of operations that can be done to absolute
minimum
– negative constants are less frequent
– if an operation can be decomposed into a simpler
operation, don’t include it
– addi …, -X = subi …, X => no need for subi
• addi $s0,$s1,-10 (in MIPS)
f = g - 10 (in C)
where MIPS registers $s0,$s1 are associated with
C or Java variables f, g
Modified by S. J. Fritz Spring 2009 (54)
Register Zero
• One particular immediate, the number
zero (0), appears very often in code.
• So we define register zero ($0 or $zero)
to always have the value 0; for example:
add $s0,$s1,$zero (in MIPS)
f = g (in C)
where MIPS registers $s0,$s1 are
associated with C variables f, g
• Defined in hardware, so an instruction
add $zero,$zero,$s0
will not do anything!
Modified by S. J. Fritz Spring 2009 (55)
Summarizing...
• In MIPS Assembly Language:
–
–
–
–
Registers replace C variables
One Instruction (simple operation) per line
Simpler is Better
Smaller is Faster
• New Instructions:
add, addi, sub
# arithmetic operations
lw, sw
# load, store –from/to memory
• New Registers:
C or Java Variables: $s0 - $s7
Temporary Variables: $t0 - $t9
Zero: $zero
Modified by S. J. Fritz Spring 2009 (56)
Compilation with Memory
• What offset in lw to select A[5] in C/Java?
• 4x5=20 to select A[5]: byte v. word
• Compile by hand using registers:
g = h + A[5];
where g: $s1, h: $s2, $s3:base address of A
• 1st transfer from memory to register:
lw $t0,20($s3)
# $t0 gets A[5]
– Add 20 to $s3 to select A[5], put into $t0
• Next add it to h and place in g
add $s1,$s2,$t0 # $s1 = h+A[5]
Modified by S. J. Fritz Spring 2009 (57)
MIPS Instruction Encoding
Instruction Format op
rs
add
R
0
reg
reg
reg
0
n.a.
sub
R
0
reg
reg
reg
0
n.a.
addi
I
8
reg
reg n.a.
n.a.
constant
lw
I
35
reg
reg n.a.
n.a.
address
sw
I
43
reg
reg n.a.
n.a.
address
Modified by S. J. Fritz Spring 2009 (58)
rt
rd
shamt funct
MIPS Assembler Register Convention
Name
Number Usage
Preserved across
a call?
the value 0
n/a
return values
no
arguments
no
temporaries
no
saved
yes
temporaries
no
stack pointer
yes
return address
yes
$zero
$v0-$v1
$a0-$a3
$t0-$t7
$s0-$s7
$t18-$t19
$sp
$ra
0
2-3
4-7
8-15
16-23
24-25
29
31
• “caller saved”
• “callee saved”
• On Green Card in Column #2 at bottom
Modified by S. J. Fritz Spring 2009 (59)
Notes about Memory
• Pitfall:
• Forgetting that sequential word addresses in
machines with byte addressing do not differ
by 1.
– Many an assembly language programmer has toiled over
errors made by assuming that the address of the next
word can be found by incrementing the address in a
register by 1 instead of by the word size in bytes.
– So remember that for both lw and sw, the sum
of the base address and the offset must be a
multiple of 4 (to be word aligned)
Modified by S. J. Fritz Spring 2009 (60)
Memory Operand Example 1
• C code:
g = h + A[8];
– g in $s1, h in $s2, base address of A in $s3
• Compiled MIPS code:
– Index 8 requires offset of 32
• 4 bytes per word
lw $t0, 32($s3)
add $s1, $s2, $t0
offset
Modified by S. J. Fritz Spring 2009 (61)
# load word
base register
Memory Operand Example 2
• C code:
A[12] = h + A[8];
– Variable h in $s2, base address of A in $s3
• Compiled MIPS code:
– Index 8 requires offset of 32
lw $t0, 32($s3)
# load word
add $t0, $s2, $t0
sw $t0, 48($s3)
# store word
Modified by S. J. Fritz Spring 2009 (62)
• Computers store numbers as binary digits (bits)
• Given an n-bit number
x  x n1 2n1  x n2 2n2    x1 21  x 0 20
• Range: 0 to +2n – 1
• Example
0000 0000 0000 0000 0000 0000 0000 10112
= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
• Using 32 bits
0 to +4,294,967,295
Modified by S. J. Fritz Spring 2009 (63)
§2.4 Signed and Unsigned Numbers
Unsigned Binary Integers
Binary Numbers
• The MIPS word is 32 bits so we can
represent 232 different values.
• Least significant bit refers to the
rightmost bit
• Most significant bit is the leftmost bit.
• Sign and magnitude uses a separate
sign bit to distinguish positive and
negative numbers. Not used because of
difficulty with arithmetic…
Modified by S. J. Fritz Spring 2009 (64)
Two’s Complement Representation
• Makes hardware representation simple:
• Leading zero(0) means positive, leading one
(1) means negative –called the sign bit
• All negative numbers begin with a 1.
• Has one negative number –2,147,483,64810
that does not have a corresponding positive
number.
Modified by S. J. Fritz Spring 2009 (65)
Two’s Complement Representation
• To form the negation of a binary number
– Invert all bits to form the complement
– Add one
For example, to negate binary 28
00011100
Binary 28
- Invert the digits. (0 becomes 1, 1 becomes 0)
11100011
Then we add 1.
+1
11100100
Binary -28
For more information see:
http://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html
Modified by S. J. Fritz Spring 2009 (66)
Two’s Complement Representation
• Going in the opposite direction- taking the negation and
transforming it into the positive binary number
– Invert all bits to form the complement
– Add one
For example, to negate binary -28
11100100
Binary -28
- Invert the digits. (0 becomes 1, 1 becomes 0)
00011011
Then we add 1.
+1
00011100
Binary 28
• This works because the binary representation of
a sum of a number and its inverse equal –1
x + x = -1
Modified by S. J. Fritz Spring 2009 (67)
2’s Complement Simulator
• Try it with a simulator:
• http://scholar.hw.ac.uk/site/computing/activity12.asp?outline
Modified by S. J. Fritz Spring 2009 (68)
2s-Complement Signed Integers Example
• Given an n-bit number represented as
x   x n1 2n1  x n2 2n2    x1 21  x 0 20
• Range: –2n – 1 to +2n – 1 – 1
• Example
1111 1111 1111 1111 1111 1111 1111 11002
= –1×231 + 1×230 + … + 1×22 +0×21 +0×20
= –2,147,483,648 + 2,147,483,644 = –410
• Using 32 bits
–2,147,483,648 to +2,147,483,647
Modified by S. J. Fritz Spring 2009 (69)
2s-Complement Signed Integers
• Bit 31 is sign bit
– 1 for negative numbers
– 0 for non-negative numbers
• –(–2n – 1) can’t be represented
• Non-negative numbers have the same
unsigned and 2s-complement representation
• Some specific numbers:
 0: 0000 0000 … 0000
 –1: 1111 1111 … 1111
 Most-negative: 1000 0000 … 0000
 Most-positive: 0111 1111 … 1111
Modified by S. J. Fritz Spring 2009 (70)
More Examples
• References for Two’s Complement notation
• http://www.duke.edu/~twf/cps104/twoscomp.html
• http://en.wikipedia.org/wiki/Two's_complement
• http://mathforum.org/library/drmath/sets/select/dm_twos_com
plement.html
• http://www.fact-index.com/t/tw/two_s_complement.html
• http://www.hal-pc.org/~clyndes/computerarithmetic/twoscomplement.html
• http://www.vb-helper.com/tutorial_twos_complement.html
• http://web.bvu.edu/faculty/traylor/CS_Help_Stuff/Two's%20Co
mplement.htm
Modified by S. J. Fritz Spring 2009 (71)
Sign Extension
• Representing a number using more bits
– Preserve the numeric value
• In MIPS instruction set
– addi: extend immediate value
– lb, lh: extend loaded byte/halfword
– beq, bne: extend the displacement
• Replicate the sign bit to the left
– c.f. unsigned values: extend with 0s
• Examples: 8-bit to 16-bit
– +2: 0000 0010 => 0000 0000 0000 0010
– –2: 1111 1110 => 1111 1111 1111 1110
Modified by S. J. Fritz Spring 2009 (72)
• Instructions are encoded in binary
– Called machine code
• MIPS instructions
– Encoded as 32-bit instruction words
– Small number of formats encoding operation code
(opcode), register numbers, …
– Regularity!
• Register numbers ( important!)
– $t0 – $t7 are registers 8 – 15
– $t8 – $t9 are registers 24 – 25
– $s0 – $s7 are registers 16 – 23
Modified by S. J. Fritz Spring 2009 (73)
§2.5 Representing Instructions in the Computer
Representing Instructions
R-Format Instructions
• Define “fields” of the following number of bits
each: 6 + 5 + 5 + 5 + 5 + 6 = 32
6
5
5
5
5
6
• For simplicity, each field has a name:
opcode
rs
Modified by S. J. Fritz Spring 2009 (74)
rt
rd
shamt funct
R-Format Instructions
• Meaning of fields:
– rs (Source Register): generally used to specify
register containing first operand
– rt (Target Register): generally used to specify
register containing second operand (note that name
is misleading)
– rd (Destination Register): generally used to specify
register which will receive result of computation
– shamt (Shift amount)
– funct ( Function) - selects specific variant of the
opcode operation - sometimes called function code
Modified by S. J. Fritz Spring 2009 (75)
MIPS R-Format Instructions - Summary
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
• MIPS fileds are given names to make them
easier to remember
• Instruction fields
–
–
–
–
–
–
op: operation code (opcode)
rs: first source register number
rt: second source register number
rd: destination register number
shamt: shift amount (00000 for now)
funct: function code (extends opcode)
Modified by S. J. Fritz Spring 2009 (76)
R-format Example
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
add $t0, $s1, $s2
Instruction format or layout
special
$s1
$s2
$t0
0
add
hex
0
17
18
8
0
32
binary
000000
10001
10010
01000
00000
100000
Mips instruction:
000000100011001001000000001000002 = 0232402016
Modified by S. J. Fritz Spring 2009 (77)
Hexadecimal
• Base 16
– Compact representation of bit strings
– 4 bits per hex digit
0
1
2
0000
0001
0010
4
5
6
0100
0101
0110
8
9
a
1000
1001
1010
c
d
e
1100
1101
1110
3
0011
7
0111
b
1011
f
1111
• Example: e c a 8 6 4 2 0
1110 1100 1010 1000 0110 0100 0010 0000
Modified by S. J. Fritz Spring 2009 (78)
Why Multiple Instruction Formats?
• Design Principle 4: Good design demands good
compromises – there is a need to keep instructions the same
length and desire for a single format
• There is a problem using previous (R-format) when an
instruction needs longer fields
– for example lw must specify two registers and a constant,
but the constant would have only 5 bits available, so the
largest value would be 25 = 32
• Solution: allow I and J formats for different
instructions - but keep all the same length = 32 bits
Modified by S. J. Fritz Spring 2009 (79)
Overview of MIPS
• Simple instructions all 32 bits wide
• Very structured, no unnecessary baggage
• Only three instruction formats
R
op
rs
rt
rd
I
op
rs
rt
16 bit address
J
op
shamt
26 bit address
rely on compiler to achieve performance
— what are the compiler's goals?
• help compiler where we can
Modified by S. J. Fritz Spring 2009 (80)
funct
Additional MIPS Instruction Formats
• I-format: used for instructions with
immediates, lw and sw (since the offset
counts as an immediate), and the branches
(beq and bne),
– (but not the shift instructions; later)
• J-format: used for j and jal (jump and link)
• R-format: used for all other instructions
• It will soon become clear why the instructions
have been partitioned in this way.
Modified by S. J. Fritz Spring 2009 (81)
MIPS I-format Instructions
op
rs
rt
constant or address
6 bits
5 bits
5 bits
16 bits
• Format for Immediate arithmetic and
load/store instructions
– rt: destination or source register number
– Constant: –215 to +215 – 1
– Address: offset added to base address in rs
Modified by S. J. Fritz Spring 2009 (82)
Instruction Format Names and Field Descriptions
Instruction Fields
Name
6 bits
(31-26)
R-format
(6 fields)
5 bits
Comments
5 bits
5 bits
5 bits
6 bits
(25-21)
(20-16)
(15-11)
(10-6)
(5-0)
op
rs
rt
rd
sham
t
funct
Arithmetic instruction format
I-format
(4 fields)
op
rs
rt
Address/immediate
Data Transfer, branch,
immediate instruction format
J-format
(2 fields)
op
Target address
All MIPS instructions 32 bits
Jump instruction format
Instruction field notes:
The op and funct fields form the op-code. The rs field gives a source
register and rt is also normally a source register. rd is the destination
register, and shamt supplies the shift amount for logical shift operations.
Modified by S. J. Fritz Spring 2009 (83)
R-Format Example
• MIPS Instruction:
add
$8,$9,$10
Decimal number per field representation:
0
9
10
8
0
32
Binary number per field representation:
000000 01001 01010 01000 00000 100000
hex representation:
012A 4020hex
decimal representation:
19,546,144ten
hex
On Green Card: Format in column 1, opcodes in column 3
Modified by S. J. Fritz Spring 2009 (84)
Green Card
• green card /n./ [after the "IBM
System/360 Reference Data" card] A
summary of an assembly language,
even if the color is not green. For
example,
"I'll go get my green card so I can check
the addressing mode for that
instruction."
www.jargon.net
Image from Dave's Green Card Collection:
http://www.planetmvs.com/greencard/
Modified by S. J. Fritz Spring 2009 (85)
J-Format Instructions
• Define “fields” of the following number of bits
each:
6 bits
26 bits
• As usual, each field has a name:
opcode
target address
• Key Concepts
– Keep opcode field identical to R-format and
I-format for consistency.
– Combine all other fields to make room for
large target address.
Modified by S. J. Fritz Spring 2009 (86)
Translating Assembly Language into Machine Language
• Suppose $t1 has base of array A and $s2
corresponds to h in the assignment
A[300] = h + A[300]
• In MIPS : ( try this )
lw $t0, 1200 ($t1)
# temp register $t0 gets A[300]
add $t0, $s2, $t0
# temp register $t0 gets h+ A[300]
sw $t0, 1200($t1)
#stores h = A[300] back into A[300]
These instructions can then be represented in
machine language…
Modified by S. J. Fritz Spring 2009 (87)
Translating Assembly Language into Machine Language
lw $t0, 1200 ($t1)
# temp register $t0 gets A[300]
add $t0, $s2, $t0
# temp register $t0 gets h+ A[300]
sw $t0, 1200($t1)
#stores h = A[300] back into A[300]
op
rs
rt
35
9
8
0
18
8
43
9
8
Modified by S. J. Fritz Spring 2009 (88)
rd
add/shamt funct
1200
8
0
32
1200
Translating MIPS Assembly Language into
Machine Language
op
rs
rt
35
0
43
9
18
9
8
8
8
rd
address
/shamt
8
1200
0
1200
funct
32
The lw instruction (opcode) is 35, the base register is 9 ($t1), and
the destination register ($t0) is 8. The offset 1200=300x4 is address.
The add instruction is specified by 0 in the op field and 32 in the
funct field.
The sw instruction is 43 and the rest is similar to the lw instruction.
See the summary on page 101.
Modified by S. J. Fritz Spring 2009 (89)
Translating MIPS Assembly Language into
Machine Language
Since 1200ten = 0000 0100 1011 0000two , the binary equivalent
of the previous form is:
100011
01001
01000
0000 01 00 1011 0000
000000
10010
01000
01000
101011
01001
01000
0000 01 00 1011 0000
00000
100000
•Notice the similarity in the first and last instructions. The only
difference is in the third bit from the left.
•This similarity simplifies hardware design…
Modified by S. J. Fritz Spring 2009 (90)
Stored Program Computers
The BIG Picture
• Instructions represented in
binary, just like data
• Instructions and data stored
in memory
• Programs can operate on
programs
– e.g., compilers, linkers, …
• Binary compatibility allows
compiled programs to work
on different computers
– Standardized ISAs
Modified by S. J. Fritz Spring 2009 (91)
• Instructions for bitwise manipulation
Operation
C
Java
MIPS
Shift left
<<
<<
sll
Shift right
>>
>>>
srl
Bitwise AND
&
&
and, andi
Bitwise OR
|
|
or, ori
Bitwise NOT
~
~
nor
• Useful for extracting and inserting groups
of bits in a word
Modified by S. J. Fritz Spring 2009 (92)
§2.6 Logical Operations
Logical Operations
Shift Operations
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
• shamt: how many positions to shift
• Shift left logical
– Shift left and fill with 0 bits
– sll by i bits multiplies by 2i
– sll $t2, $s0, 4 # reg $t2 = reg $s0 << 4 bits
• Shift right logical
– Shift right and fill with 0 bits
– srl by i bits divides by 2i (unsigned only)
– srl $t2, $s0, 4 # reg $t2 = reg $s0 >> 4 bits
Modified by S. J. Fritz Spring 2009 (93)
AND Operations
• Useful to mask bits in a word
– Select some bits, clear others to 0
– Bit –by –bit operation, 1 if both are 1,0 otherwise
and $t0, $t1, $t2 #reg $t0=reg $t1 & reg $t2
$t2
0000 0000 0000 0000 0000 1101 1100 0000
$t1
0000 0000 0000 0000 0011 1100 0000 0000
$t0
0000 0000 0000 0000 0000 1100 0000 0000
Modified by S. J. Fritz Spring 2009 (94)
OR Operations
• Useful to include bits in a word
– Set some bits to 1, leave others unchanged
– Places 1 in the result if either bit is 1, 0 otherwise
or $t0, $t1, $t2 #reg $t0=reg $t1 | reg $t2
$t2
0000 0000 0000 0000 0000 1101 1100 0000
$t1
0000 0000 0000 0000 0011 1100 0000 0000
$t0
0000 0000 0000 0000 0011 1101 1100 0000
Modified by S. J. Fritz Spring 2009 (95)
NOT Operations
• Useful to invert bits in a word
– Change 0 to 1, and 1 to 0
• For consistency, MIPS has NOR, a 3-operand
instruction, instead of NOT
– a NOR b == NOT ( a OR b )
nor $t0,$t1,$zero #reg$t0=-(reg $t1| $zero )
Register 0: always read as zero
$t1
0000 0000 0000 0000 0011 1100 0000 0000
$t0
1111 1111 1111 1111 1100 0011 1111 1111
The full MIPS instruction set also includes XOR
Modified by S. J. Fritz Spring 2009 (96)
• MIPS includes two decision making instructions,
similar to an if statement as well as a “go to”
• Branch to a labeled instruction if a condition is
true
– Otherwise, continue sequentially
• beq rs, rt, L1
#branch on equal
– if (rs == rt) branch to instruction labeled L1;
• bne rs, rt, L1
#branch not equal
– if (rs != rt) branch to instruction labeled L1;
• j L1
– unconditional jump to instruction labeled L1
Modified by S. J. Fritz Spring 2009 (97)
§2.7 Instructions for Making Decisions
Conditional Operations
Compiling C/Java if into MIPS
• Compile by hand
if (i == j) f=g+h;
else f=g-h;
• Use this mapping:
f: $s0
g: $s1
h: $s2
i: $s3
j: $s4
Modified by S. J. Fritz Spring 2009 (98)
(true)
i == j
f=g+h
(false)
i == j?
i != j
f=g-h
Exit
Compiling If Statements
• C code:
if (i==j) f = g+h;
else f = g-h;
where f, g, … in $s0, $s1, …
• Compiled MIPS code:
bne
add
j
Else:
Exit:
$s3, $s4, Else #goto Else if i ≠ j
$s0, $s1, $s2 # skip if i = j
Exit
#goto Exit
sub $s0, $s1, $s2
…
Assembler calculates addresses
Modified by S. J. Fritz Spring 2009 (99)
Compiling Loop Statements
• Code for loops is similar to that for decisions
• C code:
while (save[i] == k) i += 1;
where i is in $s3, k in $s5, address of the array save is in $s6
– First load save[i] into a temporary register. To do so, we need to form
the address by multiplying the index by 4.
– Then add $t1 to the base of save in $s6.
• Compiled MIPS code:
Loop: sll
add
lw
bne
addi
j
Exit: …
$t1,
$t1,
$t0,
$t0,
$s3,
Loop
$s3, 2
$t1, $s6
0($t1)
$s5, Exit
$s3, 1
Modified by S. J. Fritz Spring 2009 (100)
#temp reg $t1 = i * 4
#$t1 = address of save[i]
#temp reg $t1 = save[i]
#goto Exit if save[i] = k
# i= i +1
# goto Loop
Basic Blocks
• A basic block is a sequence of
instructions with
– No embedded branches (except at end)
– No branch targets (except at beginning)
• A compiler identifies basic
blocks for optimization
• An advanced processor
can accelerate execution
of basic blocks
Modified by S. J. Fritz Spring 2009 (101)
More Conditional Operations
• Test for equality or inequality
• Set result to 1 if a condition is true
– Otherwise, set to 0
• slt rd, rs, rt
#set on less than
– if (rs < rt) rd = 1; else rd = 0;
• slti rt, rs, constant #set immediate
– if (rs < constant) rt = 1; else rt = 0;
• Use in combination with beq, bne
slt $t0, $s1, $s2
bne $t0, $zero, L
Modified by S. J. Fritz Spring 2009 (102)
# if ($s1 < $s2)
#
branch to L
Branch Instruction Design
• MIPS does not include a branch on less than
instruction.
• Why not blt, bge, etc?
• Uses Von neumann’s warning to keep equipment
simple
• Hardware for <, ≥, … slower than =, ≠
– Combining with branch involves more work per
instruction, requiring a slower clock
– All instructions penalized!
• beq and bne are the common case
• This is a good design compromise- to use only slt,
slti,beq, bne and zero for all relative conditions.
Modified by S. J. Fritz Spring 2009 (103)
Signed vs. Unsigned
• Signed comparison: slt, slti
• Unsigned comparison: sltu, sltui
• Example
 $s0 = 1111 1111 1111 1111 1111 1111 1111 1111
 $s1 = 0000 0000 0000 0000 0000 0000 0000 0001
 slt $t0, $s0, $s1 # signed
 –1 < +1  $t0 = 1
 sltu $t0, $s0, $s1
# unsigned
 +4,294,967,295 > +1  $t0 = 0
 The value in reg $s0 is –1 if it is an integer and
4,294967,295 if it is an unsigned integer.
 Register $s1 contains a 1 in either case.
Modified by S. J. Fritz Spring 2009 (104)
Signed vs. Unsigned
• Treating signed numbers as if they were
unsigned gives us a low cost way to check
if 0 < x <y
• This can also be used for a bounds check
for an array.
• An unsigned comparison of x < y also
checks if x is negative as well as if x is less
than y.
Modified by S. J. Fritz Spring 2009 (105)
Case/Switch Statement
• Simplest implementation of switch is with a
sequence of if-then-else statements
• Alternative include a jump address table, or jump
table.
– Program indexes into the table and then jumps to the
appropriate sequence
– Jump table is array of addresses corresponding to the
labels in the code
– Loads entry into a register and then jumps to the
address in the register
– MIPS includes a jump register instruction (jr)
Modified by S. J. Fritz Spring 2009 (106)
“And in Conclusion…”
• Memory is byte-addressable, but lw and sw access
one word at a time.
• A pointer (used by lw and sw) is just a memory
address, so we can add to it or subtract from it
(using offset).
• A Decision allows us to decide what to execute at
run-time rather than compile-time.
• C/Java decisions are made using conditional
statements within if, while, do while, for.
• MIPS Decision making instructions are the
conditional branches: beq and bne.
• New Instructions:
lw, sw, beq, bne, j
Modified by S. J. Fritz Spring 2009 (107)
•
Steps required
1.
2.
3.
4.
5.
6.
Place parameters in registers
Transfer control to procedure
Acquire storage for procedure
Perform procedure’s operations
Place result in register for caller
Return to place of call
Modified by S. J. Fritz Spring 2009 (108)
§2.8 Supporting Procedures in Computer Hardware
Procedure Calling
Register Usage
• $a0 – $a3: arguments (registers 4 – 7)
• $v0, $v1: result values (registers 2 and 3)
• $t0 – $t9: temporaries
– Can be overwritten by callee
• $s0 – $s7: saved
– Must be saved/restored by callee
•
•
•
•
$gp: global pointer for static data (reg 28)
$sp: stack pointer (reg 29)
$fp: frame pointer (reg 30)
$ra: return address (reg 31)
Modified by S. J. Fritz Spring 2009 (109)
Procedure Call Instructions
• Procedure call: jump and link (jal)
jal ProcedureLabel
– Address of following instruction put in $ra
– Jumps to target address
• Procedure return: jump register (jr)
jr $ra
– Copies $ra to program counter
– Can also be used for computed jumps
• e.g., for case/switch statements
Modified by S. J. Fritz Spring 2009 (110)
Procedure Call Instructions
• The link means that an address or link is formed
that points to the calling site that allows the
procedure to return to the proper address –
(return address)- stored in $ra
• The calling program – the caller, puts the
parameter values in a register ($a0-$a3) and
uses jal X to jump to procedure X (the callee).
• The callee performs the calculations and then
returns to the caller using jr $ra
• The address of the current instruction is saved in
the program counter - PC
Modified by S. J. Fritz Spring 2009 (111)
Using More Registers
• Compilers often need additional registers – to spill
register to memory
• Stack – (LIFO)- last in first out data structure
• Stack pointer- adjusted by one word for each
registered saved or restored
• MIPS reserves register for the stack pointer, $sp
• Stack “grows” from higher to lower addresses
• Push - places data on the stack (subtract form
stack pointer)
• Pop- removes data from the stack ( adds to the
stack pointer)
• Stack “grows” from higher to lower addresses
Modified by S. J. Fritz Spring 2009 (112)
Leaf Procedure Example
• C code:
int leaf_example (int g, h, i, j)
{ int f;
f = (g + h) - (i + j);
return f;
}
– Arguments g, …, j in $a0, …, $a3
– f in $s0 (hence, need to save $s0 on stack)
– Result in $v0
Modified by S. J. Fritz Spring 2009 (113)
Leaf Procedure Example
• MIPS code:
• leaf_example:
addi $sp, $sp, -4
sw
$s0, 0($sp)
add $t0, $a0, $a1
add $t1, $a2, $a3
sub $s0, $t0, $t1
add $v0, $s0, $zero
lw
$s0, 0($sp)
addi $sp, $sp, 4
jr
$ra
Modified by S. J. Fritz Spring 2009 (114)
Save $s0 on stack
Procedure body
Result
Restore $s0
Return
Non-Leaf Procedures
• Procedures that call other procedures
• For nested call, caller needs to save on
the stack:
– Its return address
– Any arguments and temporaries needed
after the call
• Restore from the stack after the call
Modified by S. J. Fritz Spring 2009 (115)
Non-Leaf Procedure Example
• C code:
int fact (int n)
{
if (n < 1) return f;
else return n * fact(n - 1);
}
– Argument n in $a0
– Result in $v0
Modified by S. J. Fritz Spring 2009 (116)
Non-Leaf Procedure Example
• MIPS code:
fact:
addi
sw
sw
slti
beq
addi
addi
jr
L1: addi
jal
lw
lw
addi
mul
jr
$sp,
$ra,
$a0,
$t0,
$t0,
$v0,
$sp,
$ra
$a0,
fact
$a0,
$ra,
$sp,
$v0,
$ra
$sp, -8
4($sp)
0($sp)
$a0, 1
$zero, L1
$zero, 1
$sp, 8
$a0, -1
0($sp)
4($sp)
$sp, 8
$a0, $v0
Modified by S. J. Fritz Spring 2009 (117)
#
#
#
#
adjust stack for 2 items
save return address
save argument
test for n < 1
#
#
#
#
#
#
#
#
#
#
if so, result is 1
pop 2 items from stack
and return
else decrement n
recursive call
restore original n
and return address
pop 2 items from stack
multiply to get result
and return
Local Data on the Stack
• Local data allocated by callee
– e.g., C automatic variables
• Procedure frame (activation record)
– Used by some compilers to manage stack storage
Modified by S. J. Fritz Spring 2009 (118)
Memory Layout
• Text: program code
• Static data: global
variables
– e.g., static variables in C,
constant arrays and strings
– $gp initialized to address
allowing ±offsets into this
segment
• Dynamic data: heap
– E.g., malloc in C, new in
Java
• Stack: automatic storage
Modified by S. J. Fritz Spring 2009 (119)
• Byte-encoded character sets
– ASCII: 128 characters
• 95 graphic, 33 control
– Latin-1: 256 characters
• ASCII, +96 more graphic characters
• Unicode: 32-bit character set
– Used in Java, C++ wide characters, …
– Most of the world’s alphabets, plus
symbols
– UTF-8, UTF-16: variable-length encodings
Modified by S. J. Fritz Spring 2009 (120)
§2.9 Communicating with People
Character Data
Byte/Halfword Operations
• Could use bitwise operations
• MIPS byte/halfword load/store
– String processing is a common case
lb rt, offset(rs)
lh rt, offset(rs)
– Sign extend to 32 bits in rt
lbu rt, offset(rs)
lhu rt, offset(rs)
– Zero extend to 32 bits in rt
sb rt, offset(rs)
sh rt, offset(rs)
– Store just rightmost byte/halfword
Modified by S. J. Fritz Spring 2009 (121)
String Copy Example
• C code (naïve):
– Null-terminated string
void strcpy (char x[], char y[])
{ int i;
i = 0;
while ((x[i]=y[i])!='\0')
i += 1;
}
– Addresses of x, y in $a0, $a1
– i in $s0
Modified by S. J. Fritz Spring 2009 (122)
String Copy Example
• MIPS code:
strcpy:
addi
sw
add
L1: add
lbu
add
sb
beq
addi
j
L2: lw
addi
jr
$sp,
$s0,
$s0,
$t1,
$t2,
$t3,
$t2,
$t2,
$s0,
L1
$s0,
$sp,
$ra
$sp, -4
0($sp)
$zero, $zero
$s0, $a1
0($t1)
$s0, $a0
0($t3)
$zero, L2
$s0, 1
0($sp)
$sp, 4
Modified by S. J. Fritz Spring 2009 (123)
#
#
#
#
#
#
#
#
#
#
#
#
#
adjust stack for 1 item
save $s0
i = 0
addr of y[i] in $t1
$t2 = y[i]
addr of x[i] in $t3
x[i] = y[i]
exit loop if y[i] == 0
i = i + 1
next iteration of loop
restore saved $s0
pop 1 item from stack
and return
• Most constants are small
– 16-bit immediate is sufficient
• For the occasional 32-bit constant
lui rt, constant
– Copies 16-bit constant to left 16 bits of rt
– Clears right 16 bits of rt to 0
lhi $s0, 61
0000 0000 0111 1101 0000 0000 0000 0000
ori $s0, $s0, 2304 0000 0000 0111 1101 0000 1001 0000 0000
Modified by S. J. Fritz Spring 2009 (124)
§2.10 MIPS Addressing for 32-Bit Immediates and Addresses
32-bit Constants
Branch Addressing
• Branch instructions specify
– Opcode, two registers, target address
• Most branch targets are near branch
– Forward or backward
op
rs
rt
constant or address
6 bits
5 bits
5 bits
16 bits
• PC-relative addressing
– Target address = PC + offset × 4
– PC already incremented by 4 by this time
Modified by S. J. Fritz Spring 2009 (125)
Jump Addressing
• Jump (j and jal) targets could be
anywhere in text segment
– Encode full address in instruction
op
address
6 bits
26 bits
• (Pseudo)Direct jump addressing
– Target address = PC31…28 : (address × 4)
Modified by S. J. Fritz Spring 2009 (126)
Target Addressing Example
• Loop code from earlier example
– Assume Loop at location 80000
Loop: sll
$t1, $s3, 2
80000
0
0
19
9
4
0
add
$t1, $t1, $s6
80004
0
9
22
9
0
32
lw
$t0, 0($t1)
80008
35
9
8
0
bne
$t0, $s5, Exit 80012
5
8
21
2
19
19
1
addi $s3, $s3, 1
80016
8
j
80020
2
Loop
Exit: …
Modified by S. J. Fritz Spring 2009 (127)
80024
20000
Branching Far Away
• If branch target is too far to encode with
16-bit offset, assembler rewrites the
code
• Example
beq $s0,$s1, L1
↓
bne $s0,$s1, L2
j L1
L2: …
Modified by S. J. Fritz Spring 2009 (128)
Addressing Mode Summary
Modified by S. J. Fritz Spring 2009 (129)
• Two processors sharing an area of memory
– P1 writes, then P2 reads
– Data race if P1 and P2 don’t synchronize
• Result depends of order of accesses
• Hardware support required
– Atomic read/write memory operation
– No other access to the location allowed between
the read and write
• Could be a single instruction
– E.g., atomic swap of register ↔ memory
– Or an atomic pair of instructions
Modified by S. J. Fritz Spring 2009 (130)
§2.11 Parallelism and Instructions: Synchronization
Synchronization
Synchronization in MIPS
• Load linked: ll rt, offset(rs)
• Store conditional: sc rt, offset(rs)
– Succeeds if location not changed since the ll
• Returns 1 in rt
– Fails if location is changed
• Returns 0 in rt
• Example: atomic swap (to test/set lock variable)
try: add
ll
sc
beq
add
$t0,$zero,$s4
$t1,0($s1)
$t0,0($s1)
$t0,$zero,try
$s4,$zero,$t1
Modified by S. J. Fritz Spring 2009 (131)
;copy exchange value
;load linked
;store conditional
;branch store fails
;put load value in $s4
Many compilers produce
object modules directly
Static linking
Modified by S. J. Fritz Spring 2009 (132)
§2.12 Translating and Starting a Program
Translation and Startup
Assembler Pseudoinstructions
• Most assembler instructions represent
machine instructions one-to-one
• Pseudoinstructions: figments of the
assembler’s imagination
→ add $t0, $zero, $t1
blt $t0, $t1, L → slt $at, $t0, $t1
move $t0, $t1
bne $at, $zero, L
– $at (register 1): assembler temporary
Modified by S. J. Fritz Spring 2009 (133)
Producing an Object Module
• Assembler (or compiler) translates program
into machine instructions
• Provides information for building a complete
program from the pieces
– Header: described contents of object module
– Text segment: translated instructions
– Static data segment: data allocated for the life of
the program
– Relocation info: for contents that depend on
absolute location of loaded program
– Symbol table: global definitions and external refs
– Debug info: for associating with source code
Modified by S. J. Fritz Spring 2009 (134)
Linking Object Modules
• Produces an executable image
1. Merges segments
2. Resolve labels (determine their addresses)
3. Patch location-dependent and external refs
• Could leave location dependencies for fixing
by a relocating loader
– But with virtual memory, no need to do this
– Program can be loaded into absolute location in
virtual memory space
Modified by S. J. Fritz Spring 2009 (135)
Loading a Program
• Load from image file on disk into memory
1. Read header to determine segment sizes
2. Create virtual address space
3. Copy text and initialized data into memory
• Or set page table entries so they can be faulted in
4. Set up arguments on stack
5. Initialize registers (including $sp, $fp, $gp)
6. Jump to startup routine
• Copies arguments to $a0, … and calls main
• When main returns, do exit syscall
Modified by S. J. Fritz Spring 2009 (136)
Dynamic Linking
• Only link/load library procedure when it
is called
– Requires procedure code to be relocatable
– Avoids image bloat caused by static linking
of all (transitively) referenced libraries
– Automatically picks up new library versions
Modified by S. J. Fritz Spring 2009 (137)
Lazy Linkage
Indirection table
Stub: Loads routine ID,
Jump to linker/loader
Linker/loader code
Dynamically
mapped code
Modified by S. J. Fritz Spring 2009 (138)
Starting Java Applications
Simple portable
instruction set for
the JVM
Compiles
bytecodes of
“hot” methods
into native
code for host
machine
Modified by S. J. Fritz Spring 2009 (139)
Interprets
bytecodes
• Illustrates use of assembly instructions for
a C bubble sort function
• Swap procedure (leaf)
void swap(int v[], int k)
{
int temp;
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
}
– v in $a0, k in $a1, temp in $t0
Modified by S. J. Fritz Spring 2009 (140)
§2.13 A C Sort Example to Put It All Together
C Sort Example
The Procedure Swap
swap: sll $t1, $a1, 2
# $t1 = k * 4
add $t1, $a0, $t1 # $t1 = v+(k*4)
#
(address of v[k])
lw $t0, 0($t1)
# $t0 (temp) = v[k]
lw $t2, 4($t1)
# $t2 = v[k+1]
sw $t2, 0($t1)
# v[k] = $t2 (v[k+1])
sw $t0, 4($t1)
# v[k+1] = $t0 (temp)
jr $ra
# return to calling routine
Modified by S. J. Fritz Spring 2009 (141)
The Sort Procedure in C
• Non-leaf (calls swap)
void sort (int v[], int n)
{
int i, j;
for (i = 0; i < n; i += 1) {
for (j = i – 1;
j >= 0 && v[j] > v[j + 1];
j -= 1) {
swap(v,j);
}
}
}
– v in $a0, k in $a1, i in $s0, j in $s1
Modified by S. J. Fritz Spring 2009 (142)
The Procedure Body
move
move
move
for1tst: slt
beq
addi
for2tst: slti
bne
sll
add
lw
lw
slt
beq
move
move
jal
addi
j
exit2:
addi
j
$s2, $a0
$s3, $a1
$s0, $zero
$t0, $s0, $s3
$t0, $zero, exit1
$s1, $s0, –1
$t0, $s1, 0
$t0, $zero, exit2
$t1, $s1, 2
$t2, $s2, $t1
$t3, 0($t2)
$t4, 4($t2)
$t0, $t4, $t3
$t0, $zero, exit2
$a0, $s2
$a1, $s1
swap
$s1, $s1, –1
for2tst
$s0, $s0, 1
for1tst
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
save $a0 into $s2
save $a1 into $s3
i = 0
$t0 = 0 if $s0 ≥ $s3 (i ≥ n)
go to exit1 if $s0 ≥ $s3 (i ≥ n)
j = i – 1
$t0 = 1 if $s1 < 0 (j < 0)
go to exit2 if $s1 < 0 (j < 0)
$t1 = j * 4
$t2 = v + (j * 4)
$t3 = v[j]
$t4 = v[j + 1]
$t0 = 0 if $t4 ≥ $t3
go to exit2 if $t4 ≥ $t3
1st param of swap is v (old $a0)
2nd param of swap is j
call swap procedure
j –= 1
jump to test of inner loop
i += 1
jump to test of outer loop
Move
params
Outer loop
Inner loop
Pass
params
& call
Inner loop
Outer loop
Modified by S. J. Fritz Spring 2009 (143)
The Full Procedure
sort:
addi $sp,$sp, –20
sw $ra, 16($sp)
sw $s3,12($sp)
sw $s2, 8($sp)
sw $s1, 4($sp)
sw $s0, 0($sp)
…
…
exit1: lw $s0, 0($sp)
lw $s1, 4($sp)
lw $s2, 8($sp)
lw $s3,12($sp)
lw $ra,16($sp)
addi $sp,$sp, 20
jr $ra
Modified by S. J. Fritz Spring 2009 (144)
#
#
#
#
#
#
#
make room on stack for 5 registers
save $ra on stack
save $s3 on stack
save $s2 on stack
save $s1 on stack
save $s0 on stack
procedure body
#
#
#
#
#
#
#
restore $s0 from stack
restore $s1 from stack
restore $s2 from stack
restore $s3 from stack
restore $ra from stack
restore stack pointer
return to calling routine
Effect of Compiler Optimization
Compiled with gcc for Pentium 4 under Linux
Relative Performance
3
2
100000
1
50000
0
0
none
O1
O2
none
O1
O2
Modified by S. J. Fritz Spring 2009 (145)
none
O3
Clock Cycles
200000
150000
100000
50000
0
Instruction count
150000
O1
O3
CPI
2
1.5
1
0.5
0
O3
O2
none
O1
O2
O3
Effect of Language and Algorithm
Java/int
Java/JI
T
Java/int
Java/JI
T
C/O3
C/O1
C/none
C/O2
Bubblesort Relative Performance
3
2
1
0
C/O3
C/O1
C/none
C/O2
Quicksort Relative Performance
3
2
1
0
Java/JI
T
Java/int
C/O3
C/O2
C/none
Modified by S. J. Fritz Spring 2009 (146)
C/O1
Quicksort vs. Bubblesort Speedup
3000
2000
1000
0
Lessons Learned
• Instruction count and CPI are not good
performance indicators in isolation
• Compiler optimizations are sensitive to
the algorithm
• Java/JIT compiled code is significantly
faster than JVM interpreted
– Comparable to optimized C in some cases
• Nothing can fix a dumb algorithm!
Modified by S. J. Fritz Spring 2009 (147)
• Array indexing involves
– Multiplying index by element size
– Adding to array base address
• Pointers correspond directly to
memory addresses
– Can avoid indexing complexity
Modified by S. J. Fritz Spring 2009 (148)
§2.14 Arrays versus Pointers
Arrays vs. Pointers
Example: Clearing and Array
clear1(int array[], int size) {
int i;
for (i = 0; i < size; i += 1)
array[i] = 0;
}
clear2(int *array, int size) {
int *p;
for (p = &array[0]; p < &array[size];
p = p + 1)
*p = 0;
}
move $t0,$zero
loop1: sll $t1,$t0,2
add $t2,$a0,$t1
move $t0,$a0
# p = & array[0]
sll $t1,$a1,2
# $t1 = size * 4
add $t2,$a0,$t1 # $t2 =
#
&array[size]
loop2: sw $zero,0($t0) # Memory[p] = 0
addi $t0,$t0,4 # p = p + 4
slt $t3,$t0,$t2 # $t3 =
#(p<&array[size])
bne $t3,$zero,loop2 # if (…)
# goto loop2
# i = 0
# $t1 = i * 4
# $t2 =
#
&array[i]
sw $zero, 0($t2) # array[i] = 0
addi $t0,$t0,1
# i = i + 1
slt $t3,$t0,$a1 # $t3 =
#
(i < size)
bne $t3,$zero,loop1 # if (…)
# goto loop1
Modified by S. J. Fritz Spring 2009 (149)
Comparison of Array vs. Ptr
• Multiply “strength reduced” to shift
• Array version requires shift to be inside
loop
– Part of index calculation for incremented i
– c.f. incrementing pointer
• Compiler can achieve same effect as
manual use of pointers
– Induction variable elimination
– Better to make program clearer and safer
Modified by S. J. Fritz Spring 2009 (150)
• ARM: the most popular embedded core
• Similar basic set of instructions to MIPS
ARM
MIPS
1985
1985
Instruction size
32 bits
32 bits
Address space
32-bit flat
32-bit flat
Data alignment
Aligned
Aligned
9
3
15 × 32-bit
31 × 32-bit
Memory
mapped
Memory
mapped
Date announced
Data addressing modes
Registers
Input/output
Modified by S. J. Fritz Spring 2009 (151)
§2.16 Real Stuff: ARM Instructions
ARM & MIPS Similarities
Compare and Branch in ARM
• Uses condition codes for result of an
arithmetic/logical instruction
– Negative, zero, carry, overflow
– Compare instructions to set condition
codes without keeping the result
• Each instruction can be conditional
– Top 4 bits of instruction word: condition
value
– Can avoid branches over single
instructions
Modified by S. J. Fritz Spring 2009 (152)
Instruction Encoding
Modified by S. J. Fritz Spring 2009 (153)
• Evolution with backward compatibility
– 8080 (1974): 8-bit microprocessor
• Accumulator, plus 3 index-register pairs
– 8086 (1978): 16-bit extension to 8080
• Complex instruction set (CISC)
– 8087 (1980): floating-point coprocessor
• Adds FP instructions and register stack
– 80286 (1982): 24-bit addresses, MMU
• Segmented memory mapping and protection
– 80386 (1985): 32-bit extension (now IA-32)
• Additional addressing modes and operations
• Paged memory mapping as well as segments
Modified by S. J. Fritz Spring 2009 (154)
§2.17 Real Stuff: x86 Instructions
The Intel x86 ISA
The Intel x86 ISA
• Further evolution…
– i486 (1989): pipelined, on-chip caches and FPU
• Compatible competitors: AMD, Cyrix, …
– Pentium (1993): superscalar, 64-bit datapath
• Later versions added MMX (Multi-Media eXtension) instructions
• The infamous FDIV bug
– Pentium Pro (1995), Pentium II (1997)
• New microarchitecture (see Colwell, The Pentium Chronicles)
– Pentium III (1999)
• Added SSE (Streaming SIMD Extensions) and associated
registers
– Pentium 4 (2001)
• New microarchitecture
• Added SSE2 instructions
Modified by S. J. Fritz Spring 2009 (155)
The Intel x86 ISA
• And further…
– AMD64 (2003): extended architecture to 64 bits
– EM64T – Extended Memory 64 Technology (2004)
• AMD64 adopted by Intel (with refinements)
• Added SSE3 instructions
– Intel Core (2006)
• Added SSE4 instructions, virtual machine support
– AMD64 (announced 2007): SSE5 instructions
• Intel declined to follow, instead…
– Advanced Vector Extension (announced 2008)
• Longer SSE registers, more instructions
• If Intel didn’t extend with compatibility, its competitors
would!
– Technical elegance ≠ market success
Modified by S. J. Fritz Spring 2009 (156)
Basic x86 Registers
Modified by S. J. Fritz Spring 2009 (157)
Basic x86 Addressing Modes
• Two operands per instruction
Source/dest operand
Second source operand
Register
Register
Register
Immediate
Register
Memory
Memory
Register
Memory
Immediate
• Memory addressing modes
–
–
–
–
Address in register
Address = Rbase + displacement
Address = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3)
Address = Rbase + 2scale × Rindex + displacement
Modified by S. J. Fritz Spring 2009 (158)
x86 Instruction Encoding
• Variable length
encoding
– Postfix bytes
specify addressing
mode
– Prefix bytes modify
operation
• Operand length,
repetition, locking,
…
Modified by S. J. Fritz Spring 2009 (159)
Implementing IA-32
• Complex instruction set makes
implementation difficult
– Hardware translates instructions to simpler
microoperations
• Simple instructions: 1–1
• Complex instructions: 1–many
– Microengine similar to RISC
– Market share makes this economically viable
• Comparable performance to RISC
– Compilers avoid complex instructions
Modified by S. J. Fritz Spring 2009 (160)
• Powerful instruction  higher performance
– Fewer instructions required
– But complex instructions are hard to implement
• May slow down all instructions, including simple ones
– Compilers are good at making fast code from
simple instructions
• Use assembly code for high performance
– But modern compilers are better at dealing with
modern processors
– More lines of code  more errors and less
productivity
Modified by S. J. Fritz Spring 2009 (161)
§2.18 Fallacies and Pitfalls
Fallacies
Fallacies
• Backward compatibility  instruction set
doesn’t change
– But they do acquire more instructions
Modified by S. J. Fritz Spring 2009 (162)
x86 instruction set
Pitfalls
• Sequential words are not at sequential
addresses
– Increment by 4, not by 1!
• Keeping a pointer to an automatic
variable after procedure returns
– e.g., passing pointer back via an argument
– Pointer becomes invalid when stack
popped
Modified by S. J. Fritz Spring 2009 (163)
• Design principles
1. Simplicity favors regularity
2. Smaller is faster
3. Make the common case fast
4. Good design demands good
compromises
• Layers of software/hardware
– Compiler, assembler, hardware
• MIPS: typical of RISC ISAs
– c.f. x86
Modified by S. J. Fritz Spring 2009 (164)
§2.19 Concluding Remarks
Concluding Remarks
Concluding Remarks
• Measure MIPS instruction executions in
benchmark programs
– Consider making the common case fast
– Consider compromises
Instruction class
MIPS examples
SPEC2006 Int
SPEC2006 FP
Arithmetic
add, sub, addi
16%
48%
Data transfer
lw, sw, lb, lbu,
lh, lhu, sb, lui
35%
36%
Logical
and, or, nor, andi,
ori, sll, srl
12%
4%
Cond. Branch
beq, bne, slt,
slti, sltiu
34%
8%
Jump
j, jr, jal
2%
0%
Modified by S. J. Fritz Spring 2009 (165)