Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
COM181
Computer Hardware
Lecture 4: The MIPs CPU
"Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, ©
2008.” This material may not be copied or distributed for commercial purposes without
express written permission of the copyright holders.
Also drawn from the work of Mary Jane Irwin ( www.cse.psu.edu/~mji )
Ian McCrum
Room 5B18, Tel: 90 366364 voice mail on 6th ring
Email: [email protected] Web site: http://www.eej.ulst.ac.uk
08/08/12
www.eej.ulster.ac.uk/~ian/modules/COM181/files
1
The MIPs CPU is described in the textbook, note how the diagram below relates to lecture 3
PCSrc
1
ID/EX
0
EX/MEM
Control
IF/ID
Add
Shift
left 2
4
PC
Instruction
Memory
Read
Address
Add
Read Addr 1
Data
Memory
Register Read
Read Addr 2Data 1
File
Write Addr
Write Data
16
Sign
Extend
MEM/WB
Branch
ALU
Read
Data 2
1
Address
Read
Data
Write Data
0
32
1
0
ALU
cntrl
EX/MEM.RegisterRd
0
1
IF/ID.RegisterRs
IF/ID.RegisterRt
Forward
Unit
25/05/12
MEM/WB.RegisterRd
2
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
Often you must deal with the management of complexity, and use abstraction and partition to
reduce systems to sizes that the human brain can cope with...
When programming a MIPs CPU it is enough to maintain a “Programmer’s Model” of the CPU.
The CPU is designed as a RISC (Reduced Instruction Set Computer) machine,
this suits implementing the hardware but not necessarily suits humans
programming it!
Software tools help
A RISC machine has a fixed length instruction (32 bits in the simple MIPs)
A RISC machine has a limited number of addressing modes
A RISC machine has a limited number of operations (small instrucution set)
A RISC machine has, typically, a register bank and uses load/store instructions
(only) to access main memory.
25/05/12
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
les
3
MIPs machines have three types of Instruction; R-type for arithmetic instructions
– using Registers, I-type where the number needed is available immediately and
J-type for conditional/control, jumps etc (there are also a few others...)
Example of some MIPS assembly language arithmetic statements
add $t0, $s1, $s2
sub $t0, $s1, $s2
Each arithmetic instruction performs only one operation
Each arithmetic instruction specifies exactly three operands
destination  source1 op source2
Operand order is fixed (the destination is specified first)
The operands are contained in the datapath’s register file ($t0, $s1, $s2)
The registers above have been given symbolic names, the actual numbered registers
Run from $0 to $31. We use software to convert the statements above to a 32 bit
instruction. The ASSEMBLER program can also convert symbols into numbers
25/05/12
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
les
4
MIPS Register File

Operands of arithmetic instructions must be from a limited number of special
Register File
locations contained in the datapath’s register file

Thirty-two 32-bit registers


Two read ports
One write port
src1 addr
src2 addr
dst addr
write data

5
5
5
Fast
- Smaller is faster & Make the common case fast

Easy for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order

Improves code density
- Since register are named with fewer bits than a memory location

Register addresses are indicated by using $
data
25 = 32
locations
32 src2
32
Registers are

32 src1
data
32 bits
Naming Conventions for Registers
0
$zero constant 0 (Hdware)
16
1
$at reserved for assembler
...
2
$v0 expression evaluation &
23
$s7
3
$v1 function results
24
$t8 temporary (cont’d)
4
$a0 arguments
25
$t9
5
$a1
26 $k0 reserved for OS kernel
6
$a2
27
$k1
7
$a3
28
$gp pointer to global area
8
$t0 temporary: caller saves
29
$sp stack pointer
...
15
(callee can clobber)
$t7
$s0 callee saves
(caller can clobber)
30 $fp frame pointer
31
$ra return address (Hdware)
Registers vs. Memory

Arithmetic instructions operands must be in registers

only thirty-two registers are provided
Devices
Processor
Network
Control
Datapath

Memory
Input
Output
Compiler associates variables with registers
What about programs with lots of variables?
Processor – Memory Interconnections


Memory is a large, single-dimensional array
An address acts as the index into the memory
array
The word
Memory
read addr/
write addr
address of the
data
32
Processor
read data
?
locations
32
32write data
The data stored in
the memory
10
101
1
32 bits
8
4
0
232 Bytes
(4
 230
GB)
Words (1
GW)
= 4 Bytes = 1 Word
Accessing Memory

MIPS has two basic data transfer instructions for accessing
memory (assume $s3 holds 2410)
lw $t0, 4($s3)
#load word from memory
28
sw $t0, 8($s3)
#store word to
memory
32


The data transfer instruction must specify

where in memory to read from (load) or write to (store) – memory
address

where in the register file to write to (load) or read from (store) –
register destination (source)
The memory address is formed by summing the constant
portion of the instruction and the contents of the second
register
Compiling with Loads and Stores

Assuming variable b is stored in $s2 and that
the base address of array A is in $s3, what is
the MIPS assembly code for the C statement
...
...
A[3]
$s3+12
A[2]
$s3+8
A[1]
$s3+4
A[0]
$s3
A[8] = A[2] - b
lw $t0, 8($s3)
sub
$t0, $t0, $s2
sw $t0, 32($s3)
Compiling with a Variable Array Index

...
...
A[3]
$s4+12
A[2]
$s4+8
A[1]
$s4+4
A[0]
$s4
Assuming that the base address of
array A is in register $s4, and
variables b, c, and i are in $s1,
$s2, and $s3, respectively, what is
the MIPS assembly code for the C
statement
c = A[i] - b
add $t1, $s3, $s3
#array index i is in $s3
add $t1, $t1, $t1
#temp reg $t1 holds 4*i
add $t1, $t1, $s4
#addr of A[i] now in $t1
lw
$t0, 0($t1)
sub $s2, $t0, $s1
Dealing with Constants

Small constants are used quite frequently (50% of
operands in many common programs)
e.g.,


A = A + 5;
B = B + 1;
C = C - 18;
Solutions? Why not?

Put “typical constants” in memory and load them

Create hard-wired registers (like $zero) for constants
like 1, 2, 4, 10, …
How do we make this work?

How do we Make the common case fast !
Constant (or Immediate) Operands

Include constants inside arithmetic instructions


Much faster than if they have to be loaded from memory (they come
in from memory with the instruction itself)
MIPS immediate instructions
addi $s3, $s3, 4 #$s3 = $s3 + 4
There is no subi instruction, can you guess why not?
MIPS Instructions, so far
Category
Arithmetic
Data
transfer
Instr
Example
Meaning
add
add $s1, $s2, $s3
$s1 = $s2 + $s3
subtract
sub $s1, $s2, $s3
$s1 = $s2 - $s3
add
immediate
addi $s1, $s2, 4
$s1 = $s2 + 4
load word
lw
$s1, 32($s2)
$s1 = Memory($s2+32)
store word
sw $s1, 32($s2)
Memory($s2+32) = $s1
Machine Language - Arithmetic Instruction

Instructions, like registers and words of data, are also 32 bits
long

Example:
add $t0, $s1, $s2
registers have numbers $t0=$8,$s1=$17,$s2=$18

Instruction Format:
op
rs
000000 10001
rt
rd
shamt
funct
10010
01000
00000
100000
Can you guess what the field names stand for?
MIPS Instruction Fields
op
6 bits

op

rs
rs
5 bits
rt
rd
5 bits
shamt
5 bits
funct
5 bits
6 bits
= 32 bits
opcode indicating operation to be performed
address of the first register source operand
address of the second register source operand

rt

rd

shamt function code that selects the specific variant of the
the register destination address
shift amount (for shift instructions)
operation specified in the opcode field

funct
Machine Language - Load Instruction

Consider the load-word and store-word instr’s

What would the regularity principle have us do?


Introduce a new type of instruction format


But . . . Good design demands compromise
I-type for data transfer instructions (previous format was R-type for
register)
Example: lw $t0, 24($s2)
op
23hex
100011
rs
rt
18
10010
16 bit number
8
01000
24
0000000000011000
Where's the compromise?
Memory Address Location

Example:
lw $t0, 24($s2)
Memory
0xf f f f f f f f
2410 + $s2 =
$t0
0x00000002
. . . 1001 0100
+ . . . 0001 1000
. . . 1010 1100 =
0x120040ac
Note that the offset
can be positive or
negative
0x120040ac
0x12004094
$s2
data
0x0000000c
0x00000008
0x00000004
0x00000000
word address (hex)
Machine Language - Store Instruction

Example: sw $t0, 24($s2)
op
43
101011

rs
18
10010
rt
16 bit number
8
01000
24
0000000000011000
A 16-bit offset means access is limited to memory locations within a
range of +213-1 to -213 (~8,192) words (+215-1 to -215 (~32,768) bytes)
of the address in the base register $s2

2’s complement (1 sign bit + 15 magnitude bits)
Machine Language – Immediate Instructions
What instruction format is used for the addi ?
addi $s3, $s3, 4 #$s3 = $s3 + 4

Machine format:

op
8

rs
rt
19
19
16 bit immediate
I format
4
The constant is kept inside the instruction itself!


So must use the I format – Immediate format
Limits immediate values to the range +215–1 to -215
Instruction Format Encoding

Can reduce the complexity with multiple formats
by keeping them as similar as possible


First three fields are the same in R-type and I-type
Each format has a distinct set of values in the op
field
Instr
Frmt
op
rs
rt
rd
shamt
funct
address
add
R
0
reg
reg reg 0
32ten
NA
sub
R
0
reg
reg reg 0
34ten
NA
addi
I
8ten
reg
reg NA NA
NA
constant
lw
I
35ten reg
reg NA NA
NA
address
sw
I
43ten reg
reg NA NA
NA
address
Assembling Code

Remember the assembler code we compiled last lecture for the C
statement
A[8] = A[2] - b
lw $t0, 8($s3)
sub
$t0, $t0, $s2 #subtract b from A[2]
sw $t0, 32($s3)

#load A[2] into $t0
#store result in A[8]
Assemble the MIPS object code for these three instructions (decimal is
fine)
lw
35
19
8
sub
0
8
18
sw
43
19
8
8
8
0
32
34
Review: MIPS Instructions, so far
Category
Arithmetic
(R format)
Instr
Op
Code
Example
Meaning
add
0&
32
add $s1, $s2, $s3
$s1 = $s2 + $s3
subtract
0&
34
sub $s1, $s2, $s3
$s1 = $s2 - $s3
Arithmetic
(I format)
add
immediate
8
addi $s1, $s2, 4
$s1 = $s2 + 4
Data
transfer
(I format)
load word
35
lw $s1, 100($s2)
$s1 = Memory($s2+100)
store word
43
sw $s1, 100($s2)
Memory($s2+100) = $s1
MIPS Operand Addressing Modes Summary
Register addressing – operand is in a register

1. Register addressing
op
rs
rt
rd
funct
Register
word operand
(displacement) addressing – operand’s
address in memory is the sum of a register and a
16-bit constant contained within the instruction
 Base
2. Base addressing
op
rs
rt
offset
Memory
word or byte operand
base register
addressing – operand is a 16-bit
constant contained within the instruction
 Immediate
3. Immediate addressing
op
rs
rt
operand
MIPS Instruction Addressing Modes Summary
addressing – instruction’s address in
memory is the sum of the PC and a 16-bit constant
contained within the instruction
 PC-relative
4. PC-relative addressing
op
rs
rt
offset
Memory
branch destination instruction
Program Counter (PC)
addressing – instruction’s address in
memory is the 26-bit constant contained within the
instruction concatenated with the upper 4 bits of the
PC
 Pseudo-direct
5. Pseudo-direct addressing
op
Memory
jump address
||
Program Counter (PC)
jump destination instruction
Review: MIPS Instructions, so far
Category
Arithmetic
(R & I
format)
Instr
OpC
Example
Meaning
add
0 & 20
add $s1, $s2, $s3
$s1 = $s2 + $s3
subtract
0 & 22
sub $s1, $s2, $s3
$s1 = $s2 - $s3
addi $s1, $s2, 4
$s1 = $s2 + 4
add immediate
8
shift left logical
0 & 00
sll
$s1, $s2, 4
$s1 = $s2 << 4
shift right logical
0 & 02
srl
$s1, $s2, 4
$s1 = $s2 >> 4 (fill with
zeros)
shift right
arithmetic
0 & 03
sra $s1, $s2, 4
$s1 = $s2 >> 4 (fill with
sign bit)
and
0 & 24
and $s1, $s2, $s3
$s1 = $s2 & $s3
or
0 & 25
or
$s1 = $s2 | $s3
nor
0 & 27
nor $s1, $s2, $s3
$s1 = not ($s2 | $s3)
and immediate
c
and $s1, $s2, ff00
$s1 = $s2 & 0xff00
or immediate
d
or
$s1 = $s2 | 0xff00
load upper
immediate
f
lui $s1, 0xffff
$s1, $s2, $s3
$s1, $s2, ff00
$s1 = 0xffff0000
Review: MIPS Instructions, so far
Category
Data
transfer
(I format)
Cond.
branch
(I & R
format)
Uncond.
jump
Instr
OpC
Example
Meaning
load word
23
lw
$s1, 100($s2)
$s1 = Memory($s2+100)
store word
2b
sw $s1, 100($s2)
Memory($s2+100) = $s1
load byte
20
lb
$s1, 101($s2)
$s1 = Memory($s2+101)
store byte
28
sb $s1, 101($s2)
Memory($s2+101) = $s1
load half
21
lh
$s1, 101($s2)
$s1 = Memory($s2+102)
store half
29
sh $s1, 101($s2)
Memory($s2+102) = $s1
br on equal
4
beq $s1, $s2, L
if ($s1==$s2) go to L
br on not equal
5
bne $s1, $s2, L
if ($s1 !=$s2) go to L
set on less than
immediate
a
slti
$s1, $s2, 100
if ($s2<100) $s1=1;
$s1=0
set on less than
0 & 2a
slt
$s1, $s2, $s3
if ($s2<$s3) $s1=1; else
$s1=0
2
j
2500
go to 10000
jump register
0 & 08
jr
$t1
go to $t1
jump and link
3
jal
2500
go to 10000; $ra=PC+4
jump
else
Review: MIPS R3000 ISA

Instruction Categories

Load/Store

Computational

Jump and Branch

Floating Point


Registers
R0 - R31
coprocessor

Memory Management

Special
PC
HI
LO
3 Instruction Formats: all 32 bits wide
6 bits
5 bits
5 bits
OP
rs
rt
OP
rs
rt
OP
5 bits
rd
5 bits
shamt
16 bit number
26 bit jump target
6 bits
funct
R format
I format
J format
RISC Design Principles Review



Simplicity favors regularity

fixed size instructions – 32-bits

small number of instruction formats
Smaller is faster

limited instruction set

limited number of registers in register file

limited number of addressing modes
Good design demands good compromises


three instruction formats
Make the common case fast

arithmetic operands from the register file (load-store machine)

allow instructions to contain immediate operands
The Code Translation Hierarchy
C program
compiler
assembly code
Compiler
Transforms the C program into an assembly
language program


Advantages of high-level languages

many fewer lines of code

easier to understand and debug

…
Today’s optimizing compilers can produce
assembly code nearly as good as an assembly
language programming expert and often better for
large programs

smaller code size, faster execution
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code
Assembler
Does a syntactic check of the code (i.e., did you type it in
correctly) and then transforms the symbolic assembler
code into object (machine) code

Advantages of assembler

much easier than remembering instr’s binary codes

can use labels for addresses – and let the assembler do the
arithmetic

can use pseudo-instructions


e.g., “move $t0, $t1” exists only in assembler (would be
implemented using “add $t0,$t1,$zero”)
When considering performance, you should count
instructions executed, not code size
25/05/12
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
les
46
The Two Main Tasks of the Assembler
1.
2.
25/05/12
Builds a symbol table which holds the symbolic
names (labels) and their corresponding addresses

A label is local if it is used only within the file where its
defined. Labels are local by default.

A label is external (global) if it refers to code or data in
another file or if it is referenced from another file. Global
labels must be explicitly declared global (e.g., .globl
main)
Translates each assembly language statement into
object (machine) code by “assembling” the numeric
equivalents of the opcodes, register specifiers, shift
amounts, and jump/branch targets/offsets
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
les
47
MIPS (spim)
Memory Allocation
Memory
Mem Map I/O
$sp
Kernel Code &
Data
fffffffc
7f f f f f fc
Stack
230
words
Dynamic data
$gp
Static data
1000 8000
1000 0000
Text
Segment
0040 0000
PC
Reserved
0000 0000
Other Tasks of the Assembler

Converts pseudo-instr’s to legal assembly code




register $at is reserved for the assembler to do this
Converts branches to far away locations into a branch
followed by a jump
Converts instructions with large immediates into a lui
followed by an ori
Converts numbers specified in decimal and hexidecimal into
their binary equivalents and characters into their ASCII
equivalents

Deals with data layout directives (e.g., .asciiz)

Expands macros (frequently used sequences of instructions)
Typical Object File Pieces

Object file header: size and position of the following pieces of the file

Text (code) segment (.text) : assembled object (machine) code

Data segment (.data) : data accompanying the code


static data - allocated throughout the program

dynamic data - grows and shrinks as needed
Relocation information: identifies instructions (data) that use (are located at)
absolute addresses – not relative to a register (including the PC)



on MIPS only j, jal, and some loads and stores (e.g.,
100($zero) ) use absolute addresses
lw $t1,
Symbol table: global labels with their addresses (if defined in this code
segment) or without (if defined external to this code segment)
Debugging information
An Example
Gbl?
yes
yes
Symbol
str
cr
main
loop
brnc
done
printf
Address
1000 0000
1000 000b
0040 0000
0040 000c
0040 001c
0040 0024
???? ????
Relocation Info
Address
Data/Instr
1000 0000
1000 000b
0040 0018
0040 0020
0040 0024
str
cr
j loop
j loop
jal printf
.data
.align 0
str:
.asciiz "The answer is "
cr: .asciiz "\n"
.text
.align 2
.globl main
.globl printf
main: ori $2, $0, 5
syscall
move
$8, $2
loop: beq $8, $9, done
blt $8, $9, brnc
sub $8, $8, $9
j loop
brnc: sub $9, $9, $8
j loop
done: jal printf
The Code Translation Hierarchy
C program
compiler
main text segment
assembly code
printf text segment
assembler
object code
library routines
linker
machine code
executable
Linker
Takes all of the independently assembled code segments and “stitches” (links) them
together

1.
Decides on memory allocation pattern for the code and data segments of each
module

2.
3.
Remember, modules were assembled in isolation so each has assumed its code’s starting
location is 0x0040 0000 and its static data starting location is 0x1000 0000
Relocates absolute addresses to reflect the new starting location of the code
segment and its data segment
Uses the symbol tables information to resolve all remaining undefined labels


Faster to recompile and reassemble a patched segment, than it is to recompile and
reassemble the entire program
branches, jumps, and data addresses to/in external modules
Linker produces an executable file
Linker Code Schematic
Executable file
Object file
main:
main:
.
.
.
jal ????
call, printf
Relocation info
Linker
C library
printf:
.
.
.
.
.
.
jal printf
printf:
.
.
.
Dseg
Dseg
Reloc Smtbl Dbg
Reloc Smtbl Dbg
Reloc
+
Txtseg
Txtseg
Hdr
Dseg
File 1
Hdr
Txtseg
Hdr
Executable
Linking Two Object Files
File 2
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code
library routines
linker
machine code
executable
loader
memory
Loader




Loads (copies) the executable code now stored on disk into
memory at the starting address specified by the operating
system
Copies the parameters (if any) to the main routine onto the
stack
Initializes the machine registers and sets the stack pointer to
the first free location (0x7fff fffc)
Jumps to a start-up routine (at PC addr 0x0040 0000 on
xspim) that copies the parameters into the argument registers
and then calls the main routine of the program with a jal
main
Dynamically Linked Libraries


Statically linking libraries mean that the library becomes part
of the executable code

It loads the whole library even if only a small part is used (e.g.,
standard C library is 2.5 MB)

What if a new version of the library is released ?
(Lazy) dynamically linked libraries (DLL) – library routines
are not linked and loaded until a routine is called during
execution

The first time the library routine called, a dynamic linker-loader must


find the desired routine, remap it, and “link” it to the calling routine (see
book for more details)
DLLs require extra space for dynamic linking information, but do not
require the whole library to be copied or linked
25/05/12
www.eej.ulster.ac.uk/~ian/modules/COM181/fi
les
59