Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COM181 Computer Hardware Lecture 4: The MIPs CPU "Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2008.” This material may not be copied or distributed for commercial purposes without express written permission of the copyright holders. Also drawn from the work of Mary Jane Irwin ( www.cse.psu.edu/~mji ) Ian McCrum Room 5B18, Tel: 90 366364 voice mail on 6th ring Email: [email protected] Web site: http://www.eej.ulst.ac.uk 08/08/12 www.eej.ulster.ac.uk/~ian/modules/COM181/files 1 The MIPs CPU is described in the textbook, note how the diagram below relates to lecture 3 PCSrc 1 ID/EX 0 EX/MEM Control IF/ID Add Shift left 2 4 PC Instruction Memory Read Address Add Read Addr 1 Data Memory Register Read Read Addr 2Data 1 File Write Addr Write Data 16 Sign Extend MEM/WB Branch ALU Read Data 2 1 Address Read Data Write Data 0 32 1 0 ALU cntrl EX/MEM.RegisterRd 0 1 IF/ID.RegisterRs IF/ID.RegisterRt Forward Unit 25/05/12 MEM/WB.RegisterRd 2 www.eej.ulster.ac.uk/~ian/modules/COM181/fi Often you must deal with the management of complexity, and use abstraction and partition to reduce systems to sizes that the human brain can cope with... When programming a MIPs CPU it is enough to maintain a “Programmer’s Model” of the CPU. The CPU is designed as a RISC (Reduced Instruction Set Computer) machine, this suits implementing the hardware but not necessarily suits humans programming it! Software tools help A RISC machine has a fixed length instruction (32 bits in the simple MIPs) A RISC machine has a limited number of addressing modes A RISC machine has a limited number of operations (small instrucution set) A RISC machine has, typically, a register bank and uses load/store instructions (only) to access main memory. 25/05/12 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les 3 MIPs machines have three types of Instruction; R-type for arithmetic instructions – using Registers, I-type where the number needed is available immediately and J-type for conditional/control, jumps etc (there are also a few others...) Example of some MIPS assembly language arithmetic statements add $t0, $s1, $s2 sub $t0, $s1, $s2 Each arithmetic instruction performs only one operation Each arithmetic instruction specifies exactly three operands destination source1 op source2 Operand order is fixed (the destination is specified first) The operands are contained in the datapath’s register file ($t0, $s1, $s2) The registers above have been given symbolic names, the actual numbered registers Run from $0 to $31. We use software to convert the statements above to a 32 bit instruction. The ASSEMBLER program can also convert symbols into numbers 25/05/12 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les 4 MIPS Register File Operands of arithmetic instructions must be from a limited number of special Register File locations contained in the datapath’s register file Thirty-two 32-bit registers Two read ports One write port src1 addr src2 addr dst addr write data 5 5 5 Fast - Smaller is faster & Make the common case fast Easy for a compiler to use - e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order Improves code density - Since register are named with fewer bits than a memory location Register addresses are indicated by using $ data 25 = 32 locations 32 src2 32 Registers are 32 src1 data 32 bits Naming Conventions for Registers 0 $zero constant 0 (Hdware) 16 1 $at reserved for assembler ... 2 $v0 expression evaluation & 23 $s7 3 $v1 function results 24 $t8 temporary (cont’d) 4 $a0 arguments 25 $t9 5 $a1 26 $k0 reserved for OS kernel 6 $a2 27 $k1 7 $a3 28 $gp pointer to global area 8 $t0 temporary: caller saves 29 $sp stack pointer ... 15 (callee can clobber) $t7 $s0 callee saves (caller can clobber) 30 $fp frame pointer 31 $ra return address (Hdware) Registers vs. Memory Arithmetic instructions operands must be in registers only thirty-two registers are provided Devices Processor Network Control Datapath Memory Input Output Compiler associates variables with registers What about programs with lots of variables? Processor – Memory Interconnections Memory is a large, single-dimensional array An address acts as the index into the memory array The word Memory read addr/ write addr address of the data 32 Processor read data ? locations 32 32write data The data stored in the memory 10 101 1 32 bits 8 4 0 232 Bytes (4 230 GB) Words (1 GW) = 4 Bytes = 1 Word Accessing Memory MIPS has two basic data transfer instructions for accessing memory (assume $s3 holds 2410) lw $t0, 4($s3) #load word from memory 28 sw $t0, 8($s3) #store word to memory 32 The data transfer instruction must specify where in memory to read from (load) or write to (store) – memory address where in the register file to write to (load) or read from (store) – register destination (source) The memory address is formed by summing the constant portion of the instruction and the contents of the second register Compiling with Loads and Stores Assuming variable b is stored in $s2 and that the base address of array A is in $s3, what is the MIPS assembly code for the C statement ... ... A[3] $s3+12 A[2] $s3+8 A[1] $s3+4 A[0] $s3 A[8] = A[2] - b lw $t0, 8($s3) sub $t0, $t0, $s2 sw $t0, 32($s3) Compiling with a Variable Array Index ... ... A[3] $s4+12 A[2] $s4+8 A[1] $s4+4 A[0] $s4 Assuming that the base address of array A is in register $s4, and variables b, c, and i are in $s1, $s2, and $s3, respectively, what is the MIPS assembly code for the C statement c = A[i] - b add $t1, $s3, $s3 #array index i is in $s3 add $t1, $t1, $t1 #temp reg $t1 holds 4*i add $t1, $t1, $s4 #addr of A[i] now in $t1 lw $t0, 0($t1) sub $s2, $t0, $s1 Dealing with Constants Small constants are used quite frequently (50% of operands in many common programs) e.g., A = A + 5; B = B + 1; C = C - 18; Solutions? Why not? Put “typical constants” in memory and load them Create hard-wired registers (like $zero) for constants like 1, 2, 4, 10, … How do we make this work? How do we Make the common case fast ! Constant (or Immediate) Operands Include constants inside arithmetic instructions Much faster than if they have to be loaded from memory (they come in from memory with the instruction itself) MIPS immediate instructions addi $s3, $s3, 4 #$s3 = $s3 + 4 There is no subi instruction, can you guess why not? MIPS Instructions, so far Category Arithmetic Data transfer Instr Example Meaning add add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 add immediate addi $s1, $s2, 4 $s1 = $s2 + 4 load word lw $s1, 32($s2) $s1 = Memory($s2+32) store word sw $s1, 32($s2) Memory($s2+32) = $s1 Machine Language - Arithmetic Instruction Instructions, like registers and words of data, are also 32 bits long Example: add $t0, $s1, $s2 registers have numbers $t0=$8,$s1=$17,$s2=$18 Instruction Format: op rs 000000 10001 rt rd shamt funct 10010 01000 00000 100000 Can you guess what the field names stand for? MIPS Instruction Fields op 6 bits op rs rs 5 bits rt rd 5 bits shamt 5 bits funct 5 bits 6 bits = 32 bits opcode indicating operation to be performed address of the first register source operand address of the second register source operand rt rd shamt function code that selects the specific variant of the the register destination address shift amount (for shift instructions) operation specified in the opcode field funct Machine Language - Load Instruction Consider the load-word and store-word instr’s What would the regularity principle have us do? Introduce a new type of instruction format But . . . Good design demands compromise I-type for data transfer instructions (previous format was R-type for register) Example: lw $t0, 24($s2) op 23hex 100011 rs rt 18 10010 16 bit number 8 01000 24 0000000000011000 Where's the compromise? Memory Address Location Example: lw $t0, 24($s2) Memory 0xf f f f f f f f 2410 + $s2 = $t0 0x00000002 . . . 1001 0100 + . . . 0001 1000 . . . 1010 1100 = 0x120040ac Note that the offset can be positive or negative 0x120040ac 0x12004094 $s2 data 0x0000000c 0x00000008 0x00000004 0x00000000 word address (hex) Machine Language - Store Instruction Example: sw $t0, 24($s2) op 43 101011 rs 18 10010 rt 16 bit number 8 01000 24 0000000000011000 A 16-bit offset means access is limited to memory locations within a range of +213-1 to -213 (~8,192) words (+215-1 to -215 (~32,768) bytes) of the address in the base register $s2 2’s complement (1 sign bit + 15 magnitude bits) Machine Language – Immediate Instructions What instruction format is used for the addi ? addi $s3, $s3, 4 #$s3 = $s3 + 4 Machine format: op 8 rs rt 19 19 16 bit immediate I format 4 The constant is kept inside the instruction itself! So must use the I format – Immediate format Limits immediate values to the range +215–1 to -215 Instruction Format Encoding Can reduce the complexity with multiple formats by keeping them as similar as possible First three fields are the same in R-type and I-type Each format has a distinct set of values in the op field Instr Frmt op rs rt rd shamt funct address add R 0 reg reg reg 0 32ten NA sub R 0 reg reg reg 0 34ten NA addi I 8ten reg reg NA NA NA constant lw I 35ten reg reg NA NA NA address sw I 43ten reg reg NA NA NA address Assembling Code Remember the assembler code we compiled last lecture for the C statement A[8] = A[2] - b lw $t0, 8($s3) sub $t0, $t0, $s2 #subtract b from A[2] sw $t0, 32($s3) #load A[2] into $t0 #store result in A[8] Assemble the MIPS object code for these three instructions (decimal is fine) lw 35 19 8 sub 0 8 18 sw 43 19 8 8 8 0 32 34 Review: MIPS Instructions, so far Category Arithmetic (R format) Instr Op Code Example Meaning add 0& 32 add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract 0& 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3 Arithmetic (I format) add immediate 8 addi $s1, $s2, 4 $s1 = $s2 + 4 Data transfer (I format) load word 35 lw $s1, 100($s2) $s1 = Memory($s2+100) store word 43 sw $s1, 100($s2) Memory($s2+100) = $s1 MIPS Operand Addressing Modes Summary Register addressing – operand is in a register 1. Register addressing op rs rt rd funct Register word operand (displacement) addressing – operand’s address in memory is the sum of a register and a 16-bit constant contained within the instruction Base 2. Base addressing op rs rt offset Memory word or byte operand base register addressing – operand is a 16-bit constant contained within the instruction Immediate 3. Immediate addressing op rs rt operand MIPS Instruction Addressing Modes Summary addressing – instruction’s address in memory is the sum of the PC and a 16-bit constant contained within the instruction PC-relative 4. PC-relative addressing op rs rt offset Memory branch destination instruction Program Counter (PC) addressing – instruction’s address in memory is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC Pseudo-direct 5. Pseudo-direct addressing op Memory jump address || Program Counter (PC) jump destination instruction Review: MIPS Instructions, so far Category Arithmetic (R & I format) Instr OpC Example Meaning add 0 & 20 add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract 0 & 22 sub $s1, $s2, $s3 $s1 = $s2 - $s3 addi $s1, $s2, 4 $s1 = $s2 + 4 add immediate 8 shift left logical 0 & 00 sll $s1, $s2, 4 $s1 = $s2 << 4 shift right logical 0 & 02 srl $s1, $s2, 4 $s1 = $s2 >> 4 (fill with zeros) shift right arithmetic 0 & 03 sra $s1, $s2, 4 $s1 = $s2 >> 4 (fill with sign bit) and 0 & 24 and $s1, $s2, $s3 $s1 = $s2 & $s3 or 0 & 25 or $s1 = $s2 | $s3 nor 0 & 27 nor $s1, $s2, $s3 $s1 = not ($s2 | $s3) and immediate c and $s1, $s2, ff00 $s1 = $s2 & 0xff00 or immediate d or $s1 = $s2 | 0xff00 load upper immediate f lui $s1, 0xffff $s1, $s2, $s3 $s1, $s2, ff00 $s1 = 0xffff0000 Review: MIPS Instructions, so far Category Data transfer (I format) Cond. branch (I & R format) Uncond. jump Instr OpC Example Meaning load word 23 lw $s1, 100($s2) $s1 = Memory($s2+100) store word 2b sw $s1, 100($s2) Memory($s2+100) = $s1 load byte 20 lb $s1, 101($s2) $s1 = Memory($s2+101) store byte 28 sb $s1, 101($s2) Memory($s2+101) = $s1 load half 21 lh $s1, 101($s2) $s1 = Memory($s2+102) store half 29 sh $s1, 101($s2) Memory($s2+102) = $s1 br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L set on less than immediate a slti $s1, $s2, 100 if ($s2<100) $s1=1; $s1=0 set on less than 0 & 2a slt $s1, $s2, $s3 if ($s2<$s3) $s1=1; else $s1=0 2 j 2500 go to 10000 jump register 0 & 08 jr $t1 go to $t1 jump and link 3 jal 2500 go to 10000; $ra=PC+4 jump else Review: MIPS R3000 ISA Instruction Categories Load/Store Computational Jump and Branch Floating Point Registers R0 - R31 coprocessor Memory Management Special PC HI LO 3 Instruction Formats: all 32 bits wide 6 bits 5 bits 5 bits OP rs rt OP rs rt OP 5 bits rd 5 bits shamt 16 bit number 26 bit jump target 6 bits funct R format I format J format RISC Design Principles Review Simplicity favors regularity fixed size instructions – 32-bits small number of instruction formats Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes Good design demands good compromises three instruction formats Make the common case fast arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands The Code Translation Hierarchy C program compiler assembly code Compiler Transforms the C program into an assembly language program Advantages of high-level languages many fewer lines of code easier to understand and debug … Today’s optimizing compilers can produce assembly code nearly as good as an assembly language programming expert and often better for large programs smaller code size, faster execution The Code Translation Hierarchy C program compiler assembly code assembler object code Assembler Does a syntactic check of the code (i.e., did you type it in correctly) and then transforms the symbolic assembler code into object (machine) code Advantages of assembler much easier than remembering instr’s binary codes can use labels for addresses – and let the assembler do the arithmetic can use pseudo-instructions e.g., “move $t0, $t1” exists only in assembler (would be implemented using “add $t0,$t1,$zero”) When considering performance, you should count instructions executed, not code size 25/05/12 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les 46 The Two Main Tasks of the Assembler 1. 2. 25/05/12 Builds a symbol table which holds the symbolic names (labels) and their corresponding addresses A label is local if it is used only within the file where its defined. Labels are local by default. A label is external (global) if it refers to code or data in another file or if it is referenced from another file. Global labels must be explicitly declared global (e.g., .globl main) Translates each assembly language statement into object (machine) code by “assembling” the numeric equivalents of the opcodes, register specifiers, shift amounts, and jump/branch targets/offsets www.eej.ulster.ac.uk/~ian/modules/COM181/fi les 47 MIPS (spim) Memory Allocation Memory Mem Map I/O $sp Kernel Code & Data fffffffc 7f f f f f fc Stack 230 words Dynamic data $gp Static data 1000 8000 1000 0000 Text Segment 0040 0000 PC Reserved 0000 0000 Other Tasks of the Assembler Converts pseudo-instr’s to legal assembly code register $at is reserved for the assembler to do this Converts branches to far away locations into a branch followed by a jump Converts instructions with large immediates into a lui followed by an ori Converts numbers specified in decimal and hexidecimal into their binary equivalents and characters into their ASCII equivalents Deals with data layout directives (e.g., .asciiz) Expands macros (frequently used sequences of instructions) Typical Object File Pieces Object file header: size and position of the following pieces of the file Text (code) segment (.text) : assembled object (machine) code Data segment (.data) : data accompanying the code static data - allocated throughout the program dynamic data - grows and shrinks as needed Relocation information: identifies instructions (data) that use (are located at) absolute addresses – not relative to a register (including the PC) on MIPS only j, jal, and some loads and stores (e.g., 100($zero) ) use absolute addresses lw $t1, Symbol table: global labels with their addresses (if defined in this code segment) or without (if defined external to this code segment) Debugging information An Example Gbl? yes yes Symbol str cr main loop brnc done printf Address 1000 0000 1000 000b 0040 0000 0040 000c 0040 001c 0040 0024 ???? ???? Relocation Info Address Data/Instr 1000 0000 1000 000b 0040 0018 0040 0020 0040 0024 str cr j loop j loop jal printf .data .align 0 str: .asciiz "The answer is " cr: .asciiz "\n" .text .align 2 .globl main .globl printf main: ori $2, $0, 5 syscall move $8, $2 loop: beq $8, $9, done blt $8, $9, brnc sub $8, $8, $9 j loop brnc: sub $9, $9, $8 j loop done: jal printf The Code Translation Hierarchy C program compiler main text segment assembly code printf text segment assembler object code library routines linker machine code executable Linker Takes all of the independently assembled code segments and “stitches” (links) them together 1. Decides on memory allocation pattern for the code and data segments of each module 2. 3. Remember, modules were assembled in isolation so each has assumed its code’s starting location is 0x0040 0000 and its static data starting location is 0x1000 0000 Relocates absolute addresses to reflect the new starting location of the code segment and its data segment Uses the symbol tables information to resolve all remaining undefined labels Faster to recompile and reassemble a patched segment, than it is to recompile and reassemble the entire program branches, jumps, and data addresses to/in external modules Linker produces an executable file Linker Code Schematic Executable file Object file main: main: . . . jal ???? call, printf Relocation info Linker C library printf: . . . . . . jal printf printf: . . . Dseg Dseg Reloc Smtbl Dbg Reloc Smtbl Dbg Reloc + Txtseg Txtseg Hdr Dseg File 1 Hdr Txtseg Hdr Executable Linking Two Object Files File 2 The Code Translation Hierarchy C program compiler assembly code assembler object code library routines linker machine code executable loader memory Loader Loads (copies) the executable code now stored on disk into memory at the starting address specified by the operating system Copies the parameters (if any) to the main routine onto the stack Initializes the machine registers and sets the stack pointer to the first free location (0x7fff fffc) Jumps to a start-up routine (at PC addr 0x0040 0000 on xspim) that copies the parameters into the argument registers and then calls the main routine of the program with a jal main Dynamically Linked Libraries Statically linking libraries mean that the library becomes part of the executable code It loads the whole library even if only a small part is used (e.g., standard C library is 2.5 MB) What if a new version of the library is released ? (Lazy) dynamically linked libraries (DLL) – library routines are not linked and loaded until a routine is called during execution The first time the library routine called, a dynamic linker-loader must find the desired routine, remap it, and “link” it to the calling routine (see book for more details) DLLs require extra space for dynamic linking information, but do not require the whole library to be copied or linked 25/05/12 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les 59