Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 2 Instructions: language of the Machine 授課教師: 張傳育 博士 (Chuan-Yu Chang Ph.D.) E-mail: [email protected] Tel: (05)5342601 ext. 4337 Instructions: 1400 1300 1200 1100 1000 Language of the Machine Other SPARC Hitachi SH PowerPC Motorola 68K MIPS 900 IA-32 800 ARM More primitive than higher level languages e.g., no complex control flow 700 600 500 400 Very restrictive e.g., MIPS Arithmetic Instructions 300 200 100 0 1998 1999 2000 2001 2002 We’ll be working with the MIPS instruction set architecture similar to other architectures developed since the 1980's Almost 100 million MIPS processors manufactured in 2002 used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, … Design goals: To find a language that makes it easy to build the hardware and compiler while maximizing performance and minimize cost, reduce design time 2 MIPS arithmetic All instructions have 3 operands Operand order is fixed (destination first) $s0, $s1,… for registers that correspond to variables in C programs $t0, $t1,… for temporary registers needed to compile the program into MIPS instruction. Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 (associated with variables by compiler) 3 Examples Compiling Two C Assignment Statements into MIPS a = b + c; d = a – e; Solution add a, b, c sub d, a, e Compiling a Complex C Assignment into MIPS f = (g + h) – (i + j); Solution add t0, g, h add t1, i, j sub f, t0, t1 4 MIPS arithmetic Design Principle: simplicity favors regularity(簡單明瞭有 助於一致性). The variables a, b, c, d, e and f are Example: a = e = MIPS code: add add sub C code: assigned to the registers $s0, $s1, $s2, $s3, $s4, $s5. b + c + d; f - a; $t0, $s1, $s2 $s0, $t0, $s3 $s4, $s5, $s0 The size of a register in the MIPS architecture is 32 bits Operands must be registers, only 32 registers provided Design Principle: smaller is faster. 5 Example Compiling a C assignment Using Registers f = (g + h) – (i + j); Solution The variables f, g, h, i, and j are assigned to the registers $s0, $s1, $s2, $s3, $s4. The compiled MIPS code : add t0, $s1, $s2 add t1, $s3, $s4 sub $s0, $t0, $t1 6 Registers vs. Memory Arithmetic instructions operands must be registers, — only 32 registers provided Compiler associates variables with registers What about programs with lots of variables? The CPU can keep only a small amount of data in registers. Computer memory contains millions of data elements. MIPS must include instructions (data transfer instruction) that transfer data between memory and registers.. Control Input Memory Datapath Processor Output I/O 7 Memory Organization How can a computer represent and access large memory? Viewed as a large, single-dimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory. 0 1 2 3 4 5 6 ... 8 bits of data 8 bits of data 3 100 2 10 1 101 0 1 Address Data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data Processor Memory 8 Memory Organization Bytes are nice, but most data items use larger "words" For MIPS, a word is 32 bits or 4 bytes. 0 4 8 12 ... 32 bits of data 32 bits of data 32 bits of data Registers hold 32 bits of data 32 bits of data 232 bytes with byte addresses from 0 to 232-1 230 words with byte addresses 0, 4, 8, ... 232-4 Words are aligned 9 Instructions Load instructions lw: (load word): moves data from memory to a register. Store instructions sw: (store word): transfers data from a register to memory Store word has destination last Example: Let’s assume that A is an array of 100 words and that the compiler has associated the variables g and h with the registers $s1 and $s2 as before. The starting address of the array is in $s3. Translate this C assignment statement: C code: g = h + A[8] Base register MIPS code: lw $t0, 32($s3) add $s1, $s2, $t0 A[8] = h + A[8]; lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, 32($s3) Remember arithmetic operands are registers, not memory! C code: MIPS code: offset 10 Compiling using a variable index Example: g = h + A [i] Assume A is an array of 100 elements whose base is in register $s3 and that the compiler associates the variables g, h and i with the registers $s1, $s2, and $s4. What is the MIPS assembly code corresponding to this C segment? Solution: add add add lw add $t1, $t1, $t1, $t0, $s1, $s4, $s4 $t1, $t1 $t1, $s3 0($t1) $s2, $t0 # Temp reg $t1 = 2 * i # Temp reg $t1 = 4 *i # $t1 = address of A[i] (4*i+$s3) # Temporary reg $t0 = A[i] # g = h + A[i] 11 Constant and Immediate Operands The constants would have been placed in memory when the program was loaded. lw $t0, AddrConstant4 ($s1) # $t0=constant 4 add $s3, $s3, $t0 # $s3=$s3+$t0 ($t0==4) Assuming that AddrConstant4 is the memory address of the constant 4. Add immediate (addi) Addi $s3, $s3, 4 #$s3=$s3+4 To add 4 to register $s3 12 So far we’ve learned: MIPS — loading words but addressing bytes — arithmetic on registers only Instruction add sub lw sw $s1, $s1, $s1, $s1, Meaning $s2, $s3 $s2, $s3 100($s2) 100($s2) $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 13 Machine Language Instructions, like registers and words of data, are also 32 bits long Example: add $t0, $s1, $s2 registers have numbers, $t0=8, $s1=17, $s2=18 $s0~$s7:16~23, $t0~$t7:8-15 Instruction Format: R-Type 0000001000110010 01000 00000 100000 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5bits 5 bits 6 bits 14 Machine Language MIPS Fields op: opcode, basic operation of the instruction rs: the first source operand register rt: the second source operand register rd: the destination operand register shamt: shift amount funct: function, selects the specific variant of the op 15 Machine Language Consider the load-word and store-word instructions, What would the regularity principle have us do? New principle: Good design demands a compromise Introduce a new type of instruction format I-type for data transfer instructions other format was R-type for register Example:lw $t0, 32($s3)#Temporary reg $t0 gets A[8] 35 op 6 bits 19 rs 5 bits 8 rt 32 address 5 bits 16 bits Where's the compromise? Specifies the destination register To keep all instructions the same length, thereby requiring different kinds of instruction formats for different kinds of instructions. 16 MIPS Instruction Encoding Instruction Format op rs rt rd shamt funct address add R 0 reg reg reg 0 32 n.a. sub R 0 reg reg reg 0 34 n.a. lw I 35 reg reg n.a. n.a. n.a. address sw I 43 reg reg n.a. n.a. n.a. address 17 Translating MIPS Assembly Language into Machine Language Example: Assuming that $t1 has the base of the array A and that $s2 corresponds to h, the C assignment statement A[300] = h + A[300] is compiled into lw $t0, 1200($t1) add $t0, $s2, $t0 sw $t0, 1200($t1) What is the MIPS machine language code for these three instructions? Solution: op rs rt 35 9 8 0 18 8 43 9 8 rd Address/ shamt funct 1200 8 0 32 1200 18 Stored Program Concept Today’s computers are built on two key principles: Instructions are represented as numbers (bits) Programs are stored in memory — to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch & Execute Cycle Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue 19 Instructions for Making Decisions: Control Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed MIPS conditional branch instructions: Branch if equal: beq beq register1, register2, L1 if the value of register1 equals the value of register2 then go to the statement labeled L1. Branch if not equal: bne bne register1, register2, L1 if the value of register1 does not equals the value of register2 then go to the statement labeled L1. Example: if (i==j) h = i + j; Assume that i, j and h mapping to $s0,$s1, and $s3 bne $s0, $s1, Label add $s3, $s0, $s1 Label: .... 20 Example: Compiling an If Statement into a Conditional Branch. Example: C code segment if (i == j) go to L1; f = g + h; L1: f = f-i; Assuming that the five variables f through j correspond to the five registers $s0 through $s4, what is the compiled MIPS code? Solution: L1: beq add sub $s3, $s4, L1 $s0, $s1, $s2 $s0, $s0, $s3 21 Control: if-then-else MIPS unconditional branch instructions: j label Example: if (i!=j) h=i+j; else h=i-j; beq $s3, $s4, Lab1 add $s2, $s3, $s4 j Lab2 Lab1: sub $s2, $s3, $s4 Lab2: ... Example: ij i =j i == j? if ( i == j ) f = g + h; else f = g – h; Else: Solution: Else: Exit: bne add j sub $s3, $s4, Else $s0, $s1, $s2 Exit $s0, $s1, $s2 f = g– h f= g+h Exit: 22 Control: Loops Example: Here is a loop in C: Loop: g = g + A[i]; i = i + j; if ( i != h ) goto Loop; Assume that A is in $s5, g, h, i,j are in $s1 through $s4. What is the MIPS assembly code corresponding to this C loop? Solution: Loop: add add add lw add add bne $t1, $t1, $t1, $t0, $s1, $s3, $s3, $s3, $s3 $t1, $t1 $t1, $s5 0($t1) $s1, $t0 $s3, $s4 $s2, Loop #Temp reg $t1 = 2 * i #Temp reg $t1 = 4 * i #$t1 = address of A[i] # g = g + A[i] # i = i + j # if ( i != h ) goto Loop 23 Control: While Loops Example: Here is a traditional loop in C while ( save [ i ] == k) i = i + j; Assume that i, j, and k correspond to registers $s3, $s4, and $s5, and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment? Solution: Loop: add add add lw bne add j Exit: $t1, $t1, $t1, $t0, $t0, $s3, Loop $s3, $s3 $t1, $t1 $t1, $s6 0($t1) $s5, Exit $s3, $s4 # go to Exit if save[i] != k # i = i +j # go to Loop 24 So far: Instruction Meaning add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) bne $s4,$s5,L beq $s4,$s5,L j Label $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 Next instr. is at Label if $s4 != $s5 Next instr. is at Label if $s4 = $s5 Next instr. is at Label Formats: R op rs rt rd I op rs rt 16 bit address J op shamt funct 26 bit address 25 Control Flow We have: beq, bne, what about Branch-if-less-than? New instruction: if slt $t0, $s1, $s2 $s1 < $s2 then $t0 = 1 else $t0 = 0 slt (set on less than) Note that the assembler needs a register to do this, — there are policy of use conventions for registers 26 Control: Branch on Less Than Example: What is the code to test if register $s0 is less than register $s1 and than branch to label Less if the condition holds? Solution: slt $t0, $s0, $s1 bne $t0, $zero, Less Register $zero always contains 0. The pair of instructions, slt and bne, implements branch on less than. Jump register (jr) An unconditional jump to the address specified in a register. 27 Control: Case/Switch Statement The simplest way to implement switch is via a sequence of conditional tests, turning the switch statement into a chain of if-then-else statements. Encoded as a table of addresses of alternative instruction sequences, called jump address table. The program needs only to index into the table and then jump to the appropriate sequence. The jump table is an array of words containing addresses that correspond to labels in the code. MIPS includes a jump register (jr) instruction, meaning an unconditional jump to the address specified in a register. The program loads the appropriate entry from the jump table into a register, and then it jumps to the proper address using a jump register. 28 Control: Case/Switch Statement Example: switch (k) { case 0: f = i + j; break; case 1: f = g + h; break; case 2: f = g - h; break; case 3: f = i - j; break; } Assume that variables f through k correspond to six registers $s0 through $s5 and the register $t2 contains 4. What’s the MIPS code? Solution: slt bne slt beq add add add lw jr 檢查k是否小於0 檢查k是否小於4 $t3, $s5, $zero L0: add $s0, $s3, $s4 $t3, $zero, Exit j Exit $t3, $s5, $t2 L1: add $s0, $s1, $s2 $t3, $zero, Exit j Exit $t1, $s5, $s5 L2: sub $s0, $s1, $s2 $t1, $t1, $t1 j Exit $t1, $t1, $t4 L3: sub $s0, $s3, $s4 $t0, 0($t1) Exit: $t0 假設JumpTable的起始位址為$t4 29 Control: Case/Switch Statement Jump address table Jump address table 的起始位址為 $t4,每個location為4 bytes。 L0 L1 L2 L3 Jump Address Table 30 Policy of Use Conventions Name Register number $zero 0 $v0-$v1 2-3 $a0-$a3 4-7 $t0-$t7 8-15 $s0-$s7 16-23 $t8-$t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Usage the constant value 0 values for results and expression evaluation arguments temporaries saved more temporaries global pointer stack pointer frame pointer return address Register 1 ($at) reserved for assembler, 26-27 for operating system 31 Supporting Procedures in Computer Hardware Execution a procedure, the program must follow these six steps: Place parameters in a place where the procedure can access them Transfer control to the procedure Acquire the storage resources need for the procedure. Perform the desired task. Place the result value in a place Return control to the point of origin MIPS allocates 7 registers for procedure call $a0-$a3: to pass parameters $v0-$v1: to return values $ra: to return the point of origin. Jump-and-link instruction (jal) jal Procedure Address The jal instruction saves PC+4 in register $ra 32 What happens when a procedure is called Before calling a procedure, the caller must: 1. Pass the arguments to the callee procedure; The 4 arguments are passed in registers $a0-$a3 ($4 -$7). The remaining arguments are placed on the stack. 2. Save any caller-saved registers that the caller expects to use after the call. This includes the argument registers and the temporary registers $t0-$t9. (The callee may use these registers, altering the contents.) 3. Execute a jal to the called procedure (callee). This saves the return address in $ra. At this point, the callee must set up its stack frame: 1. Allocate memory on the stack by subtracting the frame size from the $sp. 2. Save any registers the caller expects to have left unchanged. These include $ra, $fp, and the registers $s0 - $s7. 3. Set the value of the frame pointer by adding the stack frame size to $fp and subtracting 4. The procedure can then execute its function. Note that the argument list on the stack belongs to the stack frame of the caller. 33 Returning from a procedure When the callee returns to the caller, the following steps are required: 1. If the procedure is a function returning a value, the value is placed in register $v0 and, if two words are required, $v1 (registers $2 and $3). 2. All callee-saved registers are restored by popping the values from the stack, in the reverse order from which they were pushed. 3. The stack frame is popped by adding the frame size to $sp. 4. The callee returns control to the caller by executing jr $ra Note that some of the operations may not be required for every procedure call, and modern compilers would only generate the steps required for each particular procedure. For example, the lowest level subprograms to be called (\leaf nodes") would not have to save $ra. If a programming language does not allow a subprogram to call itself (recursion) then implementing a stack frame may not be required, but a stack is still required for nested procedure calls. 34 Supporting Procedures in Computer Hardware (cont.) Return jump jr $ra Stack (last in first out) Push: placing data onto the stack Pop: removing data from the stack Stack grow from higher address to lower address. Stack pointer $sp : used to save the registers needed by the callee caller callee X ($a0~$a3) jal X ($v0~$v1) jr $ra 35 Example: Compiling a procedure that does not call another procedure What is the compiled MIPS assembly code? int leaf_example(int g, int h, int i, int j) { int f; f = ( g + h ) – ( i + j ); 在leaf_example的procedure中, 會使用到 3個暫時性的暫存器, return f; 因此需預留3*4=12byte } Solution The parameter variables g, h, i, and j correspond to the argument registers $a0, $a1, $a2, and $a3, and f corresponds to $s0. sub sw sw sw add add sub add lw lw lw add jr High address $sp, $sp, 12 $t1, 8($sp) $t0, 4($sp) $sp $s0, 0($sp) $sp Contents of register $t1 $t0, $a0, $a1 Contents of register $t0 $t1, $a2, $a3 $sp Contents of register $s0 $s0, $t0, $t1 $v0, $s0, $zero $s0, 0($sp) Low address a. b. c. $t0, 4($sp) $t1, 8($sp) 將暫存器$t1, $t2, $s0的值,儲存在堆疊中。 $sp, $sp, 12 $ra 36 將堆疊中的值,回存暫存器$t1, $t2, $s0。 Nest Procedures Push all the other registers that must be preserved onto the stack. The stack pointer $sp is adjusted to account for the number of registers placed on the stack. 37 Example: Compiling a recursive procedure, showing nested procedure linking int fact (int n) { if ( n <1 ) return (1); else return (n * fact (n-1) ); } Solution fact: 因為會用到$a0和返回位址$ra, sub $sp, $sp, 8 所以需要2個位址,並且先將$a0和 sw $ra, 4($sp) $ra儲存在堆疊中。 sw $a0, 0($sp) slt $t0, $a0, 1 #test for n<1 beq $t0, $zero, L1 #if n>=1, goto L1 add $v0, $zero, 1 #return 1 add $sp, $sp, 8 #pop 2 items off stack jr $ra L1: sub $a0, $a0, 1 #n>=1: argument gets (n-1) jal fact #call fact with (n-1) lw $a0, 0($sp) #return from jal: restore argument n lw $ra, 4 ($sp) #restore the return address addi $sp, $sp, 8 mul $v0, $a0, $v0 jr $ra 38 The MIPS memory allocation for program and data The data segment is divided into 2 parts, the lower part for static data (with size known at compile time) and the upper part, which can grow, upward, for dynamic data structures. $sp 7fff ffff hex The stack segment varies in size during the execution of a program, as functions are called and returned from. It starts at the top of memory and grows down. Stack Dynamic data $gp 1000 8000 hex 1000 0000 Static data hex Text pc 0040 0000 hex Reserved 0 39 What is preserved across a procedure call Preserved Not Preserved Saved registers: $s0~$s7 temporary registers: $t0~$t9 Stack pointer register : $sp Argument register : $a0~$a3 Return address register: $ra Return value register: $v0~$v1 Stack above the stack pointer Stack below the stack pointer 40 Beyond Numbers Load byte (lb) Loads a byte from memory, placing it in the rightmost 8 bits of a register. Store byte (sb) Takes a byte from the rightmost 8 bits of a register and writes it to memory. lb sb $t0, 0($sp) # read byte from source $t0, 0($sp) # write byte to destination There are three choices for representing a string: The first position of the string is reserved to give the length of a string. An accompanying variable has the length of the string The last position of a string is indicated by a character used to mark the end of a string. 41 Example: Compiling a string copy procedure, showing how to use C strings void strcpy (char x[], char y[]) { int i; i = 0; while ( ( x[i] = y[i] ) != 0) i = i + 1 ; } array x and y are in $a0 and $a1 Solution i is in $s0 strcpy: sub sw add L1: add lb add sb beq add j L2: lw add jr $sp, $sp, 4 $s0, 0($sp) $s0, $zero, $zero $t1, $a1, $s0 $t2, 0($t1) $t3, $a0, $s0 $t2, 0($t3) $t2, $zero, L2 $s0, $s0, 1 L1 $s0, 0($sp) $sp, $s0, 4 $ra #adjust stack for 1 more item # save $s0 #i=0 #address of y[i] in $t1 $t2 = y[i] #address of x[i] in $t3 #x[i] = y[i] # if y[i]==0, go to L2 #i = i+1 # go to L1 # y[i] ==0; end of string #restore old $s0, pop 1 word off stack #return 42 Constants Small constants are used quite frequently (50% of operands) e.g., A = A + 5; B = B + 1; C = C - 18; Solutions? Why not? put 'typical constants' in memory and load them. Ex: to add the constant 4 to register $sp Lw $t0, AddrConstant4 add $sp, $sp, $t0 create hard-wired registers (like $zero) for constants like one. Example: Translating Assembly Constants into Machine Language Solution: addi $sp, $sp, 4 op (6 bit) rs (5 bit) rt (5 bit) Immediate(16 bit) 8 29 29 4 001000 11101 11101 0000 0000 0000 0100 43 Immediate Operands Immediate version of the set on less than instruction: $t0, $s2, 10 slti #$t0 = 1 if $s2 <10 Load upper immediate instruction: To set the upper 16 bits of a constant in a register. $t0, 255 lui # $t0 is register 8 The machine language version of lui $t0, 255 op rs rt immediate 001111 00000 01000 0000 0000 1111 1111 Content of register $t0 after executing lui $t0, 255 0000 0000 1111 1111 0000 0000 0000 0000 44 How about larger constants? We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 filled with zeros 1010101010101010 0000000000000000 Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010 0000000000000000 0000000000000000 1010101010101010 1010101010101010 1010101010101010 ori 45 Example: Loading a 32-bit constant What is the MIPS assembly code to load this 32-bit constant into register $s0? 0000 0000 0011 1101 0000 1001 0000 0000 Solution (1): lui $s0, 61 0000 0000 0011 1101 0000 0000 0000 0000 addi $s0, $s0, 2304 0000 0000 0011 1101 0000 1001 0000 0000 Solution (2): (discuss in chapter 4) lui ori $s0, 61 $s0, $s0, 2304 46 Addresses in Branches and Jumps Instructions: bne $t4,$t5,Label Next instruction is at Label if $t4 <> $t5 beq $t4,$t5,Label Next instruction is at Label if $t4 = $t5 j Label Next instruction is at Label Formats: I J op op(6bit) rs rt 16 bit address 26 bit address Conditional branch unconditional branch Addresses are not 32 bits How do we handle this with load and store instructions? Program counter = register + branch address PC-relative addressing 47 Showing Branch Offset in Machine Language If we assume that the loop is placed starting at location 80000. Loop: add $t1, $s3, $s3 add $t1, $t1, $t1 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit add $s3, $s3, $s4 j Loop Exit:80000 0 19 19 9 0 Solution: 80004 0 9 9 9 0 80008 0 9 22 9 0 80012 35 9 8 0 80016 5 8 21 8 (2) 80020 80024 0 19 20 19 0 80028 2 80000 (20000) 32 32 32 32 48 Showing Branch Offset in Machine Language (cont.) The while loop on page 74 was compiled into this MIPS assembler code: Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit: If we assume we place the loop starting at location 80000 in memory, what is the MIPS machine code for this loop? 80000 80004 80008 80012 80016 80020 0 0 19 9 2 0 0 9 22 9 0 32 35 9 8 0 5 8 21 2 8 19 19 1 2 20000 49 Branching Far Away How about the conditional branch instruction to jump far away? Insert an unconditional jump to the branch target Inverts the condition so that the branch decides whether to skip the jump. Example: Given a branch on register $s0 being equal to register $s1 beq $s0, $s1, L1 replace it by a pair of instructions that offers a much greater branching distance. Solution bne j $s0, $s1, L2 L1 L2; 50 MIPS Addressing Mode Summary Addressing modes: Register addressing The operand is a register Base addressing The operand is at the memory location whose address is the sum of a register and a constant in the instruction. Immediate addressing The operand is a constant within the instruction itself. PC-relative addressing The address is the sum of the PC and a constant in the instruction. Pseudo-direct addressing The jump address is the 26 bits of the instruction concatenated with the upper bits of the PC. 51 MIPS addressing modes 1. Immediate addressing op rs rt Immediate 2. Register addressing op rs rt rd ... funct Registers Register 3. Base addressing op rs rt Memor y Address + Register Byte Halfword Word 4. PC-relative addressing op rs rt Memor y Address PC + Word 5. Pseudodirect addressing op Address PC Memor y Word 52 Example: Decoding machine code What is the assembly language corresponding to this machine instruction? 0000 0000 1010 1111 1000 0000 0010 0000 Solution: op rs rt rd shamt funct 000000 00101 01111 10000 0000 100000 查表(Fig. 2.25) 可得 add $s0, $a1, $t7 53 To summarize: MIPS operands Name 32 registers Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so 30 2 memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays, words and spilled registers, such as those saved on procedure calls. Memory[4294967292] add MIPS assembly language Example Meaning add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers $s1 = $s2 + 100 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 Used to add constants Category Arithmetic Instruction addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) store word lb $s1, 100($s2) load byte sb $s1, 100($s2) store byte load upper immediate lui $s1, 100 add immediate load word Data transfer Conditional branch Unconditional jump $s1 = 100 * 2 16 Comments Word from memory to register Word from register to memory Byte from memory to register Byte from register to memory Loads constant in upper 16 bits branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100 Equal test; PC-relative branch branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100 Not equal test; PC-relative set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0 Compare less than; for beq, bne set less than immediate slti jump j jr jal jump register jump and link $s1, $s2, 100 if ($s2 < 100) $s1 = 1; Compare less than constant else $s1 = 0 2500 $ra 2500 Jump to target address go to 10000 For switch, procedure return go to $ra $ra = PC + 4; go to 10000 For procedure call 54 MIPS instruction encoding 55 56 Translating and Staring a Program A translation hierarchy 為了加速編譯的過程,有些步驟會被合併 或省略,例如: 1. 有些compiler直接產生object code 2. 採用linking loader整合linker和loader C program Compiler Assembly language program Assembler Object: Machine language module Object: Library routine (machine language) Linker Executable: Machine language program Loader Memory 57 Translating and Staring a Program Compiler The compiler transforms the C program into an assembly language program. A symbolic form of what the machine understands Assembler The assembler convert the assembly language instruction into the machine language. Pseudoinstruction Sometimes, an assembler will accept a statement that does not correspond exactly to a machine instruction. For example, it may correspond to a small set of machine instructions. These are called pseudoinstructions. Assemblers keep track of labels used in branches and data transfer instructions in symbol table. 58 Translating and Staring a Program The object file for UNIX systems typically contains Object file header Describes the size and position of the other pieces of the object file. Text segment Contains the machine code. Static data segment Contains data allocated for the life of the program. Relocation information Identifies instructions and data words that depend on absolute address when the program is loaded into memory. Symbol table Contains the remaining labels that are not defined, such as external reference. Debugging information Contains a concise description of how the modules were compiled . 59 Translating and Staring a Program Linker (link editor) Place code and data modules symbolically in memory Determine the addresses of data and instruction labels. Patch both the internal and external references. The linker uses the relocation information and symbol table in each object module to resolve all undefined labels. The linker produces an executable file that can be run on a computer Loader Reads the executable file header to determine size of the text and data segment. Creates an address space large enough for the text and data Copies the instructions and data from the executable file into memory. Initializes the machine registers and sets the stack pointer to the first free location. Jumps to a start-up routine that copies the parameters into the argument registers and calls the main routine of the program. 60 Dynamic Linked Libraries Although the traditional static linking libraries is the fastest way to call library routines, it has a few disadvantages: The library routines become part of the executable code. It loads the whole library even if all of the library is not used when the program is run. Dynamic Linked Libraries The library routines are not linked and loaded until the program is run. 61 62 Starting a Java Program Java is compiled first to instructions that are easy to interpret: the Java bytecode instruction set. Java programs are distributed in the binary version of these bytecodes. Java Virtual Machine (JVM) is an interpreter which can execute Java bytecodes. 63 Array Version of Clear clear1 (int array[ ], int size) { int i; for ( i =0, i <size, i = i array[i ] = 0; } sll Solution: move $t0, $zero loop1: add $t1, $t0, $t0 add $t1, $t1, $t1 add $t2, $a0, $t1 sw $zero, 0($t2) addi $t0, $t0, 1 slt $t3, $t0, $a bne $t3, $zero, loop1 +1) array: $a0 size: $a1 i: $t0 $t1, $t0, 2 # i=0 # $t1=i*4 # $t2=address of array[i] # array[i]=0 # i=i+1 # $t3= (i<size) # if (i<size) goto loop1 64 Pointer Version of Clear clear2 (int *array, int size) array: $a0 { size: $a1 p: $t0 int *p; for ( p =&array[0], p < &array[size], p = p +1) *p = 0; } Solution: move loop2: sw addi add add add slt bne $t0, $a0 $zero, 0($t0) $t0, $t0, 4 $t1, $a1, $a1 $t1, $t1, $t1 $t2, $a0, $t1 $t3, $t0, $t2 $t3, $zero, loop2 # p=address of array[0] # Memory[p]=0 # p=p+4 # $t1=i*4 #(sll $t1, $a1, 2) # $t2=address of array[size] # $t3=(p<&array[size]) # if (p<&array[size]) goto loop2 65 Alternative Architectures Design alternative: provide more powerful operations goal is to reduce number of instructions executed danger is a slower cycle time and/or a higher CPI Sometimes referred to as “RISC vs. CISC” virtually all new instruction sets since 1982 have been RISC VAX: minimize code size, make assembly language easy instructions from 1 to 54 bytes long! We’ll look at PowerPC and 80x86 66 PowerPC Indexed addressing example: lw $t1,$a0+$s3 #$t1=Memory[$a0+$s3] What do we have to do in MIPS? Update addressing update a register as part of load (for marching through arrays) example: lwu $t0,4($s3) #$t0=Memory[$s3+4];$s3=$s3+4 What do we have to do in MIPS? Others: load multiple/store multiple a special counter register “bc Loop” decrement counter, if not 0 goto loop 67 80x86 1978: The Intel 8086 is announced (16 bit architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989-1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance) 1997: MMX is added “This history illustrates the impact of the “golden handcuffs” of compatibility “adding new features as someone might add clothing to a packed bag” “an architecture that is difficult to explain and impossible to love” 68 A dominant architecture: 80x86 See your textbook for a more detailed description Complexity: Instructions from 1 to 17 bytes long one operand must act as both a source and destination one operand can come from memory complex addressing modes e.g., “base or scaled index with 8 or 32 bit displacement” Saving grace: the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow “what the 80x86 lacks in style is made up in quantity, making it beautiful from the right perspective” 69 Some typical 80x86 Instruction & their functions Function Instruction JE name If equal (CC) EIP = name}; EIP – 128 name < EIP + 128 JMP name {EIP = NAME}; CALL name SP = SP – 4; M[SP] = EIP + 5; EIP = name; MOVW EBX,[EDI + 45] EBX = M [EDI + 45] PUSH ESI SP = SP – 4; M[SP] = ESI POP EDI EDI = M[SP]; SP = SP + 4 ADD EAX,#6765 EAX = EAX + 6765 TEST EDX,#42 Set condition codea (flags) with EDX & 42 MOVSL M[EDI] = M[ESI]; EDI = EDI + 4; ESI = ESI + 4 70 Typical 80x86 instruction formats a. JE EIP + displacement 4 4 8 JE Condition Displacement b. CALL 8 32 CALL Offset c. MOV EBX, [EDI + 45] 6 1 1 MOV d w 8 r-m postbyte 8 Displacement d. PUSH ESI 5 3 PUSH Reg e. ADD EAX, #6765 4 3 1 32 ADD Reg w Immediate f. TEST EDX, #42 7 1 8 32 TEST w Postbyte Immediate 71 Summary Instruction complexity is only one variable lower instruction count vs. higher CPI / lower clock rate Design Principles: simplicity favors regularity smaller is faster good design demands compromise make the common case fast Instruction set architecture a very important abstraction indeed! 72