Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair These Lectures: ISA & MIPS Operands and data types Computational operations Memory access & addressing Branches Components of an Procedure call ISA Instruction encoding Assembling and linking Alternatives Later in course: exceptions and interrupts Chapter 2 2 Assembly Language Basic job of a CPU: execute lots of instructions. Instructions are the primitive operations that the CPU may execute. Different CPUs implement different sets of instructions. The set of instructions a particular CPU implements is an Instruction Set Architecture (ISA). • Chapter 2 Examples: Intel 80x86 (Pentium 4), IBM/Motorola PowerPC (Macintosh), MIPS, Intel IA64, ... 3 Instruction Set Architectures Early trend was to add more and more instructions to new CPUs to do elaborate operations • VAX architecture had an instruction to multiply polynomials! RISC philosophy (Cocke IBM, Patterson, Hennessy, 1980s) – Reduced Instruction Set Computing • • Chapter 2 Keep the instruction set small and simple, makes it easier to build fast hardware. Let software do complicated operations by composing simpler ones. 4 MIPS Architecture MIPS – semiconductor company that built one of the first commercial RISC architectures We will study the MIPS architecture in some detail in this class Why MIPS instead of Intel 80x86? • • Chapter 2 MIPS is simple, elegant. Don’t want to get bogged down in gritty details. MIPS widely used in embedded apps, x86 little used in embedded, and more embedded computers than PCs 5 MIPS Architectural Approach Load/store or register-register instruction set • Data must be in registers to be operated on – register operations affect the entire contents of register • • • Only load/store instructions affect memory True in all RISC instruction sets True in all instruction sets designed since 1980 Emphasis on efficient implementation Simplicity: provide primitives rather than solutions Chapter 2 6 Assembly Variables: Registers Unlike HLL like C or Java, assembly cannot use variables • Assembly Operands are registers • • Why not? Keep Hardware Simple limited number of special locations built directly into the hardware operations can only be performed on these! Benefit: Since registers are directly in hardware, they are very fast (faster than 1 billionth of a second) Drawback: Since registers are in hardware, there are a predetermined number of them • Chapter 2 Solution: MIPS code must be very carefully put together to efficiently use registers 7 Assembly Variables: Registers 32 registers in MIPS • Each MIPS register is 32 bits wide • Why 32? Smaller is faster Groups of 32 bits called a word in MIPS Registers are numbered from 0 to 31 Each register can be referred to by number or name Number references: $0, $1, $2, … $30, $31 By convention, each register also has a name to make it easier to code For now: $16 - $23 $8 - $15 $s0 - $s7 $t0 - $t7 In general, use names to make your code more readable Chapter 2 8 Assembly Variables: Registers Name Register number $zero 0 $v0-$v1 2-3 $a0-$a3 4-7 $t0-$t7 8-15 $s0-$s7 16-23 $t8-$t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Chapter 2 Usage the constant value 0 values for results and expression evaluation arguments temporaries saved more temporaries global pointer stack pointer frame pointer return address 9 C, Java variables vs. registers In C (and most High Level Languages) variables declared first and given a type • Example: int fahr, celsius; char a, b, c, d, e; Each variable can ONLY represent a value of the type it was declared as (cannot mix and match int and char variables). In Assembly Language, the registers have no type; operation determines how register contents are treated Chapter 2 10 Comments in Assembly and instructions Another way to make your code more readable: comments! Hash (#) is used for MIPS comments • anything from hash mark to end of line is a comment and will be ignored In assembly language, each statement (called an Instruction), executes exactly one of a short list of simple commands Unlike in C (and most other High Level Languages), each line of assembly code contains at most 1 instruction Instructions are related to operations (=, +, -, *, /) in C or Java Chapter 2 11 Data Types: Typical • • Bit: 0, 1 Bit string: sequence of bits of a particular length • • • • • Character: • • Digits 0-9 encoded as 0000b thru 1001b, two per byte Not supported in most newer architectures Integers: • • Supported as a byte (signed or unsigned) Decimal: • • • 8 bits is a byte 16 bits is a half-word 32 bits is a word 64 bits is a double-word 2's complement: next chapter Floating point: M x 2E • • • Chapter 2 Single precision Double precision Extended precision 12 MIPS I Storage Model 232 bytes of memory: accessible by loads/stores 31 x 32-bit GPRs (R0 = 0) 32 x 32-bit FP regs–organized as 16 pairs HI, LO: used for integer multiply/divide PC: branch and procedure call $0 $1 ° ° ° $31 0 $f0 $f1 ° ° ° $f15 PC lo hi Chapter 2 13 Computational Instructions Arithmetic/logical instructions • • • Three operand format: result + two sources Operands: registers, 16-bit immediates Signed & unsigned arithmetic operations: – Sign-extension for immediates – Trapping of overflow for signed values • Compare instructions – Signed vs. Unsigned: comparison is different Integer multiply/divide • Use HI/LO registers Floating point instructions • • • Chapter 2 Operate on floating point registers Double and single precision Typical: add, multiply, divide, subtract 14 MIPS Addition and Subtraction Syntax of Instructions: 1 2,3,4 where: 1) operation by name 2) operand getting result (“destination”) 3) 1st operand for operation (“source1”) 4) 2nd operand for operation (“source2”) Syntax is rigid: • • Chapter 2 1 operator, 3 operands Why? Keep Hardware simple via regularity 15 Addition and Subtraction of Integers Addition in Assembly Example: add $s0,$s1,$s2 (in MIPS) Equivalent to: a = b + c (in C) where MIPS registers $s0,$s1,$s2 are associated with C variables a, b, c • Subtraction in Assembly Example: sub $s3,$s4,$s5 (in MIPS) Equivalent to: d = e - f (in C) where MIPS registers $s3,$s4,$s5 are associated with C variables d, e, f • Chapter 2 16 Addition and Subtraction of Integers How do the following C statement? a = b + c + d - e; Break into multiple instructions add $t0, $s1, $s2 # temp = b + c add $t0, $t0, $s3 # temp = temp + d sub $s0, $t0, $s4 # a = temp - e Notice: A single line of C may break up into several lines of MIPS. Notice: Everything after the hash mark on each line is ignored (comments) Chapter 2 17 Addition and Subtraction of Integers How do we do this? f = (g + h) - (i + j); Use intermediate temporary register add $t0,$s1,$s2 add $t1,$s3,$s4 sub $s0,$t0,$t1 Chapter 2 # temp = g + h # temp = i + j # f=(g+h)-(i+j) 18 MIPS instructions Register Zero How do we do f = g (in C) ? add $s0,$s1, $zero (in MIPS) where MIPS registers $s0,$s1 are associated with C variables f, g Immediates: Add f = g + 10 (in C) addi $s0,$s1,10 (in MIPS) where MIPS registers $s0,$s1 are associated with C variables f, g • Chapter 2 19 MIPS instructions There is no Subtract Immediate in MIPS: Why? • • if an operation can be decomposed into a simpler operation, don’t include it addi …, -X = subi …, X => so no subi addi $s0,$s1,-10 (in MIPS) f = g - 10 (in C) where MIPS registers $s0,$s1 are associated with C variables f, g Chapter 2 20 MIPS Integer Arithmetic Instruction Example Meaning Comments add subtract add immediate add unsigned subtract unsign add imm unsign sub imm unsign set less than set less than imm set less than uns set l. t. imm. uns. add $1,$2,$3 sub $1,$2,$3 addi $1,$2,100 addu $1,$2,$3 subu $1,$2,$3 addiu $1,$2,100 subiu $1,$2,100 slt $1,$2,$3 slti $1,$2,100 sltu $1,$2,$3 sltiu $1,$2,100 $1 = $2 + $3 $1 = $2 – $3 $1 = $2 + 100 $1 = $2 + $3 $1 = $2 – $3 $1 = $2 + 100 $1 = $2 – 100 $1 = ($2 < $3) $1 = ($2 < 100) $1 = ($2 < $3) $1 = ($2 < 100) 3 operands; exception possible 3 operands; exception possible + constant; exception possible 3 operands; no exceptions 3 operands; no exceptions + constant; no exceptions – constant; no exception compare less than signed compare w. constant signed compare less than unsigned compare< constant unsigned Chapter 2 21 Multiply / Divide Start multiply, divide multiply mult $2,$3 multiply unsign multu$2,$3 divide div $2,$3 divide unsign divu $2,$3 Move result from multiply, divide Move from Hi mfhi $1 Move from Lo mflo $1 Rationale: Hi, Lo = $2 x $3 Hi, Lo = $2 x $3 Lo = $2 ÷ $3, Hi = $2 mod $3 Lo = $2 ÷ $3, Hi = $2 mod $3 64-bit signed product 64-bit unsigned product Lo = quotient Hi = remainder Unsigned quotient Unsigned remainder $1 = Hi $1 = Lo Used to get copy of Hi Used to get copy of Lo • • deal with 64-bit result simplify handling of instruction LO Registers Chapter 2 A d d HI 22 MIPS Logical Instructions Instruction Example Meaning Comment and or xor nor and immediate or immediate xor immediate shift left log shift right log shift right arith shift left log var shift right log var shift right arith load upper imm and $1,$2,$3 or $1,$2,$3 xor $1,$2,$3 nor $1,$2,$3 andi $1,$2,10 ori $1,$2,10 xori $1, $2,10 sll $1,$2,10 srl $1,$2,10 sra $1,$2,10 sllv $1,$2,$3 srlv $1,$2, $3 srav $1,$2, $3 lui $1,40 $1 = $2 & $3 $1 = $2 | $3 $1 = $2 $3 $1 = ~($2 |$3) $1 = $2 & 10 $1 = $2 | 10 $1 = $2 10 $1 = $2 << 10 $1 = $2 >> 10 $1 = $2 >> 10 $1 = $2 << $3 $1 = $2 >> $3 $1 = $2 >> $3 $1 = 40 << 16 Logical AND Logical OR Logical XOR Logical NOR Logical AND w. constant Logical OR w. constant Logical XOR w. constant Shift left by constant Shift right by constant Shift right (sign extend) Shift left by variable Shift right by variable Shift right arith. by var Places immediate into upper 16 bits Chapter 2 23 How about larger constants? We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 filled with zeros 1010101010101010 0000000000000000 Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 ori Chapter 2 1010101010101010 0000000000000000 0000000000000000 1010101010101010 1010101010101010 1010101010101010 24 Memory Addressing Byte addressing: • Since 1980 every machine uses addresses to level of 8-bits. Three questions: • Can individual bytes be accessed? – Yes, in almost every machine (half-words also) • How do byte addresses map into words? – Byte order – A word is accessible either as 32 bits or as 4 bytes • How can words be positioned in memory? – Alignment Chapter 2 25 Machine Language Instructions, like registers and words of data, are also 32 bits long • • Example: add $t1, $s1, $s2 registers have numbers, $t1=9, $s1=17, $s2=18 Instruction Format: 00000010001 10010 01001 00000 100000 op Chapter 2 rs rt rd shamt funct 26 Byte Ordering Two conventions–named based on Gulliver’s travels Big Endian: • • Little Endian: • • address of most significant byte = word address (x000 = Big End of word) IBM 360/370, Motorola 68k, Sparc, HP PA address of least significant byte = word address (000x = Little End of word) Intel 80x86, DEC Vax, DEC Alpha Bimodal: MIPS, PowerPC (both mostly Big Endian) 3 2 1 little endian byte 0 0 msb lsb 0 1 2 3 big endian byte 0 Chapter 2 27 Alignment Alignment: require that objects fall on address that is multiple of their size. Important performance effect. Historically: • • • Early machines (IBM 360 in 1964) require alignment Restriction removed in 1970s: too hard for programmers! RISC machines: reintroduce restriction– important to performance 0 1 Aligned Not Aligned Example: word access (also half-word and double word) Chapter 2 28 2 3 MIPS memory access All memory access through loads and stores Aligned words, halfwords, and bytes • halfwords and bytes may be sign or 0 extended Floating-point loads/stores for FP registers Single addressing mode (displacement or based): 16-bit sign-extended displacement (immediate field) • + register Registers Memory • = memory address + In addition: • displacement = 0 uses register contents as address • register = 0 uses 16-bit displacement as address • Chapter 2 29 Data to load/ location to store into MIPS Load/store Instructions Instruction Example Meaning Comments store word store half store byte store float sw sh sb sf 8($4), $3 6($4), $3 7($4), $3 4($2), $f2 Mem[$4+8]=$3 Mem[$4+6]=$3 Mem[$4+7]=$3 Mem[$2+4]=$f2 Store word Stores only lower 16 bits Stores only lowest byte Store FP word load word load halfword load half unsign load byte load byte unsign load float lw $1, 8($2) lh $1, 6($2) lhu $1, 6($2) lb $1, 5($2) lbu $1, 5($2) lf F1, 4($3) $1=Mem[8+$2] $1=Mem[6+$2] $1=Mem[6+$2] $1=Mem[5+$2] $1=Mem[5+$2] $f1=Mem[4+$3] Load word Load half; sign extend Load half; zero extend Load byte; sign extend Load byte; zero extend Load FP register Chapter 2 30 MIPS Load/store Instructions Load and store instructions Example: C code: A[12] = h + A[8]; lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, 48($s3) Remember arithmetic operands are registers, not memory! MIPS code: Can’t write: Chapter 2 add 48($s3), $s2, 32($s3) 31 Example Can we figure out the code? swap(int v[], int k); { int temp; temp = v[k] v[k] = v[k+1]; v[k+1] = temp; } swap: muli $2, $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31 Chapter 2 32 Machine Language Instructions, like registers and words of data, are also 32 bits long • • Example: add $t1, $s1, $s2 registers have numbers, $t1=9, $s1=17, $s2=18 Instruction Format: 000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct Can you guess what the field names stand for? Chapter 2 33 Machine Language Consider the load-word and store-word instructions, • • Introduce a new type of instruction format • • What would the regularity principle have us do? New principle: Good design demands a compromise I-type for data transfer instructions other format was R-type for register Example: lw $t0, 32($s2) 35 op 10010 rs 01000 rt 32 16 bit number Where's the compromise? Chapter 2 34 Stored Program Concept Instructions are bits Programs are stored in memory — to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch & Execute Cycle • • • Chapter 2 Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue 35 MIPS Jump/branch Instructions Two classes: • Jumps–unconditional, not pc-relative – For procedure call, unconditional control, switch statements • Branches–conditional and PC relative – For conditional control and pc-relative unconditional Instruction Jump Jump register Jump and link Chapter 2 example meaning j 10000 PC = 10000 jr $31 PC = $31 jal 10000 $31 = PC + 4; PC = 10000 comment jump to address jump to address in register Save PC next instruction jump to address 36 MIPS Compare and Branch Conditional branch is compare-and-branch • conditions: – comparison against 0: equality, sign-test – comparison of two registers: equality only – remaining set of compare-and-branch take two instructions • unconditional formulated with $0: beq $0,$0,where Instruction branch equal branch not eq branch l.t. 0 branch l.t./eq 0 branch g.t. 0 branch g.t./eq 0 Chapter 2 Example beq $1,$2,100 bne $1,$2,100 bltz $1,100 blez $1,100 bgtz $1,100 bgez $1,100 Meaning if ($1 == $2) PC=PC+4+100 if ($1 != $2) PC=PC+4+100 if ($1 < 0) PC = PC+4+100 if ($1 <= 0) PC = PC+4+100 if ($1 > 0) PC = PC+4+100 if ($1 >= 0) PC = PC+4+100 37 Compiling C if into MIPS Compile by hand (true) i == j if (i == j) i == j? (false) i != j f=g-h f=g+h f = g+h; Exit else f = g-h; Use this mapping: ° Final compiled MIPS code: • • • • • f: $s0, g: $s1, h: $s2, i: $s3, j: $s4 beq $s3, $s4, True # branch i==j sub $s0, $s1, $s2 # f=g-h(false) j Fin # go to Fin $s0,$s1,$s2 # f=g+h (true) True: add Fin: Chapter 2 38 Example: Searching an Array C code: count=0; for (index=head; index<=n; index++) if (C[index] = = target) count ++; MIPS assembly code, assuming: • • loop: next: head in $1; starting address of C in $2; target in $3; n in $4 Register allocation: count in $5, index in $6 li move bgt sll addu lw bne addui addui b $5,0 $6,$1 $6,$4,exit $7,$6,2 $7,$7,$2 $8,0($7) $8,$3,next $5,$5,1 $6,$6,1 loop ; ; ; ; ; ; ; ; ; ; set count =0 (actually addui $5,$0,0) initial index (actually addu $6,$1,$0) if index>n exit (actually sgt, bne) multiply index by 4 address of C [index] $8 = C[index] test if equal increment count increment index unconditional branch to loop exit: • Chapter 2 Simplest, but not best code! 39 Instruction Support for Functions ... sum(a,b);... /* a, b: $s0,$s1 */ } int sum(int x, int y) { return x+y; } address 1000 1004 1008 1012 1016 add add addi j ... $a0,$s0,$zero # x = a $a1,$s1,$zero # y = b $ra,$zero,1016 #$ra=1016 sum #jump to sum 2000 sum: add $v0,$a0,$a1 2004 jr Chapter 2 $ra # new instruction 40 Instruction Support for Functions •Single instruction to jump and save return address: jump and link (jal) •Before: 1008 addi $ra,$zero,1016 #$ra=1016 1012 j sum #go to sum •After: 1012 jal sum # $ra=1016,go to sum •Why have a jal? Make the common case fast: functions are very common. • Syntax for jr (jump register): jr register •Instead of providing a label to jump to, the jr instruction provides a register which contains an address to jump to. •Very useful for function calls: jal stores return address in register ($ra) jr jumps back to that address Chapter 2 41 MIPS: Software Register Convention Caller save: caller saves at call if needed after call Callee save: called procedure saves if register needed 0 $0 zero constant 0 16 1 $at reserved for assembler ... 2 $v0 expression evaluation & 23 $s7 3 $v1 function results 24 $t8 4 $a0 25 $t9 5 $a1 26 $k0 6 $a2 27 $k1 7 $a3 28 $gp Pointer to global area 8 $t0 29 $sp Stack pointer 30 $fp frame pointer 31 $ra Return Address (HW) ... 15 $t7 Chapter 2 arguments temporary: caller saves $s0 callee saves temporary (cont’d) reserved for OS kernel 42 MIPS Calling Convention Caller (the calling function) • • • Callee: (the function being called) start-up • • • Load arguments: first four in $a0–$a3, rest on stack Save caller-saved registers: $a0–$a3, $t0-$t9 if used Execute jal instruction Allocate memory in frame: $sp = $sp – frame size Save callee-saved registers $s0–$s7,$fp,$ra if used Create frame: $fp = $sp + frame size Return: • • • • Chapter 2 Place return value in $v0 Restore any callee-saved registers Pop stack: $sp = $sp+frame size Return by jr $ra Only need to do what is needed! 43 Register Conventions - saved When callee returns from executing, the caller needs to know which registers may have changed and which are guaranteed to be unchanged. Register Conventions: A set of generally accepted rules as to which registers will be unchanged after a procedure call (jal) and which may be changed. $0: No Change. Always 0. $s0-$s7: Restore if you change. Very important, that’s why they’re called saved registers. If the callee changes these in any way, it must restore the original values before returning. $sp: Restore if you change. The stack pointer must point to the same place before and after the jal call, or else the caller won’t be able to restore values from the stack. HINT -- All saved registers start with S Chapter 2 44 Register Conventions - volatile $ra: Can Change. The jal call itself will change this register. Caller needs to save on stack if nested call. $v0-$v1: Can Change. These will contain the new returned values. $a0-$a3: Can change. These are volatile argument registers. Caller needs to save if they’ll need them after the call. $t0-$t9: Can change. That’s why they’re called temporary: any procedure may change them at any time. Caller needs to save if they’ll need them afterwards. Chapter 2 45 Instruction Encoding-I 3-formats, all 32-bits in length fixed 6-bit opcode begins each instruction ALU Format (also R format): one opcode • register-register-register ALU instructions Bits 6 OP=0 5 rs first source register 5 5 rt rd second source register 5 sa 6 funct result shift function register amount code Function code: Detailed opcode: Add, Sub, or, and, ... Chapter 2 46 Instruction Encoding-II Immediate instruction format (I format): Loads/stores (including floating point) Immediate instructions (e.g. addi, lui, etc.) different opcode for each instruction • • • Bits 6 5 OP rs 5 16 rt second first source source or base or target register register Chapter 2 immediate immediate field 47 Instruction Encoding-III Jump format (J format): • • Bits used for j, jal 26-bit offset field 6 OP 26 jump target jump target address Chapter 2 48 Addressing Modes 1. Immediate addressing op rs rt Immediate 2. Register addressing op rs rt rd ... funct Registers Register 3. Base addressing op rs rt Memory Address + Register Byte Halfword Word 4. PC-relative addressing op rs rt Memory Address PC + Word 5. Pseudodirect addressing op Address PC Chapter 2 Memory Word 49 Translation Hierarchy C program Compiler Assembly language program Assembler Object: Machine language module Object: Library routine (machine language) Linker Executable: Machine language program Loader Memory Chapter 2 50 Assembling Programs Assembly: • resolve labels on instructions and data: – relative to PC for instructions – relative to some register for data – either two-pass or use backpatch • • • • expand any macros and pseudoinstructions handle any assembler directives: data layout translate instructions to binary create object file: – – – – – – Chapter 2 headers code segment (called text in Unix) Data segment Relocation information: instruction/data words to relocate Symbol table: unresolved references + visible symbols debugging information 51 Linking and Loading Linker–combine multiple object modules, resolving cross references: • • • • Search and link in any library modules Determine address for any data or code in module and fix-up the address appropriately Resolve cross references (both code and data) If all modules present yields an executable. Loader • • • • Chapter 2 Reads executable Loads code and data segments Initializes registers, stack, and arguments Jumps to program’s start-up routine to initiate execution 52 Alternative ISA Approaches Internal storage: registers, stacks, none • • • Typical operations: • • Heavily used ones are little changed since 1970 Fancy instructions in some machines, but under used Operands and addressing: where can operands be • • • • Registers: choice since 1984, all machines in use today Stacks: in 1960s-70s Only memory: not used successfully in 25 years Register-register: all since 1980 Register-memory: 360, 80x86, 680x0 Memory-memory: VAX Addressing: many different address modes Instruction formats: fixed versus variable Chapter 2 53 Operations Supported Most machines support a base set of operations like those in the MIPS ISA. Recently many architectures have added limited support (both operations and data types) for graphics and multimedia. Examples of operations included in more elaborate instruction sets: • • • • • • Chapter 2 support for arithmetic and logical instructions on all data types (bytes, half words) support for larger integer data types string instructions: copies, compares, translation subroutine call instructions support for data structures: queues, stacks support for bit strings as a data type 54 Methods of Testing Condition Condition Codes • • Processor status bits set as a side-effect of arithmetic instructions (possibly moves) or explicitly by compare or test instructions. example: add r1, r2, r3 bz label Condition Register: evaluate into register, test: • example: cmp r1, r2, r3 bgt r1, label Compare and Branch • Chapter 2 example: bgt r1, r2, label 55 Accessing and Addressing Operands All recent machines are general-purpose register architectures (mainly load/store architecture) Substantial differences in both expressiveness and complexity based on how operands are accessed. Example– VAX: • • • Any operand can reside in a register or in memory Any memory location can be addressed with any address mode Example: ADDW3–adds 2 16-bit operands, result in 3rd – Each operand can be a register, immediate, or in memory 27 combinations! – Each memory operand has a choice of 20+ addressing modes more than 20,000 different forms of the add instruction – Instruction size varies accordingly Chapter 2 56 Addressing Modes Addressing mode Example Meaning Register Add R4,R3 R4 R4+R3 Immediate Add R4,#3 R4 R4+3 Displacement Add R4,100(R1) R4 R4+Mem[100+R1] Register indirect Add R4,(R1) R4 R4+Mem[R1] Indexed / Base Add R3,(R1+R2) R3 R3+Mem[R1+R2] Direct or absolute Add R1,(1001) R1 R1+Mem[1001] Memory indirect Add R1,@(R3) R1 R1+Mem[Mem[R3]] Auto-increment Add R1,(R2)+ R4 R1+Mem[R2]; R2 R2 ¬+d Auto-decrement Add R1,–(R2) R2 R2–d; ¬ R1 R1+Mem[R2] Scaled Add R1,100(R2)[R3] R1 R1+Mem[100+R2+R3*d] Chapter 2 57 Examples of Instruction Formats Variable: … Fixed: Hybrid: • If code size is most important, use variable length. – may be dictated by instruction set. •If performance is most important, use fixed length. •Hybrid in use on 80x86 shares +/–. Chapter 2 58 Rationale for ISA Choices Metrics: • • Design cost impact: HW and SW Performance and other execution time metrics – Instruction and data bytes accessed • Static metrics (code size) Influences on ISA Effectiveness • • Program usage: importance of various alternatives Organizational techniques: – Pipelining, memory hierarchies • • • Compiler technology OS needs Basic implementation technology: – Memory vs logic; high-speed vs. operations in parallel Chapter 2 59 Compilers and ISA Ease of compilation • • • • Efficiency of code: • • Orthogonality: few special registers or special cases, all operand modes available with any data type or instruction type Completeness: support for wide range of applications Regularity: no overloading of meaning for instruction fields Streamlined: resource needs easily determined Minimize hidden work–do what’s needed Primitives rather than solutions Register Assignment is critical in • Chapter 2 Easier if lots of registers 60 Operand Size Usage Support these data sizes and types: • • 8-bit, 16-bit, 32-bit integers 32-bit and 64-bit floating point numbers Doubleword 0% 69% 74% Word Halfword Byte Int Avg. 31% 19% FP Avg. 0% 7% 0% 0% 20% 40% 60% 80% Frequency of reference by size Chapter 2 61 Addressing Mode Usage 40% Displacement Register deferred Memory indirect 0% • • • • • 16% 6% 10% 20% 30% 40% 50% 60% Three programs measured on machine with all address modes (VAX), not including registers: • gcc spice TEX 24% 6% 1% 1% 43% 11% 3% 0% 39% 17% Immediate Scaled 55% 32% Displacement: Immediate: Register deferred (indirect): Scaled: Memory indirect: Misc: 42% avg, 32% to 55% 33% avg, 17% to 43% 13% avg, 3% to 24% 7% avg, 0% to 16% 3% avg, 1% to 6% 2% avg, 0% to 3% 75% displacement & immediate 88% displacement, immediate & register indirect Chapter 2 62 Immediate Size How big are immediates? • • • 50% to 60% fit within 8 bits 75% to 80% fit within 16 bits Assuming sign extension! 60 50 40 30 20 10 0 0 4 8 12 16 20 24 28 32 Number of bits needed for immediate value Chapter 2 63 Displacement Address Size Average of 5 SPECfp and 5 SPECint programs 1% of addresses need > 16 bits 12-16 bits sufficient 30 Frequency (%) 25 20 Integer Floating Point 15 10 5 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Value Chapter 2 64 Top 10 80x86 Instructions Simple instructions dominate instruction frequency Rank Chapter 2 Instruction Average Percent total executed 1 load 22% 2 conditional branch 20% 3 compare 16% 4 store 12% 5 add 8% 6 and 6% 7 sub 5% 8 move register-register 4% 9 call 1% 10 return 1% Total 96% 65 Conditional Branch Distance Distance from branch in log( instructions) 35% of integer branches are –4..+3 instructions • 40% 35% 30% 25% Integer Floating Point 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of bits needed to specify distance between target and branch Chapter 2 66 Conditional Branches PC-relative since most branches are relatively close to the current PC address At least 8 bits suggested (± 128 instructions) Compare Equal/Not Equal most important for integer programs (86%) 7% LT/GE Branch 40% Int Avg. FP Avg. 7% GT/LE 23% comparison 86% EQ/NE 37% 0% 20% 40% 60% 80% 100% Frequency of comparison types in branches Chapter 2 67 Summary: ISA Desires vs. MIPS Use general purpose registers with a load-store architecture: Yes Provide at least 16 general purpose registers plus separate floating-point registers: 31 GPR & 32 FPR Support these addressing modes: displacement (with address offset of 12 – 16 bits), immediate (size 8–16 bits), and register deferred: Yes: 16-bits for immediate, displacement; (disp=0 => register deferred) All addressing modes apply to all data transfer instructions : Yes Use fixed instruction encoding if interested in performance and use variable instruction encoding if interested in code size : Fixed Support these data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and 64-bit floating point numbers: Yes Support these simple instructions, since they will dominate the number of instructions executed: load, store, add, subtract, move register-register, and, shift, compare equal, compare not equal, branch (with a PC-relative address at least 8-bits long), jump, call, and return: Yes, 16b branch offsets, simple branch compares Aim for a minimalist instruction set: Yes Chapter 2 68 Stack example Int leaf_example (int g, int h, int I, int j) { int f; f = (g +h) – (I + j); return f; } Answer • • • Chapter 2 We have 4 arguments and one return value 3 registers will be used $s0, $t0, and $t1 Create a space for three registers in the stack 69 Stack example cont. Sub $sp, $sp, 12 # make room Sw $t1, 8($sp) Sw $t0,4($sp) Sw $s0, 0($sp) Add $t0, $a0, $a1 Add $t1, $a2, $a3 Sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) lw $t0, 4($sp) lw $t1, 8($sp) add $sp, $sp, 12 jr $ra # restore old values High address $sp $sp Contents of register $t1 Contents of register $t0 $sp Low address Chapter 2 a. Contents of register $s0 b. c. 70