Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What does this code do? label: sub bne $a0, $a0, 1 $a0, $zero, label Arf We’ll finish up talking about memory We’ll go into more detail about the ISA. — Pseudo-instructions — Using branches for conditionals May 14, 2017 1 Pseudo-instructions MIPS assemblers support pseudo-instructions that give the illusion of a more expressive instruction set, but are actually translated into one or more simpler, “real” instructions. In addition to the la (load address) we saw on last lecture, you can use the li and move pseudo-instructions: li move $a0, 2000 $a1, $t0 # Load immediate 2000 into $a0 # Copy $t0 into $a1 They are probably clearer than their corresponding MIPS instructions: addi add $a0, $0, 2000 $a1, $t0, $0 # Initialize $a0 to 2000 # Copy $t0 into $a1 We’ll see lots more pseudo-instructions this semester. — A complete list of instructions is given in Appendix A of the text. — Unless otherwise stated, you can always use pseudo-instructions in your assignments and on exams. May 14, 2017 2 Control flow in high-level languages The instructions in a program usually execute one after another, but it’s often necessary to alter the normal control flow. Conditional statements execute only if some test expression is true. // Find the absolute value of *a0 v0 = *a0; if (v0 < 0) v0 = -v0; // This might not be executed v1 = v0 + v0; Loops cause some statements to be executed many times. // Sum the elements of a five-element array a0 v0 = 0; t0 = 0; while (t0 < 5) { v0 = v0 + a0[t0]; // These statements will t0++; // be executed five times } May 14, 2017 3 Control-flow graphs It can be useful to draw control-flow graphs when writing loops and conditionals in assembly: // Find the absolute value of *a0 v0 = *a0; if (v0 < 0) v0 = -v0; v1 = v0 + v0; // Sum the elements of a0 v0 = 0; t0 = 0; while (t0 < 5) { v0 = v0 + a0[t0]; t0++; } May 14, 2017 4 MIPS control instructions In section, we introduced some of MIPS’s control-flow instructions j bne and beq slt and slti // for unconditional jumps // for conditional branches // set if less than (w/ and w/o an immediate) And how to implement loops — You went to section, right? Today, we’ll talk about — MIPS’s pseudo branches — if/else — case/switch (bonus material) May 14, 2017 5 Pseudo-branches The MIPS processor only supports two branch instructions, beq and bne, but to simplify your life the assembler provides the following other branches: blt ble bgt bge $t0, $t0, $t0, $t0, $t1, $t1, $t1, $t1, L1 L2 L3 L4 // // // // Branch Branch Branch Branch if if if if $t0 $t0 $t0 $t0 < $t1 <= $t1 > $t1 >= $t1 There are also immediate versions of these branches, where the second source is a constant instead of a register. Later this semester we’ll see how supporting just beq and bne simplifies the processor design. May 14, 2017 6 Implementing pseudo-branches Most pseudo-branches are implemented using slt. For example, a branchif-less-than instruction blt $a0, $a1, Label is translated into the following. slt bne $at, $a0, $a1 $at, $0, Label // $at = 1 if $a0 < $a1 // Branch if $at != 0 This supports immediate branches, which are also pseudo-instructions. For example, blti $a0, 5, Label is translated into two instructions. slti $at, $a0, 5 bne $at, $0, Label // $at = 1if $a0 < 5 // Branch if $a0 < 5 All of the pseudo-branches need a register to save the result of slt, even though it’s not needed afterwards. — MIPS assemblers use register $1, or $at, for temporary storage. — You should be careful in using $at in your own programs, as it may be overwritten by assembler-generated code. May 14, 2017 7 Translating an if-then statement We can use branch instructions to translate if-then statements into MIPS assembly code. v0 = *a0; if (v0 < 0) v0 = -v0; v1 = v0 + v0; lu $v0, 0($a0) bgei $v0, 0, skip sub $v0, $0, $v0 skip: add $v1, $v0, $v0 Sometimes it’s easier to invert the original condition. — In this case, we changed “continue if v0 < 0” to “skip if v0 >= 0”. — This saves a few instructions in the resulting assembly code. May 14, 2017 8 Control-flow Example Let’s write a program to count how many bits are set in a 32-bit word. int in = 0xabcdabcd; int count=0; for ( int i=0; i<32; i++ ) { if ( in & 1 ) count++; in = in >> 1; } May 14, 2017 9 Translating an if-then-else statements If there is an else clause, it is the target of the conditional branch — And the then clause needs a jump over the else clause // increase the magnitude of v0 by one if (v0 < 0) bge $v0, v0 --; sub $v0, j L else v0 ++; E: add $v0, v1 = v0; L: move $v1, $0, E $v0, 1 $v0, 1 $v0 Dealing with else-if code is similar, but the target of the first branch will be another if statement. — Drawing the control-flow graph can help you out. May 14, 2017 10 Case/Switch Statement Many high-level languages support multi-way branches, e.g. switch case case case case } (two_bits) { 0: break; 1: /* fall through */ 2: count ++; break; 3: count += 2; break; We could just translate the code to if, thens, and elses: if ((two_bits == 1) || (two_bits == 2)) { count ++; } else if (two_bits == 3) { count += 2; } This isn’t very efficient if there are many, many cases. May 14, 2017 11 Case/Switch Statement switch case case case case } (two_bits) { 0: break; 1: /* fall through */ 2: count ++; break; 3: count += 2; break; Alternatively, we can: 1. Create an array of jump targets 2. Load the entry indexed by the variable two_bits 3. Jump to that address using the jump register, or jr, instruction This is much easier to show than to tell. — (see the example with the lecture notes online) May 14, 2017 12 What does this C code do? int foo(char *s) { int L = 0; while (*s++) { ++L; } return L; } May 14, 2017 13 Machine Language and Pointers Today we’ll discuss machine language, the binary representation for instructions. — We’ll see how it is designed for the common case • Fixed-sized (32-bit) instructions • Only 3 instruction formats • Limited-sized immediate fields Array Indexing vs. Pointers — Pointer arithmetic, in particular May 14, 2017 14 Assembly vs. machine language So far we’ve been using assembly language. — We assign names to operations (e.g., add) and operands (e.g., $t0). — Branches and jumps use labels instead of actual addresses. — Assemblers support many pseudo-instructions. Programs must eventually be translated into machine language, a binary format that can be stored in memory and decoded by the CPU. MIPS machine language is designed to be easy to decode. — Each MIPS instruction is the same length, 32 bits. — There are only three different instruction formats, which are very similar to each other. — Eg. Machine Language code of following instruction is 10AC0003hex 000100 00101 01100 0000 0000 0000 0011 op rs rt address Studying MIPS machine language will also reveal some restrictions in the instruction set architecture, and how they can be overcome. May 14, 2017 15 R-type format Register-to-register arithmetic instructions use the R-type format. op rs rt rd shamt func 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits This format includes six different fields. — op is an operation code or opcode that selects a specific operation. — rs and rt are the first and second source registers. — rd is the destination register. — shamt is only used for shift instructions. — func is used together with op to select an arithmetic instruction. The inside back cover of the textbook lists opcodes and function codes for all of the MIPS instructions. May 14, 2017 16 About the registers We have to encode register names as 5-bit numbers from 00000 to 11111. — For example, $t8 is register $24, which is represented as 11000. — The complete mapping is given on page A-23 in the book. The number of registers available affects the instruction length. — Each R-type instruction references 3 registers, which requires a total of 15 bits in the instruction word. — We can’t add more registers without either making instructions longer than 32 bits, or shortening other fields like op and possibly reducing the number of available operations. May 14, 2017 17 I-type format Load, store, branch and immediate instructions all use the I-type format. op rs rt address 6 bits 5 bits 5 bits 16 bits For uniformity, op, rs and rt are in the same positions as in the R-format. The meaning of the register fields depends on the exact instruction. — rs is a source register—an address for loads and stores, or an operand for branch and immediate arithmetic instructions. — rt is a source register for branches and stores, but a destination register for the other I-type instructions. The address is a 16-bit signed two’s-complement value. — It can range from -32,768 to +32,767. — But that’s not always enough! May 14, 2017 18 Larger constants Larger constants can be loaded into a register 16 bits at a time. — The load upper immediate instruction lui loads the highest 16 bits of a register with a constant, and clears the lowest 16 bits to 0s. — An immediate logical OR, ori, then sets the lower 16 bits. To load the 32-bit value 0000 0000 0011 1101 0000 1001 0000 0000: lui $s0, 0x003D ori $s0, $s0, 0x0900 # $s0 = 003D 0000 (in hex) # $s0 = 003D 0900 This illustrates the principle of making the common case fast. — Most of the time, 16-bit constants are enough. — It’s still possible to load 32-bit constants, but at the cost of two instructions and one temporary register. Pseudo-instructions may contain large constants. Assemblers including SPIM will translate such instructions correctly. — Yay, SPIM!! May 14, 2017 19 Branches For branch instructions, the constant field is not an address, but an offset from the current program counter (PC) to the target address. L: beq add add j add $at, $0, L $v1, $v0, $0 $v1, $v1, $v1 Somewhere $v1, $v0, $v0 Since the branch target L is three instructions past the beq, the address field would contain 3. The whole beq instruction would be stored as: 000100 00001 00000 0000 0000 0000 0011 op rs rt address SPIM’s encoding of branches offsets is off by one, so the code it produces would contain an address of 4. (But it has a compensating error when it executes branches.) May 14, 2017 21 Larger branch constants Empirical studies of real programs show that most branches go to targets less than 32,767 instructions away—branches are mostly used in loops and conditionals, and programmers are taught to make code bodies short. If you do need to branch further, you can use a jump with a branch. For example, if “Far” is very far away, then the effect of: beq $s0, $s1, Far ... can be simulated with the following actual code. Next: bne $s0, $s1, Next j Far ... Again, the MIPS designers have taken care of the common case first. May 14, 2017 22 J-type format Finally, the jump instruction uses the J-type instruction format. op address 6 bits 26 bits The jump instruction contains a word address, not an offset — Remember that each MIPS instruction is one word long, and word addresses must be divisible by four. — So instead of saying “jump to address 4000,” it’s enough to just say “jump to instruction 1000.” — A 26-bit address field lets you jump to any address from 0 to 228. • your MP solutions had better be smaller than 256MB For even longer jumps, the jump register, or jr, instruction can be used. jr May 14, 2017 $ra # Jump to 32-bit address in register $ra 23 Representing strings A C-style string is represented by an array of bytes. — Elements are one-byte ASCII codes for each character. — A 0 value marks the end of the array. 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 May 14, 2017 space ! ” # $ % & ’ ( ) * + , . / 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 @ A B C D E F G H I J K L M N O 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 P Q R S T U V W X Y Z [ \ ] ^ _ 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 ` a b c d e f g h I j k l m n o 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 p q r s t u v w x y z { | } ~ del 24 Null-terminated Strings For example, “Harry Potter” can be stored as a 13-byte array. 72 97 H +1 a 114 114 121 r r y 32 80 P 111 116 116 101 114 o t t e r 0 \0 Since strings can vary in length, we put a 0, or null, at the end of the string. — This is called a null-terminated string Computing string length — We’ll look at two ways. May 14, 2017 25 Array Indexing Implementation of strlen $a0 int strlen(char *string) { int len = 0; while (string[len] != 0) { len ++; } return len; } May 14, 2017 strlen: li L: add lb beq addi j E: jr $v0, 0 $t0, $a0, $v0 $t0, 0($t0) $t0, $0, E $v0, $v0, 1 L $ra 26 Pointers & Pointer Arithmetic Many programmers have a vague understanding of pointers — Looking at assembly code is useful for their comprehension. int strlen(char *string) { int len = 0; while (string[len] != 0) { len ++; } return len; } May 14, 2017 int strlen(char *string) { int len = 0; while (*string != 0) { string ++; len ++; } return len; } strlen: li L: lb beq addi addi j E: jr $v0, 0 $t0, 0($a0) $t0, $0, E $a0, $a0, 1 $v0, $v0, 1 L $ra 27 What is a Pointer? A pointer is an address. Two pointers that point to the same thing hold the same address Dereferencing a pointer means loading from the pointer’s address A pointer has a type; the type tells us what kind of load to do — Use load byte (lb) for char * — Use load half (lh) for short * — Use load word (lw) for int * — Use load single precision floating point (l.s) for float * Pointer arithmetic is often used with pointers to arrays — Incrementing a pointer (i.e., ++) makes it point to the next element — The amount added to the point depends on the type of pointer • pointer = pointer + sizeof(pointer’s type) 1 for char *, 4 for int *, 4 for float *, 8 for double * May 14, 2017 28 What is really going on here… int strlen(char *string) { int len = 0; while (*string != 0) { string ++; len ++; } return len; } May 14, 2017 29 Summary Machine language is the binary representation of instructions: — The format in which the machine actually executes them MIPS machine language is designed to simplify processor implementation — Fixed length instructions — 3 instruction encodings: R-type, I-type, and J-type — Common operations fit in 1 instruction • Uncommon (e.g., long immediates) require more than one Pointers are just addresses!! — “Pointees” are locations in memory Pointer arithmetic updates the address held by the pointer — “string ++” points to the next element in an array — Pointers are typed so address is incremented by sizeof(pointee) May 14, 2017 30