Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EI 209 Computer Organization Fall 2015 Chapter 2: Instructions: Language of the Computer Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ ) Course Website: http://nsec.sjtu.edu.cn/teaching/ei209.html [Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2012, MK] EI209 Chapter 2.1 Haojin Zhu, SJTU, 2016 Assignment I Assignment I Chapter I: 1.16 Coding Assignment I: Design a program to check if your computer is big endian or little endian? Please submit the source code and the execution results. Submission Due: Oct. 8 EI209 Chapter 2.2 Haojin Zhu, SJTU, 2016 The Language a Computer Understands Word a computer understands: instruction Vocabulary of all words a computer understands: instruction set (aka instruction set architecture or ISA) Different computers may have different vocabularies (i.e., different ISAs) iPhone (ARM) not same as Macbook (x86) Or the same vocabulary (i.e., same ISA) iPhone and iPad computers have same instruction set (ARM) EI209 Chapter 2.3 Haojin Zhu, SJTU, 2016 The Language a Computer Understands Why not all the same? Why not all different? What might be pros and cons? Single ISA (to rule them all): - Leverage common compilers, operating systems, etc. - BUT fairly easy to retarget these for different ISAs (e.g., Linux, gcc) Multiple ISAs: - Specialized instructions for specialized applications - Different tradeoffs in resources used (e.g., functionality, memory demands, complexity, power consumption, etc.) - Competition and innovation is good, especially in emerging environments (e.g., mobile devices) EI209 Chapter 2.4 Haojin Zhu, SJTU, 2016 MIPS: Instruction Set for EI209 MIPS is a real-world ISA (see www.mips.com) Standard instruction set for networking equipment Was also used in original Nintendo-64! Elegant example of a Reduced Instruction Set Computer (RISC) instruction set Invented by John Hennessy @ Stanford Why not Berkeley/Sun RISC invented by Dave Patterson? Ask him! EI209 Chapter 2.5 Haojin Zhu, SJTU, 2016 RISC Design Principles Basic RISC principle: “A simpler CPU (the hardware that interprets machine language) is a faster CPU” (CPU Core) Focus of the RISC design is reduction of the number and complexity of instructions in the ISA A number of the more common strategies include: Fixed instruction length, generally a single word; Simplifies process of fetching instructions from memory Simplified addressing modes; Simplifies process of fetching operands from memory Fewer and simpler instructions in the instruction set; Simplifies process of executing instructions Only load and store instructions access memory; E.g., no add memory to register, add memory to memory, etc. Let the compiler do it. Use a good compiler to break complex highlevel language statements into a number of simple assembly language statements EI209 Chapter 2.6 Haojin Zhu, SJTU, 2016 Mainstream ISAs ARM (Advanced RISC Machine) is most popular RISC In every smart phone-like device (e.g., iPhone, iPad, iPod, …) Intel 80x86 is another popular ISA and is used in Macbook and PCs (Core i3, Core i5, Core i7, …) x86 is a Complex Instruction Set Computer (CISC) 20x ARM sold vs. 80x86 (i.e., 5 billion vs. 0.3 billion) EI209 Chapter 2.7 Haojin Zhu, SJTU, 2016 MIPS Green Card EI209 Chapter 2.8 Fall 2012 -- Lecture #6 8 Haojin Zhu, SJTU, 2016 MIPS Green Card EI209 Chapter 2.9 Fall 2012 -- Lecture #6 9 Haojin Zhu, SJTU, 2016 Two Key Principles of Machine Design 1. Instructions are represented as numbers and, as such, are indistinguishable from data 2. Programs are stored in alterable memory (that can be read or written to) Memory just like data Stored-program concept Programs can be shipped as files of binary numbers – binary compatibility Computers can inherit ready-made software provided they are compatible with an existing ISA – leads industry to align around a small number of ISAs EI209 Chapter 2.10 Accounting prg (machine code) C compiler (machine code) Payroll data Source code in C for Acct prg Haojin Zhu, SJTU, 2016 MIPS-32 ISA Registers Instruction Categories Computational Load/Store Jump and Branch Floating Point - R0 - R31 coprocessor PC HI Memory Management Special LO 3 Instruction Formats: all 32 bits wide op rs rt op rs rt op EI209 Chapter 2.11 rd sa immediate jump target funct R format I format J format Haojin Zhu, SJTU, 2016 MIPS (RISC) Design Principles Simplicity favors regularity Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes Make the common case fast fixed size instructions small number of instruction formats opcode always the first 6 bits arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands Good design demands good compromises three instruction formats EI209 Chapter 2.12 Haojin Zhu, SJTU, 2016 MIPS Arithmetic Instructions MIPS assembly language arithmetic statement add $t0, $s1, $s2 sub $t0, $s1, $s2 Each arithmetic instruction performs one operation Each specifies exactly three operands that are all contained in the datapath’s register file ($t0,$s1,$s2) destination source1 op source2 0 0x22 Instruction Format (R format) 0 EI209 Chapter 2.14 17 18 8 Haojin Zhu, SJTU, 2016 MIPS Instruction Fields MIPS fields are given names to make them easier to refer to op rs rt rd shamt funct op 6-bits opcode that specifies the operation rs 5-bits register file address of the first source operand rt 5-bits register file address of the second source operand rd 5-bits register file address of the result’s destination shamt 5-bits shift amount (for shift instructions) funct function code augmenting the opcode EI209 Chapter 2.15 6-bits Haojin Zhu, SJTU, 2016 Computer Hardware Operands High-Level Programming languages: could have millions of variables Instruction sets have fixed, small number Called registers “Bricks” of computer hardware Fastest way to store data in computer hardware Visible to (the “assembly language”) programmer MIPS Instruction Set has 32 integer registers EI209 Chapter 2.16 Haojin Zhu, SJTU, 2016 Why Just 32 Registers? RISC Design Principle: Smaller is faster But you can be too small … Hardware would likely be slower with 64, 128, or 256 registers 32 is enough for compiler to translate typical C programs, and not run out of registers very often ARM instruction set has only 16 registers May be faster, but compiler may run out of registers too often (aka “spilling registers to memory”) EI209 Chapter 2.17 Haojin Zhu, SJTU, 2016 Size of Registers Bit is the atom of Computer Hardware: contains either 0 or 1 True “alphabet” of computer hardware is 0, 1 Will eventually express MIPS instructions as combinations of 0s and 1s (in Machine Language) MIPS registers are 32 bits wide MIPS calls this quantity a word Some computers use 16-bit or 64-bit wide words E.g., Intel 8086 (16-bit), MIPS64 (64-bit) EI209 Chapter 2.18 Haojin Zhu, SJTU, 2016 Some Examples of 64-bit processors What is the first 64-bit Apple phone? Iphone 5s EI209 Chapter 2.19 What is the first 64-bit android phone? HTC Desire 820 Haojin Zhu, SJTU, 2016 MIPS Register File Register File Holds thirty-two 32-bit registers Two read ports and One write port Registers are Faster than main memory src1 addr src2 addr dst addr write data 32 bits 5 32 src1 data 5 5 32 locations 32 src2 32 data - But register files with more locations write control are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file) - Read/write port increase impacts speed quadratically Easier for a compiler to use - e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack Can hold variables so that - code density improves (since register are named with fewer bits than a memory location) EI209 Chapter 2.20 Haojin Zhu, SJTU, 2016 Aside: MIPS Register Convention Name Register Number $zero 0 $at 1 $v0 - $v1 2-3 $a0 - $a3 4-7 $t0 - $t7 8-15 $s0 - $s7 16-23 $t8 - $t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 EI209 Chapter 2.21 Usage Preserve on call? constant 0 (hardware) n.a. reserved for assembler n.a. returned values no arguments yes temporaries no saved values yes temporaries no global pointer yes stack pointer yes frame pointer yes return addr (hardware) yes Haojin Zhu, SJTU, 2016 MIPS Memory Access Instructions MIPS has two basic data transfer instructions for accessing memory lw $t0, 4($s3) #load word from memory sw $t0, 8($s3) #store word to memory The data is loaded into (lw) or stored from (sw) a register in the register file – a 5 bit address The memory address – a 32 bit address – is formed by adding the contents of the base address register to the offset value A 16-bit field meaning access is limited to memory locations within a region of 213 or 8,192 words (215 or 32,768 bytes) of the address in the base register EI209 Chapter 2.22 Haojin Zhu, SJTU, 2016 Machine Language - Load Instruction Load/Store Instruction Format (I format): lw $t0, 24($s3) 35 19 8 2410 Memory 2410 + $s3 = . . . 0001 1000 + . . . 1001 0100 . . . 1010 1100 = 0x120040ac 0xf f f f f f f f 0x120040ac $t0 0x12004094 $s3 data EI209 Chapter 2.23 0x0000000c 0x00000008 0x00000004 0x00000000 word address (hex) Haojin Zhu, SJTU, 2016 Byte Addresses Since 8-bit bytes are so useful, most architectures address individual bytes in memory Alignment restriction - the memory address of a word must be on natural word boundaries (a multiple of 4 in MIPS-32) Big Endian: leftmost byte is word address IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA Little Endian: rightmost byte is word address Intel 80x86, DEC Vax, DEC Alpha (Windows NT) 3 2 1 little endian byte 0 0 msb 0 big endian byte 0 EI209 Chapter 2.24 lsb 1 2 3 Haojin Zhu, SJTU, 2016 Aside: Loading and Storing Bytes MIPS provides special instructions to move bytes lb $t0, 1($s3) #load byte from memory sb $t0, 6($s3) #store byte to 0x28 19 8 memory 16 bit offset What 8 bits get loaded and stored? load byte places the byte from memory in the rightmost 8 bits of the destination register - what happens to the other bits in the register? store byte takes the byte from the rightmost 8 bits of a register and writes it to a byte in memory - what happens to the other bits in the memory word? EI209 Chapter 2.25 Haojin Zhu, SJTU, 2016 Example of Loading and Storing Bytes Given following code sequence and memory state what is the state of the memory after executing the code? add lb sb $s3, $zero, $zero $t0, 1($s3) $t0, 6($s3) What value is left in $t0? Memory 0x 0 0 0 0 0 0 0 0 24 0x 0 0 0 0 0 0 0 0 20 0x 0 0 0 0 0 0 0 0 16 0x 1 0 0 0 0 0 1 0 12 0x 0 1 0 0 0 4 0 2 8 0x F F F F F F F F 4 0x 0 0 9 0 1 2 A 0 0 Data EI209 Chapter 2.26 What word is changed in Memory and to what? What if the machine was little Endian? Word Address (Decimal) Haojin Zhu, SJTU, 2016 Example of Loading and Storing Bytes Given following code sequence and memory state what is the state of the memory after executing the code? add lb sb $s3, $zero, $zero $t0, 1($s3) $t0, 6($s3) What value is left in $t0? Memory $t0 = 0x00000090 0x 0 0 0 0 0 0 0 0 24 0x 0 0 0 0 0 0 0 0 20 0x 0 0 0 0 0 0 0 0 16 0x 1 0 0 0 0 0 1 0 12 0x 0 1 0 0 0 4 0 2 8 0x F F F F F F F F 4 0x 0 0 9 0 1 2 A 0 0 Data EI209 Chapter 2.27 What word is changed in Memory and to what? mem(4) = 0xFFFF90FF What if the machine was little Endian? $t0 = 0x00000012 Word Address (Decimal) mem(4) = 0xFF12FFFF Haojin Zhu, SJTU, 2016 Speed of Registers vs. Memory Given that Registers: 32 words (128 Bytes) Memory: Billions of bytes (2 GB to 8 GB on laptop) and the RISC principle is… Smaller is faster How much faster are registers than memory?? About 100-500 times faster! in terms of latency of one access EI209 Chapter 2.28 Haojin Zhu, SJTU, 2016 MIPS Immediate Instructions Small constants are used often in typical code Possible approaches? put “typical constants” in memory and load them create hard-wired registers (like $zero) for constants like 1 have special instructions that contain constants ! addi $sp, $sp, 4 #$sp = $sp + 4 slti $t0, $s2, 15 #$t0 = 1 if $s2<15 Machine format (I format): 0x0A 18 8 0x0F The constant is kept inside the instruction itself! Immediate format limits values to the range +215–1 to -215 what about upper 16 bits? EI209 Chapter 2.29 Haojin Zhu, SJTU, 2016 Aside: How About Larger Constants? We'd also like to be able to load a 32 bit constant into a register, for this we must use two instructions a new "load upper immediate" instruction lui $t0, 1010101010101010 16 0 8 10101010101010102 Then must get the lower order bits right, use ori $t0, $t0, 1010101010101010 1010101010101010 0000000000000000 0000000000000000 1010101010101010 1010101010101010 1010101010101010 why can’t addi be used as the second instruction for this 32 bit constant? EI209 Chapter 2.30 Haojin Zhu, SJTU, 2016