Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Assembly Process Basically how does it all work The Assembly Process Assembly code Assembler Machine code • A computer understands machine code - binary • People (and compilers) write assembly language CMPE12c 2 Gabriel Hugh Elkaim The Assembly Process An assembler is a program that translates each instruction to its binary machine code equivalent. •It is a relatively simple program •There is a one-to-one or near one-to-one correspondence between assembly language instructions and machine language instructions. •Assemblers do some code manipulation •Like MAL to TAL •Label resolution •A “macro assembler” can process simple macros like puts, or preprocessor directives. CMPE12c 3 Gabriel Hugh Elkaim MAL Æ TAL MAL is the set of instructions accepted by the assembler. TAL is a subset of MAL – the instructions that can be directly turned into machine code. •There are many MAL instructions that have no single TAL equivalent. •To determine whether an instruction is a TAL instruction or not: •Look in appendix C or on the MAL/TAL sheet. •The assembler takes (non MIPS) MAL instructions and synthesizes them into 1 or more MIPS instructions. CMPE12c 4 Gabriel Hugh Elkaim MAL Æ TAL For example mul $8, $17, $20 Becomes mult $17, $20 mflo $8 •MIPS has 2 registers for results from integer multiplication and division: HI and LO •Each is a 32 bit register •mult and multu places the least significant 32 bits of its result into LO, and the most significant into HI. •Multiplying two 32-bit numbers gives a 64-bit result •(232 – 1)(232 – 1) = 264 – 2x232 - 1 CMPE12c 5 Gabriel Hugh Elkaim MAL Æ TAL mflo, mtlo, mfhi, mthi Move From lo Move To hi •Data is moved into or out of register HI or LO •One operand is needed to tell where the data is coming from or going to. •For division (div or divu) •HI gets the remainder •LO gets the dividend •Why aren’t these just put in $0-$31 directly? CMPE12c 6 Gabriel Hugh Elkaim MAL Æ TAL TAL has only base displacement addressing So this: lw $8, label Becomes: la $7, label lw $8, 0($7) Which becomes lui $8, 0xMSPART of label ori $8, $8, 0xLSpart of label lw $8, 0($8) CMPE12c 7 Gabriel Hugh Elkaim MAL Æ TAL Instructions with immediate values are synthesized with other instructions So: add $sp, $sp, 4 Becomes: addi $sp, $sp, 4 For TAL: •add requires 3 operands in registers. •addi requires 2 operands in registers and one operand that is an immediate. •In MIPS assembly immediate instructions include: •addi, addiu, andi, lui, ori, xori •Why not more? CMPE12c 8 Gabriel Hugh Elkaim MAL Æ TAL TAL implementation of I/O instructions This: putc $18 # if you got to use macros Becomes: addi add syscall CMPE12c $2, $0, 11 $4, $18, $0 # code for putc # put character argument in $4 # ask operating system to do a function 9 Gabriel Hugh Elkaim MAL Æ TAL getc $11 Becomes: addi syscall add done Becomes: $2, $0, 12 addi syscall $2, $0, 10 $11, $0, $2 puts $13 Becomes: addi add syscall CMPE12c $2, $0, 4 $4, $0, $13 10 Gabriel Hugh Elkaim MAL Æ TAL MAL TAL Arithmetic Instructions: move $4, $3 add $4, $3, $0 add $4, $3, 15 addi $4, $3, 15 # also andi, ori, .. mul $8, $9, $10 mult $9, $10 mflo $8 div $8, $9, $10 div $9, $10 #HI –– LO Å product # never overflow # $8 Å $L0, ignore $HI! # $LO Å quotient # $HI Å remainder mflo $8 rem $8, $9, $10 CMPE12c div $9, $10 mfhi $8 11 Gabriel Hugh Elkaim MAL Æ TAL MAL TAL Branch Instructions: bltz, bgez, blez, bgtz, beqz, bnez, blt, bge, bgt, beq, bne bltz, bgez, blez, bgtz, beq, bne beqz $4, loop beq $4, $0, loop blt $4, $5, target slt $t0, $4, $5 # $t0 is 1 if $4 < $5 # $t0 is 0 otherwise bne $t0, $0, target CMPE12c 12 Gabriel Hugh Elkaim Assembler The assembler will: •Assign addresses •Generate machine code If necessary, the assembler will: •Translate (synthesize) from the accepted assembly to the instructions available in the architecture •Provide macros and other features •Generate an image of what memory must look like for the program to be executed. CMPE12c 13 Gabriel Hugh Elkaim Assembler What should the assembler do when it sees a directive? • • • • .data .text .space, .word, .byte, .float main: How is the memory image formed? CMPE12c 14 Gabriel Hugh Elkaim Assembler Example Data Declaration a1: a2: a3: .data .word 3 .byte ‘\n’ .space 5 Address 0x00001000 0x00001004 0x00001008 0x0000100c Contents 0x00000003 0x??????0a 0x???????? 0x???????? •Assembler aligns data to word addresses unless told not to. •Assembly process is very sequential. CMPE12c 15 Gabriel Hugh Elkaim Machine code generation Assembly language: addi $8, $20, 15 immediate opcode rt rs Machine code format: 31 0 opcode rs rt immediate •opcode is 6 bits – addi is defined to be 001000 •rs – source register is 5 bits, encoding of 20, 10100 •rt – target register is 5 bits, encoding of 8, 01000 The 32-bit instruction for addi $8, $20, 15 is: 001000 10100 01000 0000000000001111 Or 0x2288000f CMPE12c 16 Gabriel Hugh Elkaim Instruction Formats I-Type Instructions with 16-bit immediates •ADDI, ORI, ANDI, … OPC:6 rs1:5 rd:5 immediate:16 •LW, SW OPC:6 rs1:5 rs2/rd OPC:6 rs1:5 displacement:16 •BNE CMPE12c rs2:5 17 distance(instr):16 Gabriel Hugh Elkaim Instruction Formats J-Type Instructions with 26-bit immediate •J, JAL OPC:6 26-bits of jump address R-Type All other instructions •ADD, AND, OR, JR, JALR, SYSCALL, MULT, MFHI, LUI, SLT OPC:6 CMPE12c rs1:5 rs2:5 rd:5 ALU function:11 18 Gabriel Hugh Elkaim Assembly Example a1: a2: a3: .data .word .word .word “Symbol Table” 3 16:4 5 Symbol a1 0040 0000 .text a2 0040 0004 la $6, a2 lw $7, 4($6) mul $8, $9, $10 b loop done a3 0040 0014 main 0080 0000 loop 0080 0008 main: loop: CMPE12c Address 19 Gabriel Hugh Elkaim Assembly Example Memory map of .data section address Contents (hex) 0040 0000 0000 0003 0000 0000 0000 0000 0000 0000 0000 0011 0040 0004 0000 0010 0000 0000 0000 0000 0000 0000 0001 0000 0040 0008 0000 0010 0000 0000 0000 0000 0000 0000 0001 0000 0040 000c 0000 0010 0000 0000 0000 0000 0000 0000 0001 0000 0040 0010 0000 0010 0000 0000 0000 0000 0000 0000 0001 0000 0040 0014 0000 0005 0000 0000 0000 0000 0000 0000 0000 0101 CMPE12c Contents (binary) 20 Gabriel Hugh Elkaim Assembly Example Translation of MAL to TAL code .text main: lui $6, 0x0040 ori $6, $6, 0x0004 loop: lw $7, 4($6) mult $9, $10 mflo $8 beq $0, $0, loop ori $2, $0, 10 syscall CMPE12c 21 # la $6, a2 # mul $8, $9, $10 # b loop # done Gabriel Hugh Elkaim Assembly Example Memory map of .text section address Contents (hex) 0080 0000 3c06 0040 0011 1100 0000 0110 0000 0000 0100 0000 (lui) 0080 0004 34c6 0004 0011 0100 1100 0110 0000 0000 0000 0100 (ori) 0080 0008 8cc7 0004 1000 1100 1100 0111 0000 0000 0000 0100 (lw) 0080 000c 012a 0018 0000 0001 0010 1010 0000 0000 0001 1000 (mult) 0080 0010 0000 4012 0000 0000 0000 0000 0100 0000 0001 0010 (mflo) 0080 0014 1000 fffc 0001 0000 0000 0000 1111 1111 1111 1100 (beq) 0080 0018 3402 000a 0011 0100 0000 0010 0000 0000 0000 1010 (ori) 0080 001C 0000 000c 0000 0000 0000 0000 0000 0000 0000 1100 (sys) CMPE12c Contents (binary) 22 Gabriel Hugh Elkaim Assembly Example Branch offset computation At execution time: PC Å NPC + —sign extended offset field,00˝ •PC points to instruction after the beq when offset is added. At assembly time: Byte offset CMPE12c = target addr – (address of branch + 4) = 00800008 – (00800010 + 00000004) = FFFFFFF4 (-12) 23 Gabriel Hugh Elkaim Assembly Example 4 important observations: • Offset is stored in the instruction as a word offset • An offset may be negative • The field dedicated to the offset is 16 bits, range is thus limited • More simply: Just count the number of instructions from instruction following branch to target, encode that as a 16-bit value CMPE12c 24 Gabriel Hugh Elkaim Assembly Jump target computation At execution time: PC Å —most significant 4 bits of PC, target field, 00˝ At assembly time: •Take 32 bit target address •Eliminate least significant 2 bits (since word aligned) •Eliminate most significant 4 bits •What remains is 26 bits, and goes in the target field CMPE12c 25 Gabriel Hugh Elkaim Linking N’ Loading The process of building/configuring the executable, placing it in memory, and running it. CMPE12c 26 Gabriel Hugh Elkaim Linking and Loading Linker •Searches libraries •Reads object files •Relocates code/data •Resolves external references •Creates object file CMPE12c 27 Gabriel Hugh Elkaim Linking and Loading Loader • • • • • • Creates address spaces for text & data Copies text & data in memory Initializes stack and copy args Initializes regs (maybe) Initializes other things (OS) Jumps to startup routine – And then to address of “main:” CMPE12c 28 Gabriel Hugh Elkaim Linking and Loading Object file Section: Description: Header Start/size of other parts Text Machine Language Data Static data – size and initial values Relocation info Instructions and data with absolute addresses Symbol table Addresses of external labels Debuggin‘ info Break points CMPE12c 29 Gabriel Hugh Elkaim Linking and Loading •The data section starts at 0x0040 0000 for the MIPS processor. •If the source code has, a1: a2: .data .word 15 .word –2 then the assembler specifies initial configuration memory as address: 0x00400000 0x00400004 contents: 0000 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111 1111 1111 1111 1111 1110 •Like the data, the code needs to be placed starting at a specific location to make it work CMPE12c 30 Gabriel Hugh Elkaim Linking and Loading Consider the case where the assembly language code is split across 2 files. Each is assembled separately. File2: File 1: .data a3: .word 0 .data a1: .word 15 a2: .word –2 .text main: CMPE12c .text proc5: la $t0, a1 add $t1, $t0, $s3 jal proc5 done 31 lw $t6, a1 sub $t2, $t0, $s4 jr $ra Gabriel Hugh Elkaim Linking and Loading What happens to… • a1 • • • • • • CMPE12c a3 main proc5 lw la jal 32 Gabriel Hugh Elkaim Linking and Loading Problem: there are absolute addresses in the machine code. Solutions: 1. Only allow a single source file • Why not? 2. Allow linking and loading to • Relocate pieces of data and code sections • Finish the machine code where symbols were left undefined • Basically makes absolute address a relative address CMPE12c 33 Gabriel Hugh Elkaim Linking and Loading The assembler will: •Start both data and code sections at address 0, for all files. •Keep track of the size of every data and code section. •Keep track of all absolute addresses within the file. CMPE12c 34 Gabriel Hugh Elkaim Linking and Loading Linking and loading will: • Assign starting addresses for all data and code sections, based on their sizes. • The blocks of data and code go at nonoverlapping locations. • Fix all absolute addresses in the code • Place the linked code and data in memory at the location assigned • Start it up CMPE12c 35 Gabriel Hugh Elkaim MIPS Example Code levels of abstraction (from James Larus) “C” code #include <stdio.h> int main (int argc, char *argv[]) { int I; int sum = 0; for (I=0; I<=100; I++) sum += I * I; printf (“The sum 0..100=%d“n”,sum); } Compile this HLL into a machine’s assembly language with the compiler. CMPE12c 36 Gabriel Hugh Elkaim MIPS Example Converted into MAL… str: .data .asciiz “The sum 0..100=%d\n” .text sw ble la lw jal move lw addu jr main: subu sw sw sw sw $sp, 32 $31, 20($sp) $4, 32($sp) $0, 24($sp) $0, 28($sp) lw mul lw addu $14, $15, $24, $25, loop: CMPE12c 28($sp) $14, $14 24($sp) $24, $15 37 $8, 28($sp) $8, 100, loop $4, str $5, 24($sp) printf $2, $0 $31, 20($sp) $sp, 32 $31 Gabriel Hugh Elkaim MIPS Example Now resolve the labels and convert to MIPS… addiu sw sw sw sw sw lw lw multu addiu slti sw mflo addu bne sw CMPE12c lui lw jal addiu lw addiu jr $sp, $sp,-32 $ra, 20($sp) $a0, 32($sp) $a1, 36($sp) $0, 24($sp) $0, 28($sp) t6, 28($sp) $t8, 24($sp) $t6, $t6 $t0, $t6, 1 $at, $t0, 101 $t0, 28($sp) $t7 $t9, $t8, $t7 $at, $0, -9 $t9, 24($sp) $a0,4096 $a1, 24($sp) 1048812 $a0, $a0, 1072 $ra, 20($sp) $sp, $sp, 32 $ra Which the assembler then translates into binary machine code for instructions and data. 38 Gabriel Hugh Elkaim MIPS Example Real MIPS Machine language CMPE12c 00100111101111011111111111100000 10101111101111110000000000010100 10101111101001000000000000100000 10101111101001010000000000100100 10101111101000000000000000011000 10101111101000000000000000011100 10001111101011100000000000011100 10001111101110000000000000011000 00000001110011100000000000011001 00100101110010000000000000000001 00101001000000010000000001100101 10101111101010000000000000011100 00000000000000000111100000010010 00000011000011111100100000100001 00010100001000001111111111110111 10101111101110010000000000011000 00111100000001000001000000000000 10001111101001010000000000011000 00001100000100000000000011101100 00100100100001000000010000110000 10001111101111110000000000010100 00100111101111010000000000100000 00000011111000000000000000001000 00000000000000000001000000100001 39 Gabriel Hugh Elkaim