Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
嵌入式處理器架構與程式設計 王建民 中央研究院 資訊所 2008年 7月 Contents Introduction Computer Architecture ARM Architecture Development Tools GNU Development Tools ARM Instruction Set ARM Assembly Language ARM Assembly Programming GNU ARM ToolChain Interrupts and Monitor 2 Lecture 4 Development Tools Outline Compilers Assemblers Linkers and Loaders Runtime Environment 4 What is a compiler? A program translator Source language E.g., C, C++, Java, Pascal Target language E.g., assembly language for x86, MIPS, ARM 5 Historical Background Machine language first 1957: First FORTRAN compiler 18 programmer-years of effort Extremely ad hoc Today’s techniques were created in response to the difficulties of implementing early compilers 6 Phases of a Compiler Analysis (“front end”) Synthesis (“back end”) Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Generation Intermediate Code Optimization Target Code Generation/Optimization Front & back ends share symbol table 7 Lexical Analysis Aka “scanning”, transform characters into tokens Example: TDOUBLE TIDENT TOP double f = sqrt(-1); TIDENT TLPAREN TOP TINTCONSTANT TRPAREN TSEP (“double”) (“f”) (“=“) (“sqrt”) (“(“) (“-”) (“1”) (“)”) (“;”) 8 Syntax Analysis Aka “parsing” Uses context-free grammars Structural validation Creates parse tree or derivation 9 Derivation of “sqrt(-1)” Expression -> UnaryExpression Expression -> FuncCall Expression -> TINTCONSTANT UnaryExpression -> TOP Expression FuncCall -> TIDENT TLPAREN Expression TRPAREN Expression -> FuncCall -> TIDENT TLPAREN -> TIDENT TLPAREN -> TIDENT TLPAREN -> TIDENT TLPAREN Expression TRPAREN UnaryExpression TRPAREN TOP Expression TRPAREN TOP TINTCONSTANT TRPAREN 10 Parse Tree of “sqrt(-1)” Expression FuncCall Expression UnaryExpression Expression TIDENT TLPAREN TOP TINTCONSTANT TRPAREN 11 Semantic Analysis “Does it make sense”? Checking semantic rules, such as Is variable declared? Are operand types compatible? Do function arguments match function declarations? Types 12 Intermediate Code Generation A program for an abstract machine Requirements Easy to generate from parse tree Easy to translate into target code A variety of forms Quadruple or three-address code Register transfer language 13 Intermediate Code Example Three-address code (TAC) j = 2 * i + 1; if (j >= n) j = 2 * i + 3; return a[j]; L0: t1 = 2 * i t2 = t1 + 1 j = t2 t3 = j < n if t3 goto L0 t4 = 2 * i t5 = t4 + 3 j = t5 t6 = a[j] return t6 14 Intermediate Code Optimization Inhibiting code generation of unreachable code segments Getting rid of unused variables Eliminating multiplication by 1 and addition by 0 Loop optimization Common sub-expression elimination . . ., etc. 15 Code Optimization Example Before L0: t1 = 2 * i t2 = t1 + 1 j = t2 t3 = j < n if t3 goto L0 t4 = 2 * i t5 = t4 + 3 j = t5 t6 = a[j] return t6 After t1 = 2 * i j = t1 + 1 t3 = j < n if t3 goto L0 j = t1 + 3 L0: t6 = a[j] return t6 16 Target Code Generation Example: a in %o0, i in %o1, n in %o2, j in %g2 t1 = 2 * i sll %o1, 1, %o1 j = t1 + 1 t3 = j < n if t3 goto L0 add cmp blt nop add j = t1 + 3 L0: t6 = a[j] return t6 delayed branch %o1, 1, %g2 %g2, %o2 .LL3 %o1, 3, %g2 .LL3: sll %g2, 2, %g2 retl ld [%o0+%g2], %o0 17 Pascal Example: Source Code PROGRAM STATS VAR SUM,SUMSQ,I,VALUE,MEAN,VARIANCE : INTEGER BEGIN SUM := 0; SUMSQ := 0; FOR I := 1 TO 100 DO BEGIN READ(VALUE); SUM := SUM + VALUE; SUMSQ := SUMSQ + VALUE * VALUE END; MEAN := SUM DIV 100; VARIANCE := SUMSQ DIV 100 – MEAN * MEAN; WRITE(MEAN,VARIANCE) END. 18 Pascal Example: Token Coding Token PROGRAM VAR BEGIN END END. INTEGER FOR READ WRITE TO DO Code 1 2 3 4 5 6 7 8 9 10 11 Token ; : , := + * DIV ( ) id int Code 12 13 14 15 16 17 18 19 20 21 22 23 19 Scanner Output: Token Stream I Line 1 2 3 4 5 Token type 1 22 2 22 14 22 14 22 14 22 14 22 14 22 13 6 3 22 15 Toke specifier Line ^STATS 6 ^SUM ^SUMSQ 7 ^I ^VALUE ^MEAN ^VARIANCE ^SUM 8 9 Token type 23 12 22 15 23 12 7 22 15 23 10 23 11 3 8 20 22 21 12 Toke specifier #0 ^SUMSQ #0 ^I #1 #100 ^VALUE 20 Scanner Output: Token Stream II Line 10 11 12 13 Token type 22 15 22 16 22 12 22 15 22 16 22 18 22 4 12 22 15 22 19 Toke specifier ^SUM Line ^SUM 14 ^VALUE ^SUMSQ ^SUMSQ ^VALUE ^VALUE 15 ^MEAN ^SUM 16 Token type 23 12 22 15 22 19 23 17 22 18 22 12 9 20 22 14 22 21 5 Toke specifier #100 ^VARIANCE ^SUMSQ #100 ^MEAN ^MEAN ^MEAN ^VARIANCE 21 Pascal Example: BNF Grammar 1 <prog> 2 3 4 5 6 7 8 9 10 11 <prog-name> <decl-list> <dec> <type> <id-list> <stmt-list> <stmt> <assign> <exp> <term> 12 13 14 15 16 17 <factor> <read> <write> <for> <index-exp> <body> ::= PROGRAM <prog-name> VAR <decl-list> BEGIN <stmt-list> END. ::= id ::= <dec> | <decl-list> ; <dec> ::= <id-list> : <type> ::= INTEGER ::= id | <id-list> , id ::= <stmt> | <stmt-list> ; <stmt> ::= <assign> | <read> | <write> | <for> ::= id := <exp> ::= <term> | <exp> + <term> | <exp> - <term> ::= <factor> | <term> * <factor> | <term> DIV <factor> ::= id | int | ( <exp> ) ::= READ ( <id-list> ) ::= WRITE ( <id-list> ) ::= FOR <index-exp> DO <body> ::= id := <exp> TO <exp> ::= <stmt> | BEGIN <stmt-list> END 22 Parser Output: Parse Tree I 23 Parser Output: Parse Tree II 24 Intermediate Code I Operation Op1 Op2 Result (1) := #0 SUM {SUM := 0} (2) := #0 SUMSQ {SUMSQ := 0} (3) := #1 I {FOR I := 1 TO 100} (4) JGT I (5) CALL XREAD (6) PARAM VALUE (7) + SUM (8) := i1 (9) * VALUE VALUE i2 (10) + SUMSQ i2 i3 (11) := i3 (12) + I #100 (15) {READ(VALUE)} VALUE i1 SUM {SUM := SUM + VALUE} {SUMSQ := SUMSQ + VALUE * VALUE} SUMSQ #1 i4 {end of FOR loop} 25 Intermediate Code II Operation Op1 (13) := Op2 i4 Result I (14) J (4) (15) DIV SUM #100 i5 (16) := i5 (17) DIV SUMSQ #100 i6 (18) * MEAN MEAN i7 (19) - i6 i7 i8 (20) := i8 (21) CALL XWRITE {WRITE (22) PARAM MEAN (MEAN,VARIANCE)} (23) PARAM VARIANCE VARIANCE)} MEAN {MEAN := SUM DIV 100} {VARIANCE := SUMSQ DIV 100 - MEAN * MEAN} VARIANCE 26 Assembly Code I 27 Assembly Code II 28 Compiler Issues Symbol Table Management Scoping Error Handling & Recovery Passes One-pass vs. multi-pass Most compilers are one-pass up to code optimization phase Several passes are usually required for code optimization 29 End-to-End Compilation source Lexical tokens code analysis IR gen. IR Syntax analysis AST IR optimization IR parse tree Semantic analysis Target code generation Assembly code Assembler object code Linker & executable Execution Loader code 30 Outline Compilers Assemblers Linkers and Loaders Runtime Environment 31 C versus Assembly Language C is called a “portable assembly language” Allows low level operations on bits and bytes Allows access to memory via use of pointers Integrates well with assembly language functions Advantages over assembly code Easier to read and understand source code Requires fewer lines of code for same function Doesn’t require knowledge of the hardware 32 C versus Assembly Language Good reasons for learning assembly language In time-critical sections of code, it is possible to improve performance with assembly language It is a good way to learn how a processor works In writing a new operating system or in porting an existing system to a new machine, there are sections of code which must be written in assembly language 33 Best of Both Worlds Integrating C and assembly code Convenient to let C do most of the work and integrate with assembly code where needed Make our gas routines callable from C Use C compiler conventions for function calls Preserve registers that C compiler expects saved 34 GNU vs. Intel Assembler1 In this course, we will be using the GNU assembler, referred to as “gas” Available on UNIX machines as “i386-as” The GNU assembler uses the AT&T syntax (instead of official Intel/Microsoft syntax) Text is written using Intel assembly language syntax which is not the same as GNU syntax Local references will provide gas notes for the text sections with Intel assembly language 35 GNU vs. Intel Assembler2 Overall, the follow are the key differences between the Intel and the gas syntax: The GNU operation codes have a size indicator that is not present on the Intel operation codes Intel: MOV gas: movb, movw, or movl The GNU operands are in the opposite order from the Intel operands Intel: MOV dest, source gas: movb source, dest 36 GNU vs. Intel Assembler3 The GNU register names are preceded by a % that is not present on the Intel register names GNU constants are represented differently from the Intel constant representations Intel: MOV AH, AL gas: movb %al, %ah Intel: MOV AL, 0AH gas: movb $0xa, %al Comments are indicated with # instead of ; Intel: ; comment here gas: # comment here 37 GNU vs. Intel Assembler4 You should familiarize yourself with both Intel and GNU assembly language syntax Even for GNU assembler, the syntax may not be the same on different platforms. You may need to use Intel syntax in your professional work someday 38 The Four Field Format1 The Label Field A label is a symbol followed by : Can be referred to as a representation of the address The ‘Opcode’ Field Mnemonic to specify the instruction and size Unnecessary to remember instruction code values Directives to guide the work of the assembler In GNU assembly language, directive begins with . 39 The Four Field Format2 The Operand Field(s) On which the instruction operates Zero, one, or two operands depending on the instruction The Comment Field Comment contains documentation It begins with a # anywhere and goes to the end of the line 40 Symbolic Constants Allow use of symbols for numeric values Perform same function as #define in C Format is: SYMBOL = value Example: NCASES = 8 movl $NCASES, %eax 41 Assembly Coding for a C Function General form for a C function in assembly: .globl .text _mycode _mycode: _mydata: . . . ret .data .long .end 17 42 Assembler Directives1 Defining a label for external reference (call) .globl _mycode Defining code section of program .text Defining data section of program .data End of the Assembly Language .end 43 Assembler Directives2 Defining / initializing storage locations: .long .word .byte 0x12345678 0x1234 0x12 # 32 bits # 16 bits # 8 bits Defining / initializing a string .ascii .asciz “Hello World\n\0” “Hello World\n” 44 C Function Coding Conventions Same function name as used in the calling C program except with a leading _ Use only %eax, %ecx, and %edx to avoid using registers that the C compiler expects to be preserved across a call and return Save/restore other registers on stack as needed Return value in %eax before “ret” instruction 45 Example #1: Sum Two Numbers C “driver” to execute sum2.s is called sum2c.c extern int sum2(void); int main(void) { printf(“Sum2 returned %d\n”, sum2()); return 0; } 46 Example #1: Assembly Code Assembly code for sum2.s #sum2.s -- Sum of .text .globl _sum2: movl $8, addl $3, ret two numbers _sum2 %eax %eax #number in eax 47 How to pass parameters? How would you modify the source code so that sum2(int a, int b) returns the sum of two integer parameters a and b? extern int sum2(int a,int b); int main(void) { printf(“3 + 8 = %d\n”, sum2(3,8)); return 0; } 48 Addressing Memory and I/O With gas, we’ve already seen operands with: % for registers (part of the processor itself) $ for immediate data (part of the instruction itself) Accessing memory versus I/O Memory I/O Read movb address, %al inb address, %al Write movb %al, address outb %al, address 49 Addressing Memory1 Direct addressing for memory Intel uses [ ] gas does not use any “operator” Example: .text movl %eax, 0x1234 movl 0x1234, %edx . . . 50 Addressing Memory2 Direct addressing for memory Gas allows use of a variable name for address Examples: total: .text movl %eax, total movl total, %edx . . . .data .long 0 51 Addressing Memory3 Direct addressing for memory Why can’t we write an instruction such as this? movl first, second Intel instruction set does not support instructions to move a value from memory to memory! Must always use a register as an intermediate location for the value being moved 52 Addressing Memory4 Indirect Addressing Defined as using a register as the address of the memory location to access in an instruction movl movb $0x1234, %ebx (%ebx), %al Memory %ebx 0x00001234 One byte %al 53 Addressing Memory5 Indirect Addressing May also be done with a fixed offset, e.g. 4 movl $0x1234, %ebx movb 4(%ebx), %al Memory %ebx 0x00001234 +4 %al One byte 54 Outline Compilers Assemblers Linkers and Loaders Runtime Environment 55 Linker and Loader Functions Allocation Loading Relocation Linking Execution 56 Program Relocation 問題:如果程式載入的起始位址和原始 程式的預定值不同,則執行時無法得到 預期的結果 Absolute Program Relocatable Program 解決方法 利用PC relative或base relative定址模式 提供address modification的資訊給loader,於 程式載入執行前完成修正的工作 57 real address = starting address + offset Program Linking 允許獨立的程式單元,可以個別組譯, 執行時才和其他程式單元連結 Section:可以獨立組譯、載入和重定位的程 式單元,通常用於副程式或程式的邏輯單元 External definition:定義section內的symbols 給其他section使用 External reference:宣告section內所使用的 symbols為定義在其他section的external symbols 59 Classification Loader: a system program that performs the loading function. Absolute Loader Relocating Loader Linking Loader Linker: a system program that performs the linking function. Linkage Editor Binder 64 Linkage Editors1 作用 Linkage editor執行linking和部分relocation的 工作,並將連結好的程式寫到檔案或程式庫 說明 連結好的程式稱為load module或executable image 執行時只需利用簡單的relocating loader就可 以將程式載入記憶體內 載入的動作只需一個pass即可完成 65 Linkage Editors2 比較 Linking loader於每次執行時,都要進行所有 linking的工作,並須處理automatic library search和external references;linkage editor只 須於產生load module時進行一次上述工作 Linking loader將連結好的程式直接置於記憶 體內;linkage editor將連結好的程式置於檔 案或程式庫內 Linking loader較浪費時間,適合於每次執行 都要重新組譯的狀況;linkage editor適合執 行一個程式很多次而不必重新組譯的狀況 67 Dynamic Linking1 作用 程式執行前不做任何連結的工作,直到程式 執行後,當一個副程式第一次被呼叫時才被 載入記憶體,並和程式的其他部分連結 也稱為dynamic loading或load-on-call 比較 Linkage editor在程式被載入執行前進行連結 Linking loader在程式被載入時進行連結 Dynamic linking將連結延遲到程式執行後 68 Dynamic Linking2 優點 副程式真正用到時才將其載入,可節省載入 和連結沒有被執行到的副程式所需的時間和 佔用的空間 允許使用者以交談的方式呼叫任何副程式, 而不必載入整個程式庫 系統可以隨時動態地進行reconfiguration 缺點 系統較複雜而且有較多的額外工作(overhead) 69 Dynamic Linking3 方法 將dynamic loader視為作業系統的一部分,凡 是需要動態載入的程式都必須透過作業系統 服務來呼叫 呼叫副程式時,程式發出load-and-call的作 業系統服務請求,並以副程式名稱為參數 作業系統檢查內在表格以決定副程式是否已 載入,如尚未載入則將副程式由程式庫載入 作業系統將控制權轉移給被呼叫的副程式 被呼叫的副程式完成處理後,先將控制權傳 回作業系統,再由作業系統傳回原來的程式 72 Outline Compilers Assemblers Linkers and Loaders Runtime Environment 73 Runtime Environment To understand the environment in which your final output will be running. How a program is laid out in memory: Code Data Stack Heap How function callers and callees pass info 74 Executable Layout in Memory From low memory up: Code (text segment, instructions) Static (constant) data Global data Dynamic data (heap) Runtime stack (procedure calls) Review of what’s in each section: stack heap globl static code 75 Text Segment (Executable Code) Actual machine instructions Code segment write-protected, so running code can’t overwrite itself. Arithmetic / logical / comparison / branch / jump / load / store / move / … (Debugger can overwrite it.) You’ll create the precursor for the code in this segment by emitting assembly code. Assembler will build final text. 76 Data Segment1 Data Objects Whose size is known at compile time Whose lifetime is the full run of the program (not just during a function invocation) Static data includes things that won’t change (can be write-protected): Virtual-function dispatching tables String literals used in instructions Arithmetic literals could be, but more likely incorporated into instructions. 77 Data Segment2 Global data (other than static) Variables declared global Local variables declared static (in C) Declared local to a function. Retain values even between invocations of that function (lifetime is whole run). Semantic analysis ensures that static locals are not referenced outside their function scope. 78 Dynamic Data (Heap)1 Data created by malloc or New. Heap data lives until deallocated or until program ends. (Sometimes longer than you want, if you lose track of it.) Garbage collection / reference counting are ways of automatically de-allocating dead storage in the heap. 79 Dynamic Data (Heap)2 Heap allocation starts at bottom of heap (lower addresses) and allocates upward. Requirements of alignment, specifics of allocation algorithm may cause storage to be allocated out of (address) order. *p3 p1 = new Big(); p2 = new Medium(); *p2 *p4 p3 = new Big(); p4 = new Tiny(); So (int)p2 > (int)p1 *p1 0x1000000 But (int)p4 < (int)p3 Compare pointers for equality, not < or >. 80 Runtime Stack1 Data used for function invocation: Variables declared local to functions (including main) aka “automatic” data. Except for statics (in data segment) Variables declared in anonymous blocks inside function. Arguments to function (passed by caller). Temporaries used by generated code (not representing names in source). Possibly value returned by callee to caller. 81 Runtime Stack2 Types of data that can be allocated on runtime stack: In C, all kinds of data: simple types, structs, arrays. C++: stack can hold objects declared as class type, as well as pointer type. Some languages don’t allow arrays on stack. 82 Stack Terminology1 A stack is an abstract data type. Top Base Push new value onto Top; pop value off Top. Higher elements are more recent, lower elements are older. 83 Stack Terminology2 Stack implementation can grow any direction. MIPS stack grows downward (from higher memory addresses to lower). Possible difficulty with terminology. Some people (and documents) talk about going “up” and “down” the stack. Some use the abstraction, where “up” means “more recent”, towards Top. Some (including gdb) say “up” meaning “towards older entries”, toward Base. 84 Other Resources Caches (very fast) Physical memory (fast) Virtual memory (swapping is slower) Possibly multiple levels Includes main memory + swap space on disk Registers (the fastest) 85 Storage Layout Issues Variables (local & file-scope) Functions Objects Arrays Strings 86 Arrays C uses row-major order Whole first row is stored first, then whole second row, … then whole nth row. Fortran uses column-major order Whole first column is stored first, then whole second col, … then whole kth col. Storage still a big block, but a column is contiguous instead of a row. 87 Generating Code for Array Refs In C, use size of element and range of each dimension to compute the offset of any given element: If A has m rows and n columns of 4-byte elements: &A [i] [j] is &A + 4 * (n * i + j) 88 C struct Objects Structs in C are stored in adjacent words, enough for all fields to be aligned: struct cow { char milk; // 3 slack-bytes after this char* name; // aligned on single word } Cow; ‘A’ 0x20000 “Bossy” 89 Making Function Calls Each active function call has its own unique stack frame Frame pointer Static link Return address Arguments Local variables and temporaries Who does which (caller vs callee)? How do callers and callees communicate? 90 Who does what?1 Before a function call, the calling routine: Saves any necessary registers Pushes arguments onto the stack Sets up the static link (if appropriate) Saves the return address into $ra Jumps (or branches) to the target (AKA the callee, the called function) 91 Who does what?2 During a function call, the called routine: Saves any necessary registers Sets up the new frame pointer Makes space for any locals or temporaries Does its work Sets up return value in $v0 Works only for integer or pointer values Tears down frame pointer and static link Restores any saved registers Jumps to return address (saved on stack) 92 Who does what?3 After a function call, the calling routine: Removes return address and parameters from the stack Gets return value from $v0 Restores any saved registers Continues executing 93 Parameter Passing Call by value Call by value-result Supported by Ada Call by reference Supported by C Supported by Fortran Call by name Like C preprocessor macros 94 An Example Program int dump(arg1, arg2, arg3, stop) int arg1, arg2, arg3, *stop; { int loc1 = 5, loc2 = 6, loc3 = 7; int *p; printf("Address Content\n"); for (p = stop; p >= (int*)(&p); p--) printf("%8x: %8x\n", p, *p); return 9; } int main(argc, argv, envp) int argc; char *argv[], *envp[]; { int var1 = 1, var2 = 2, var3 = 3; var3 = dump(var1, var2, var3, &envp); } 95 Sample Output Address bffff9a8: bffff9a4: bffff9a0: bffff99c: bffff998: bffff994: bffff990: bffff98c: bffff988: bffff984: bffff980: bffff97c: bffff978: bffff974: bffff970: bffff96c: bffff968: bffff964: bffff960: bffff95c: bffff958: Content bffff9ec bffff9e4 1 420158d4 bffff9b8 1 2 3 bffff998 4212a2d0 4212aa58 bffff9a8 3 2 1 80483d1 bffff998 5 6 7 bffff958 Comment envp argv argc return address (crt0) fp (crt0) <- fp (main) var1 var2 var3 stop arg3 arg2 arg1 return address (main) fp (main) <- fp (dump) loc1 loc2 loc3 p 96