Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSC 2400: Computer Systems Towards the Hardware: Machine-Level Representation of Programs Towards the Hardware • High-level language (Java) – High-level language (C) → assembly language → machine language (IA-32) 1 Compilation Stages count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; } mov ECX, 0 00000000000000000000000000000000 cmp jle add mov and je mov add add add EDX, 1 .endloop ECX, 1 EAX, EDX EAX, 1 .else EAX, EDX EDX, EAX EDX, EAX EDX, 1 01100011010101110110001101010111 jmp .endif sar .endif: jmp .endloop: EDX, 1 .loop: .else: High level language 00000000000000000000000000000000 00101011101011010010101110101101 11010001110111011101000111011101 00101110100111000010111010011100 11010010001111001101001000111100 11010000011111011101000001111101 11010011101010011101001110101001 11010000111111101101000011111110 11010001010000101101000101000010 .loop Assembly language Machine language Building and Running • To build an executable $ gcc program.c –o program • Result ! Complete executable binary file ! Machine language 1001011001011101001011000101110100101100100 1011010010110010111010010110001011101001011 0010010110100101100101110100101100010111010 0101100100101101001011001011101001011000101 1101001011001001011011010010110010010110100 • To run: $ ./program 2 Compiling Into Assembly C Code (code.c) int sum(int x, int y) { int t = x+y; accum += t; return t; } Generated Assembly (code.s) sum: push mov mov mov add pop ret ebp ebp, edx, eax, eax, ebp esp DWORD PTR [ebp+12] DWORD PTR [ebp+8] edx • Obtained with command gcc –masm=intel -S code.c • Produces file code.s Lecture Goals • Help you learn… ! Relationship between C and assembly language ! IA-32 assembly language through an example • Why do you need to know this? 3 Computer Architecture Background CPU and Memory • Example of a dual core architecture Address 0 10100111 Memory Address 1 Address 2 11010010 - The only large storage area that the CPU can access directly Address 3 00101001 11000010 . . . - Hence, any program executing must be in memory 4 Central Processing Unit (CPU) Memory CPU Addresses Control Unit ALU TEXT DATA HEAP Data Instructions Registers • Control unit STACK ! Fetch, decode • Arithmetic and logic unit ! Execute low-level operations • Registers ! High-speed temporary storage Central Processing Unit (CPU) • Runs the loop Fetch-Decode-Execute Fetch Next Instruction START START q Fetch - Decode Execute Instruction Instruction Execute Execute Instruction Instruction HALT the next instruction from memory Where from? A CPU register called the Instruction Pointer (EIP) holds the address of the instruction to be fetched next q Decode the instruction and fetch the operands q Execute the instruction and store the result 5 Control Unit: Instruction Decoder • Determines what operations need to take place ! Translate the machine-language instruction • Control what operations are done on what data ! E.g., control what data are fed to the ALU ! E.g., enable the ALU to do multiplication or addition ! E.g., read from a particular address in memory src1 src2 operation ALU flag/carry dst IA-32 Architecture CPU Registers Memory Registers EIP EAX EBX ECX EDX Addresses Data TEXT DATA HEAP Instructions ESI EDI ESP EBP Condition Codes CF ZF SF OF EFLAGS STACK EIP ESP, EBP EAX Instruction Pointer Reserved for special use Always contains return value 6 Registers • Small amount of storage on the CPU ! Can be accessed more quickly than main memory • Instructions move data in and out of registers ! Loading registers from main memory ! Storing registers to main memory • Instructions manipulate the register contents ! Registers essentially act as temporary variables ! For efficient manipulation of the data • Registers are the top of the memory hierarchy ! Ahead of main memory, disk, tape, … C Code vs. Assembly Code 7 Kinds of Instructions • Reading and writing data count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; } ! count = 0 ! n • Arithmetic and logic operations ! ! ! ! Increment: count++ Multiply: n * 3 Divide: n/2 Logical AND: n & 1 • Checking results of comparisons ! Is (n > 1) true or false? ! Is (n & 1) non-zero or zero? • Changing the flow of control ! To the end of the while loop (if “n ≤ 1”) ! Back to the beginning of the loop ! To the else clause (if “n & 1” is 0) Variables in Registers count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; } Registers count n ECX EDX mov ECX, DWORD PTR count mov EDX, DWORD PTR n 8 Immediate and Register Addressing count n count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } mov ECX, 0 add ECX, 1 Read directly from the instruction ECX EDX written to a register General Syntax op Dest, Src Perform operation op on Src and Dest Save result in Dest 9 Immediate and Register Addressing count n count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } mov and ECX EDX EAX, EDX EAX, 1 Computing intermediate value in register EAX Immediate and Register Addressing count n count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } mov add add add EAX, EDX, EDX, EDX, ECX EDX EDX EAX EAX 1 Adding n twice is cheaper than multiplication! 10 Immediate and Register Addressing count n count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } sar ECX EDX EDX, 1 Shifting right by 1 bit is cheaper than division! Changing Program Flow count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } • Cannot simply run next instruction ! Check result of a previous operation ! Jump to appropriate next instruction • Jump instructions ! Load new address in instruction pointer • Example jump instructions ! Jump unconditionally (e.g., “}”) ! Jump if zero (e.g., “n & 1”) ! Jump if greater/less (e.g., “n > 1”) 11 Jump Instructions • Jump depends on the result of previous arithmetic instruction Jump Description jmp Unconditional je Equal / Zero Registers EIP EAX EBX jne ECX js EDX jns ESI jg EDI jge ESP jl EBP jle Condition Codes ja CF ZF SF OF jb EFLAGS Conditional and Unconditional Jumps • Comparison cmp compares two integers ! Done by subtracting the first number from the second ! Discards the results, but sets flags as a side effect: – cmp EDX, 1 (computes EDX – 1) – jle .endloop (checks whether result was 0 or negative) • Logical operation and compares two integers: – and EAX, 1 – je .else (bit-wise AND of EAX with 1) (checks whether result was 0) • Also, can do an unconditional branch jmp – jmp .endif – jmp .loop 12 Jump and Labels: While Loop .loop: cmp jle count n ECX EDX EDX, 1 .endloop … while (n>1) { Checking if EDX is less than or equal to 1. } jmp .endloop: .loop Jump and Labels: While Loop count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } count n mov .loop: cmp jle ECX, 0 add mov and je mov add add add jmp .else: sar .endif: jmp .endloop: ECX, 1 EAX, EDX EAX, 1 .else EAX, EDX EDX, EAX EDX, EAX EDX, 1 .endif ECX EDX EDX, 1 .endloop EDX, 1 .loop 13 Jump and Labels: If-Then-Else if (n&1) ... else ... mov and je … count n ECX EDX EAX, EDX EAX, 1 .else “then” block “else” block jmp .else: .endif … .endif: Jump and Labels: If-Then-Else mov .loop: cmp jle count=0; add while(n>1) { mov count++; and je if (n&1) mov n = n*3+1; add add “then” block else add n = n/2; jmp .else: } sar “else” block .endif: jmp .endloop: count n ECX EDX ECX, 0 EDX, 1 .endloop ECX, 1 EAX, EDX EAX, 1 .else EAX, EDX EDX, EAX EDX, EAX EDX, 1 .endif EDX, 1 .loop 14 count n Code More Efficient… count=0; while(n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } Replace with “jmp loop” mov .loop: cmp jle ECX, 0 add mov and je mov add add add jmp .else: sar .endif: jmp .endloop: ECX, 1 EAX, EDX EAX, 1 .else EAX, EDX EDX, EAX EDX, EAX EDX, 1 .endif EDX, 1 .endloop EDX, 1 .loop count n Complete Example count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; } ECX EDX mov .loop: cmp jle ECX, 0 add mov and je mov add add add jmp .else: sar .endif: jmp .endloop: ECX, 1 EAX, EDX EAX, 1 .else EAX, EDX EDX, EAX EDX, EAX EDX, 1 .endif ECX EDX EDX, 1 .endloop EDX, 1 .loop 15 Reading IA-32 Assembly Language • Referring to a register: ! EAX, eax, EBX, ebx, etc. • Result stored in the first argument ! E.g. “mov EAX, EDX” moves EDX into EAX ! E.g., “add ECX, 1” increments register ECX • Assembler directives: starting with a period (“.”) ! E.g., “.section .text” to start the text section of memory • Comment: pound sign (“#”) ! E.g., “# Purpose: Convert lower to upper case” X86 à C Example int Example(int x){ ??? } L1: L2: push mov mov xor xor cmp jge add inc cmp jl mov pop ret EBP EBP, ESP ECX, [EBP+8] EAX, EAX EDX, EDX EDX, ECX L2 EAX, EDX EDX EDX, ECX L1 ESP,EBP EBP Write comments # # # # # L1: # # # # L2: ECX = x EAX = EDX = if goto L2 EAX = EDX = if goto L1 16 Name the Variables int Example(int x){ ??? } L1: L2: push mov mov xor xor cmp jge add inc cmp jl mov pop ret EBP EBP, ESP ECX, [EBP+8] EAX, EAX EDX, EDX EDX, ECX L2 EAX, EDX EDX EDX, ECX L1 ESP,EBP EBP EAX à result, EDX à i # # # # # L1: # # # # L2: ECX = x result = i = if goto L2 result = i = if goto L1 Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: 17 Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: result = 0; i = 0; if (i >= x) goto L2; do { result += i; i++; } while (i < x); L2: result = 0; i = 0; while (i < x){ result += i; i++; } result = 0; for (i = 0; i < x; i++) result += i; C Code int Example(int x) { int result=0; int i; for (i=0; i<x; i++) result += i; return result; } 18 Exercise int F(int x, int y){ ??? } .L2: .L1: push EBP mov EBP,ESP mov ECX, [EBP+8] mov EDX, [EBP+12] xor EAX, EAX cmp ECX, EDX jle .L1 dec ECX inc EDX inc EAX cmp ECX,EDX jg .L2 inc EAX mov ESP,EBP pop EBP ret # ECX = x # EDX = y .L2: .L1: C Code int F(int x, int y) 19 Instructions to Recognize Arithmetic Instructions (1) • Two Operand Instructions ADD SUB MUL SAL SAR SHR XOR AND OR Dest, Src Dest, Src Dest, Src Dest, Src Dest, Src Dest, Src Dest, Src Dest, Src Dest, Src Dest Dest Dest Dest Dest Dest Dest Dest Dest = = = = = = = = = Dest Dest Dest Dest Dest Dest Dest Dest Dest + Src - Src * Src << Src >> Src >> Src ^ Src & Src | Src Arithmetic Logical 20 Arithmetic Instructions (2) • One Operand Instructions INC DEC NEG NOT Dest Dest Dest Dest Dest Dest Dest Dest = = = = Dest + 1 Dest – 1 -Dest ~Dest Compare and Test Instructions CMP TEST Dest, Src Compute Dest - Src without setting Dest Dest, Src Compute Dest & Src without setting Dest 21 Jump Instructions • Jump depending on the result of the previous arithmetic instruction: Jump Description jmp Unconditional je Equal / Zero jne Not Equal / Not Zero js Negative jns Nonnegative jg Greater (Signed) jge Greater or Equal (Signed) jl Less (Signed) jle Less or Equal (Signed) ja Above (unsigned) jb Below (unsigned) Loading and Storing Data Memory Addressing Modes 22 IA-32 General Purpose Registers 31 15 87 16-bit 32-bit 0 EAX EBX ECX EDX ESI EDI AX BX CX DX AL BL CL DL AH BH CH DH SI DI General-purpose registers C Example: One-Byte Data Global char variable i is in AL, the lower byte of the “A” register. char i; … if (i > 5) { i++; else i--; } CMP AL, 5 JLE else INC AL jmp endif else: DEC AL endif: 23 C Example: Four-Byte Data Global int variable i is in EAX, the full 32 bits of the “A” register. int i; … if (i > 5) { i++; else i--; } CMP EAX, 5 JLE else INC EAX JMP endif else: DEC EAX endif: Loading and Storing Data • Processors have many ways to access data ! Known as “addressing modes” ! Two simple ways seen in previous examples • Addressing Modes ! Register addressing ! Immediate addressing ! Pointer addressing 24 Register Addressing • Registers embedded in the instruction • Examples ! XOR EAX, EAX ! MOV EBX, ECX ! INC BH Immediate Addressing • Data embedded in the instruction • Examples ! ADD EAX, 3 ! MOV AX, -40 25 Pointer Addressing (direct) • Memory address embedded in the instruction ! Direct memory addressing • Examples ! MOV AL, [2000] ! Read the byte from memory address 2000 ! Load the byte value in the register AL • Note that ! 2000 refers to the constant value 2000 ! [2000] refers to the memory location at address 2000 Pointer Addressing (indirect) • Load or store from a previously-computed address ! Register with the base address is in the instruction ! Indirect memory addressing • Examples ! MOV AX, [BX] (register addressing) ! CMP DL, [BX+8] (base+displacement) ! MOV EAX,[EBX+ESI*4+8] (base+index+displacement) 26 Indexed Addressing Example int a[20]; … int i, sum=0; for (i=0; i<20; i++) sum += a[i]; global variable MOV EAX, 0 MOV EDX, OFFSET FLAT:a sumloop: EAX: i EBX: sum ECX: address of a[0] ADD EBX, [ECX+EAX*4] INC EAX CMP EAX, 20 jne sumloop Data Access Methods: Summary • Immediate addressing: data stored in the instruction itself ! MOV ECX, 10 • Register addressing: data stored in a register ! MOV ECX, EAX • Direct addressing: address stored in instruction ! MOV ECX, [200] • Indirect addressing: address stored in a register ! MOV ECX, [EAX] ! MOV ECX, [EAX+4] ! MOV ECX, [EAX + ESI*4 + 12] 27 LEA: Load Effective Address • LEA Dest, Src ! Src is address mode expression ! Set Dest to address denoted by expression • Example ! LEA EAX, [EBX+4*ESI] ! Load into EAX the value EBX+4*ESI • Compare to ! MOV EAX, [EBX+4*ESI] ! Load into EAX the value stored in memory at address EBX+4*ESI Summary • Hardware ! Memory is the only storage area CPU can access directly ! Executables are stored on the disk ! Fetch-Decode-Execute Cycle for running executables • Assembly code ! Programming the “bare metal” of the hardware ! Loading and storing data, arithmetic and logic operations, checking results, and changing control flow • To get more familiar with IA-32 assembly ! Generate your own assembly-language code – gcc –masm=intel –S code.c 28