* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download x86 ISA
Program optimization wikipedia , lookup
Library (computing) wikipedia , lookup
Name mangling wikipedia , lookup
Stream processing wikipedia , lookup
C Sharp (programming language) wikipedia , lookup
Very long instruction word wikipedia , lookup
History of compiler construction wikipedia , lookup
Protected mode wikipedia , lookup
One-pass compiler wikipedia , lookup
x86 ISA Compiler Baojian Hua [email protected] Front End source code lexical analyzer tokens parser abstract syntax tree semantic analyzer IR Code Generation Before discussing code generation, we must understand what we are trying to generate virtual machines bare architecture … This course uses x86 So you’d learn how to program at the x86 level There is an online manual covering every details relatively old, but enough for understanding Linux, Windows, gcc, … x86 Complex Instruction Set Computer (CISC) Instructions can operate on memory values Complex, multi-cycle instructions e.g., string-copy, call Many ways to do the same thing e.g., add [eax], ebx e.g., add eax,1 inc eax, sub eax,-1 Instructions are variable-length (1-10 bytes) Registers are not orthogonal Capsule History 1978, 8086 1985, 80386 MMX 2000, Pentium 4 32-bit, protected mode 1989, 80486 1993, Pentium First x86 microprocessor, 16-bit Deeply pipelined, high frequency 2006, Intel Core 2 Low power, multi-core x86 ISA Instruction Set Architecture another programming language (instructions set) different implementations encoding decoding assemble, compile to … say Intel vs AMD Basis for OS, compilers, etc. hardware-software interface x86 ISA What’s important here? OS and library language syntax Note: assembly program are NOT portable another CFG, read the manual assembler directives etc. think “compiler”, read the gas manual OS and Library OS simplifies programming model e.g., Linux and Windows disable segmentation the so-called “flat” model in the manual so all segment-related details may be ignored when reading the manual OS provides protection mode e.g., Linux and Windows run user programs on ring3 so you cannot change the page table! etc. OS and Library OS provides system calls hide many crazy details but may be still annoying Libraries another level of indirection on top of OS system calls In particular, we’d use C library Syntax Syntax = data + instructions Data Immediate 4, 3.14, “hello” Register general-purpose eax, ebx, … segment remember? we don’t care Data Memory different usage: globl stack heap but same behavior Data Memory addressing mode seg:[base+index*scale+disp] any part can be null complex! right? e.g., int a[5][10], to read a[3][2] mov eax, 30 mov ebx, 2 mov ecx, [eax+ebx*4+a] Problems with this strategy? Instructions Manual covers all instructions in details: Data movement Arithmetic Control transfer … Rather than explain all these bit-by-bit, I’ll give an example next Assembler Assembler is more than just a compiler: it costumes assembly syntax it also offers the so-called directives Two main branches: Intel syntax assembler on Windows: masm, nasm, … the Intel manual! AT&T syntax Linux assembler: gas the good news is that recent version of gas supports Intel syntax! the GCC output! This course uses as with Intel syntax So reading the Intel manual is relatively easy Example # Sum up an array of integers comments start with “#”, # compiled by GCC: # $ gcc test.s also supports C/C++ style .intel_syntax noprefix directive: telling that we .data directive: assemble prefer Intel thesyntax following data section a: label: the currenttoaddress .int 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 .globl main .text main: directive: store 10 integers start from globl the address directive: symbol“a” directive: assemble the following text section label: anothertoaddress Example, cont’ push ebp mov esp, ebp # convention: eax: the sum, ebx: index xor eax, eax mov ebx, eax L_start: add eax, dword ptr [ebx*4+a] inc ebx comp ebx, 10 jl L_start leave ret Summary Assembly programming is fun and simple conceptually but CISC architecture is … and a compound knowledge of OS, architecture and compiler Read the online manual Essential for code generation