Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly How software works gcc compiler driver pre-processes, compiles, assembles and links to generate executable Links together object code (i.e. game.o) and static libraries (i.e. libc.a) to form final executable Links in references to dynamic libraries for code loaded at load time (i.e. libc.so.1) Executable may still load additional dynamic libraries at run-time hello.c Program Source Preprocessor hello.i Modified Source Compiler hello.s Assembly Code Assembler hello.o Object Code Linker hello Executable Code Static libraries Suppose you have utility code in x.c, y.c, and z.c that all of your programs use Link together individual .o files gcc –o hello hello.o x.o y.o z.o Create a library libmyutil.a using ar and ranlib and link library in statically libmyutil.a : x.o y.o z.o ar rvu libmyutil.a x.o y.o z.o ranlib libmyutil.a gcc –o hello hello.c –L. –lmyutil Note: library code copied directly into binary Dynamic libraries Avoid having multiple copies of common code on disk Problem: libc “gcc program.c –lc” creates an a.out with entire libc object code in it (libc.a) Almost all programs use libc! Solution: Have binaries compiled with a reference to a library of shared objects versus an entire copy of the library Libraries loaded at run-time from file system “ldd <binary>” to see which dynamic libraries a program relies upon gcc flags “–shared” and “-soname” for handling and generating dynamic shared object files The linking process (ld) Merges object files Merges multiple relocatable (.o) object files into a single executable program. Resolves external references References to symbols defined in another object file. Relocates symbols Relocates symbols from their relative locations in the .o files to new absolute positions in the executable. Updates all references to these symbols to reflect their new positions. References in both code and data » code: a(); » data: int *xp=&x; /* reference to symbol a */ /* reference to symbol x */ Executables Various file formats Linux = Executable and Linkable Format (ELF) Windows = Portable Executable (PE) ELF Standard binary format for object files in Linux One unified format for Relocatable object files (.o), Shared object files (.so) Executable object files Better support for shared libraries than old a.out formats. More complete information for debuggers. ELF Object File Format ELF header Magic number, type (.o, exec, .so), machine, byte ordering, etc. Program header table Page size, virtual addresses of memory segments (sections), segment sizes, entry point .text section Code .data section Initialized (static) data .bss section Uninitialized (static) data “Block Started by Symbol” 0 ELF header Program header table (required for executables) .text section .data section .bss section .symtab .rel.text .rel.data .debug Section header table (required for relocatables) ELF Object File Format (cont) .symtab section Symbol table Procedure and static variable names Section names and locations .rel.text section Relocation info for .text section Addresses of instructions that will need to be modified in the executable Instructions for modifying. .rel.data section Relocation info for .data section Addresses of pointer data that will need to be modified in the merged executable .debug section Info for symbolic debugging (gcc -g) 0 ELF header Program header table (required for executables) .text section .data section .bss section .symtab .rel.text .rel.data .debug Section header table (required for relocatables) PE (Portable Executable) file format Windows file format for executables Based on COFF Format Magic Numbers, Headers, Tables, Directories, Sections Disassemblers Overlay Data with C Structures Load File as OS Loader Would Identify Entry Points (Default & Exported) Example C Program m.c a.c int e=7; extern int e; int main() { int r = a(); exit(0); } int *ep=&e; int x=15; int y; int a() { return *ep+x+y; } Merging Relocatable Object Files into an Executable Object File Relocatable Object Files system code .text system data .data Executable Object File 0 headers system code main() m.o a.o a() main() .text int e = 7 .data more system code a() .text int *ep = &e int x = 15 int y .data system data int e = 7 int *ep = &e int x = 15 uninitialized data .bss .text .symtab .debug .data .bss Program execution Operating system provides Protection and resource allocation Abstract view of resources (files, system calls) Virtual memory Uniform memory space abstraction for each process Gives the illusion that each process has entire memory space How does a program get loaded? The operating system creates a new process. Including among other things, a virtual memory space Important: any hardware-based debugger must know OS state in page tables to map accesses to virtual addresses System loader reads the executable file from the file system into the memory space. Reads executable from file system into memory space Executable contains code and statically link libraries Done via DMA (direct memory access) Executable in file system remains and can be executed Loading Executable Binaries Executable object file for example program p 0 ELF header Program header table (required for executables) .text section Process image init and shared lib segments .data section .bss section .text segment (r/o) Virtual addr 0x080483e0 0x08048494 .symtab .rel.text .rel.data .data segment (initialized r/w) 0x0804a010 .debug Section header table (required for relocatables) 0x0804a3b0 .bss segment (uninitialized r/w) More on relocation Assembly code with relative and absolute addresses With VM abstraction, old linkers decide layout and can supply definitive addresses Windows “.com” format Linker can statically bind the program to virtual addresses Now, they provide hints as to where they would like to be placed But….this could also be done at load time (address space layout randomization) Windows “.exe” format Loader rewrites addresses to proper offsets System needs to force position-independent code » Force compiler to make all jumps and branches relative to current location or relative to a base register set at run-time ELF uses Global Offset Table Program execution CPU Memory Addresses Registers E I P Object Code Program Data OS Data Data Condition Codes Instructions Stack Programmer-Visible State EIP - Instruction Pointer a. k. a. Program Counter Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching Memory Byte addressable array Code, user data, OS data Includes stack used to support procedures Run-time data structures 0xffffffff kernel virtual memory (code, data, heap, stack) 0xc0000000 user stack (created at runtime) 0x40000000 memory invisible to user code %esp (stack pointer) memory mapped region for shared libraries brk run-time heap (managed by malloc) read/write segment (.data, .bss) 0x08048000 0 read-only segment (.init, .text, .rodata) unused loaded from the executable file Registers The processor operates on data in registers (usually) movl (%eax), %ecx Fetch data at address contained in %eax Store in register %ecx movl $array, %ecx Move address of variable array into %ecx Typically, data is loaded into registers, manipulated or used, and then written back to memory The IA32 architecture is “register poor” Few general purpose registers Source or destination operand is often memory locations IA32 General Registers 31 15 87 0 %ax %eax %ah %al %cx %ecx %ch %cl %dx General purpose registers (mostly) %edx %dh %dl %bx %ebx %bh %bl %esi %si %edi %di %esp %sp Stack pointer %ebp %bp Frame pointer Special purpose registers Operand types A typical instruction acts on 1 or more operands addl %ecx, %edx adds the contents of ecx to edx Three general types of operands Immediate Like a C constant, but preceded by $ e.g., $0x1F, $-533 Encoded with 1, 2, or 4 bytes based on instruction Register: the value in one of the 8 integer registers Memory: a memory address There are many modes for addressing memory Operand examples using mov Source movl Destination C Analog movl $0x4,%eax temp = 0x4; movl $-147,(%eax) *p = -147; Imm Reg Mem movl %eax,%edx temp2 = temp1; Reg Reg Mem movl %eax,(%edx) *p = temp; Mem Reg movl (%eax),%edx temp = *p; Memory-memory single instruction transfers cannot be done with Addressing Modes Immediate and registers have only one mode Memory on the other hand … Absolute specify the address of the data Indirect use register to calculate address Base + displacement use register plus absolute address to calculate address Indexed Indexed » Add contents of an index register Scaled index » Add contents of an index register scaled by a constant Summary of IA32 Operand Forms Type Form Operand Value Name Immediate $Imm Imm Immediate Register Ea R[Ea] Register Memory Imm M[Imm] Absolute Memory (Ea) M[R[Ea]] Indirect Memory Imm(Eb) M[Imm + R[Eb] Base + displacment Memory (Eb, Ei) M[R[Eb] + R[Ei]] Indexed Memory Imm(Eb, Ei) M[Imm + R[Eb] + R[Ei]] Indexed Memory (, Ei, s) M[R[Ei] * s] Scaled Indexed Memory Imm(, Ei, s) M[Imm + R[Ei] * s] Scaled Indexed Memory (Eb, Ei, s) M[R[Eb] + R[Ei] * s] Scaled Indexed Memory Imm (Eb, Ei, s) M[Imm + R[Eb] + R[Ei] * s] Scaled Indexed x86 instructions Rules Source operand can be memory, register or constant Destination can be memory or register Only one of source and destination can be memory Source and destination must be same size Flags set on each instruction EFLAGS Conditional branches handled via EFLAGS What’s the “l” for on the end? addl 8(%ebp),%eax It stands for “long” and is 32-bits It tells the size of the operand. Baggage from the days of 16-bit processors For x86, x86_64 8 bits is a byte 16 bits is a word 32 bits is a double word 64 bits is a quad word IA32 Standard Data Types C Declaration Intel Data Type GAS Suffix Size in bytes char Byte b 1 short Word w 2 int Double word l 4 unsigned Double word l 4 long int Double word l 4 unsigned long Double word l 4 char * Double word l 4 float Single precision s 4 double Double precision l 8 long double Extended precision t 10/12 Global vs. Local variables Global variables stored in either .data or .bss section of process Local variables stored on stack Global vs local example int x = 1; int y = 2; void a() { x = x+y; printf("Total = %d\n",x); } int main(){a();} void a() { int x = 1; int y = 2; x = x+y; printf("Total = %d\n",x); } int main() {a();} Global vs local example int x = 1; int y = 2; void a() { x = x+y; printf("Total = %d\n",x); } int main(){a();} 080483c4 <a>: 80483c4: 80483c5: 80483c7: 80483ca: 80483d1: 80483d8: 80483db: 80483de: 80483e1: 80483e5: 80483ec: 80483f1: 80483f2: push mov sub movl movl mov add mov mov movl call leave ret %ebp %esp,%ebp $0x18,%esp $0x1,-0x8(%ebp) $0x2,-0x4(%ebp) -0x4(%ebp),%eax %eax,-0x8(%ebp) -0x8(%ebp),%eax %eax,0x4(%esp) $0x80484f0,(%esp) 80482dc <printf@plt> void a() { int x = 1; int y = 2; x = x+y; printf("Total = %d\n",x); } int main() {a();} 080483c4 <a>: 80483c4: 80483c5: 80483c7: 80483ca: 80483d0: 80483d5: 80483d8: 80483dd: 80483e2: 80483e6: 80483ed: 80483f2: 80483f3: push mov sub mov mov lea mov mov mov movl call leave ret %ebp %esp,%ebp $0x8,%esp 0x804966c,%edx 0x8049670,%eax (%edx,%eax,1),%eax %eax,0x804966c 0x804966c,%eax %eax,0x4(%esp) $0x80484f0,(%esp) 80482dc <printf@plt> Arithmetic operations void f(){ int a = 0; int b = 1; a = a+11; a = a-b; a--; b++; } int main() { f();} 08048394 <f>: 8048394: 8048395: 8048397: 804839a: 80483a1: 80483a8: 80483ac: 80483af: 80483b2: 80483b6: 80483ba: 80483bb: push mov sub movl movl addl mov sub subl addl leave ret %ebp %esp,%ebp $0x10,%esp $0x0,-0x8(%ebp) $0x1,-0x4(%ebp) $0xb,-0x8(%ebp) -0x4(%ebp),%eax %eax,-0x8(%ebp) $0x1,-0x8(%ebp) $0x1,-0x4(%ebp) Machine Instruction Example int sum(int x, int y) { int t = x+y; return t; } _sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax movl %ebp,%esp popl %ebp ret C Code Add two signed integers Assembly Add 2 4-byte integers “Long” words in GCC parlance Same instruction whether signed or unsigned Operands: x: y: t: Register Memory Register %eax M[%ebp+8] %eax » Return function value in %eax Object Code 0x401046: 03 45 08 3-byte instruction Stored at address 0x401046 Condition codes The IA32 processor has a register called eflags (extended flags) Each bit is a flag, or condition code CF Carry Flag SF Sign Flag ZF Zero Flag OFOverflow Flag As programmers, we don’t write to this register and seldom read it directly Flags are set or cleared by hardware depending on the result of an instruction Condition Codes (cont.) Setting condition codes via compare instruction cmpl b,a Computes a-b without setting destination CF set if carry out from most significant bit Used for unsigned comparisons set if a == b SF set if (a-b) < 0 OF set if two’s complement overflow ZF (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0) Byte and word versions cmpb, cmpw Condition Codes (cont.) Setting condition codes via test instruction testl b,a Computes a&b without setting destination Sets condition codes based on result Useful to have one of the operands be a mask Often used to test zero, testl %eax, %eax positive set when a&b == 0 SF set when a&b < 0 Byte and word versions testb, testw ZF if statements void f(){ int x = 1; int y = 2; if (x==y) { printf("x equals y.\n"); } else { printf("x is not equal to y.\n"); } } int main() { f();} 080483c4 <f>: 80483c4: 80483c5: 80483c7: 80483ca: 80483d1: 80483d8: 80483db: 80483de: 80483e0: 80483e7: 80483ec: 80483ee: 80483f5: 80483fa: 80483fb: push mov sub movl movl mov cmp jne movl call jmp movl call leave ret %ebp %esp,%ebp $0x18,%esp $0x1,-0x8(%ebp) $0x2,-0x4(%ebp) -0x8(%ebp),%eax -0x4(%ebp),%eax 80483ee <f+0x2a> $0x80484f0,(%esp) 80482d8 <puts@plt> 80483fa <f+0x36> $0x80484fc,(%esp) 80482d8 <puts@plt> if statements int a = 1, b = 3, c; if (a > b) c = a; else c = b; 00000018: C7 45 FC 01 00 00 00 mov dword ptr [ebp-4],1 ; store a = 1 0000001F: C7 45 F8 03 00 00 00 mov dword ptr [ebp-8],3 ; store b = 3 00000026: 8B 45 FC mov eax,dword ptr [ebp-4] ; move a into EAX register 00000029: 3B 45 F8 cmp eax,dword ptr [ebp-8] ; compare a with b (subtraction) 0000002C: 7E 08 jle 00000036 ; if (a<=b) jump to line 00000036 0000002E: 8B 4D FC mov ecx,dword ptr [ebp-4] ; else move 1 into ECX register && 00000031: 89 4D F4 mov dword ptr [ebp-0Ch],ecx ; move ECX into c (12 bytes down) && 00000034: EB 06 jmp 0000003C ; unconditional jump to 0000003C 00000036: 8B 55 F8 mov edx,dword ptr [ebp-8] ; move 3 into EDX register && 00000039: 89 55 F4 mov dword ptr [ebp-0Ch],edx ; move EDX into c (12 bytes down) Loops int factorial_do(int x) { int result = 1; do { result *= x; x = x-1; } while (x > 1); return result; } factorial_do: pushl movl movl movl .L2: imull decl cmpl jg leave ret %ebp %esp, %ebp 8(%ebp), %edx $1, %eax %edx, %eax %edx $1, %edx .L2 C switch statements Implementation options Series of conditionals testl followed by je Good if few cases Slow if many cases Jump table (example below) Lookup branch target from a table Possible with a small range of integer constants GCC picks implementation based on structure Example: .L3 switch (x) { case 1: case 5: code at L0 case 2: case 3: code at L1 default: code at L2 } .L2 .L0 .L1 .L1 .L2 .L0 1. init jump table at .L3 2. get address at .L3+4*x 3. jump to that address Example int switch_eg(int x) { int result = x; switch (x) { case 100: result *= 13; break; case 102: result += 10; /* Fall through */ case 103: result += 11; break; case 104: case 106: result *= result; break; default: result = 0; } return result; } int switch_eg(int x) { int result = x; switch (x) { case 100: result *= 13; break; case 102: result += 10; /* Fall through */ case 103: result += 11; break; case 104: case 106: result *= result; break; default: result = 0; leal -100(%edx),%eax cmpl $6,%eax ja .L9 jmp *.L10(,%eax,4) .p2align 4,,7 .section .rodata .align 4 .align 4 .L10: .long .L4 .long .L9 .long .L5 .long .L6 .long .L8 .long .L9 .long .L8 .text .p2align 4,,7 .L4: leal (%edx,%edx,2),%eax leal (%edx,%eax,4),%edx jmp .L3 .p2align 4,,7 .L5: addl $10,%edx .L6: addl $11,%edx jmp .L3 .p2align 4,,7 .L8: imull %edx,%edx jmp .L3 .p2align 4,,7 .L9: xorl %edx,%edx .L3: movl %edx,%eax } return result; } Key is jump table 41 at L10 Array of pointers to jump locations x86-64 conditionals Modern CPUs with deep pipelines Instructions fetched far in advance of execution Mask the latency going to memory Problem: What if you hit a conditional branch? Must predict which branch to take! Branch prediction in CPUs well-studied, fairly effective But, best to avoid conditional branching altogether x86-64 conditionals Conditional instruction execution Conditional Move Conditional move instruction cmovXX src, dest Move value from src to dest if condition XX holds No branching Handled as operation within Execution Unit Added with P6 microarchitecture (PentiumPro onward) Example movl 8(%ebp),%edx movl 12(%ebp),%eax cmpl %edx, %eax cmovll %edx,%eax # # # # Get x rval=y rval:x If <, rval=x Current version of GCC won’t use this instruction Thinks it’s compiling for a 386 Performance 14 cycles on all data More efficient than conditional branching (simple control flow) But overhead: both branches are evaluated x86-64 conditional example int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; } absdiff: movl %edi, movl %esi, subl %esi, subl %edi, cmpl %esi, cmovle %edx, ret # x in %edi, y in %esi %eax # eax = x %edx # edx = y %eax # eax = x-y %edx # edx = y-x %edi # x:y %eax # eax=edx if <= IA32 Stack Stack “Bottom” Region of memory managed with stack discipline Grows toward lower addresses Register %esp indicates lowest stack address Increasing Addresses address of top element Stack Pointer %esp Stack Grows Down Stack “Top” IA32 Stack Pushing Stack “Bottom” Pushing pushl Src Decrement %esp by 4 Increasing Addresses Fetch operand at Src Write operand at address given by %esp e.g. pushl %eax subl $4, %esp movl %eax,(%esp) Stack Grows Down Stack Pointer %esp -4 Stack “Top” IA32 Stack Popping Stack “Bottom” Popping popl Dest Read operand at address Increasing Addresses given by %esp Write to Dest Increment %esp by 4 e.g. popl %eax movl (%esp),%eax addl $4,%esp Stack Pointer %esp Stack Grows Down +4 Stack “Top” Stack Operation Examples Initially pushl %eax popl %edx 0x110 0x110 0x110 0x10c 0x10c 0x10c 0x108 123 0x108 123 0x108 123 Top 0x104 213 0x104 213 Top %eax 213 %edx %esp %eax 213 %edx 0x108 %esp 0x108 0x104 Top %eax 213 %edx 555 213 %esp 0x104 0x108 Procedure Control Flow Procedure call: call label Push address of next instruction (after the call) on stack Jump to label Procedure return: ret Pop address from stack into eip register Procedure Call Example 804854e: 8048553: e8 3d 06 00 00 50 call 8048b90 <main> next instruction call 0x110 0x110 0x10c 0x10c 0x108 123 8048b90 0x108 123 0x104 0x8048553 %esp 0x108 %esp 0x108 0x104 %eip 0x804854e %eip 0x804854e 0x8048b90 %eip is program counter Procedure Return Example 8048e90: c3 ret ret 0x110 0x110 0x10c 0x10c 0x108 123 0x104 0x8048553 %esp 0x104 %esp 0x104 0x108 %eip 0x8048e90 %eip 0x8048553 0x8048e91 %eip is program counter 0x108 123 0x8048553 Procedure Control Flow When procedure foo calls who: foo is the caller, who is the callee Control is transferred to the ‘callee’ When procedure returns Control is transferred back to the ‘caller’ Last-called, first-return (LIFO) order Naturally implemented via the stack foo(…) { • • • who(); • • • } call who(…) { • • • amI(); • • • amI(); ret • • • } call amI(…) { • • • ret • • • } Procedure calls and stack frames How does the ‘callee’ know where to return later? Return address placed in a well-known location on stack within a “stack frame” How are arguments passed to the ‘callee’? Arguments placed in a well-known location on stack within a “stack frame” Upon procedure invocation Stack frame is pushed onto program stack Upon procedure return Its frame is popped off of stack Caller’s stack frame is recovered foo’s stack frame who’s stack frame amI’s stack frame Call chain: foo => who => amI increasing addresses Stack frame created for the procedure stack growth Stack bottom Keeping track of stack frames The stack pointer (%esp) moves around Can be changed within procedure Problem How can we consistently find our parameters? The base pointer (%ebp) Points to the base of our current stack frame Also called the frame pointer Within each function, %ebp stays constant Most information on the stack is referenced relative to the base pointer Base pointer setup is the programmer’s Actually usually the compiler’s job job IA32/Linux Stack Frame Current Stack Frame (Yellow) (From Top to Bottom) Parameters for function about to be called Caller Frame “Argument build” of caller Arguments Local variables If can’t keep in registers Saved register context Old frame pointer Frame Pointer (%ebp) Saved Registers + Local Variables Caller Stack Frame (Pink) Return address Pushed by call instruction Return Addr Old %ebp Arguments for this call “Argument build” of callee etc… Stack Pointer (%esp) Argument Build swap Calling swap from call_swap int zip1 = 15213; int zip2 = 91125; void call_swap() { swap(&zip1, &zip2); } void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } call_swap: • • • pushl $zip2 pushl $zip1 call swap • • • # Global Var # Global Var • • • Resulting Stack &zip2 &zip1 Rtn adr %esp swap swap: pushl %ebp movl %esp,%ebp pushl %ebx movl movl movl movl movl movl 12(%ebp),%ecx 8(%ebp),%edx (%ecx),%eax (%edx),%ebx %eax,(%edx) %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Setup Body Finish void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } swap Setup #1 Resulting stack Entering Stack %ebp %ebp • • • • • • &zip2 yp &zip1 xp Rtn adr %esp Rtn adr Old %ebp swap: pushl %ebp movl %esp,%ebp pushl %ebx %esp swap Setup #2 Resulting stack Stack before instruction %ebp • • • • • • yp yp xp xp Rtn adr Rtn adr Old %ebp %esp Old %ebp %ebp %esp swap: pushl %ebp movl %esp,%ebp pushl %ebx swap Setup #3 Resulting Stack Stack before instruction • • • • • • yp yp xp xp Rtn adr Rtn adr Old %ebp %ebp Old %ebp %ebp %esp Old %ebx %esp swap: pushl %ebp movl %esp,%ebp pushl %ebx Effect of swap Setup Resulting Stack Entering Stack %ebp • • • Offset (relative to %ebp) • • • &zip2 12 yp &zip1 8 xp 4 Rtn adr 0 Old %ebp %ebp Old %ebx %esp Rtn adr %esp movl 12(%ebp),%ecx # get yp movl 8(%ebp),%edx # get xp . . . Body swap Finish #1 swap’s Stack • • • Offset Offset • • • 12 yp 12 yp 8 xp 8 xp 4 Rtn adr 4 Rtn adr 0 Old %ebp %ebp 0 Old %ebp %ebp -4 Old %ebx %esp -4 Old %ebx %esp Observation Saved & restored register %ebx movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret swap Finish #2 swap’s Stack Offset swap’s Stack • • • Offset • • • 12 yp 12 yp 8 xp 8 xp 4 Rtn adr 4 Rtn adr 0 Old %ebp %ebp 0 Old %ebp -4 Old %ebx %esp %ebp %esp movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret swap Finish #3 swap’s Stack Offset • • • 12 yp 8 xp 4 Rtn adr 0 Old %ebp %ebp swap’s Stack Offset %ebp • • • 12 yp 8 xp 4 Rtn adr %esp %esp movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret swap Finish #4 %ebp swap’s Stack %ebp • • • • • • 12 yp &zip2 8 xp &zip1 4 Rtn adr Offset Exiting Stack %esp %esp Observation movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret & restored register %ebx Didn’t do so for %eax, %ecx, or %edx Saved swap swap: pushl %ebp movl %esp,%ebp pushl %ebx Setup movl movl movl movl movl movl Body 12(%ebp),%ecx 8(%ebp),%edx (%ecx),%eax (%edx),%ebx %eax,(%edx) %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Save old %ebp of caller frame Set new %ebp for callee (current) frame Save state of %ebx register from caller Retrieve parameter yp from caller frame Retrieve parameter xp from caller frame Perform swap Finish Restore the state of caller’s %ebx register Set stack pointer to bottom of callee frame (%ebp) Restore %ebp to original state Pop return address from stack to %eip Equivalent to single leave instruction Local variables Where are they in relation to ebp? Stored “above” %ebp (at lower addresses) How are they preserved if the current function calls another function? updates %esp beyond local variables before issuing “call” Compiler What happens to them when the current function returns? Are lost (i.e. no longer valid) Register Saving Conventions When procedure foo calls who: foo is the caller, who is the callee Can Register be Used for Temporary Storage? Conventions “Caller Save” Caller saves temporary in its frame before calling “Callee Save” Callee saves temporary in its frame before using IA32 Register Usage Integer Registers Two have special uses %ebp, %esp Three managed as callee-save %eax Caller-Save Temporaries %ebx, %esi, %edi Old values saved on stack prior to using Three managed as caller-save %eax, %edx, %ecx Do what you please, but expect any callee to do so, as well Return value in %eax %edx %ecx %ebx Callee-Save Temporaries %esi %edi %esp Special %ebp simple.c gcc –O2 –c simple.c int simple(int *xp, int y) { int t = *xp + y; *xp = t; return t; } _simple: pushl movl movl movl movl addl movl popl ret %ebp Setup stack frame pointer %esp, %ebp 8(%ebp), %edx get xp 12(%ebp), %ecx get y (%edx), %eax move *xp to t %ecx, %eax add y to t %eax, (%edx) store t at *xp %ebp restore frame pointer return to caller Function pointers Pointers in C can also point to code locations Function pointers Store and pass references to code Some uses Dynamic “late-binding” of functions Dynamically “set” a random number generator Replace large switch statements for implementing dynamic event handlers » Example: dynamically setting behavior of GUI buttons Emulating “virtual functions” and polymorphism from OOP qsort() with user-supplied callback function for comparison » man qsort Operating on lists of elements » multiplicaiton, addition, min/max, etc. Malware leverages this to execute its own code Using pointers to functions // function prototypes int doEcho(char*); int doExit(char*); int doHelp(char*); int setPrompt(char*); // dispatch table section typedef int (*func)(char*); typedef struct{ char* name; func function; } func_t; func_t func_table[] = { { "echo", doEcho }, { "exit", doExit }, { "quit", doExit }, { "help", doHelp }, { "prompt", setPrompt }, }; // find the function and dispatch it for (i = 0; i < cntFuncs; i++) { if (strcmp(command,func_table[i].name)==0){ done = func_table[i].function(argument); break; } } if (i == cntFuncs) printf("invalid command\n"); #define cntFuncs (sizeof(func_table) / sizeof(func_table[0])) Function pointers example main: leal #include <sys/time.h> #include <stdio.h> void fp1(int i){ printf("Even\n“,i);} void fp2(int i) { printf("Odd\n”,i); } andl pushl %ebp main(int argc, char **argv) { void (*fp)(int); int i = argc; } %esp, %ebp pushl %ecx subl $4, %esp movl (%ecx), %eax movl $fp2, %edx testb $1, %al jne .L4 movl $fp1, %edx movl %eax, (%esp) .L4: call mashimaro % ./funcp a Even 2 mashimaro % ./funcp a b Odd 3 mashimaro % $-16, %esp pushl -4(%ecx) movl if (argc%2) fp=fp2; else fp=fp1; fp(i); 4(%esp), %ecx *%edx addl $4, %esp popl %ecx popl %ebp leal ret -4(%ecx), %esp Uses in operating system Interrupt descriptor table Pointers to interrupt handler functions IDTR points to IDT System services descriptor table Pointers to system call functions Import address table Pointers to imported library calls Malware attacks all of these More disassembly Code patterns in assembly Calling conventions (fast vs. standard vs. cdecl) ebp omission ecx use as C++ this pointer C++ vtables (virtual function table) WinXP SP2 prologue with patching support For detours Exception handlers (FS register) Linked list of functions stored in exception frames on stack Advanced disassembly Windows examples Largely the same with small modifications Size of operands (i.e. dword) specified (not in operator suffix) Reverse ordering of operands Disassembly example 0000 mov ecx, 5 for(int i=0;i<5;i++) 0003 push aHello { 0009 call printf 000E loop 00000003h 0014 ... 0000 cmp ecx, 100h if(x == 256) 0003 jnz 001Bh { 0009 push aYes 000F call printf } 0015 jmp 0027h else 001B push aNo { 0021 call printf 0027 ... printf(“Hello”); } printf(“Yes”); printf(“No”); } Disassembly example int main(int argc, char **argv) { WSADATA wsa; SOCKET s; struct sockaddr_in name; unsigned char buf[256]; // Initialize Winsock if(WSAStartup(MAKEWORD(1,1),&wsa)) return 1; // Create Socket s = socket(AF_INET,SOCK_STREAM,0); if(INVALID_SOCKET == s) goto Error_Cleanup; name.sin_family = AF_INET; name.sin_port = htons(PORT_NUMBER); name.sin_addr.S_un.S_addr = htonl(INADDR_ANY); // Bind Socket To Local Port if(SOCKET_ERROR == bind(s,(struct sockaddr*)&name,sizeof(name))) goto Error_Cleanup; // Set Backlog parameters if(SOCKET_ERROR == listen(s,1)) goto Error_Cleanup; push ebp mov ebp, esp sub esp, 2A8h lea eax, [ebp+0FFFFFE70h] push eax push 101h call 4012BEh test eax, eax jz 401028h mov eax, 1 jmp 40116Fh push 0 push 1 push 2 call 4012B8h mov dword ptr [ebp+0FFFFFE6Ch], eax cmp dword ptr [ebp+0FFFFFE6Ch], byte 0FFh jnz 401047h jmp 401165h mov word ptr [ebp+0FFFFFE5Ch], 2 push 800h call 4012B2h mov word ptr [ebp+0FFFFFE5Eh], ax push 0 call 4012ACh mov dword ptr [ebp+0FFFFFE60h], eax push 10h lea ecx, [ebp+0FFFFFE5Ch] push ecx mov edx, [ebp+0FFFFFE6Ch] push edx call 4012A6h cmp eax, byte 0FFh jnz 40108Dh jmp 401165h push 1 mov eax, [ebp+0FFFFFE6Ch] push eax call 4012A0h cmp eax, byte 0FFh jnz 4010A5h jmp 401165h Tools for disassembling IDA Pro, IDA Pro Free – Disassembler – Execution graph – Cross-referencing – Searching – Function analysis – Function and variable labeling Tools for disassembling objdump objdump -d <object_file> Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either executable or relocatable (.o) file gdb Debugger gdb p disassemble sum Disassemble procedure x/13b sum Examine the 13 bytes starting at sum In-class exercise Lab 5-1 (Steps 1-17) – Use IDA Pro to bring up the code of DllMain – Bring up Figures 5-1L, the equivalent of 5-2L, and 5-3L – Find the remote shell routine in which memcmp is used to compare command strings received over the network – Show the code for the function called if the command robotwork is invoked – Show IDA Pro graphs of DLLMain and sub_10004E79 – Explain what the assembly code on p. 499 does – Find the socket call referred to in Table 5-1L and change its integer constants to symbolic ones – Show the assembly on p. 500. Find the routine that calls this assembly which shows that it is an anti-VM check. In-class exercise Lab 6-1 – Show the imported network functions in any tool – Show the output of executing the binary – Load binary in IDA Pro to generate Figure 6-1L Lab 6-2 – Generate Listing 6-1L and 6-2L using a tool of your choice. What calls hint at this code's function? – Using either Wireshark or netcat with Apate DNS, execute the malware to generate Listing 6-3L – In IDA Pro, show the functions called by main. What does each one do? – In IDA Pro, show the order that the WinINet calls are used and explain what each one does. – Generate Listing 6-5L and explain what each cmp does. Windows Chapter 7: Analyzing Malicious Windows Programs Types Hungarian notation word (w) = 16 bit value double word (dw) = dword = 32 bit value • Handles • Long dwSize = A type that is a 32-bit value (H) HWND = A handle to a window Pointer (LP) Callback File system functions Malware often hits file system CreateFile, ReadFile, WriteFile Memory mapping calls: CreateFileMapping, MapViewOfFile Trickiness • • • Alternate Data Streams (special file data) \Device\PhysicalMemory (accesses memory) \\.\ (accesses device) Registry functions Malware often hits registry Registry stores OS and program configuration information HKEY_LOCAL_MACHINE (HKLM) – Settings global to the machine HKEY_CURRENT_USER (HKCU) – Settings for current user Regedit tool for examining values Functions: RegOpenKeyEx, RegSetValueEx, RegGetValue (Listing 7-1) Networking APIs Berkeley sockets API socket, bind, listen, accept, connect, recv, send Listing 7-3 WinINet API InternetOpen, InternetOpenURL, InternetReadFile DLLs Dynamic link libraries Store code that is re-used amongst applications including malware Can be used to store malicious code for injection into a process Malware uses standard Windows DLLs to interact with OS Malware uses third-party DLLs (e.g. Firefox DLL) to avoid re-implementing functions Processes Execute code outside of current process CreateProcess Listing 7-4 Hijack execution of current process Injecting code via debugger or DLLs Companion execution Store executable in resource section of PE Program extracts executable and writes it to disk upon execution Threads Windows threads share same memory space but have separate registers and stack Used by Malware to insert a malicious DLL into a process's address space CreateThread address with address of LoadLibrary as start Services Processes run in the background Scheduled and run by Windows service manager without user input OpenSCManager, CreateService, StartService Allows malware to maintain persistence on a machine Types • • • WIN32_SHARE_PROCESS = allows multiple processes to contact service (e.g. svchost.exe) WIN32_OWN_PROCESS = independent process KERNEL_DRIVER = loads code into kernel COM Microsoft Component Object Model Interface standard that allows software components to call each other • • OleInitialize, CoInitializeEx CLSID = class identifier, IID = interface identifier “Navigate” • • Malware • • function in IWebBrowser2 interface Used by malware to launch browser Listing 7-11 implemented as COM server Browser helper objects Detect COM servers running via its calls – DllCanUnloadNow, DllGetClassObject, DllInstall, DllRegisterServer, DllUnregisterServer Exceptions Allow program to handle exceptional conditions during program execution Windows • • • • Used Structured Exception Handling Exception handling information stored on stack Listing 7-13 Not all handlers respond to all exceptions Thrown to caller's frame if not handled by malware to hijack execution • • Handler address replaced by address to injected malicious code Adversary then triggers exception Kernel-mode malware Windows API calls (Kernel32.dll) Typically call into underlying Native API (Ntdll.dll) Code in Ntdll then transfers to kernel (Ntoskrnl.exe) via INT 0x2E, SYSENTER, SYSCALL • Figure 7-3 Malware often calls Ntdll directly to avoid detection via interposition of security programs between Kernel32.dll and Ntdll.dll • • Example: Windows API (ReadFile, WriteFile) versus Native API (NtReadFile, NtWriteFile) Figure 7-4 Kernel-mode malware Other Native API calls NtQuerySystemInformation, NtQueryInformationProcess, NtQueryInformationThread, NtQueryInformationFile, NtQueryInformationKey • Can also carry “Zw” prefix NtContinue • • Used to return from an exception Location to return is specified in exception context, but can be modified to transfer execution in nefarious ways Kernel-mode malware Legitimate programs typically do not use NativeAPI exclusively Programs that are native applications (as specified in subsytem part of PE header) are likely malicious In-class exercise Lab 7-2 Using strings, identify the network resource being used by the malware What imports give away the mechanism this malware uses to launch the browser? Go to the code snippet shown on p. 518. Follow the references to show the values of rclsid and riid in memory. Debug the program and break at the call shown on p. 519. Run the call to show the browser being launched with the embedded URL Extra Run-time data structures More code snippets Registry modifications for disabling task manager and changing browser default page HKEY_CURRENT_USER\Software\Policies\Microsoft\Internet Explorer\Control Panel,Homepage HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\SystemDisableRegistryTools HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MainStart Page HKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_buzz content url HKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_Launchcast DisableTaskMgr More code snippets Kills anti-virus, zone-alarm, firewall processes More code snippets New variants Download worm update files and register them as services regsvr32 MSINET.OCX Internet Transfer ActiveX Control Check for updates