Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hello ASM World: A Painless and Contextual Introduction to x86 Assembly rogueclown DerbyCon 3.0 September 28, 2013 who? • security consultant by vocation • mess around with computers, code, CTFs by avocation • frustrated when things feel like a black box what is assembly language? • not exactly machine language…but close – instructions: mnemonics for machine operations – normally a one-to-one correlation between ASM instruction and machine instruction • varies by processor – today, we will be discussing 32-bit x86 why learn assembly language? • some infosec disciplines require it • curious about lower-level details of memory or interfacing with an operating system • it’s fun and challenging! how does assembly language work? hello memory • what parts of computer memory does assembly language commonly access? • how does assembly language access those parts of computer memory? where is this memory? • what one “normally” thinks of as memory – RAM – virtual memory • CPU – registers computer memory layout • heap – global variables, usually allocated at compile-time – envision a bookshelf…that won’t let you push books together when you take one out • stack – local, contextual variables – envision a card game discard pile – you will use this when coding ASM. a lot. registers • memory located on the CPU • registers are awesome because they are fast. • registers are a pain because they are tiny. registers • general purpose registers – alphabet soup • eax, ebx, ecx, edx • can address in parts: ax, ah, al – stack and base pointers • esp • ebp – index registers • esi, edi registers • instruction pointer – eip – records the next instruction for the program to follow • other registers – eflags – segment registers instructions • mov – moves a value to a register – can either specify a value, or specify a register where a value resides • syntax in assembly – Intel syntax: mov ebx, 0xfee1dead – AT&T syntax: mov $0xfee1dead, %eax instructions • interrupt – int 0x80 – int 0x3 • system calls – how a program interacts with the kernel of the OS instructions • mathematical instructions – add, sub, mul, div mov eax, 10 cdq ; edx is now 0 div 3; eax is now 3, edx is now 1 – dec, inc – useful for looping mov ecx, 3 dec ecx ; ecx is now 2 jumps • jge, jg, jle, jl – work with a compare (cmp) instruction • jz, jnz, js, jns – check zero flag or sign flag for jump instructions • stack operations: push and pop mov eax, 10 push eax inc eax push eax pop ebx pop ecx ; ; ; ; ; 10 on top of stack eax is now 11 11 on top of stack ebx is now 11 ecx is now 10 instructions • function access instructions – call • places the address of the next instruction on top of the stack • moves execution to identified function – ret • returns to the memory address on top of the stack • designed to work in tandem with the “call” instruction…but we’re hackers, yes? sections of ASM code • .data – constant variables initialized at compile time • .bss – declaration of variables that may are set of changed during runtime • .text – executable instructions $%&#@%^ instructions: how do they work? putting it together • time to take a bit of C code, and reimplement it in assembly language! where does shellcode come in? what is shellcode? • instructions injected into a running process • lacks some of the luxuries of writing a stand-alone program – no laying out nice memory segments in a .bss or .data section – basically, just one big .text section a first stab at shellcode… • this is going to look mostly familiar, except for how data is handled. why did it fail? • bad characters – shellcode is often passed to an application as a string. – if a character makes a string act funny, you may not want it in your shellcode • 0x00, 0x0a, 0x0d, etc. – use an encoder, or do it yourself try that shellcode again… where can i learn more about assembly language? suggested resources • dead trees – “Hacking: The Art of Exploitation” by Jon Erickson – “Practical Malware Analysis” by Michael Sikorski and Andrew Honig – “Gray Hat Python” by Justin Seitz suggested resources • the series of tubes – http://ref.x86asm.net – quick and dirty opcode reference – http://www.nasm.us/doc – Netwide Assembler documentation • system calls – Linux: • /usr/include/asm/unistd.h • man 2 $syscall – Windows: • http://msdn.microsoft.com/library/windows/desktop/hh92 0508%28vs.85%29 – Windows API reference how to find me • Twitter: @rogueclown • email: [email protected] • IRC: #derbycon, #misec, or #burbsec on Freenode • or, just wave me down at the con