Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to 8086 Assembly Language Computer Organization and Assembly Language Petra University Dr. Hadi Hassan Basic Elements of Assembly Language Constants Integer Constants Examples: –10, 42d, 10001101b, 0FF3Ah, 777o Radix: b = binary, d = decimal, h = hexadecimal, and o = octal If no radix is given, the integer constant is decimal A hexadecimal beginning with a letter must have a leading 0A3h Character and String Constants Enclose character or string in single or double quotes Examples: 'A', "d", 'ABC', "ABC", '4096' Embedded quotes: "single quote ' inside", 'double quote " inside' Each ASCII character occupies a single byte Reserved Words and Identifiers • Reserved words cannot be used as identifiers › Instruction mnemonics, directives, type attributes, operators, and predefined symbols • Identifiers › 1-247 characters, including digits › not case sensitive › first character must be a letter, _, @, ?, or $ Assembly Language Statements Statements: Programs consist of statements, one per line. Each statement is either an instruction , which assembler translates to machine code , or an assembler directive , which instructs the assembler to perform some specific tasks. Such as allocating memory space for variable or creating a procedure. both instructions and directive have up to four fields. Three types of statements in assembly language Typically, one statement should appear on a line 1. Executable Instructions Generate machine code for the processor to execute at runtime Instructions tell the processor what to do 2. Assembler Directives Provide information to the assembler while translating a program Used to define data, select memory model, etc. Non-executable: directives are not part of instruction set 3. Macros Shorthand notation for a group of statements Sequence of instructions, directives, or other macros Instructions Assembly language instructions have the format: [label:] mnemonic [operands] [;comment] Instruction Label (optional) marks the address (offset) of code and data Act as place markers Used to transfer program execution to a labeled instruction • Follow identifier rules • Data label • must be unique • example: myArray (not followed by colon) • Code label • target of jump and loop instructions • example: L1: (followed by colon) Mnemonics and Operands Mnemonic Identifies the operation (e.g. MOV, ADD, SUB, JMP, CALL) Operands Specify the data required by the operation Executable instructions can have zero to three operands Operands can be registers, memory variables, or constants Instruction Examples No operands pushf ; push the content of flag register One operand inc eax ; increment register eax call Clrscr ; call procedure Clrscr jmp ; jump to instruction with label L1 L1 Two operands add bx, cx ; register bx = bx + cx sub var1, 25 ; memory variable var1 = var1 - 25 Three operands imul eax,ebx,5 ; register eax = ebx * 5 Comments Comments are very important! Explain the program's purpose When it was written, revised, and by whom Explain data used in the program Explain instruction sequences and algorithms used Application-specific explanations Single-line comments Begin with a semicolon ; and terminate at end of line Multi-line comments Begin with COMMENT directive and a chosen character End with the same chosen character Defining Data Intrinsic Data Types in MASM BYTE, SBYTE REAL4 8-bit unsigned integer IEEE single-precision float 8-bit signed integer Occupies 4 bytes WORD, SWORD REAL8 16-bit unsigned integer IEEE double-precision 16-bit signed integer Occupies 8 bytes DWORD, SDWORD REAL10 32-bit unsigned integer IEEE extended-precision 32-bit signed integer Occupies 10 bytes QWORD, TBYTE 64-bit integer 80-bit integer Intrinsic Data Types in TASM db Define Byte 8-bit unsigned integer 8-bit signed integer dw Define Words (2 Bytes) 16-bit unsigned integer 16-bit signed integer dd Define Double words ( 4 Bytes) 32-bit unsigned integer 32-bit signed integer DQ Define Quadwords 64-bit integer 80-bit integer Data Definition Statement Sets aside storage in memory for a variable May optionally assign a name (label) to the data Syntax: [name] directive initializer [, initializer] . . . val1 BYTE 10 All initializers become binary data in memory Defining BYTE and SBYTE Data Each of the following defines a single byte of storage in case of using MASM : value1 BYTE 'A' ; character constant value2 BYTE 0 ; smallest unsigned byte value3 BYTE 255 ; largest unsigned byte value4 SBYTE -128 ; smallest signed byte value5 SBYTE +127 ; largest signed byte value6 BYTE ? ; uninitialized byte • MASM does not prevent you from initializing a BYTE with a negative value, but it's considered poor style. • If you declare a SBYTE variable, the Microsoft debugger will automatically display its value in decimal with a leading sign. Defining BYTE and SBYTE Data using TASM value1 dB 'A' ; character constant value2 dB 0 ; smallest unsigned byte value3 db 255 ; largest unsigned byte value4 db -128 ; smallest signed byte value5 db +127 ; largest signed byte value6 db ? ; uninitialized byte Defining Byte Arrays Examples that use multiple initializers list1 BYTE 10,20,30,40 list2 BYTE 10,20,30,40 BYTE 50,60,70,80 BYTE 81,82,83,84 list3 BYTE ?,32,41h,00100010b list4 BYTE 0Ah,20h,'A',22h Defining Strings A string is implemented as an array of characters For convenience, it is usually enclosed in quotation marks It is often terminated with a NULL char (byte value = 0) Examples: str1 BYTE "Enter your name", 0 str2 BYTE 'Error: halting program', 0 str3 BYTE 'A','E','I','O','U' greeting BYTE "Welcome to the Encryption " BYTE "Demo Program", 0 In Tasm can be written in this way str1 db "Enter your name $" str2 db 'Error: halting program $‘ str3 db 'A','E','I','O','U' greeting db "Welcome to the Encryption " db "Demo Program $" Defining Strings – cont'd To continue a single string across multiple lines, end each line with a comma menu BYTE "Checking Account",0dh,0ah,0dh,0ah, "1. Create a new account",0dh,0ah, "2. Open an existing account",0dh,0ah, "3. Credit the account",0dh,0ah, "4. Debit the account",0dh,0ah, "5. Exit",0ah,0ah, "Choice> ",0 End-of-line character sequence: 0Dh = 13 = carriage return 0Ah = 10 = line feed Idea: Define all strings used by your program in the same area of the data segment Using the DUP Operator Use DUP to allocate space for an array or string Advantage: more compact than using a list of initializers Syntax counter DUP ( argument ) Counter and argument must be constants expressions The DUP operator may also be nested var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero var2 BYTE 20 DUP(?) ; 20 bytes, all uninitialized var3 BYTE 4 DUP("STACK") ; 20 bytes: "STACKSTACKSTACKSTACK" var4 BYTE 10,3 DUP(0),20 ; 5 bytes: 10, 0, 0, 0, 20 var5 BYTE 2 DUP(5 DUP('*'), 5 DUP('!')) ; '*****!!!!!*****!!!!!' In Tasm can be written in this way var1 db 20 DUP(0) ; 20 bytes, all equal to zero var2 db 20 DUP(?) ; 20 bytes, all uninitialized var3 db 4 DUP("STACK") var4 db 10,3 DUP(0),20 ; 20 bytes: "STACKSTACKSTACKSTACK" ; 5 bytes: 10, 0, 0, 0, 20 var5 db 2 DUP(5 DUP('*'), 5 DUP('!')) ; '*****!!!!!*****!!!!!' Byte Ordering and Endianness Processors can order bytes within a word in two ways Little Endian Byte Ordering Memory address = Address of least significant byte Examples: Intel 80x86 MSB Byte 3 Byte 2 Byte 1 LSB Byte 0 address a . . . Byte 0 32-bit Register Example: val1 DWORD 12345678h a+1 a+2 a+3 Byte 1 Byte 2 Byte 3 Memory ... Big Endian Byte Ordering Memory address = Address of most significant byte Examples: MIPS, Motorola 68k, SPARC MSB Byte 3 Byte 2 Byte 1 32-bit Register LSB Byte 0 address a . . . Byte 3 a+1 a+2 a+3 Byte 2 Byte 1 Byte 0 Memory ... Defining Symbolic Constants Symbolic Constant Just a name used in the assembly language program Processed by the assembler pure text substitution Assembler does NOT allocate memory for symbolic constants Assembler provides three directives: = directive EQU directive TEXTEQU directive Defining constants has two advantages: Improves program readability Helps in software maintenance: changes are done in one place Equal-Sign Directive Name = Expression Name is called a symbolic constant Expression is an integer constant expression Good programming style to use symbols COUNT = 500 . . . mov eax, COUNT . . . COUNT = 600 . . . mov ebx, COUNT ; NOT a variable (NO memory allocation) ; mov eax, 500 ; Processed by the assembler ; mov ebx, 600 Name can be redefined in the program EQU Directive Three Formats: Name EQU Expression Integer constant expression Name EQU Symbol Existing symbol name Name EQU <text> Any text may appear within < …> SIZE EQU 10*10 ; Integer constant expression PI EQU <3.1416> ; Real symbolic constant PressKey EQU <"Press any key to continue...",0> .DATA prompt BYTE PressKey No Redefinition: Name cannot be redefined with EQU TEXTEQU Directive TEXTEQU creates a text macro. Three Formats: Name TEXTEQU <text> assign any text to name Name TEXTEQU textmacro assign existing text macro Name TEXTEQU %constExpr constant integer expression Name can be redefined at any time (unlike EQU) ROWSIZE COUNT MOVAL ContMsg .DATA prompt .CODE MOVAL = 5 TEXTEQU TEXTEQU TEXTEQU %(ROWSIZE * 2) ; evaluates to 10 <mov al,COUNT> <"Do you wish to continue (Y/N)?"> BYTE ContMsg ; generates: mov al,10 Operand Types Three Basic Types of Operands Immediate Constant integer (8, 16, or 32 bits) Constant value is stored within the instruction Register Name of a register is specified Register number is encoded within the instruction Memory Reference to a location in memory Memory address is encoded within the instruction, or Register holds the address of a memory location Instruction Operand Notation Operand Description r8 8-bit general-purpose register: AH, AL, BH, BL, CH, CL, DH, DL r16 16-bit general-purpose register: AX, BX, CX, DX, SI, DI, SP, BP r32 32-bit general-purpose register: EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP reg Any general-purpose register sreg 16-bit segment register: CS, DS, SS, ES, FS, GS imm 8-, 16-, or 32-bit immediate value imm8 8-bit immediate byte value imm16 16-bit immediate word value imm32 32-bit immediate doubleword value r/m8 8-bit operand which can be an 8-bit general-purpose register or memory byte r/m16 16-bit operand which can be a 16-bit general-purpose register or memory word r/m32 32-bit operand which can be a 32-bit general register or memory doubleword mem 8-, 16-, or 32-bit memory operand Data Transfer Instructions Move source operand to destination mov destination, source Source and destination operands can vary mov reg, reg mov mem, reg Rules mov reg, mem • Both operands must be of same size mov mem, imm • No memory to memory moves • Destination cannot be CS, EIP, or IP mov reg, imm • No immediate to segment moves mov r/m16, sreg mov sreg, r/m16 Programs running in protected mode should not modify the segment registers Source Operands Destination Operands General register Segment registers Memory location Constant General register YES YES YES NO Segment registers YES NO YES NO Memory location YES YES NO NO Constant YES NO YES NO MOV Examples .DATA count db 100 bVal db 20 wVal dw 2 dVal dw 5 .CODE mov bl, count mov ax, wVal mov count,al mov ax, dval ; ; ; ; bl = count = 100 ax = wVal = 2 count = al = 2 ax = dval = 5 ; Assembler will not accept the following moves – why? mov mov mov mov mov ds, 45 si, bVal ip, dVal 25, bVal bVal,count ; immediate move to DS not permitted ; size mismatch ; IP cannot be the destination ; immediate value cannot be destination ; memory-to-memory move not permitted XCHG Instruction XCHG exchanges the values of two operands xchg reg, reg xchg reg, mem xchg mem, reg Source Operands Destination Operands General register Memory location General register YES YES Memory location YES NO Rules .DATA • Operands must be of the same size var1 dw 10000000h • At least one operand must be a var2 DWORD 20000000h register .CODE • No immediate operands are permitted xchg ah, al ; exchange 8-bit regs xchg ax, bx ; exchange 16-bit regs xchg eax, ebx ; exchange 32-bit regs xchg var1,ebx ; exchange mem, reg xchg var1,var2 ; error: two memory operands Example: XCHG AH, BL Before After 1A 00 00 AH AL AH 00 00 00 1A BH BL BH BL 00 AL Direct Memory Operands Variable names are references to locations in memory Direct Memory Operand: Named reference to a memory location Assembler computes address (offset) of named variable .DATA var1 db 10h .CODE Direct Memory Operand mov al, var1 ; AL = var1 = 10h mov al,[var1] ; AL = var1 = 10h Alternate Format Direct-Offset Operands Direct-Offset Operand: Constant offset is added to a named memory location to produce an effective address Assembler computes the effective address Lets you access memory locations that have no name .DATA arrayB BYTE 10h,20h,30h,40h .CODE mov al, arrayB+1 mov al,[arrayB+1] mov al, arrayB[1] ; AL = 20h ; alternative notation ; yet another notation Q: Why doesn't arrayB+1 produce 11h? Direct-Offset Operands - Examples .DATA arrayW WORD 1020h, 3040h, 5060h arrayD DWORD 1, 2, 3, 4 .CODE mov ax, arrayW+2 ; Ax=3040 mov ax, arrayW[4] ; Ax=5060 mov eax,[arrayD+4] ;EAX = 00000002h ; Will the following statements assemble? mov eax,[arrayD-3];?????????? mov ax, [arrayW+9] ;?? mov ax, [arrayD+3] ;?? mov ax, [arrayW-2] ;?? mov eax,[arrayD+16] ;?? 1020 3040 5060 1 2 3 4 20 10 40 30 60 50 01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 +1 arrayW +2 +3 +4 +5 arrayD +1 +2 +3 +4 +5 +6 +7 +8 +9 +10 +11 +12 +13 +14 +15 Your turn. . . Write a code that rearranges the values of three word values in the following array as: 3, 1, 2. .data arrayW dw 1,2,3 • Step1: copy the first value into AX and exchange it with the value in the second position. mov ax,arrayW xchg ax,[arrayW+2] • Step 2: Exchange AX with the third array value and copy the value in AX to the first array position. xchg ax,[arrayW+4] mov arrayW,ax Addition and Subtraction INC and DEC Instructions ADD and SUB Instructions NEG Instruction Implementing Arithmetic Expressions Flags Affected by Arithmetic • • • • Zero Sign Carry Overflow • Add 1, subtract 1 from destination operand • operand may be register or memory • INC destination • Logic: destination destination + 1 • DEC destination • Logic: destination destination – 1 INC and DEC Examples .data myWord dw 1000h myDword dd 10000000h .code inc myWord dec myWord inc myDword mov inc mov inc ax,00FFh ax ax,00FFh al ; 1001h ; 1000h ; 10000001h ; AX = 0100h ; AX = 0000h Your turn... Show the value of the destination operand after each of the following instructions executes: .data myByte .code mov mov dec inc dec BYTE 0FFh, 0 al,myByte ah,[myByte+1] ah al ax ; ; ; ; ; AL AH AH AL AX = = = = = FFh 00h FFh 00h FEFF ADD and SUB Instructions • ADD destination, source • Logic: destination destination + source • SUB destination, source • Logic: destination destination – source • Same operand rules as for the MOV instruction ADD and SUB Examples .data var1 DWORD 10000h var2 DWORD 20000h .code mov eax,var1 add eax,var2 add ax,0FFFFh add eax,1 sub ax,1 ; ; ; ; ; ; ---EAX--00010000h 00030000h 0003FFFFh 00040000h 0004FFFFh NEG (negate) Instruction Reverses the sign of an operand. Operand can be a register or memory operand. NEG does this by replacing the content by its two’s complement . The syntax is NEG destination Example: NEG BX Before 0002 BX After FFFE BX NEG Examples .data valB BYTE -1 valW WORD +32767 .code mov al,valB neg al neg valW ; AL = -1 ; AL = +1 ; valW = -32767 Implementing Arithmetic Expressions HLL compilers translate mathematical expressions into assembly language. You can do it also. For example: Rval = -Xval + (Yval – Zval) Rval DWORD ? Xval DWORD 26 Yval DWORD 30 Zval DWORD 40 .code mov eax,Xval neg eax mov ebx,Yval sub ebx,Zval add eax,ebx mov Rval,eax ; EAX = -26 ; EBX = -10 ; -36 Flags Affected by Arithmetic The ALU has a number of status flags that reflect the outcome of arithmetic (and bitwise) operations based on the contents of the destination operand Essential flags: Zero flag – set when destination equals zero Sign flag – set when destination is negative Carry flag – set when unsigned value is out of range Overflow flag – set when signed value is out of range The MOV instruction never affects the flags. Concept Map CPU part of executes executes ALU conditional jumps arithmetic & bitwise operations attached to affect used by provide status flags branching logic You can use diagrams such as these to express the relationships between assembly language concepts. Zero Flag (ZF) The Zero flag is set when the result of an operation produces zero in the destination operand. mov sub mov inc inc cx,1 cx,1 ax,0FFFFh ax ax ; CX = 0, ZF = 1 ; AX = 0, ZF = 1 ; AX = 1, ZF = 0 Remember... • A flag is set when it equals 1. • A flag is clear when it equals 0. Sign Flag (SF) The Sign flag is set when the destination operand is negative. The flag is clear when the destination is positive. mov cx,0 sub cx,1 add cx,2 ; CX = -1, SF = 1 ; CX = 1, SF = 0 The sign flag is a copy of the destination's highest bit: mov al,0 sub al,1 add al,2 ; AL = 11111111b, SF = 1 ; AL = 00000001b, SF = 0 Signed and Unsigned Integers A Hardware Viewpoint All CPU instructions operate exactly the same on signed and unsigned integers The CPU cannot distinguish between signed and unsigned integers YOU, the programmer, are solely responsible for using the correct data type with each instruction Overflow and Carry Flags A Hardware Viewpoint How the ADD instruction affects OF and CF: CF = (carry out of the MSB) OF = (carry out of the MSB) XOR (carry into the MSB) How the SUB instruction affects OF and CF: negate the source and add it to the destination OF = (carry out of the MSB) XOR (carry into the MSB) CF = INVERT (carry out of the MSB) MSB = Most Significant Bit (high-order bit) XOR = eXclusive-OR operation NEG = Negate (same as SUB 0,operand ) Carry Flag (CF) The Carry flag is set when the result of an operation generates an unsigned value that is out of range (too big or too small for the destination operand). mov al,0FFh add al,1 ; CF = 1, AL = 00 ; Try to go below zero: mov al,0 sub al,1 ; CF = 1, AL = FF Your turn . . . For each of the following marked entries, show the values of the destination operand and the Sign, Zero, and Carry flags: mov add sub add mov add ax,00FFh ax,1 ax,1 al,1 bh,6Ch bh,95h mov al,2 sub al,3 ; AX= 0100h ; AX= 00FFh ; AL= 00h SF= 0 ZF= 0 CF= 0 SF= 0 ZF= 0 CF= 0 SF= 0 ZF= 1 CF= 1 ; BH= 01h SF= 0 ZF= 0 CF= 1 ; AL= FFh SF= 1 ZF= 0 CF= 1 Overflow Flag (OF) The Overflow flag is set when the signed result of an operation is invalid or out of range. ; Example 1 mov al,+127 add al,1 ; Example 2 mov al,7Fh add al,1 ; OF = 1, AL = ?? ; OF = 1, AL = 80h The two examples are identical at the binary level because 7Fh equals +127. To determine the value of the destination operand, it is often easier to calculate in hexadecimal. A Rule of Thumb • When adding two integers, remember that the Overflow flag is only set when . . . • Two positive operands are added and their sum is negative • Two negative operands are added and their sum is positive What will be the values of the Overflow flag? mov al,80h add al,92h ; OF = 1 mov al,-2 add al,+127 ; OF = 0 Your turn . . . What will be the values of the given flags after each operation? mov al,-128 neg al ; CF = 1 OF = 1 mov ax,8000h add ax,2 ; CF = 0 OF = 0 mov ax,0 sub ax,2 ; CF = 1 OF = 0 mov al,-5 sub al,+125 ; OF = 1 TITLE and .MODEL Directives TITLE line (optional) Contains a brief heading of the program and the disk file name .MODEL directive Specifies the memory configuration For our purposes, the FLAT memory model will be used Linear 32-bit address space (no segmentation) STDCALL directive tells the assembler to use … Standard conventions for names and procedure calls .686 processor directive Used before the .MODEL directive Program can use instructions of Pentium P6 architecture At least the .386 directive should be used with the FLAT model .STACK, .DATA, & .CODE Directives .STACK directive Tells the assembler to define a runtime stack for the program The size of the stack can be optionally specified by this directive The runtime stack is required for procedure calls .DATA directive Defines an area in memory for the program data The program's variables should be defined under this directive Assembler will allocate and initialize the storage of variables .CODE directive Defines the code section of a program containing instructions Assembler will place the instructions in the code area in memory INCLUDE, PROC, ENDP, and END INCLUDE directive Causes the assembler to include code from another file We will include Irvine32.inc provided by the author Kip Irvine Declares procedures implemented in the Irvine32.lib library To use this library, you should link Irvine32.lib to your programs PROC and ENDP directives Used to define procedures As a convention, we will define main as the first procedure Additional procedures can be defined after main END directive Marks the end of a program Identifies the name (main) of the program’s startup procedure Example 1: Example 2: Write a program to read a character from the keyboard and display it at beginning of the next line. TITLE PGM(1): ECHO PROGRAM .MODEL SMALL .STACK 100h .CODE MAIN PROC MOV AH,2 ; display character function MOV DL,’?’ INT 21h MOV AH,1 ; read a character from the keyboard INT 21h ; The ASCII code of the character in AL MOV AL,BL ; save it in AL MOV AH,2 MOV AH,0DH ; carriage return INT 21h ;execute MOV DL,OA ; feed line INT 21h MOV DL,BL ; display the character INT 21h MOV AH, 4CH ; return to DOS Function INT 21h MAIN ENDP END MAIN ; end of the code Note: because no variables were used , the .DATA segment was omitted Simple execution of this program ?A A Example 3: Case Conversion Program The following program combines most of this chapter material into one program , which prompts the user to input a lower case letter , and on the next line displays the letter in upper case. TITLE PGM(3):Case Conversion Program .MODEL SMALL .STACK 100h .DATA CR EQU 0DH LF EQU 0AH MSG1 DB ‘ ENTER A LOWER CASE LETTER: $’ MSG2 DB 0DH, 0AH, ‘ IN UPPER CASE IT IS: $‘ CHAR DB ? .CODE MAIN PROC MOV AX,@DATA ; initialize DS MOV DS,AX LEA DX,MSG1 ; get the offset address of MSG1 MOV AH,9 ; display function INT 21h ; display the first MSG1 MOV AH,1 ; read a character from the keyboard INT 21h ; The ASCII code of the Lower case letter in AL SUB AL,20h ; convert L/C into U/C MOV CHAR,AL ; store it into CHAR LEA DX,MSG1 ; get the offset address of MSG2 MOV AH,9 ; display function INT 21h ; display the first MSG2 MOV AH,2 MOV DL, CHAR INT21h MOV AH, 4CH ; return to DOS Function INT 21h MAIN ENDP END MAIN