Download Ch. I-3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CS2422 Assembly Language and System Programming
Assembly Language
Fundamentals
Department of Computer Science
National Tsing Hua University
Assembly Language for IntelBased Computers, 5th Edition
CS2422 Assembly Language and System Programming
Kip Irvine
Chapter 3: Assembly Language
Fundamentals
Slides prepared by the author
Revision date: June 4, 2006
(c) Pearson Education, 2006-2007. All rights reserved. You may modify and copy this slide show for your personal use,
or for use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.
Chapter Overview






Basic Elements of Assembly Language
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data
Symbolic Constants
Real-Address Mode Programming
2
Starting with an Example
TITLE Add and Subtract
(AddSub.asm)
; Adds and subtracts three 32-bit integers
; (10000h + 40000h + 20000h)
INCLUDE Irvine32.inc
.code
Title/header
main PROC
Include file
mov eax,10000h
; EAX = 10000h
add eax,40000h
; EAX = 50000h
sub eax,20000h
; EAX = 30000h
call DumpRegs
; display registers
exit
main ENDP
Code section
END main
3
Meanings of the Code
Assembly code
MOV EAX, 10000h
Machine code
B8 00010000
(Move 10000h into EAX)
Operand in instruction
ADD EAX, 40000h
05 00040000
(Add 40000h to EAX)
SUB EAX, 20000h
2D 00020000
(SUB 20000h from EAX)
4
Fetched MOV EAX, 10000h
Register
Memory
EAX
EBX
data
…
ALU
IR
B8 00010000
B8
00
01
00
00
05
00
04
00
00
MOV EAX, 10000h
ADD EAX, 40000h
SUB EAX, 20000h
PC
0000011
address
…
5
Execute MOV EAX, 10000h
Register
EAX
EBX
Memory
00010000
data
…
ALU
IR
B8 00010000
B8
00
01
00
00
05
00
04
00
00
MOV EAX, 10000h
ADD EAX, 40000h
SUB EAX, 20000h
PC
0000011
address
…
6
Fetched ADD EAX, 40000h
Register
EAX
EBX
Memory
00010000
data
…
ALU
IR
05 00040000
B8
00
01
00
00
05
00
04
00
00
MOV EAX, 10000h
ADD EAX, 40000h
SUB EAX, 20000h
PC
0001000
address
…
7
Execute ADD EAX, 40000h
Register
EAX
EBX
Memory
00010000
00050000
data
…
ALU
IR
05 00040000
B8
00
01
00
00
05
00
04
00
00
MOV EAX, 10000h
ADD EAX, 40000h
SUB EAX, 20000h
PC
0001000
address
…
8
Chapter Overview

Basic Elements of Assembly Language












Integer constants and expressions
Character and string constants
Reserved words and identifiers
Directives and instructions
Labels
Mnemonics and Operands
Comments
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data
Symbolic Constants
Real-Address Mode Programming
9
Reserved Words, Directives
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main



TITLE:
 Define program listing
title
 Reserved word of
directive
Reserved words
 Instruction mnemonics,
directives, type
attributes, operators,
predefined symbols
 See MASM reference
in Appendix A
Directives:
 Commands for
assembler
10
Directive vs Instruction

Directives: tell assembler what to do




Commands that are recognized and acted upon
by the assembler, e.g. declare code, data areas,
select memory model, declare procedures, etc.
Not part of the Intel instruction set
Different assemblers have different directives
Instructions: tell CPU what to do




Assembled into machine code by assembler
Executed at runtime by the CPU
Member of the Intel IA-32 instruction set
Format:
LABEL (option), Mnemonic, Operands, Comment
11
Comments
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

Single-line comments


begin with semicolon (;)
Multi-line comments

begin with COMMENT
directive and a
programmer-chosen
character, end with the
same character, e.g.
COMMENT !
Comment line 1
Comment line 2
!
12
Include Files
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

INCLUDE directive:

Copies necessary
definitions and setup
information from a text
file named Irvine32.inc,
located in the
assembler’s INCLUDE
directory (see Chapt 5)
13
Code Segment
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

.code directive:

Marks the beginning of
the code segment,
where all executable
statements in a
program are located
14
Procedure Definition
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main


Procedure defined by:
 [label] PROC
 [label] ENDP
Label:
 Place markers: marks
the address (offset) of
code and data
 Assigned a numeric
address by assembler
 Follow identifier rules
 Data label: must be
unique, e.g. myArray
 Code label: target of
jump and loop
instructions, e.g. L1:
15
Identifiers
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

Identifiers:




A programmer-chosen
name to identify a
variable, a constant, a
procedure, or a code
label
1-247 characters,
including digits
not case sensitive
first character must be
a letter, _, @, ?, or $
16
Integer Constants



Optional leading + or – sign
Binary, decimal, hexadecimal, or octal digits
Common radix characters:






h – hexadecimal
d – decimal
b – binary
r – encoded real
Examples: 30d, 6Ah, 42, 1101b
Hexadecimal beginning with letter: 0A5h
17
Instructions
[label:] mnemonic operand(s)
TITLE Add and …
[;comment]
; Adds and subtracts
 Instruction mnemonics:
; (10000h + …
 help to memorize
INCLUDE Irvine32.inc
 examples: MOV, ADD,
.code
SUB, MUL, INC, DEC
main PROC
 Operands:
 constant
mov eax,10000h
 constant expression
add eax,40000h
 register
sub eax,20000h
 memory (data label,
call DumpRegs
register)
exit
main ENDP Destination Source
immediate values
END main operand
operand
18
Instruction Format Examples

No operands



; set Carry flag
; no operation
One operand



stc
nop
inc eax
inc myByte
; register
; memory
Two operands



add ebx,ecx
sub myByte,25
add eax,36 * 25
; register, register
; memory, constant
; register, constant-expr.
19
I/O
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

Not easy, if program by
ourselves


Two steps:



Will use the library
provided by the author
Include the library
(Irvine32.inc) in your
code
Call the subroutines
call DumpRegs:

Calls the procedure to
displays current values
of processor registers
20
Remaining
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main

exit:



Halts the program
Not a MSAM keyword,
but a command
defined in Irvine32.inc
END main:


Marks the last line of
the program to be
assembled
Identifies the name of
the program’s startup
procedure
21
Example Program Output

Program output, showing registers and flags
EAX=00030000
EBX=7FFDF000
ECX=00000101
EDX=FFFFFFFF
ESI=00000000
EDI=00000000
EBP=0012FFF0
ESP=0012FFC4
EIP=00401024
EFL=00000206
CF=0
SF=0
ZF=0
OF=0
22
Alternative Version of AddSub
TITLE Add and Subtract
(AddSubAlt.asm)
; adds and subtracts 32-bit integers
.386
.MODEL flat,stdcall
.STACK 4096
ExitProcess PROTO, dwExitCode:DWORD
DumpRegs PROTO
.code
main PROC
mov eax,10000h
; EAX = 10000h
add eax,40000h
; EAX = 50000h
sub eax,20000h
; EAX = 30000h
call DumpRegs
INVOKE ExitProcess,0
main ENDP
END main
23
Explanations

.386 directive:


.MODEL directive:



Generate code for protected mode program
Stdcall: enable calling of Windows functions
PROTO directives:



Minimum processor required for this code
Prototypes for procedures
ExitProcess: Windows function to halt process
INVOKE directive:


Calls a procedure or function
Calls ExitProcess and passes it with a return code
of zero
24
Suggested Program Template
TITLE Program Template
(Template.asm)
; Program Description:
; Author:
; Creation Date:
; Revisions:
; Date:
Modified by:
INCLUDE Irvine32.inc
.data
; (insert variables here)
.code
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main
25
What's Next






Basic Elements of Assembly Language
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data
Symbolic Constants
Real-Address Mode Programming
26
Assemble-Link-Execute Cycle

Steps from creating a source program through
executing the compiled program
Link
Library
Source
File
Step 1: text editor
Step 2:
assembler
Object
File
Listing
File
Step 3:
linker
Executable
File
Step 4:
OS loader
Output
Map
File
http://kipirvine.com/asm/gettingStarted/index.htm
27
MASM History

v6.11


v6.15


Visual C++ .NET 2002
v7.1


Visual C++ 6.0 Processor Pack
v7.0


Independent product
Visual C++ .NET 2003
v8.0

Visual C++ .NET 2005
28
Download, Install, and Run

MASM 6.15 (with all examples of textbook)



Unzip the archive and run setup.exe
Choose the installation directory



Masm615.zip: download from the course web site
Suggest using the default directory
See index.htm in the archive for details
Go to C:\Masm615 (if installed default)

Write assembly source code
‒ TextPad, NotePad++, UltraEdit or …

make32 xxx (where xxx is your file name)
29
Suggestion

Study make32.bat and make16.bat



Think about linking with other language (ex: C or
C++ or …)
Understand that MASM is only one of the
assemblers, and there are still many other
assemblers to use


To know where assembling stage and linking
stage are
Try to use NASM or TASM
Try to use high‐level compiler to generate
assembly codes

gcc or visual c++ or turbo c or …
30
Listing File


Use it to see how your program is compiled
Contains






source code
addresses
object code (machine language)
segment names
symbols (variables, procedures, and constants)
Example: addSub.lst
31
Listing File
00000000
00000000
00000000
00000005
0000000A
0000000F
.code
main PROC
B8 00010000
05 00040000
2D 00020000
E8 00000000E
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
00000014 6A 00
*
push +000000000h
00000016 E8 00000000E * call ExitProcess
0000001B
main ENDP
END main
memory
address
content
32
What's Next




Basic Elements of Assembly Language
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data







Intrinsic Data Types
Data Definition Statement
Defining BYTE, SBYTE, WORD, SWORD,
DWORD, SDWORD, QWORD, TBYTE
Defining Real Number Data
Little Endian Order
Symbolic Constants
Real-Address Mode Programming
33
Intrinsic Data Types
BYTE
SBYTE
WORD
SWORD
DWORD
SDWORD
FWORD
QWORD
TBYTE
REAL4
REAL8
REAL10
8-bit unsigned integer
8-bit signed integer
16-bit unsigned integer
16-bit signed integer
32-bit unsigned integer
32-bit signed integer
48-bit integer (Far pointer in protected
mode)
64-bit integer
80-bit (10-byte) integer
32-bit (4-byte) IEEE short real
64-bit (8-byte) IEEE long real
80-bit (10-byte) IEEE extended real
34
Data Definition Statement



A data definition statement sets aside storage in
memory for a variable
May optionally assign a name (label) to the data
Syntax:
[name] directive initializer [,initializer] . . .
value1 BYTE 10


All initializers become binary data in memory
Use ? if no initialization necessary

Example: Var1 BYTE ?
35
Defining BYTE and SBYTE Data

Each of following defines a single byte of storage:
value1 BYTE 'A'
; character constant
value2 BYTE 0
; smallest unsigned byte
value3 BYTE 255
; largest unsigned byte
value4 SBYTE -128
; smallest signed byte
value5 SBYTE +127
; largest signed byte
value6 BYTE ?
; uninitialized byte
• MASM does not prevent you from initializing a BYTE with a
negative value, but it is considered poor style
• If you declare a SBYTE variable, the Microsoft debugger will
automatically display its value in decimal with a leading sign
36
Defining Byte Arrays

Examples that use multiple initializers:
list1 BYTE 10,20,30,40
list2 BYTE 10,20,30,40
BYTE 50,60,70,80
BYTE 81,82,83,84
list3 BYTE ?,32,41h,00100010b
list4 BYTE 0Ah,20h,‘A’,22h
37
Defining Strings

An array of characters



Usually enclosed in quotation marks
Will often be null-terminated
To continue a single string across multiple lines,
end each line with a comma
str1 BYTE
str2 BYTE
str3 BYTE
greeting
"Enter your name",0
'Error: halting program',0
'A','E','I','O','U'
BYTE "Welcome to the Encryption Demo program "
BYTE "created by Kip Irvine.",0
menu BYTE "Checking Account",0dh,0ah,0dh,0ah,
"1. Create a new account",0dh,0ah,
"2. Open an existing account",0dh,0ah,
"Choice> ",0
End-of-line sequence:
Is str1 an array?
•0Dh = carriage return
38
•0Ah = line feed
Using the DUP Operator



Use DUP to allocate (create space for) an array
or string
Syntax:
counter DUP (argument)
Counter and argument must be constants or
constant expressions
var1 BYTE 20 DUP(0)
; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?)
; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK"); 20 bytes,
; "STACKSTACKSTACKSTACK"
var4 BYTE 10,3 DUP(0),20
; 5 bytes
39
Defining WORD and SWORD

Define storage for 16-bit integers


word1
word2
word3
word4
myList
array
or double characters
single value or multiple values
WORD
SWORD
WORD
WORD
WORD
WORD
65535
–32768
?
"AB"
1,2,3,4,5
5 DUP(?)
;
;
;
;
;
;
largest unsigned value
smallest signed value
uninitialized, unsigned
double characters
array of words
uninitialized array
40
Defining Other Types of Data

Storage definitions for 32-bit integers, quadwords,
tenbyte values, and real numbers:
val1 DWORD 12345678h ; unsigned
val2 SDWORD –2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array
val4 SDWORD –3,–2,–1,0,1 ; signed array
quad1 QWORD 1234567812345678h
val1 TBYTE 1000000000123456789Ah
rVal1 REAL4 -2.1
rVal2 REAL8 3.2E-260
rVal3 REAL10 4.6E+4096
ShortArray REAL4 20 DUP(0.0)
41
Adding Variables to AddSub
TITLE Add and Subtract, Version 2
(AddSub2.asm)
; This program adds and subtracts 32-bit unsigned
; integers and stores the sum in a variable.
INCLUDE Irvine32.inc
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1
; start with 10000h
add eax,val2
; add 40000h
sub eax,val3
; subtract 20000h
mov finalVal,eax
; store the result (30000h)
call DumpRegs
; display the registers
exit
main ENDP
END main
42
Listing File
00000000
00000000 00010000
00000004 00040000
00000008 00020000
0000000C 00000000
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
00000000
00000000
00000000
00000005
0000000B
00000011
00000016
.code
main PROC
A1 00000000 R
mov eax,val1 ; start with 10000h
03 05 00000004 R add eax,val2
; add 40000h
2B 05 00000008 R sub eax,val3
; subtract 20000h
A3 0000000C R
mov finalVal,eax; store result
E8 00000000 E
call DumpRegs ; display registers
exit
00000022
main ENDP
43
C vs Assembly
main()
{
int
int
int
int
val1=10000h;
val2=40000h;
val3=20000h;
finalVal;
finalVal = val1
+ val2 - val3;
}
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1
add eax,val2
sub eax,val3
mov finalVal,eax
call DumpRegs
exit
main ENDP
44
What's Next





Basic Elements of Assembly Language
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data
Symbolic Constants





Equal-Sign Directive
Calculating the Sizes of Arrays and Strings
EQU Directive
TEXTEQU Directive
Real-Address Mode Programming
45
Equal-Sign Directive

name = expression





expression is a 32-bit integer (expression or
constant)
may be redefined
name is called a symbolic constant
Also OK to use EQU
good programming style to use symbols
COUNT = 500
…
mov al,COUNT
46
Calculating the Size of Arrays


Current location counter: $
Size of a byte array

Subtract address of list and difference is the
number of bytes
list BYTE 10,20,30,40
ListSize = ($ - list)

Size of a word array

Divide total number of bytes by 2 (size of a word)
list WORD 1000h,2000h,3000h,4000h
ListSize = ($ - list) / 2
47
EQU Directive



Define a symbol as either an integer or text
expression
Cannot be redefined
OK to use expressions in EQU:




Matrix1 EQU 10 * 10
Matrix1 EQU <10 * 10>
No expression evaluation if within < >
EQU accepts texts too
PI EQU <3.1416>
pressKey EQU <"Press any key to continue",0>
.data
prompt BYTE pressKey
48
TEXTEQU Directive



Define a symbol as either an integer or text
expression
Called a text macro
Can be redefined
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
rowSize = 5
.data
prompt1 BYTE continueMsg
count TEXTEQU %(rowSize * 2)
; evaluates expression
setupAL TEXTEQU <mov al,count>
.code
setupAL
; generates: "mov al,10"
49
What's Next






Basic Elements of Assembly Language
Example: Adding and Subtracting Integers
Assembling, Linking, and Running Programs
Defining Data
Symbolic Constants
Real-Address Mode Programming (skipped)
50
Summary






Integer expression, character constant
Directive – interpreted by the assembler
Instruction – executes at runtime
Code, data, and stack segments
Source, listing, object, map, executable files
Data definition directives:



BYTE, SBYTE, WORD, SWORD, DWORD,
SDWORD, QWORD, TBYTE, REAL4, REAL8,
and REAL10
DUP operator, location counter ($)
Symbolic constant

EQU and TEXTEQU
51