Download uP Assembly Language 1

Document related concepts
no text concepts found
Transcript
ECE 371
Microprocessors
Chapter 5
x86 Assembly Language 1
Herbert G. Mayer, PSU
Status 11/11/2015
For use at CCUT Fall 2015
1
Syllabus













Motivation
16-bit, 32-bit, 64-bit Processor
Null Program
Print Character
Print String
INT Function
Assembler Abbreviations
Macros
Procedures
Assembly and Linking
nasm Assembler
Summary
Appendix
2
Motivation
 Almost impossible to communicate with a
microprocessor on the binary level
 Assembler offers abstraction, relocatability, and
program reuse
 Symbolic names permit convenient definition and
reference of data and code objects
 Assembler offers high level data and control
constructs, similar to high-level languages
 Assembler programming allows high level of control
over the target machine
 And achieves highest performance -for short code
sections
3
Motivation
 Intel x86 is the most widely used microprocessor for
general computing; made by Intel and AMD
 The ARM processor is most widely used processor
for portable devices, e.g. tablets and cell phones
 We use Intel x86 here to explain the relation of µP
and assembly language; for any one µP, there may
be many assemblers, but only a single binary code
 The µP architecture defines details of the assembler
instructions; yet some assembly language detail is
independent of architecture
 E.g. the syntactic order in which operands are listed
in assembly instructions is arbitrary, but the bits
have to be assembled into their specific bit positions
of a machine instruction
4
Motivation
 Any machine instruction has its corresponding
assembler syntax
 Different manufacturers of an assembler may have
different syntax rules for the same machine
instructions
 For example, some define the destination register to
be situated in the leftmost position of the various
defined operands; e.g. a load instruction for a
hypothetical machine could be:
ld r1, [foo]
-- load word at address foo into reg r1
 Others might reverse the order, use different
mnemonics, or name registers differently, such as:
load foo, %r1 -- load word at address foo into reg r1
5
Motivation
 Some manufacturers refer to moving bits from
memory into a register as a load instruction (IBM);
others as a move instruction (Intel)
 Assembly Language bridges the gap between low
level binary machine instructions and higher level
interface with human programmers
 Binary instructions execution on a digital computer,
while an assembler provides a tool of expressing
programs in readable, text form, readable by
programmers
 Assembly language is by no means high-level in the
sense of machine independent, structured, or objectoriented
 It is a low level, target machine specific interface; but
shields programmers from the tedium of binary code
6
Motivation
 Users do not deal with the target machine in terms of
bits that represent binary machine instructions
 An assembler is a piece of system software that
maps an assembly source program into binary
instructions
 Thus assembly language provides an abstraction:
 It elevates the user to the level of textual language, up
from the level of binary object code
 Several, different assemblers may do this in
syntactically different ways for the same target μP
 Yet the generated binary code has to be identical for
each assembler, in order to render the object code
executable on the targeted μP
7
Motivation
 Common to many architectures is the notion (and
separation) of data space, instruction space, and
perhaps other areas of program logic
 The x86 architecture embodies so called data
segments, code segments, stack segments, and
numerous of these if needed
 Each segment is identified at run time by a segment
register
 Offsets to specific data or code elements are
identified by offsets from the start of their respective
segment
8
Motivation
 For example, the code label next: will be interpreted
by the hardware as seg: offset, where seg is the
segment register cs, and offset is the offset of next
from the start of the code segment
 Let’s say the offset of next is 248x and the value in the
cs register is 20030x, then the resulting run time
(code) address is 200548x
 Note the left-shift of the segment address by 4 bits
 This is possible, and required, since all segments are
required to be aligned at modulo-16 addresses on the
Intel x86 architecture
 Thus a segment’s starting address is always a
multiple of 16, and its binary address would always
have the rightmost (low-order) 4 bits 0
9
Motivation
 This chapter introduces complete programs, written
in assembly language
 Starting with the smallest possible but complete
assembly program, we progress to more
sophisticated programs
 One example emits a single character, the next prints
a complete string onto the standard screen, followed
by conventions that allow us to communicate with the
assembler in an abbreviated way
 We also discuss macros and simple procedures with
calls and returns
10
16-Bit, 32-Bit, 64-bit Architecture
 The Intel x86 processor started out as a 16-bit
architecture in the late 1970s
 The x86 product names was Intel 8086 µP
 Then the x86 architecture grew to become a 32-bit
architecture
 The initial product name being Intel 80386; yes, there
were preliminary versions, named 80186 and 80286,
with very short lives
 The 32-bit version was backwards compatible with the
16-bit architecture and could execute old code
 Then in the early 2000s, since AMD had produced a 64bit version of the x86 family, very much to the surprise
of Intel, then Intel productized a 64-bit version as well,
in addition to the new and different Itanium
11
16-Bit, 32-Bit, 64-bit Architecture
 The AMD product name was
AMD64
 Intel’s name: Intel 64
 Old 16-bit and 32-bit x86
code is compatible and
executes without issue on
the new 64 bit processors
 Through not with optimal
speed, as legacy object
code cannot take advantage
of new instructions that
may speed up certain
applications
12
Photo of AMD64 µP
16-Bit, 32-Bit, 64-bit Architecture
 AMD’s 64-bit version of the old x86 architecture must
have sent shock waves through Intel, which at the
time of AMD’s release had no published plans to
release a 64-bit version of the old x86 machine
 That quickly changed, as Intel had been smart
enough, to have its skunk work design the new Intel
64-bit µP in secrecy
 All 8 old registers were expanded to 64 bits, and the
names modified correspondingly, to differentiate
them from their 32-bit or 16-bit siblings
 The old names, e.g. “eax” for the 32-bit version of the
ax register, were modified to “rax”, for the 64-bit
version of the ax register; the ax register has 16 bits
 Intel added 8 more GPR to the register-starved
architecture; these are known as rn, with n = 8..15
13
16-Bit, 32-Bit, 64-bit Architecture
14
16-Bit, 32-Bit, 64-bit Architecture
 The above register map also shows the XMM and
MMX registers, directly usable on the new 64-bit
architecture
 The 8 MMX registers are 80-bits long for extended
floating point computations, and 64-bits short, for
regular floating-point computations; all adhering to
the IEEE industry standard
 The 16 XMM registers were already 128 bits long, that
did not have to change in Intel 64
 The instruction pointer register ip simply became rip,
and the flags register became rflags
15
16-Bit, 32-Bit, 64-bit Usage
In assembly code below we use the following names
for the ax register, depending on 16-bit, 32-bit, or 64bit modes:
 ax
16 bits; also al is the low order byte register
 eax
32 bits
 rax
64 bits
Ditto with the other registers, for example, the bx:
 bx
16 bits; also bh is the high order byte register
 ebx
32 bits
 rbx
64 bits
Etc.
16
A Null Program
In x86 Assembly Language
17
Null Program
 Goal here is to craft an x86 assembly language
program that assembles, links, loads and executes
correctly, and then does nothing 
 Set up segments: code, data, and stack
 Here only the Code Segment as the others are empty
 Note the ’code’ string to identify code segment
 Communicate implied seg portion of seg:offset in
assume instruction
 Define start address (actually offset) via label, here
label start:
 Labels are user-defined identifiers, each followed by
colon, in the code segment
18
Null Program
; Source: out1.asm
; Purpose:simplest program, no data seg, no stack
code_s
segment ’code’
; ’code’ identifies segment
assume cs:code_s ; implied seg register cs
start:
mov al, 0
mov ah, 4ch
int 21h
ends
;
;
;
;
end start
; end argument defines start
code_s
termination code
to terminate: 4ch in ah
call system sw for help
end of code segment
19
Null Program
 Use manufacturer-provided assembler services: Here
4ch to terminate; the ‘h’ stands for ‘hexadecimal’
 Run-time services requested via INT 21h
 Service refinement specified in register ah and
possibly other registers that ‘h’ stands for ‘high’ byte
 Return code is zero, meaning: no errors occurred
 Note comments, introduced by ;
 Comments end at the end of line
 Can be different in different assemblers
 Assembler used here could be Microsoft masm or ML
20
Print Single Character:
We Choose ‘$’
21
Print Character ‘$’
 Goal to craft an x86 assembly language program that
assembles, links, loads and executes a complete
program for the purpose of printing a single character
 Define also data and stack segment; though they will
remain unused; just used for demonstration
 Use assembler instruction to define data, here a
single machine word, via dw:
dw 999
; reserves 1 word, initialize to 999
 And we define an array of 100 machine words, via the
dup pseudo-opcode dup:
100 dup( 0 )
; defines 100 words, initialize 0
; remain unused in simple program
22
Print Character ‘$’
; Source: out2.asm
; Purpose:simplest program to output a character, here ‘$’
data_s
segment
; unused data segment
dw 999
; define a word, init 999
data_s
ends
stack_s
segment
; unused stack segment
dw 100 dup( 0 ) ; reserve 100 words, init 0
stack_s
ends
code_s
segment 'code'
; THE Code Segment
assume cs:code_s, ds:data_s
start:
mov
ax, seg data_s ; initialize ds
mov
ds, ax
; cannot load directly into ds
mov
dl, '$'
; char to print assumed in dl
mov
ah, 2h
; call 2h emits char in dl
int
21h
; call OS routine, e.g. DOS
mov
ax, 4c00h ; termination code in ah + al
int
21h
; terminate finally via call
code_s
ends
; repeat seg name at ends
end
start
; say: Where to start
23
Print Character ‘$’
 Again a system routine is called for help: INT 21h
 The specific argument, communicating which help
is needed, must be passed in register ah
 Value 2 in ah states character output is desired
 OS service routine 2 prints a char; it outputs the
one found in register dl; that is the ‘$’ character
 Moving c400h into register ax is same as 4ch into
register ah and 00h into al
 Note that one of the h qualifiers says “hex”, while
the other says “high”
 c400h is just two byte literals concatenated
24
Print Character String
25
Print String
 Goal now is to craft an x86 assembly program that
assembles, links, loads and executes a program to
print a character string
 The Data Segment defines a string of bytes, initialized
to some string literal, identified by symbol msg
 This name msg is a user-defined name for the byte
address, where the string starts
 Note the $ character to end a string literal
 Used as end criterion for system SW routine 9
 Stack segment here is solely a dummy segment:
 It holds 10 unused strings, each of length 16, solely
for demonstration purposes
26
Print String
; Source: out3.asm
; Purpose:simplest program to output a character string
data_s
segment
msg
db
"Hello CCUT class$"
data_s
ends
stack_s
segment
; unused
db
10 dup( "---S t a c k----" )
stack_s
ends
; repeat the name
code_s
start:
code_s
segment 'code'
assume
cs:code_s, ds:data_s
mov
ax, seg data_s
mov
ds, ax
mov
dx, offset msg ; System SW prints
mov
ah, 9h
; sys call 9h emits string
int
21h
; call OS routine
mov
ax, 4c00h
; term code in ah + al
int
21h
; term finally via call
ends
; label seg name at ends
end
start
; start here!
27
Print String
 System SW routine 9 emits character string to the
standard output file; note 9 is same as 9h
 Whose start address it finds in ds:offset, offset
communicated in register dx
 Note the built-in system-SW function offset applied
to a data label, here label msg
 System-SW also provides built-in seg pseudofunction to generate another part of the final address
28
INT Function
 The x86 INT instruction is not what the computer
sciences call an interrupt
 Instead it a call to a low-level system SW routine
 Parameterized by the single-byte argument residing
in the ah register
 The actual system SW being executed as a result of
INT is dependent on the actual operating system on
which the x86 code executes
 Thus it may be different on a Linux system from a
Windows environment and from a Unix target
machine
29
Assembler Abbreviations
30
Assembler Abbreviations
 Assembler directive .mode small allows for certain
default abbreviations and assumptions
 For example data, code, stack, @data are predefined
in Microsoft assemblers, as are assume statements
 Here another string is printed, that string is “Hello”
 Note again the $ terminator --Note the different
meaning of $ in a different target system, e.g. $
means “current code address” in Linux
 Under Microsoft assembler SW, the macro @data is
predefined by ML (or masm), same as seg data
 Note again offset function, to compute the byte
distance from the start address of the segment
31
Assembler Abbreviations
; Source file: out4.asm
; Purpose: simpler program to output string
.model small
; assumes stack data code
.stack 10h
; assumes name: stack
.data
; assumes name: data
hi
db "Hello$"
.code
start: mov
mov
mov
mov
int
; assumes name: code
ax, @data
; @data predefined macro
ds, ax
; now data segment reg set
dx, offset hi ; string 2 b output by System SW
ah,
; System SW 9h emits string
21h
; call System SW
mov
int
ax, 4c00h
21h
; we want to terminate: ah + al
; terminate finally
end
start
; start here!
32
Assembler Abbreviations
 Note again the System SW routine 9 under Microsoft
system SW, to output some string of characters,
whose at address is found in register dx
 Program using .model small abbreviation is smaller,
more compact, easier to read
 The .code ends previous segment, if any (here data)
 And starts code segment
 The .data ends previous segment, if any
 And starts the data segment
33
Macros
34
Macros
 Programmers get tired  of writing segment … ends
 The .model small allows defaults and abbreviations
 Macros make program source more readable, easier
to maintain; here are the rules:
 Macros can be defined anywhere in assembler source
 The initial assembler translation process extracts all
macro definitions, stores them during assembly time,
and uses (expands) them, each time a macro name is
found in the asm source
 Macros are introduced by user defined name and the
macro keyword
 Terminated by endm keyword
35
Macros
; Source file: out5.asm
; Purpose:
macro-ized program to output character string
start macro
mov ax, @data
mov ds, ax
endm
;
;
;
;
no parameters
@data predefined macro
now data segment reg set
end of start macro
Put_Str macro Str
; one formal parameter, “Str”
mov
dx, offset Str; string 2 b output by DOS
mov
ah, 9h
; DOS call 9h emits string
int
21h
; call system SW
endm
; end of Put_Str macro
Done
macro
mov
mov
int
endm
ret_code
ah, 4ch
al, ret_code
21h
;
;
;
;
;
formal parameter “ret_code”
we wanna terminate, ah = 4c
communicate: all is o.k.
terminate finally via DOS
end of macro body of Done
36
Macros
.model small
; allow predefines assumptions
.stack 10h
; assumes segment name: stack
.data
; assumes segment name: data
hi
db "Hello$"
.code
main: start
Put_Str hi
Done
0
end
main
; terminate string with $
; assumes segment name: code
; use of mcro “start”
; invoke macro “Put_Str” with hi
; use of macro “Done”
; start here!
37
Macros
 Macros specify 0 or more formal macro parameters,
which can be referenced in the macro body
 At the place of macro definition, these parameters are
named formal parameters
 Formal parameters follow the macro keyword at the
place of definition
 At the place of use (the place where they are
expanded) these are substituted by actual parameters
 When macro name is used, its body is expanded inline at that place, with all actual parameters taking the
place of the formal ones
38
Assembler Procedures:
Like High-Level Language Procedures
39
Procedures
 Assembler procedure identified by proc and endp
 Procedures can be called and provide a syntactic
grouping mechanism to form physical modules
containing logically connected actions
 The Microsoft syntax rule for procedure names does
not allow : as used for labels
 Return instruction ret ends a procedure body and
allows return to the place of call, immediately after
the call instruction
40
Procedures
; Source file: out6.asm
; Purpose:
modular macro program to output string
start macro
mov
mov
endm
;
ax, @data ;
ds, ax
;
;
Put_Str macro
.data
hi db "Hello$"
.code
main
main
proc
start
Put_Str
Done
ret
endp
end
Str
hi
0
main
no parameters
@data predefined macro
now data segment reg set
end of “start” macro body
;
;
;
;
“Str” must be data label
assumes name: data
terminate string with $
assumes name: code
;
;
;
;
;
begin of procedure body
invoke “start” macro
invoke “Put_Str” with actual
invoke “Done” with actual 0
return
; entry point is “main”
41
Procedures
 Like in High-Level language programs, procedures
are a key syntax tool to modularize
 Physical modules (procedures) encapsulate data and
actions that belong together
 Physical modules –delineated by the proc and endp
keywords) are the language tool to define such
logical modules
 Net result: programs that are easier to write, and
above all, easier to read
 A procedure example is provided in a separate
handout
42
Assembly and Linking
Of
Full Programs
43
Assembly
 Linking is the process of binding 2 or more pieces of
software together in a way that they constitute one
running program
 Clearly the start address, where execution begins,
must be defined, by convention
 Typical tools to link include:
1. Microsoft Macro Assembler masm
2. Borland Macro Assembler tasm
3. Microsoft Macro Assembler ml
4. Microsoft Linker link
5. Borland Linker tlink
44
Assembly With MASM
 The Microsoft macro assembler old version has the
name masm
 A newer assembler from Microsoft is named ml
 This section explains the masm command briefly
 The masm command in version 5.10 and older has 4
arguments, separated from one another by commas.
These arguments are file names
 Arguments are considered omitted, if no comma (and
thus no file name) is given
 The assembler prompts for each omitted one, so it is
generally better to provide them, at least the commas,
lest there will be repeated interaction with the
assembler asking for file names, or hitting of carriage
returns
45
Assembly With MASM
 It is a nuisance in masm 5.10 that the last comma (the
third one to separate 4 arguments) must be followed
by another comma (or semicolon, indicating the end
of a command line)
 Else the assembler does not recognize that the
default should be used for the fourth argument
 If commas without file names are given, then default
file names are assumed
 The four file names, which are the arguments of the
masm command, are left to right:
46
Assembly With MASM
1. assembly source program, e.g. source.asm
2. object program generated by assembler, e.g.
source.obj
3. the listing, generated by the assembler, say
source.lst; yes, in days of old, people actually created
paper listings of programs being processed
4. the cross-reference file, named source.crf
47
Assembly With MASM
 Suffixes obj, lst, and crf are automatically generated
by the assembler, if no other names are provided
 Some complete masm commands, for the assembler
file src1.asm would be:
masm src1.asm, src.obj, src.lst, src.crf; no prompting
masm src1,src1,src1,src1
; no prompting
masm src1,src1.obj,src1,src1.crf ; no prompting
masm src1,,,;
; no prompting
 In the above cases the assembler will not prompt you,
because you provided all file names
 It was smart enough to think up the suffixes (like .lst
and .obj) from the respective positions
48
Assembly With MASM
 Some incomplete masm commands for source file
src2.asm, are shown next
 The assembler will prompt the user for the missing
ones:
masm src2.asm, src2.obj; asks for: list, cross ref file (xref)
masm src2,foo,src2
; creates foo.obj, src2.lst, asks xref
masm src2,,bar.lst
; creates src2.obj, bar.lst, asks xref
masm src2
; asks for object,list, cross ref file
 Borland Macro Assembler tasm 5.10
 Similar to masm, but command is tasm
49
Linking Assembler Programs
50
Linking
 The Microsoft link command also has 4 arguments,
one input file and 3 output files
 Input is the object to be linked
 The object may be a concatenation of multiple object
files, typically ending in the .obj suffix, concatenated
via the + operator. For example:
link mem0 + putdec,,,
 creates an executable mem0.exe
 The file name mem0 is derived from the first part of
the first argument; suffix .exe is assumed
 Also, the object file putdec.obj is used as input, used
to resolve external names used in mem0.obj
51
Linking
 The link command has 4 arguments: the 4 file names
are:
1. object files, concatenated by + with default suffix .obj
2. the linked executable with suffix .exe
3. the load map file, whose name ends in .map
4. the library
 If the input file is provided without suffix then the
suffix .obj is assumed
 If the executable file is specified without suffix, then
.exe is assumed
 Any other file and suffix is allowable too
52
Linking
 The file for the load map can be specified
 If none is provided then the file name nul is generated
by the linker
 If no file suffix is provided, then the .map suffix is
assumed. Similarly, for the library a file name must be
specified
 The suffix is .lib
 The commands below do not cause the linker to
prompt for additional file name inputs, because
sufficient information is assumed:
link mem0 + putdec,,,,
; mem0.exe, no map, no library
link mem0+putdex,foo.bar,,,
; generate executable foo.bar
link putdec+mem0,mem0.exe,,,
; mem0.exe
53
Linking
 Concatenation operator + may be embedded in any
number of blanks
 Commas may be surrounded by 0 or more blanks
 The order of specifying object files is immaterial,
provided the main entry point is unambiguous
 The commands below cause the linker to prompt for
some additional information:
link mem0 + putdec
; executable, map, and library
link mem0+putdec,x.y; ask for map and lib
link putdec+mem0,,
; gen putdec.exe, ask for map and lib
54
Main Entry Point
 Each assembly unit (.asm source file) must end in an
end directive (end statement)
 This end statement may have a label, identifying one
of the labels of proc names of the program. Such a
label specifies the entry point, i.e. the initial value of
ip, set by the loader
 However, if an executable is composed of multiple
objects, there must be just one single entry point. All
other source modules should not specify an
argument after their end statement
 If, however, two or more object modules to be linked
into an executable do have entry points specified,
masm does not complain
 Instead, it takes the first one of the objects listed as
the first argument in the link command. And if this is
not the intended entry point, program execution will
bring surprises
55
nasm Assembler
56
Nasm Assembler
Simplest possible, meaningful asm program that
outputs a character string. Assumes translation via
Borland nasm command
1. ; introduces comment, until the end of source line
2. %define macro_name value the value is replaced,
whenever the macro name is found
3. section pseudo instruction defines one of various
data segments, or code or stack segment
4. mov is instruction to move bits to register, memory
on the left, from source on the right
5. $ pseudo-operator means: Current value of location
counter.
6. int 80h instruction is an x86 instruction that uses
GPRs to determine what to do
57
Nasm Assembler
; Asm:
Netwide Assembler (nasm)
; Note:
uses Linux system calls, not Microsoft!
; Define convenient symbolic names for Linux system calls
%define __NR_exit
%define __NR_write
%define STDOUT_FILE
1
4
1
; symbolic names system dependent
; 4 for output under Linux
; 1 for standard out under Linux
section .data
message:
msglen:
; Other section names: .rodata and .bss
; have specific, and distinct, meanings
db "Hello CCUT class"
equ $ - message ; # bytes in message
section .text
; All executable code is in the .text section
global _start ; required to announced name “start” for linker
start:
; used by linker; similar to "main()" in C
58
Nasm Assembler
; Display the string on stdout
mov eax, __NR_write ; system call number for write
mov ebx, STDOUT_FILE ; write string to stdout
mov ecx, message
; address of string
mov edx, msglen
; number of bytes to write
int 80h
; call Linux
; Exit
mov
mov
int
the program
eax, __NR_exit
ebx, 0
80h
; system call number for exit
; exit status 0: "success"
; call Linux
59
Summary
 Comments introduced by ;
 .model pseudo instruction tells assembler: which
memory model to be used, pulls in predefined macros
 .stack is one such macro; tells assembler: We use a
stack of 10016 words
 Leftmost column used for optional labels
 Labels are symbolic names you can refer to in the
source
 The next column used for command or pseudo
commands; but if no label used, first may be
command
 data_s is symbolic name chosen to name a data
segment; since ‘code’ not specified here, this must be
data
 Define string literal by embedding it between pair of
double quotes, e.g."Hello ECE class 371 at CCUT”
60
Summary
 The ends pseudo instruction says: end of segment;
may be redefined any number of times again
 The assume pseudo instruction tells assembler,
which value to set cs and ds registers to
 The segment ‘code’ pseudo instruction defines the
code segment
 mov is instruction to move bits to/from register,
memory or (if source) literal
 move offset message instruction breaks address into
segment/offset pair and uses offset
 The int 21 instruction is an x86 interrupt (really a
system call) that uses other registers to determine
what to do
 The end start pseudo instruction says: start
execution at first address of the segment with the
symbolic name start
61
Appendix:
Some Definitions
62
Definitions
Address
 Identity of any one of the distinguishable memory
units, e.g. bytes or words
 On the x86 architecture a logical address is a pair
seg:offset, which is translated by the hardware into
linear address
 The segment and the offset are 16 bits long each in
real mode
 The machine address, called a linear address, is 20
bits long for the original x86 microprocessor
 Since the 1980s Intel has produced the more famous
32-bit version of its x86 µP, and since the 2000s, the
64-bit version has become common
63
Definitions
Assembler
 A source to object translator, reading relocatable,
abstract, machine-specific source programs,
translating them into binary object code
 After linking, the binary code is executable
64
Definitions
Binary Object
 These are strings of bits, which, when interpreted by
the target machine, are legal machine operations plus
associated memory references
 Jointly, these bit strings represent executable
programs
65
Definitions
Code Segment
 Subsection of an architecture’s memory which holds
executable instructions with possibly embedded,
immediate operands
 On the x86 microprocessors, the start address of the
code segment is identified by the cs register
 A complete programs is comprised of more than code
segments
66
Definitions
Data Segment
 Subsection of an architecture’s memory which holds
data to be manipulated
 Like any segment, a data segment is identified by a
segment register, holding its start address
 Such an address must be evenly divisible by 16 on
the x86 family processors
 Such aligned addresses are also called paragraphs
67
Definitions
Offset
 Byte distance of a named object (addressable unit)
from the beginning of an area that encompasses the
name
68
Definitions
Relocation, Relocatability
 Ability of digital computer information to be placed in
any location of memory
 For example, referring to data (or object code) by
offsets relative to some start address allows the code
to be placed anywhere, as long as the respective start
address is always added at execution time
69
Definitions
Segment
 Subsection of memory
 A segment is identified by a segment register and
holds either code, data, or stack space
70
Definitions
Stack
 Data structure holding data
 The amount of data varies over time:
 Increase of data is accomplished through an
operation called pushing, decreases via popping
 A stack segment register points to the beginning of
the stack, and the stack pointer register to the current
top
 The top varies frequently during execution
71
Definitions
Top of Stack
 Select the element on the stack that is immediately
accessible, AKA addressable
 That element is said to be “at the top”
 There may be other elements in the stack as well,
hidden by the top element
 Additional elements are created by pushing, and
elements are removed by popping
 If the stack is empty, and the top element is accessed,
an error occurs
72