Download Computer Architecture and System Software

Document related concepts
no text concepts found
Transcript
Computer Architecture and
System Software
Lecture 04: Floating Points & Intro to
Assembly
Instructor:
Rob Bergen
Applied Computer Science
University of Winnipeg
Announcements

Assignment 1 due today

Solutions to be posted next week

DOSBOX materials available online

Syllabus available online
Final Thoughts on Ints, Floats

When do we want to use ints vs floats, or vice
versa?
 Ints
require less space (bits) but have a much smaller
range
 Can overflow
 Floats
have larger range per bit, but require more
space
 Convenient for very small or very large numbers
 Can represent fractions and infinite/NaN
 May produce rounding errors
Assembly Language Background
High-Level Language


High-level language (HLL) shield us from machinelevel complexity
HLL programs are
 Easier
to write
 Less error prone
 Readable
 Portable
 Efficient
High-Level Languages

HLL relieves programmer from
 Writing
in machine-level instructions
 Managing memory
 Rewriting programs to run on different machines
Compiler

Main work is done by compiler
 Translate
HLL program to assembly code (hardest part)
 Assemble assembly code to machine code (object code)
 Link object code with runtime library
 Yields an executable

Assembly code:
 VERY
close to machine code
 Human readable encoding of machine code
 Each instruction corresponds to a machine instruction
 Not readable by machine (needs to be assembled)
Motivation for Assembly Code

Important:
 Understand
how computers execute code
 Understand what optimizations the compiler performs
 Reverse engineer a piece of software
 Improve efficiency of a program
 Write programs that have direct control over system
hardware
 Example:
writing a device driver to a new scanner
 If you are developing system software, you will need to
know assembly language
Instruction Set


The set of instructions (and their encodings) that a
processor can execute
We will use the IA32 (aka i386/i486/Pentium)
instruction set
 Used
by majority of computers
 Instruction
set for Intel’s most commercially successful
microprocessors
 Easily
accessible
 Large # of instructions and modes
 Gives a good view of how computers work
Program Encoding

Recall the compilation process
Editor
hello.cpp
Source
program
(text)



Compiler
hello.asm
Assembly
program
(text)
Assembler
hello.obj
Relocatable
object
programs
(binary)
Linker
Hello.exe
Executable
program
(binary)
hello.asm: assembly (human readable, not executable)
hello.obj: machine code (binary, executable)
Assembly and machine code are different encodings of the
same instructions
Machine-Level (ML) Code



Both assembly and machine code are ML code
How does ML differ from High Level (HL) ?
HL view:






Objects of different types are declared and allocated memory
Memory is accessed by accessing the objects
Local variables used to store temporary values
CPU state is hidden from programmer
Object type defines what operations can be performed
Single statement can perform a complex task

e.g. if (a+b < c) someFunc(d);
Machine-Level (ML) Code

ML view:
 CPU
state is visible and accessible to the program
 All CPU state is stored in its register file, i.e. the memory
on the CPU
 CPU has:
 Registers
 ALU
 Control
Unit
 Execution Unit
 etc
x86 and IA-32 Instruction Set Architectures
History



Intel first introduced microprocessors in 1969
Work on early processors led to the development of the
Intel Architecture (IA)
First processor in the IA family was the 8086 (1979)


First 32-bit IA processor was the 80386 (1985)


20-bit address bus and 16-bit data bus
32-bit address bus and 32-bit data bus
Pentium introduced in 1993


Not named 80586 because numbers can’t be trademarked
32-bit address bus and 64-bit data bus
Source (UofW electronic resource): Chapter 3 in S. Dandamudi, Introduction to Assembly Language Programming
for Pentium and RISC Processors, Springer, 2005.
History
Source (UofW electronic resource): Chapter 3 in S. Dandamudi, Introduction to Assembly Language Programming
for Pentium and RISC Processors, Springer, 2005.
8086 / 8088 Architecture

8086/8088 is the first CPU in the X86-series
(1979)
 8088
is identical to 8086 except it has a 8-bit data
bus instead of a 16-bit data bus (more economical)


Later processors in this series have similar but
extended architectures
Need to start at the beginning to understand a
processor like Pentium-IV
8086 / 8088 Architecture

Two main units



8086



Bus interface unit (BIU)
Execution unit (EU)
Data bus size: 16 bit
Instruction queue: 6 bytes
20 bit address bus
General Purpose Registers


Used for arithmetic, logical,
and other operations
Can be configured as:


Four 16-bit registers
Eight 8-bit registers
Index and Pointer Registers
BP, SI, DI, SP
 Can only be used as 16 bit
registers
 SI, DI used for
addressing, and string op.
 BP, SP are used for
maintaining the stack

Segment Registers


16 bit registers labelled
DS, CS, SS, ES
Used to indicate the start
of different segments in
memory
Flag Register


16-bit register containing 9
1-bit flags
Gives status of the last
instruction and the
processor
Flag Register




Overflow flag: Set if a signed overflow occurs
Direction flag: Used to indicate direction of processing for
string manipulations (0 = forward, 1 = backward)
Interrupt enable flag: Setting this enables interrupts
Trace (trap) flag: Used for debugging. When set processor
executes only 1 instruction, then interrupts to call debugger

Sign flag: Set to the MSB of the result of an operation

Zero flag: Set if the result is zero



Auxiliary carry flag: Set if there was a carry from or borrow
to bits 0-3 in the AL register
Parity flag: Set if the number of1’s in result is even
Carry flag: Set if there was a carry from or borrow to the
MSB during last calculation
Image source: http://www.electronics.dit.ie/staff/tscarff/8086_registers/8086_registers.html
Programming Tools
DOSBox


Required for those of you that have 64-bit machines
and want to work on assignments at home
Available at: http://www.dosbox.com/
8086 Emulator for Mac

DosBox
 Works

on a mac
Other options:
 http://i8086emu.sourceforge.net/
 http://wiki.qemu.org/Main_Page

Again, assignments that you hand in MUST COMPILE
in DOSBOX. Solutions posted in the future will only
compile for DOSBOX
Using DOSBOX


Run DOSBOX
Choose your working folder by ‘mounting’ your
virtual C:\
 Example:
mount c “C:\Program Files\Assembly”
 Directory in quotations is where you will be storing your
source code

Once the C:\ is mounted, type C: to switch to your
directory in DOSBOX
DOS commands



You will need to know simple DOS commands to
navigate your folders in the command line
dir /p– display contents of current directory
(folders and files), one page at a time
cd FolderName – change directory to the folder
called ‘FolderName’ in your current directory
 Or
type full directory path in place of folder name
(C:\Windows\etc...)
DOS commands



cd .. – Go up a level in the folder hierarchy
Prog.exe – run program Prog.exe (must be in your
current directory)
help – displays a list of usable commands
Shortcuts


Up arrow – previous command
Tab – Scrolls through available files/directories that
begin with whatever you have started typing
 Use
to avoid wasting time typing long directory
pathways
Creating an Assembly file



Create a text file
Paste any program template you have into the file
Change file name extension to .asm
 You
may have to turn on the option to show file
extensions on your computer
 Google how to do this if necessary
Programming Tools



Course website has link to websites and tools you
will need (these will also be e-mailed)
After you have created an .asm file, you’ll need to
compile and link your program before you can run
it
This is done within DOSBOX from your mounted C:\
Programming Tools

Compile, link, and run program
 masm
programName
 Include
the file extension above
 Hit enter 4 times
 link
programName
 DO
NOT include the file extension above
 Hit enter 4 times
 Execute

program
To be demonstrated in class
Debugging

Debug programName.exe
 Type

? for command listing
Commands that may be useful
 dump
 Dumps
a section of memory
 go
 Jumps
to a line of code in the program
 Is like running until a break point in a program
 Can be used to skip over interrupt routines
Debugging
 trace
 Allows
you to step through the code one line at a time,
allowing you to see the CPU state at any point
 unassemble
 Displays
the disassembly starting from the next instruction to
be executed

Here is a guide that has examples which I found
more useful than the help tool:
http://kipirvine.com/asm/debug/debug_tutorial.pdf
Notepad++

Recommend notepad++ as assembly editor for this
course
 Highlights
keywords
 Good organization/formatting
 Does not require an installer

Available at: http://notepad-plus-plus.org/
Addressing Modes
x86 Addressing Modes

Specify the method for finding the memory address
of an operand
 e.g.
using information held in registers and/or constants
contained within a machine instruction

Specifies operands in an assembly language
program and is completely architecture dependent
Example


MOV AH, 08h
h stands for hexadecimal
MOV copies operand 2 into operand 1
 Source
(data value 08h) is copied into destination
(register AH)

Data 08h is provided in the instruction
 Called
Immediate addressing mode
x86 Addressing Modes

There are three basic types of operands:
 Immediate
 Constant
integer (8, 16, or 32 bits)
 Constant value is stored within the instruction
 Register
 Name
of a register is specified
 Register number is encoded within the instruction
 Memory
 Reference
to a location in memory
 Memory address is encoded within the instruction, or
 Register holds the address of a memory location
x86 Addressing Modes

The Intel 8086 has about 9 data addressing modes:
 Immediate,
register, direct, register indirect, based,
indexed, based indexed, string, and port addressing
 Definition
 Register
names vary across books
and immediate modes do not need to access
the external bus (faster)
 The syntax of the operand field determines the
addressing mode
Immediate Addressing Mode

MOV CX, 10
Immediate data is coded directly in the instruction’s
machine code
 Note:



CX contains 0xA because there is no h after 10
The data is put in the operand field
The constant in the operand field may be of byte or
word length
Some assemblers need a ‘#’ before the constant
Immediate Addressing Mode Cont.

The constant may be in hex., dec., binary, or text
 Default

format is decimal
Examples
 MOV
CX, 65
 MOV CX, 41h
 MOV CX, 01000001b
 MOV CX, ‘A’


Advantage: No memory reference
Disadvantage: Size of number is restricted to size
of address field
Direct Addressing Mode


MOV DS:[1234h], AL
Copies AL using offset 1234h
The operand is stored in a MEMORY location



DS specifies data segment register
An offset address is coded directly in the instruction
The offset combined with DS forms the address where the
operand is located
Direct Addressing Mode Cont.



Ordinarily, DS: is assumed unless explicitly
overridden using a colon (e.g. ES:[001h])
Advantage: Single memory reference to access
data
Disadvantage: Limited address space
Direct Addressing Mode Cont.

Examples:
 Assume
DS=10000h, ES=20000h, FRED=4567h
 MOV AX, [20h]; load the contents at address 10020h
and 10021h into AL and AH respectively
 MOV DS:[1234h], AL; copy AL to address 11234h
 MOV ES:[1234h], AL; copy AL to address 21234h
 MOV FRED, AL; copy AL to address 14567h
Register Addressing Mode




Instruction gets its source data from a register
Data can be either 8/16/32 bits in length
Data resulting from the operation is stored in
another register
Advantages:
 Only
a small address field is needed in instruction
 No memory references are required

Disadvantage: address space is very limited
Register Addressing Mode Cont.

Examples:





MOV AX, BX; copy the 16 bit content of BX to AX
MOV AL, BL; copy the 8-bit content of BL to AL
MOV SI, DI; copy DI into SI
MOV DS, AX; copy AX into DS
The following register to register transfers are not
permitted.



MOV BL, BX; mixed sizes
MOV CS, AX; CS cannot be the destination
MOV ES, DS; Segment register to segment register forbidden
Register Indirect Addressing Mode
MOV CX, [BX]





Copies a word from memory with location specified by BX
Uses a register instead of a constant to specify the 16-bit
offset address of the operand
Offset address can be in any of the following registers: BP,
BX, DI, SI.
The [ ] is needed to denote register indirect addressing
mode
The DS register is the default segment address register
(except BP, which uses SS as the default register)
Register Indirect Addressing Mode



In cases of ambiguity: assemblers need the presence of
BYTE PTR or WORD PTR directives to indicate the size of the
data addressed by the memory pointer
e.g. MOV [DI], 10h is ambiguous since assembler does not
know whether to save 10h to memory as a byte or a word
Instead, use the following:


MOV BYTE PTR [DI], 10h; save 10h to memory
MOV WORD PTR [DI], 10h; save 0010h to memory
Register Indirect Addressing Mode


Register indirect addressing is commonly used to access a
table of data in memory
Examples:




Assume BX=0222h, DS=10000h, SS=20000h, BP=0111h
MOV CX, [BX]; Copy a byte from address 10222h and 10223h
to CX
MOV [BP], DL; Copy a byte from register DL to address 20111h
Remember: Indirect addressing using BP defaults to SS
Based Addressing Mode




Operand is located at the address given by adding 8 or 16
bit displacement to either BX or BP and combing the result
with a segment register
Displacement must be specified in the operand field
Interpreted as signed 2’s complement value
Examples



Assume DS=1000h, SS=2000h, BP=0222h, BX=0111h
MOV AX, [BP-2]; Copy the content of 20220h and 20221h to AX
MOV [BX+777h], AX; Copy AL to 10888h and AH to 10889h
Indexed Addressing Mode



Similar to based addressing except the index registers (SI or
DI) must be used instead
Operand is located at the address given by adding 8 or 16
bit displacement to either SI or DI and combing the result
with a segment register
Examples



Assume DS=10000h, SI=222h, DI=111h
MOV [DI-1], BL; store the content of BL to 10110h
MOV BX, [SI+1000h]; Load BL with the contents of 11222h and
BH with the contents of 11223h
Indexed Addressing Mode Cont.



Based and indexed addressing are aka REGISTER RELATIVE
addressing
Other syntax may be permitted to indicate the displacement
e.g. the following examples are all equivalent


And they all result in the same assembled binary code
Examples




Assume FRED is a constant defined in the assembly code
MOV [DI+FRED], BL
MOV [DI]+FRED, BL
MOV FRED[DI], BL
Based-Index-Relative Addressing


The base and index registers are added to give the
segment offset of where the operand is located
The base register (either BX or BP) is added to an index
register(DI or SI) as positive integers only



Each register lies in the range 0 to 65535
By default, the segment address is derived from DS except
the BP register, which is derived from SS
A signed displacement may also be included to calculate the
offset
Based-Index-Relative Addressing

Examples
 Assume SS=10000h, SI=3333h, BP=2222h
 MOV AX, [SI+BP]; Load the content of 15555h and
15556h to AL and AH respectively
 MOV AX, [SI+BP+1111h]; Load the contents of
16666h and 16667h to AL and AH respectively
String Addressing Mode






A string is a series of bytes or words in sequential memory
locations
String instructions do not use any of the previous address
modes
Strings may be up to 64KB in length
String addressing modes uses SI, DI, DS, and ES registers
String instructions assume SI points to first byte of string to
be processed, and DI points to first byte of destination string
The use of a register my be implicit in the instruction
String Addressing Mode

Example
 Assume
DS=10000h, ES=20000h, SI=10h, DI=20h
 MOVSB; Move string byte from 10010h to 20020h
Port Addressing Mode



8086 has separate input/output space
Up to 65536 I/O ports are available
The I/O ports may be addressed by a byte sized constant




Limited to I/O ports in the range 0 to 255
IN AL, 40h; Put the content of I/O port 40h into AL
OUT 80h, AL; Send the contents of AL to I/O port 80h
The I/O ports may be addressed using a register



Full range of 65536 ports are accessible
IN AL, DX; Load AL with the byte from port address given by DX
OUT DX, AX; Send the word in AX to port address given by DX
Summary of Addressing Modes

Operand needed for an instruction may be located:






Immediately in the operand field, e.g. MOV AX, 1234h
In a register (register addressing), e.g. MOV DS, AX
In memory at a an offset specified by one of the following (disp
is a constant):
The segment address is in DS (by default, except when BP is used)
In memory locations given implicitly by string instructions
At input/output ports specified by a register or a constant
Instruction Set
Assembly Language Statements

Three types of statements
 Directives
or Pseudo-ops
 Instructions
 Macros
Assembly Language Statements

Three types of statements
 Directives
or Pseudo-ops
 Direct
the assembler during the assembly process
 Used for variable declaration and storage allocation
 Non-executable and do not generate any machine
language
 e.g. segment, db, dw
 Instructions
 Macros
Assembly Language Statements

Three types of statements
 Directives
of Pseudo-ops
 Instructions
 Instruct
the processor to perform a task
 Contains operation code (opcode)
 Cause the assembler to generate machine language
 e.g. mov, add
 Macros
Assembly Language Statements

Three types of statements
 Directives
of Pseudo-ops
 Instructions
 Macros
 Permit
the assembly language programmer to name a group
of statements and refer to the group by the macro name
 During the assembly process, each macro is replaced by the
group of statements that it represents and assembled in
place
Directives: Data Allocation
[var-name] define-directive initial-value,[initial-value] …
 The define directive takes one of the five basic forms:
 DB
 DW
 DD
 DQ
 DT

Define Byte
Define Word
Define Doubleword
Define Quadword
Define Ten Bytes
Examples:
 var1
 var2
 var3
DB
DB
DB
‘y’
79h
11110010B
;allocates 1 byte
;allocates 2 bytes
;allocates 4 bytes
;allocates 8 bytes
;allocates 10 bytes
Directives: Data Allocation

Another example:
 total
DD 542803535
 This would allocated four contiguous bytes of memory
and initialize it to 542803535 (205A864Fh):
Address: x
Contents: 4F
x+1
86
x+2
5A
x+3
20
Directives: Data Allocation

Or:
 total
DB 20h,5Ah,86h,4Fh
 This would allocated four contiguous bytes of memory
and initialize it to 205A864Fh:
Address: x
Contents: 20
x+1
5A
x+2
86
x+3
4F
Directives: Data Allocation

Range of numeric operands
-27 to 28 – 1
 DW: -215 to 216 – 1
 DD: -231 to 232 – 1 or a short floating-point number
(32 bits)
 DQ: -263 to 264 – 1or a long floating-point number
 DB:
Directives: Data Allocation

Uninitialized Data and multiple initializations
 Use
DUP command
 Number after DB indicates the number of bytes, and
DUP(X) will fill the number of bytes with X




Examples:
text DB 10 DUP (’W’) ;initializes 10 bytes to W
test DB 1 DUP(?) ;Reserve 1 uninitialized byte
test2 DB 20 DUP(?) ;Reserve 20 uninitialized bytes
Directives: Data Allocation

Multiple definitions:
sort
DB
‘y’
; ASCII of y = 79h
value
DW
25159
; 25159 = 6247h
total
DD
542803535
; 542803535 = 205A864Fh

Memory allocation
Address: x
x+1
x+2
x+3
x+4
x+5
x+6
Contents: 79
47
62
4F
86
5A
20

More examples:
message DB ‘BYE’,0Dh,0Ah OR
message DB
‘B’
DB
‘Y’
DB
‘E’
DB
0Dh
DB
0Ah
Directives: Defining Constants

name
 Assigns
EQU expression
the result of the expression to name.
Example:
NUM_OF_STUDENTS EQU 90
…
MOV AX, NUM_OF_STUDENTS

8086 Instructions

Instructions can be of various types
 Data
moving instructions
 Arithmetic – add, subtract, increment, decrement, etc.
 Control transfer – conditional, unconditional, call
subroutine
 Logic – AND, OR, XOR, shift/rotate and test
 String manipulation – load, store, move, compare
 Input/Output instructions
 Other – setting/clearing flag bits, stack operations, etc.
MOV
mov destination, source
mov register, register
mov register, immediate
mov memory, immediate
mov register, memory
mov memory, register
Memory to memory forbidden.

Examples
mov [response], bh
mov dx, [table1]
mov [name1+4], ‘k’
Data transfer instructions
lea register, memory_address
 Examples
lea DX, mystring
Data Transfer Instructions

XCHG exchanges 8, or 16-bit source and destination operands

Examples


xchg ax, dx

xchg [response], cl

xchg [total], dx
As in the mov instruction, both operands cannot be located in
memory. Note that this restriction is applicable to most instructions
Data Transfer Instructions




XLATB instruction can be used to perform character translation
To use this instruction, the BX register must be loaded with the
starting address of the translation table and AL must contain an
index value into the table
The xlatb instruction adds contents of AL to BX and reads the byte
at the resulting address. This byte replaces the index value in the AL
register.
Since the 8-bit AL register provides the index into the translation
table, the number of entries in the table is limited to 256.
Arithmetic Instructions

INC and DEC used to either increment or decrement
the operands by one
 inc
destination
 dec destination

Examples
 inc
 dec
bx
dl
; increment bx register by one
; decrement 8-bit register
Arithmetic Instructions


ADD used to add two 8 or 16-bit operands
As with the mov instruction, add can also take the five basic forms
depending on how the two operands are specified. The semantics of
the add instruction are

add destination, source => destination = destination + source

Examples:
Arithmetic Instructions

Examples:
Arithmetic Instructions

Examples:
Arithmetic Instructions

Examples:
Arithmetic Instructions

Examples:
Arithmetic Instructions

Examples:
Arithmetic Instructions

SUB used to subtract two 8, 16-bit operands

sub destination, source => destination = destination - source

Examples:
Arithmetic Instructions





CMP used to compare two 8, 16, or 32-bit numbers
cmp operand1, operand2 => operand1-operand2
The cmp instruction performs the same operation as
the sub instruction except that the result of
subtraction is not saved
Thus, cmp does not disturb the source and
destination operands
After the subtraction operation, flags are set
accordingly
Arithmetic Instructions




MUL used to perform unsigned multiplication of a number with AL
register
mul source
The source operand can be in a general-purpose register or in
memory. Immediate operand specification is not allowed
Example:

mov al, 10

mov dl, 25

mul dl
Arithmetic Instructions
MUL is used to perform unsigned multiplication of a
number with AL register
 Example:
mov al, 2
mov bl, 80h ; 128
mul bl
;Ax = 0100h (256)

Arithmetic Instructions

IMUL used to perform signed multiplication of a number with AL
register

imul source

The behaviour of the imul instruction is similar to that of mul

The only difference to note is that the carry and overflow flags are
set if the upper half of the result is not the sign extension of the
lower half
mov al, 2
mov bl, f6h
; -10
imul bl
;ax = FFECh (-20)

Note, here the carry and overflow flags are cleared, because AH
contains the sign extension of the AL value
Arithmetic Instructions

DIV, IDIV used to perform division of AX register

div source (unsigned)

idiv source (signed)



Division generates two result components: a quotient and a
remainder
In multiplication, by using double-length registers, overflow never
occurs. In division, divide overflow is a real possibility.
The processor generates a special software interrupt when a divide
overflow occurs
Arithmetic Instructions

Example
 mov
ax, 251
 mov cl, 12
 div cl


Leaves 20 (14h) in the al register and 11 (0bh) in
the ah register
Therefore, ax contains 0b14h after division
Arithmetic Instructions
Extension to larger operands:
 Both the multiplication and division instructions can
be performed using 32-bit and 16-bit operands
 Or,

64-bit and 32-bit operands on 32-bit system
Example:
Arithmetic Instructions

Need for sign extension

Example: Perform signed division 8000h/400h


Here the divisor is larger than a byte, so dividend must be stored in DX AX

However, 8000h fits entirely within AX

So, AX needs to be sign extended to DX in order to maintain sign for
division
To aid sign extension instructions such as idiv, several instructions are
provided:

cbw
; Convert byte to word

cwd
; Convert word to doubleword

cdq
; Convert doubleword to quadword
Lab 04

Practice integer
addition/subtraction/multiplication/division
 Know
how to do this in signed (two’s complement) form
as well as unsigned

Practice converting back and forth between
fractional decimal and fractional binary numbers