Download Course\SS\SS Unit-2 28-2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Page 1 of 7
Unit-2 Fundamentals of Assembly Language and Assemblers
Elements of Assembly Language Programming
•
An Assembly Language is a machine-dependent, low-level programming language which is
specific to a computer system (or a family of computer systems).
•
Machine Dependent means different machine can have different Assembly Instructions for
performing various jobs/tasks. This means that you write same C program on different
machine and it works well, but you cannot write same assembly program on different
machine. In other words a C program written for one machine will work on other machine
without making any changes to your C program. Whereas an Assembly program written for
one machine (architecture) will not work for other machine (architecture). In case of C
program, the care to run your program on different machines (architectures) is taken by C
Compiler. Hence, we say that C supports portability or C is portable language.
•
Low-level programming Language is a programming language that provides little or
no abstraction from a computer's instruction set architecture. Generally this refers to
either machine code or assembly language. The word "low" refers to the small or nonexistent
amount of abstraction between the language and machine language; because of this, low-level
languages are sometimes described as being "close to the hardware".
[
]
Three Basic Features of an Assembly Language Programming are:
Compared to the Machine Language of a computer system, it offers three features:
1. Mnemonic Operation Codes
2. Symbolic Operands
3. Data Declarations
1. Mnemonic Operation Codes
• Eliminates the need to memorize numeric operation codes.
• Enable the assembler to provide helpful diagnostics, for example, indication of misspelt
Operation Codes.
• E.g., MOVEM, MULT, ADD etc.
2. Symbolic Operands
• Can be associated with Instructions (e.g., Label used to Jump/Go to) or Data.
• Can be used as Operands in Statements.
• The assembler performs memory bindings to these names; the programmer need not know
any details of the memory bindings. This makes program modification easier.
• Memory Binding – is activity of associating memory address with a symbolic name
(mnemonic).
• E.g., AREG, BREG, RESULT etc.
3. Data Declarations
• Data can be declared using various notations. For example, Strings, Numbers (integers,
floats), etc. can be supplied using notations such as Decimal, Hexa Decimal, Octal, etc.
• This avoid manual conversion of constants into their machine representation (i.e., in Binary).
• E.g. 123, 12.5, “HELLO”, ‘A’ etc.
Description of Simple Assembly Language
Format of an assembly language statement:

[LABEL] <OPCODE> <OPERAND1> [, <OPERAND2> …]
Page 2 of 7




Parts specified within [ ] are optional.
At least one operand must be specified. But, there may be more than one operands.
Some instructions do not require even first operand, e.g. STOP.
The first operand is always a register, which can be any one of AREG, BREG, CREG
and DREG.
The second operand refers to a memory word (mostly a variable name, can be a
register) using a symbolic name. Optionally, it can include Displacement and/or Index
Register.
The Operand has following syntax
• <Symbol Name> [+Displacement] [(Index_Register_Number)]
• If used, Displacement is added in Memory Address specified by Symbol_Name. It
may be positive or negative. For example, if A represents (associated with) memory
address 100, then A+5 represents effective memory address 105, A-3 represents
effective memory address 97, and so on.
• If used, Index Register, its content will be added in (Memory Address represented by)
Symbol_Name to get effective memory address. For example, if Index Register 3
contains value 10, and Symbol A represents memory address 200, then A(3)
represents effective memory address 210.
Opcode
(Operation
Code)
00
01
Assembly
Mnemonic
Remarks
Example
STOP
ADD
02
SUB
03
MULT
04
05
06
07
08
MOVER
MOVEM
COMP
BC
DIV
09
10
READ
PRINT
Stop Execution
STOP
First Operand is modified. That is, ADD AREG, A
Result of operation is stored in
(location specified by) first operand,
Condition Code is Set
First Operand is modified,
SUB AREG, A
Condition Code is Set
First Operand is modified,
MULT AREG, A
Condition Code is Set
Move to Register from Memory
MOVER AREG, A
Move to Memory from Register
MOVEM AREG, A
Sets Condition Code
COMP AREG, BREG
Branch on Condition
BC ANY, LOOP
First Operand is modified,
DIV AREG, A
Condition Code is Set
First Operand is not used
First Operand is not used
Thus, using above statements “A=B+C*D” can be carried out as below.
MOVER AREG, C
MULT AREG, D
ADD AREG, B
MOVEM AREG, A
Assembly Language Statements
•
An assembly program contains three types of statements:
1. Imperative statements
Page 3 of 7
2. Declaration statements
3. Assembler directives
1. Imperative Statements
•
•
An imperative -statement indicates an action to be performed during the execution of the
assembled program.
Each imperative statement typically translates into one machine instruction.
2. Declaration statements
The syntax of declaration statements is as follows:
Label DS
<Constant>
Label DC
‘<Value>’
•
•
•
•
•
•
•
•
The DS (short for Declare Storage) statement reserves areas of memory and assigns
names to them.
Consider the following DS statements:
A
DS
1
M
DS
10
The first statement reserves 1 word in the memory and assigns name A to it.
The second statement reserves 10 words in the memory and assigns name M to it.
The name M is associated with the first word of the block.
Other words in the block can be accessed through offsets from M, e.g., M+5 is the sixth
word of the reserved memory block for M.
The DC (short for Declare Constant) statement constructs memory words containing
constants.
Consider the following DC statements:
A
DC
‘5’
M
DC
’85’
M+1 DC
’75’
M+2 DC
’80’
•
Contrary to the name 'declare constant', the DC statement does not really implement
constants, it merely initializes memory words to given values.
• These values are not protected by the assembler; they may be changed by moving a
new value into the memory word.
• For example, the value of M can be changed by executing an instruction: MOVEM
BREG , A
• A literal is an operand with the syntax =‘<value>'.
ADD AREG, 5 (Simple assembly language does not support this).
ADD AREG, =‘5’
• It differs from a constant because its location cannot be specified in the assembly
program.
• This helps to ensure that its value is not changed during execution of a program.
• The value of the literal is protected because the name and address of this word is not
known to the assembly language programmer.
3. Assembler Directives
• Assembler directives instruct the assembler to perform certain actions during the
assembly of a program.
START
<constant>
Page 4 of 7
• This directive indicates that the first word of the target program generated by the
assembler should be placed in the memory word with address <constant>.
END
[operand]
• This directive indicates the end of the source program. The optional <operand spec>
indicates the address of the instruction where the execution of the program should begin.
• By default, execution begins with the first instruction of the assembled program.
Advantages of Assembly Language
•
•
•
•
•
It is easier to change/correct an Assembly language program than a machine language
program because memory addresses are not to be changed.
It is easier to understand and write. Hence, saves time and effort of programmer.
It has diagnostic capability for error detection (e.g., use of misspelled mnemonic name)
In situations where special hardware features of a computer are to be used, use of Assembly
language is better.
It can be used to write efficient and controlled code
Design Specification of an assembler
•
To develop the design specification of an assembler following four steps are followed:
1. Identify the information necessary to perform a task.
2. Design a suitable data structure to record the information.
3. Determine the processing necessary to obtain and maintain the information.
4. Determine the processing necessary to perform the task.
Synthesis phase of an assembler
•
•
•
•
•
•
•
•
•
Consider the assembly statement
MOVER BREG, ONE
We must have the following information to synthesize the machine instruction corresponding
to this statement:
1. Address of the memory word with which name ONE is associated.
2. Machine operation code corresponding to the mnemonic MOVER.
The first item of information depends on the source program. Hence it must be made
available by the analysis phase.
The second item of information does not depend on the source program. It merely depends
on the assembly language. Hence the synthesis phase can determine this information for
itself.
Based on the above discussion. we consider the use of two data structures during the
synthesis phase:
(1) Symbol table (2) Mnemonics table.
Each entry of the symbol table has two primary fields-name and address.
The symbol table is built by the analysis phase.
An entry in the mnemonics table has two primary fields-mnemonic and opcode.
The synthesis phase uses these tables to obtain the machine address with which a name is
associated and the machine opcode corresponding to a mnemonic, respectively.
Name
Address
ONE
201
N
303
Page 5 of 7
…
…
Name
Opcode
ADD
01
SUB
…
Symbol table
02
…
Mnemonic table
Analysis phase of an assembler
•
•
•
•
•
•
•
•
•
•
•
•
•
•
The primary function performed by the analysis phase is the building of the symbol table.
For this purpose it must determine the addresses of symbolic names used in a program.
It is possible to determine some addresses directly, e.g. the address of the first instruction in
the program. However others must be inferred/calculated (e.g., address of current instruction
is address of previous instruction plus length of previous instruction)
To determine address of all program elements is known as memory allocation. Thus,
memory allocation is necessary to build the symbol table.
Memory allocation must be completed in order to build the symbol table.
To implement memory allocation a data structure called location Counter (LC) is introduced.
The location counter is always made to contain the address of the next memory word in the
target program.
It is initialized to the constant specified in the START statement.
Whenever the analysis phase sees a label in an assembly statement, it enters the label and
the contents of LC in a new entry of the symbol table.
A
DC
‘5’
M
DC
’85’
It then finds the number of memory words required by the assembly statement and updates
the LC contents. (Hence the word 'counter' in ' location counter“)
To update contents of LC, the analysis phase needs to know lengths of different instructions.
Hence, the mnemonics table can be extended to have one more field: length.
Thus, mnemonics table is accessed by analysis and synthesis phase.
But, symbol table is constructed during analysis phase & used during synthesis phase.
• The tasks performed by analysis and synthesis phase are follows:
Analysis phase
1. Isolate the label, mnemonic opcode and operand fields of a statement.
Page 6 of 7
2. If a label is present, enter the pair (symbol, <LC contents>) in a new entry of symbol
table.
3. Check validity of the mnemonic opcode through a look-up in the Mnemonics table.
4. Perform LC processing. That is, update the value contained in LC by considering the
opcode and operands of the statement.
Synthesis phase
1. Obtain the machine opcode corresponding to the mnemonic from the Mnemonics
table.
2. Obtain address of a memory operand from the Symbol table.
3. Synthesize a machine instruction or the machine form of a constant, as the case may
be.
Pass structure of assemblers
•
•
•
•
A pass is a complete scan of the source program.
An assembler can be a single-pass or a two-pass assembler.
In a two-pass assembler tackles forward references easily whereas a single-pass assembler
has to use back-patching to handle forward references.
A forward reference is using a symbol before declaring it.
Two-pass translation
•
•
•
•
In the first pass, LC processing is performed and symbols defined in the program are
entered into the symbol table.
The second pass performs synthesis of target program. To do so, it uses address
information available in the symbol table.
The first pass constructs intermediate representation (IR) of code.
This IR code consists of: symbol table & intermediate code (IC).
Single-pass translation
•
•
•
•
•
•
•
Consider the following instruction in which the variable SUM is yet not declared:
MOVER BREG, SUM
This statement will be synthesized partially. The instruction opcode & address of BREG will
be assembled to reside in memory. Assume that the instruction is stored at memory location
101.
A Table of Incomplete (i.e., partially synthesized) Instruction (TII) is maintained. So, an entry
for the instruction discussed above will be made in this table.
This entry has two fields: instruction address and name of unresolved symbol like <101 ,
SUM>
By the time END statement is processed, the symbol table would contain the address of all
the symbols used in the program and TII would contain information about all forward
references.
The assembler can now process each entry in TII to complete the concerned instruction.
For example, the entry <101, SUM> would be processed by obtaining the address of SUM
from symbol table and inserting the same it in the operand address field of the instruction.
Design of a Two-pass assembler
•
•
Tasks performed by a two-pass assembler are as follows:
Pass-1
1. Separate the symbol, mnemonic opcode and operand fields.
2. Build the symbol table.
3. Perform the LC processing.
Page 7 of 7
•
4. Construct the IR.
Pass-2
1. Synthesize the target program.
Questions
1. Explain in brief 3 features of an Assembly Language. What are the advantages of
Assembly Language?
2. Explain in brief simple Assembly Language.
3. Explain in detail different types of Statements in Assembly Program
4. Explain in brief Analysis and Synthesis Phases (Design Specification) of Assembler
5. Explain Pass Structure of Assemblers