Download LanguageProcessor

Document related concepts

Control table wikipedia , lookup

Transcript
System Programming
Chapter 1: Language Processor
•
•
•
•
•
Introduction
Language Processing Activities
Fundamentals of Language Processing
Fundamentals of Language Specification
Language Processing Development Tools
Introduction
• Why Language Processor?
– Difference between the manner in which software
designer describes the ideas (How the s/w should
be) and the manner in which these ideas are
implemented(CPU).
Semantic Gap
Application
Domain
Software
Designer
Execution
Domain
CPU
• Consequences of Semantic Gap
1. Large development times
2. Large development efforts
3. Poor quality of software
• Issues are tackled by Software Engineering
through use of Programming Language.
Specification Gap
Application
Domain
Execution Gap
PL
Domain
Execution
Domain
Language Processors
• A Language Processor is a software which
bridges a specification or execution gap.
• Program to input to a LP is referred as a
Source Program and output as Target
Program.
• Languages in which they are written are called
as source language and target languages
respectively.
A Spectrum of Language Processor
• A language translator bridges an execution gap to
the machine language of a computer system.
• A detranslator bridges the same execution gap as
the language translator but in reverse direction.
• A Preprocessor is a language processor which
bridges an execution gap but is not a language
translator.
• A language migrator bridges the specification gap
between two PL’s.
Errors
C++
program
C++
preprocessor
C
program
Errors
C++
program
C++
translator
Machine
language
program
Interpreters
• An interpreter is a language processor which
bridges an execution gap without generating a
machine language program.
• Here execution gap vanishes totally.
Interpreter
Domain
Application
Domain
PL
Domain
Execution
Domain
Problem Oriented Language
Specification Gap
Application
Domain
PL
Domain
Execution Gap
Execution
Domain
Procedure Oriented Language
Specification Gap
Application
Domain
Execution Gap
PL
Domain
Execution
Domain
Language Processing Activities
• Divided into those that bridge the
specification gap and those that bridge the
execution gap.
1. Program generation activities.
2. Program execution activities.
Program Generation
Errors
Program
specification
Program
generator
Program in
target PL
• It is a software which accepts the specification
of a program to be generated and generates a
program in the target PL.
Specification Gap
Application
Domain
Program
generator
domain
PL
Domain
Execution
Domain
Example
• A screen handling Program
• Specification is given as below
Employee name : char : start(line=2,position=25)
end(line=2,position=80)
Married
: char : start(line =10, position=25)
end(line=10,position=27)
default(‘Yes’)
Employee Name
Address
Married
Age
Yes
Gender
Figure: Screen displayed by a screen handling program
Program Execution
• Two models of Program Execution
– Program translation
– Program interpretation
Program Translation
• Program translation model bridges the
execution gap by translating a program
written in PL i.e Source Program into machine
language i.e Target Program.
Errors
Source
Program
Translator
Figure: Program translation model
Data
m/c language program
Target
Program
Program Interpretation
• Interpreter reads the source program and
stores it in its memory.
• During interpretation it determines the
meaning of the statement.
• Instruction Execution Cycle
1. Fetch the instruction
2. Decode the instruction and determine the
operation to be performed.
3. Execute the instruction.
• Interpretation Cycle consists of—
1. Fetch the statement.
2. Analyze the statement and determine its
meaning.
3. Execute the meaning of the statement.
Interpreter
Memory
PC
Errors
CPU
PC
Source
program
+
Data
Figure (a) : Interpretation
Memory
Machine
language
program
+
Data
Figure (b) : Program Execution
• Characteristics of interpretation
1. Source program is retained in the source
form itself i.e no target program.
2. A statement is analyzed during its
interpretation.
Fundamentals of Language Processing
• Lang Processing = Analysis of SP+ Synthesis of TP.
• Analysis consists of three steps
1. Lexical rule identifies the valid lexical units
2. Syntax rules identifies the valid statements
3. Semantic rules associate meaning with valid
statement.
Example:percent_profit := (profit * 100) / cost_price;
Lexical analysis identifies------:=, * and / as operators
100 as constant
Remaining strings as identifiers.
Syntax analysis identifies the statement as the
assignment statement.
Semantic analysis determines the meaning of
the statement as
profit x 100
cost_price
to percent_profit
• Synthesis Phase
1. Creation of DS
2. Generation of target code
Refered as memory allocation and code
generation, respectively.
MOVER
MULT
DIV
MOVEM
------------------PERCENT_PROFIT
PROFIT
COST_PRICE
AREG, PROFIT
AREG, 100
AREG, COST_PRICE
AREG, PERCENT_PROFIT
DW
DW
DW
1
1
1
Analysis Phase and Synthesis phase is not
feasible due to
1. Forward References
2. Memory Requirement
Forward Reference
• It is a reference to the entity which precedes
its definition in the program.
percent_profit := (profit * 100) / cost_price;
……..
……..
long profit;
Language Processor Pass
• Pass I : performs analysis of SP and notes
relevant information.
• Pass II: performs synthesis of target program.
Pass I analyses SP and generates IR which is
given as input to Pass II to generate target
code.
Intermediate Representation (IR)
• An IR reflects the effect of some but not all,
analysis and synthesis tasks performed during
language processing.
Source
Program
Front End
Back End
Intermediate
Representation (IR)
Target
Program
• Properties
1. Ease of use.
2. Processing efficiency.
3. Memory efficiency.
Toy Compiler
• Front End
Performs lexical, syntax and semantic analysis of
SP.
Analysis involves
1. Determine validity of source stmt.
2. Determine the content of source stmt.
3. Construct a suitable representation.
Output of Front End
1. Tables of information
2. Intermediate code (IC)
IC is a sequence of IC units, represents meaning
of one action in SP
Example
i: integer;
a,b: real;
a := b+i;
Symbol Table.
No.
Symbol
Type
1
i
int
2
a
real
3
b
real
4
i*
real
5
temp
real
Length
Address
• Intermediate code
1. Convert (id, #1) to real, giving (id, #4)
2. Add (id, #4) to (id, #3) giving (id, #5)
3. Store (id, #5) in (Id, #2)
Lexical Analysis (Scanning)
• Identifies lexical units in a source statement.
• Classifies units into different classes e.g id’s,
constants, reserved id’s etc and enters them
into different tables
• Token contains
– Class code and number in class
Code #no
e.g.
Id #10
• Example
Statement a := b + i;
a
:=
b
+
Id #2
Op #5
Id #3
Op #3
i
Id #1
;
Op #10
Syntax Analysis (Parsing)
• Determines the statement class such as
assignment statement, if stmt etc.
e.g.:- a , b : real; and a = b + I ;
real
a
:=
b
+
a
b
i
Semantic Analysis
• Determines the meaning of the SP
• Results in addition of info such as type, length
etc.
• Determines the meaning of the subtree in IC
and adds info to the IC tree.
Stmt a := b +i; proceeds as
1. Type is added to the IC tree
2. Rules of assignment indicate expression on
RHS should be evaluated first.
3. Rules of addition indicate i should be
converted before addition
i. Convert i to real giving i*;
ii. Add i* to b giving temp.
iii. Store temp in a.
:=
:=
a, real
a, real
+
b, real
+
b, real
i, int
(b)
(a)
:=
temp, real
a, real
(c)
i*, int
Source Program
Lexical
errors
Scanning
tokens
Syntax
errors
Parsing
Parse tree
Semantic
errors
Semantic analysis
IC
IR
Figure: Front end of a Toy Compiler
Symbol table
Constant table
Other table
Back End
1. Memory Allocation
Calculated form its type, length and dimentionality.
No.
Symbol
Type
Length
Address
1
i
int
2000
2
a
real
2001
3
b
real
2002
• Code generation
1. Determine places where results should be
kept in registers/memory location.
2. Determine which instruction should be used
for type conversion operations.
3. Determine which addressing modes should
be used for accessing variables.
CONV_R
ADD_R
MOVEM
Figure: Target Code
AREG, I
AREG, B
AREG, A
IR
IC
Memory allocation
Symbol table
Constants table
Other tables
Code generation
Target
program
Figure: Back End of the toy compiler
Data Structures For Language
Processing.
CLASSIFICATION OF DATA
STRUCTURE
• Classification criteria:1. Nature: Linear or Non linear.
2. Purpose: Search or Allocation.
3. Life Time: whether used during language
processing or during target program execution
LINEAR AND NON LINEAR
LINEAR VS NON LINEAR
S.N.
LINEAR
NON LINEAR
1
Linear arrangement in
memory
Non Linear arrangement in
memory
2
Require contiguous area of
memory
Can use discontinues area of
memory
3
Size of data structure must be
predictable.
No need to predict
4
Wastage of memory
Efficient use of memory
5
Very good search efficiency
Poor search efficiency
SEARCH DATA STRUCTURE
• Search data structures are used during language processing to
maintain attribute information concerning different entities in
the source program.
• Key field is symbol field containing the name of an entity.
• Entry format: Set of fields Common for entry.
Entry format types:
1. Fixed length entries : each entry in symbol table has fields for
all attributes specified in the programming language.
2. Variable length entries: the entry occupied by a symbol has
fields only for the attributes specified for symbol of its class.
3. Hybrid entries: fixed and variant , each contains a set of
fields.
Entries in symbol table of a compiler have following fields:
 Fixed part: fields symbol and class/tag field.
 Variant part:
Tag value
(Symbol class)
Variant part fields.
(Attributes)
Variable
Type ,length ,dimension information
Label
Statement number
Procedure name
Address of parameter list , no. of
parameters
Type of return value , length of returned
value, Address of parameter list , no. of
parameters
Function name
Fixed and variable length entries
• An entry may be declared as record or
structure of language in which the language
processor is being implemented.
• Fixed length entry format: consist of fields
1. Fields in fixed part of the entry
2. set field in all variant parts of entry.
1
2
3
4
5
1. Symbol
3. Type
5. Dimension information
7. no. Of parameter
9. Length of return value
6
7
8
9
2. Class
4. Length
6. Parameters list address
8.Type of return value
10.Stament no.
fixed length entry
10
variable length entries
• consist of fields
1. Fields in fixed part of the entry including Tag
field
2. { fi | fj Є SFvj} if tag = vj
• Leads to compact organization in which no
memory wastage occurs.
1. name
2.class
3.statement number
1
2
3
Hybrid entry format
Fixed part
Length
Pointer
Variable part
Operations on DS
1. ADD: Add entry to symbol table
2. SEARCH: search and locate the entry of a
symbol
3. DELETE: Delete the entry of a symbol
Entry for symbol is created only once but may
be searched for large number of times during
processing of program.
Deletion operation is not very common.
Generic search procedure
• To search and locate the entry of symbol s in a search
data structure.
• algorithm:
1. Make prediction concerning the entry of search DS
which symbol s may be occupying. let this be entry e
2. Let Se be the symbol occupying eth entry. Compare s
with se. exit with success if two match. Each
comparison called as probe
3. Repeat step 1 and 2 till it can be conducted that the
symbol does not exist in search data structure
Table Organization
1. Sequential search organization
2. Binary search organization
3. Hash table
Sequential search organization
Search Algorithms
Suppose that there are n elements in the array. The following expression
gives the average number of comparisons:
It is known that
Therefore, the following expression gives the average number of comparisons
made by the sequential search in the successful case:
Search Algorithms
Binary Search organization
– Binary search algorithm assumes that the items in
the array being searched are sorted
– The algorithm begins at the middle of the array in a
binary search
– If the item for which we are searching is less than
the item in the middle, we know that the item won’t
be in the second half of the array
– Once again we examine the “middle” element
– The process continues with each comparison cutting
in half the portion of the array where the item might
be
Data Structures Using C++
64
Algorithm of binary search
1. Start :=1 ; end =f;
2. While start ≤ end
start + end
E:=
a.
2
Exit with success if s= se
b. If s<se then end:= e-1;
else start:= e+1;
3. Exit with failure.
Hash table
Description
• A hash table is a data structure that stores things and
allows insertions, lookups, and deletions to be
performed in O(1) time.
• An algorithm converts an object, typically a string, to
a number. Then the number is compressed according
to the size of the table and used as an index.
• There is the possibility of distinct items being
mapped to the same key. This is called a collision and
must be resolved.
Key 
Hash Code Generator
 Number 
Compression
Smith  7
0
1
2
3
4
5
6
7
8
9
Bob Smith
123 Main St.
Orlando, FL 327816
407-555-1111
[email protected]
 Index
Collision: A collision occurs when a hashing algorithm produces
an address for an insertion key and that address is already
occupied.
Collision Resolution
• There are two kinds of collision resolution:
1 – Chaining makes each entry a linked list so that
when a collision occurs the new entry is added to
the end of the list.
2 – Open Addressing uses probing to discover an
empty spot.
• With chaining, the table does not have to be resized.
With open addressing, the table must be resized
when the number of elements is larger than the
capacity.
Collision Resolution - Chaining
• With chaining, each entry is the head of a (possibly empty)
linked list. When a new object hashes to an entry, it is added
to the end of the list.
• A particularly bad hash function could create a table with only
one non-empty list that contained all the elements. Uniform
hashing is very important with chaining.
• The load factor of a chained hash table indicates how many
objects should be found at each location, provided reasonably
uniform hashing. The load factor LF = n/c where n is the
number of objects stored and c is the capacity of the table.
• With uniform hashing, a load factor of 2 means we expect to
find no more than two objects in one slot. A load factor less
than 1 means that we are wasting space.
Open Addressing
•
Another way to avoid collision
•
In open addressing, each position in the array is in one of three states, EMPTY,
DELETED, or OCCUPIED.
•
If a position is OCCUPIED, it contains a legitimate value (key and data);
otherwise, it contains no value.
•
Positions are initially EMPTY. If the value in a position is deleted, the position
is marked as DELETED, not EMPTY: the need for this distinction is explained
below.
• The algorithm for inserting a value V into a table
is this:
• compute the position at which V ought to be
stored,
P = H(key(V)).
• If position P is not OCCUPIED, store V there.
• If position P is OCCUPIED, compute another
position in the table; set P to this value and
repeat (2)-(3).
Linear Probing
• The simplest method is called linear probing. P =
(1 + P) mod TABLE_SIZE. This has the virtues that
it is very fast to compute and that it `probes' (i.e.
looks at) every position in the hash table. To see
its fatal drawback, consider the following
example.
• In this example, the table is size 10, keys are 2digit integers, and H(k) = k mod 10. Initially all the
positions in the table are EMPTY:
• If we now insert 15, 17, and 8, there are no collisions; the result is:
• When we insert 35, we compute P = H(35) = 35
mod 10 = 5.
• Position 5 is occupied (by 15). So we must look
elsewhere for a position in which to store 35.
Using linear probing, the position we would try
next is 5+1 mod 10 = 6. Position 6 is empty, so 35
is inserted there.
• Now suppose we insert 25. We first try
position (25 mod 10). It is occupied.
• Using linear probing we next try position 6
(5+1 mod 10).
• It too is occupied. Next we try position 7 (6+1
mod 10); it is occupied, as is the next one we
try, position 8 (7+1 mod 10). Position 9 is next
(8+1 mod 10); it is empty so 25 is inserted
there.
• Now suppose we insert 75. The first position
we try is 5. Then we will try position 6, then 7,
then 8, then 9. Finally we try position 0 (9+1
mod 10); it is empty so 75 is stored there.
• Open addressing has the advantage that amount of
memory needed is fixed. It has two disadvantages:
• The first one we've just seen: RETRIEVE is very
inefficient when the key does not occur in the table.
• The other disadvantage relates to the load factor
defined earlier. With open addressing, the amount you
can store in the table is. of course, limited by the size
of the table; and, what is worse, as the load factor gets
large, all the operations degrade to linear-time.
Hash Table Organization
Organization
Advantages
Disadvantages
Chaining
•Unlimited number of
elements
•Unlimited number of
collisions
•Overhead of multiple
linked lists
Re-hashing
Overflow area
•Fast re-hashing
•Fast access through use
of main table space
•Maximum number of
elements must be known
•Multiple collisions may
become probable
•Fast access
•Collisions don't use
primary table space
•Two parameters which
govern performance
need to be estimated
ALLOCATION DATA STRUCTURE
• Most basic allocation data structure:1. Stack
2. Heap
• When a program is loaded into memory, it is organized into three
areas of memory, called segments: the text segment, stack
segment, and heap segment. The text segment (sometimes also
called the code segment) is where the compiled code of the
program itself resides. This is the machine language representation
of the program steps to be carried out, including all functions
making up the program, both user defined and system.
• The remaining two areas of system memory is where storage may
be allocated by the compiler for data storage.
Program Address Space
•
Any program you run has, associated with it, some memory which is divided into:
–
–
–
–
Code Segment
Data Segment (Holds Global Data)
Stack (where the local variables and other temporary information is stored)
Heap
The Heap grows
downwards
The Stack grows
upwards
Heap
Stack
Data Segment
Code Segment
Stack
• The stack is where memory is allocated for
automatic variables within functions. A stack
is a Last In First Out (LIFO) storage device
where new storage is allocated and
deallocated at only one ``end'', called the Top
of the stack
Local Variables: Stack Allocation
•
•
When we have a declaration of the form “int a;”:
– a variable with identifier “a” and some memory allocated to it is created in the
stack. The attributes of “a” are:
• Name: a
• Data type: int
• Scope: visible only inside the function it is defined, disappears once we
exit the function
• Address: address of the memory location reserved for it. Note: Memory is
allocated in the stack for a even before it is initialized.
• Size: typically 4 bytes
• Value: Will be set once the variable is initialized
Since the memory allocated for the variable is set in the beginning itself, we
cannot use the stack in cases where the amount of memory required is not known
in advance. This motivates the need for HEAP
Pointers
•
We know what a pointer is. Let us say we have declared a pointer “int *p;” The
attributes of “a” are:
–
–
–
–
–
•
Name: p
Data type: Integer address
Scope: Local or Global
Address: Address in the data segment or stack segment
Size: 32 bits in a 32-bit architecture
We saw how a fixed memory allocation is done in the stack, now we want to
allocate dynamically. Consider the declaration:
– “int *p;”. So the compiler knows that we have a pointer p that may store the starting
address of a variable of type int.
– To point “p” to a dynamic variable we need to use a declaration of the type “ p = new
int;”
ALLOCATION DATA STRUCTURE -STACK
• Linear Data structure.
• Allocation and de allocation performed in LIFO
manner.
• Only the last entry is accessible at any time.
• Two pointers are used:Stack Base(SB):Points to first word of stack area.
Top of Stack (TOS): Points to last entry allocated.
STACK OPERATION
When Entry is pushed on stack TOS incremented by l(Where l is size
of entry).
When the entry is popped i.e. deallocated TOS is decremented by l.
EXTENDED STACK MODEL
• One extra pointer is used known as a record
base pointer.
• It points to the first word of the last record in
stack
Heap
In general, perhaps 50 percent of the
computer's total memory space
might be unused at any given
moment. The operating system owns
and manages the unused memory,
and it is collectively known as the
heap. The heap is extremely
important because it is available for
use by applications during execution
using the C functions malloc
(memory allocate) and free. The
heap allows programs to allocate
memory exactly when they need it
during the execution of a program,
rather than pre-allocating it with a
specifically-sized array declaration.
HEAP
• The heap segment provides more stable storage
of data for a program; memory allocated in the
heap remains in existence for the duration of a
program. Therefore, global variables (storage
class external), and static variables are allocated
on the heap. The memory allocated in the heap
area, if initialized to zero at program start,
remains zero until the program makes use of it.
Thus, the heap area need not contain garbage.
• Non Linear data structure.
• It permits allocation and de allocation of
entities in random order.
• Allocation request returns a pointer to the
allocated area in heap.
• Deallocation request must present a pointer
to the area to be deallocation .
HEAP -MEMORY
MANAGEMENT
• Identifying free memory areas.
1. Reference Counts.
2. Garbage Collection.
• Reuse of memory.
1. First fit technique.
2. Best fit technique.
First fit technique
• In First fit technique, the free list is traversed
sequentially to find the first free block whose size
is larger than or equal to the amount requested .
• Once the block is found – It is removed from free
list and free list is suitably manipulated. Every
blocked is of fixed size.
- the block is spilt into two portions. The first block
remains in the free list and the second is
allocated. Block size is not fixed.
Statue of memory is as follows:
U: used
A: Available
0k
Block1
Size:20K
U
A
U
20K
10K
30K
20k
Free
size:
10K
A
30K
30k 60k
U
A
U
A
20K
10K
40K
20K
90k
110k 150k
160k
Block2Si 3Free
Block3
Free
Block4
Free
ze:30K size:30k Size:20K size:10K Size:40K size:20K
• 20k block memory allocation : free size block >= 20k size starting at
address 60k.this block will be splited into to two parts 10k and 20k and
second block will be allocated.
0k
20k
30k
60k
Block1
Size:20K
Free
Block2Si Free
size: 10K ze:30K size:10k
70k
90k
New
block=20k
Block3
Size:20K
110k
150k
Free
size:10
K
Block4
Size:40K
160k
Free
size:20K
.
• 10k block memory allocation : free size block >= 10k size starting at
address 20k.
0k
20k
30k
60k
Block1
Size:20K
New
block:
10K
Block2Si Free
ze:30K size:10k
70k
90k
New
block=20k
Block3
Size:20K
110k
Free
size:10
K
150k
Block4
Size:40K
160k
Free
size:20K
Best fit allocation
• In the best fit method , obtain the smallest free
block whose size is grater than or equal to the
requested size.
Statue of memory is as follows:
U: used
A: Available
0k
Block1
Size:20K
U
A
U
20K
10K
30K
20k
Free
size:
10K
A
30K
30k 60k
U
A
U
A
20K
10K
40K
20K
90k
110k 150k
160k
Block2Si 3Free
Block3
Free
Block4
Free
ze:30K size:30k Size:20K size:10K Size:40K size:20K
• 20k block memory allocation : free size block >= 20k size starting at
address 160k.
0k
20k
Block1
Size:20K
Free size:
10K
30k
60k
Block2Siz Free
e:30K
size=30k
90k
Block3
Size:20K
110k
Free
size:10K
150k
160k
Block4
Size:40K
New
block:20K
.
• 10k block memory allocation : free size block >= 10k size starting at
address 20k.
0k
20k
30k
60k
Block1
Size:20K
New
block:
10K
Block2Si Free
ze:30K size:10k
70k
90k
New
block=20k
Block3
Size:20K
110k
Free
size:10
K
150k
Block4
Size:40K
160k
Free
size:20K
• Worst fit :
This method is opposite of best fit method.
In this obtain largest free block whose size is
grater than or equal to the request size .
• Next fit:
In this case searching in free list starts from the
next of the previously allocated node.
It reduces the search time.
Fundamentals of Language
Specification
• Programming Language grammars.
alphabets
Words / Strings
Sentences/ Statements
Language
Terminal symbols, alphabet and strings
• The alphabet of L is represented by a greek
symbol Σ.
• Such as Σ = {a , b , ….z, 0, 1,…. 9}
• A string is a finite sequence of symbols.
α= axy
Productions
• Also called as rewriting rule
A nonterminal symbol ::= String of T’s and NT’s
e.g.:
<Noun Phrase> ::= <Article> <Noun>
<Article>
::= a| an | the
<Noun>
::= boy | apple
Grammar
• A grammar G of a language LG is a quadruple
(Σ, SNT, S, P) where
– Σ is the alphabet
– SNT is the set of NT’s
– S is the distinguished symbol
– P is the set of productions
Derivation
• Let production P1 of grammar G be of the
form
P1
:
A ::= α
And let β be such that β = γAθ
β = γαθ
Example
<Sentence> :: = <Noun Phrase> <Verb Phrase>
<Noun Phrase> ::= <Article> <Noun>
<Verb Phrase> ::= <Verb> <Noun Phrase>
<Article>
::= a| an| the
<Noun>
::= boy | apple
<Verb>
::= ate
<Sentence>
<Noun Phrase> <Verb Phrase>
<Article> <Noun> <Verb Phrase>
<Article> <Noun> <Verb> <Noun Phrase>
the <Noun> <Verb> <Article> <Noun>
the boy <Verb> <Article> <Noun>
the boy ate <Article> <Noun>
the boy ate
an
<Noun>
the boy ate
an
apple
Binding and Binding Times
• Program entity pei in program P has a some
attributes.
pei
kind
variable
type
size
procedure
dimen Mem addr etc
Reserved id
etc
Definition
• A binding is an association of an attribute of a
program entity with a value.
• Binding time is the time at which binding is
performed.
int
a;
Different Binding Times
1.
2.
3.
4.
5.
Language definition time of L
Language implementation time of L
Compilation time of P
Execution init time of proc
Execution time of proc
program bindings(input , output);
var
i : integer;
a,b : real;
procedure proc (x: real; j: integer);
var
info : array[1…10,1…5] of integer;
p : integer;
begin
new (p);
end;
begin
proc(a,i);
end
1. Binding of keywords with its meaning.
e.g.: program, procedure, begin and end.
2. Size of type int is bind to n bytes.
3. Binding of var to its type.
4. Memory addresses are allocated to the variables.
5. Values are binded to memory address.
Types of Binding
• Static Binding:
– Binding is performed before the execution of a
program begins.
• Dynamic Binding:
– Binding is performed after the execution of a
program has begun.
Language Processor Development
Tools (LPDT)
LPDT requires two inputs:
1. Specification of a grammar of language L
2. Specification of semantic actions
Two LPDT’s which are widely used
1. LEX (Lexical analyzer)
2. YACC (Parser Generator)
It contains translation rules of form
<string specification> { <semantic action>}
Source Program
Scanning
Grammar
of L
Semantic
action
Front end
generator
Parsing
Semantic analysis
IR
Front end
Source Program in L
Lexical
Specification
LEX
Scanner
Syntax
Specification
YACC
Parser
IR
LEX
• LEX consists of two components
– Specification of strings such as id’s, constants etc.
– Specification of semantic action
%{
//Symbols definition;
1st section
%}
%%
//translation rules;
Specification
2nd section
Action
%%
//Routines;
3rd section
%{
letter
digit
}%
%%
begin
end
{letter} ( {letter}|{digit})*
{digit}+
[A-Za-z]
[0-9]
{return(BEGIN);}
{return(END);}
{yylval = enter_id();
return(ID);}
{yylval=enter_num();
return(NUM);}
%%
enter_id()
{ /*enters id in symbol table*/ }
enter_num()
{/* enters number in constabts table */}
YACC
• String specification resembles grammar
production.
• YACC performs reductions according to the
grammar.
Example
%%
E :
E+T
{$$ = gencode(‘+’, $1, $3);}
|
T
{$$ = $1;}
T :
T*V
{$$ = gencode(‘*’, $1, $3);}
|
V
{$$ = $1;}
V :
id
{$$ = gendesc($1);}
%%
gencode (operator, operand_1, operand_2){ }
gendesc(symbol) { }
Thank You!!!