Download Lecture 15

Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi 1 Outline The Symbol Table 1. 1. 2. 3. 4. 2. 3. The Structure of the Symbol Table Declarations Scope Rules and Block Structure Interaction of Same Level Declarations Data Types & Type Checking Summary 2 Semantic Analysis Lecture: 15-16 3 The Symbol Table    The symbol table is a major data structure in a compile after the syntax tree In some languages symbol table is involved during the process of parsing and even lexical analysis where they need to add some information in it or may need to look for something from it But in a careful designed language like Pascal or Ada, it is possible and reasonable to put off symbol table operations until after a complete parse, when the program being translated is known to be syntactically correct 4 The Symbol Table (Continue…)  The principal symbol table operations include     Insert is used to store the information provided by name declarations Lookup is needed to retrieve the information associated to a name Delete is needed to remove the information provided by declaration when that declaration no longer applies Typically symbol table stores data type information, information or region of applicability (scope) , and information on eventual location in memory 5 The Structure of the Symbol Table  The symbol table in a compiler is a typical dictionary data structure    The efficiency of three basic operations insert, lookup, and delete vary according to the organization of data structure Typical implementations of dictionary structures include linear lists, various search tree structures, and hash tables Linear lists are a good basic data structure that can provide easy and direct implementation of the three basic operations  Constant Time for insert and Linear Time to the size of the list for Lookup and Delete 6 The Structure of the Symbol Table (Continue…)    Linear lists can be a good choice in case of implementations where compilation speed is not a major concern Search tree structures are somewhat less useful for the symbol table, partially because they do not provide best case efficiency, but also because of the complexity of the delete operation The hash table often provides the best choice for implementing the symbol table  All three basic operations can be performed in almost constant time, and is used most frequently in practice 7 The Structure of the Symbol Table (Continue…)    A hash table is an array of entries, called buckets, indexed by an integer range, usually from 0 to the table size minus one A has function turns the search key (identifier name) into an integer hash value in the index range, and the item corresponding to the search key is stored in the bucket at this index The has function should distribute the key indices as uniformly as possible over the index range, since has collisions a performance degradation in the lookup and delete operations 8 The Structure of the Symbol Table (Continue…)   An important question is how has table deals with collisions (often called collision resolution) One method allocates only enough space for a single item in each bucket and resolves collisions by inserting new items in successive buckets (this is sometimes called open addressing)   In this case the contents of the hash tables are limited by the size of the array used for the table, and as the array fills collisions become more and more frequent The best choice for compilers is the alternative to open addressing, called separate chaining 9 The Structure of the Symbol Table (Continue…)  In separate chaining method, each bucket is actually a linear list  Collisions are resolved by inserting the new item into the bucket list 10 The Structure of the Symbol Table (Continue…)   One question still remains that how the hash function works The hash function     It converts a character string into an integer in the range 0…size-1 in three steps First, each character in the string is converted into a nonnegative integer Second, these nonnegative integers are combined in some way to form a single integer Finally, the resulting integer is scaled in the range 0…size-1 11 Declarations  The behavior of the symbol table depends heavily on the properties of declarations of the language being translated   How the insert and delete operations act on the symbol table, when these operations need to be called, and what attributes are inserted into the table There are four basic kinds of declarations 1. 2. 3. 4. Constant declaration Type declaration Variable declaration Procedure/Function declaration 12 Declarations (Continue…)  It is easiest to use one symbol table to hold the names from all the different kinds of declarations   When programming language prohibits the use of the same in different kinds of declarations Occasionally it is easier to use a different symbol table for each kind of declaration  For example all type declarations are contained in one symbol table whereas all variable declarations are in a different symbol table and so on 13 Declarations (Continue…)  The attributes bound to a name by declaration vary with the kind of the declaration     Constant declarations associate values to names; sometimes constant declarations are called value bindings for this reason Type declarations bind names to newly constructed types and may also create aliases for existing named types Variable declarations most often bind names to data types. Besides data type, it may bind more attributes implicitly e.g. scope of a variable Procedure/Function declarations may bind return type and parameters as attribute 14 Scope Rules and Block Structure    Scope rules vary widely from language to language but there are some rules that are common Here we will discuss two of these rules; declaration before use and most closely nested rule for block structure Declaration before use   Name be declared in the text of the program prior to any reference to the name It permits the symbol table to be built as parsing proceeds and for lookup to be performed as soon as a name reference is encountered in the code 15 Scope Rules and Block Structure (Continue…)  Block structure    It is a common property of programming languages A language is block structured if it permits the nesting of blocks inside other blocks If the scope of declarations in a block are limited to that block and other blocks contained in that block, subject to the most closely nested rule  Given several different declarations for same name, the declaration that applies to a reference is the one in the most closely nested block to the reference 16 Scope Rules and Block Structure (Continue…)  To implement nested scopes and most closely nested rule, the symbol table insert operation must not overwrite previous declaration    The insert operation should hide the previous declaration so the lookup operation can only find the recent one The delete operation must not delete all declarations corresponding to a name, but only the most recent one, uncovering any previous declaration Symbol table construction can proceed by performing insert operations for all declared names on entry into block & delete operations on exit from block 17 Scope Rules and Block Structure (Continue…)  Build symbol table for following code; int i, j int f(int size) { char i, temp; … { double j; … } … { char * j; … } } 18 Interaction of Same Level Declarations   One main issue that relates to scope is the interactions among declarations at the same level One typical requirement in many languages is that there can be no reuse of the same in the declaration at the same level   To check this requirement, a compiler must perform a lookup before each insert and determine by some mechanism whether any preexisting declaration with the same name are at the same level or not Somewhat more difficult is the question of how much information the declaration in a sequence at the same level have available about each other 19 Interaction of Same Level Declarations (Continue…)  Consider the following code; int a = 1; void f(void) { int a = 2, j = a+1; … } //which ‘a’ will be used to assign value to ‘j’?   If each declaration is added to the symbol table as it is processed, it is called sequential declaration If all the declarations are processed simultaneously and added at once to symbol table at the end of a section, then it is called collateral declaration 20 Interaction of Same Level Declarations (Continue…)  For each recursive declaration of function or procedure, the compiler must insert the name of function or procedure as it finds its declaration, otherwise compiler may consider recursive call as an error  Error of use before declaration 21 Data Types & Type Checking     One of the principal tasks of a compiler is the computation and maintenance of information on data types (type reference) Compiler uses this information to ensure that each part of the program makes sense under the type rules of the language (type checking) Data type information can occur in a program in several different forms Theoretically, a data type is a set of values, or more precisely a set of values with certain operations on those values 22 Summary Any Questions? 23

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 15