Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SYLLABUS UNIT-I: Preliminary Concepts->Reasons for studying concepts of programming languages, Programming domains, Language Evaluation Criteria, influences on Language design, Language categories, Programming Paradigms – Imperative, Object Oriented functional Programming, Logic Programming. Programming Language Implementation – Compilation and Virtual Machines, programming environments UNIT-II: Syntax and Semantics->general Problem of describing Syntax and Semantics, formal methods of describing syntax – BNF, EBNF for common programming languages features, parse trees, ambiguous grammars, attribute grammars, denotational semantics and axiomatic semantics for common programming language features UNIT-III: Data types->Introduction, primitive, character, user defined, array, associative, record, union, pointer and reference types, design and implementation uses, related to these types, Names, Variable, concept of binding, type checking, strong typing, type compatibility, named constants, variable initialization UNIT-IV: Expressions and Statements->Arithmetic relational and Boolean expressions, Short circuit evaluation mixed mode assignment, Assignment Statements, Control Structures – Statement Level ,Compound Statements ,Selection, Iteration, Unconditional Statements, guarded commands UNIT-V: Subprograms and Blocks->Fundamentals of sub-programs, Scope and lifetime of variable, static and dynamic scope, Design issues of subprograms and operations, local referencing environments, parameter passing methods, overloaded sub-programs, generic sub-programs, parameters that are sub-program names, design issues for functions user defined overloaded operators, co routines UNIT-VI: Abstract Data types->Abstractions and encapsulation, introductions to data abstraction, design issues, language examples, C++ parameterized ADT, object oriented programming in small talk, C++, Java, C#, Ada 95, Concurrency->Subprogram level concurrency, semaphores, monitors, massage passing, Java threads, C# threads UNIT-VII: Exception handling->Exceptions, exception Propagation, Exception handler in Ada, C++ and Java, Logic Programming Language->Introduction and overview of logic programming, basic elements of prolog, application of logic programming UNIT-VIII: Functional Programming Languages->Introduction, fundamentals of FPL, LISP, ML, Haskell, application of Functional Programming Languages and comparison of functional and imperative Languages CSE–PPL UNIT – II Describing Syntax and Semantics Introduction Syntax - the form or structure of the expressions, statements, and program units. Semantics - the meaning of the expressions, statements, and program units. Ex: while (<Boolean_expr>)<statement> The semantics of this statement form is that when the current value of the Boolean expression is true, the embedded statement is executed. The form of a statement should strongly suggest what the statement is meant to accomplish. The General Problem of Describing Syntax A sentence “statement” is a string of characters over some alphabet. The syntax rules of a language specify which strings of characters from the language’s alphabet are in the language. The formal description of syntax of programming language, for simplicity sake often do not include description of the lowest level syntactic units . these small units are called lexemes. A language is a set of sentences. A lexeme is the lowest level syntactic unit of a language. It includes identifiers, literals, operators, and special word. (e.g. *, sum, begin) A program is strings of lexemes. A token is a category of lexemes (e.g., identifier.) An identifier is a token that have lexemes, or instances, such as sum and total. Pattern : a pattern is a description of the form that the lexemes of a token may take, pattern are defined using a regular expression. Ex: index = 2 * count + 17; Lexemes index = 2 * count + 17 ; Tokens identifier equal_sign int_literal mult_op identifier plus_op int_literal semicolon Programming language: programming language are the notations for describing computation to people and to machine The study of programming language is like the study of natural language can be divided into syntax and semantics 2 CSE–PPL Language can be formally defined in two ways Language Recognizer: Suppose language L uses alphabets E(sigma symbol) of characters, to define L using recognitions, we need to construct machine R called Recognition device, which capable of reading strings of characters from alphabets Sigma(E). R will identify it indicates given input string was or was not in L. So R will either accepts or rejects the string . o o The syntax analysis part of a compiler is a recognizer for the language the compiler translates. They determine whether given programs are in the language. Syntax Analyzers: determine whether the given programs are syntactically correct. Language Generators: Language Generator generates the sentences of a language. Formal Methods of Describing Syntax Backus-Naur Form and Context-Free Grammars It is a syntax description formalism that became the mostly wide used method for P/L syntax. Context-free Grammars – – – – - Developed by Noam Chomsky in the mid-1950s who described four classes of generative devices or grammars that define four classes of languages. Context-free and regular grammars are useful for describing the syntax of P/Ls. Tokens of P/Ls can be described by regular grammars. Whole P/Ls can be described by context-free grammars. Lexical analyzer reads in a stream of characters identifies the lexemes in the stream and categories them into tokens called tokenizer. Origin of Backus-Naur Form (1959) – – – Invented by John Backus to describe ALGOL 58 syntax. New notations are later modified by peter Naur for the description of ALGOL 60 , this revised method of syntax description became known as Backus Naur form BNF is equivalent to context-free grammars used for describing syntax. Fundamentals – – – – – – A metalanguage is a language used to describe another language “Ex: BNF.” In BNF, abstractions are used to represent classes of syntactic structures--they act like syntactic variables (also called nonterminal symbols) <while_stmt> while ( <logic_expr> ) <stmt> <assign>-><var>=<expression> This is a rule; it describes the structure of a while statement A rule has a left-hand side (LHS) “The abstraction being defined” and a righthand side (RHS) “consists of some mixture of tokens, lexemes and references to other abstractions”, and consists of terminal and nonterminal symbols. A grammar is a finite nonempty set of rules and the abstractions are called nonterminal symbols, or simply nonterminals. The lexemes and tokens of the rules are called terminal symbols or terminals. 3 CSE–PPL – An abstraction (or nonterminal symbol) can have more than one RHS <stmt> <single_stmt> | begin <stmt_list> end – Multiple definitions can be written as a single rule, with the different definitions separated by the symbol |, meaning logical OR. Describing Lists • Syntactic lists are described using recursion. <ident_list> ident | ident, <ident_list> • A rule is recursive if its LHS appears in its RHS. Grammars and derivations • • • The sentences of the language are generated through a sequence of applications of the rules, beginning with a special nonterminal of the grammar called the start symbol. A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols) An example grammar: <program> <stmts> <stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d <expr> <term> + <term> | <term> - <term> <term> <var> | const • An example derivation: <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const • • • • Every string of symbols in the derivation, including <program>, is a sentential form. A sentence is a sentential form that has only terminal symbols. A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded. The derivation continues until the sentential form contains no nonterminals. A derivation may be neither leftmost nor rightmost. OR • Each successive string which is derived from the start symbol of the grammar for the derivation of a particular string is called a sentential form, the start symbol of a grammar is also a sentential form 4 CSE–PPL • If a sentential form consists of only terminals then it is called a sentence. Parse Trees • • Hierarchical structures of the language are called parse trees. A parse tree for the simple statement A = B + const <program>> <stmts> <stmt> <var> a = <expr> <term> + <term> <var> b const Ambiguity • A grammar is ambiguous iff it generates a sentential form that has two or more distinct parse trees. Ex: Two distinct parse trees for the same sentence, const – const / const <expr> <expr> <op> <expr> | const <op> / | - • Ex: Two distinct parse trees for the same sentence, A = B + A * C • <assign> <id> <expr> | | | <id> = <expr> A|B|C <expr> + <expr> <expr> * <expr> (<expr>) <id> 5 CSE–PPL Operator Precedence • • • The fact that an operator in an arithmetic expression is generated lower in the parse tree can be used to indicate that it has higher precedence over an operator produced higher up in the tree. In the left parsed tree above figure, one can conclude that the * operator has precedence over the + operator. How about the tree on the right hand side? An unambiguous Grammar for Expressions <assign> <id> <expr> | <term> | <factor> | <id> = <expr> A|B|C <expr> + <term> <term> <term> * <factor> <factor> (<expr>) <id> A=B+C*A 6 CSE–PPL Associativity of Operators • • Do parse trees for expressions with two or more adjacent occurrences of operators with equal precedence have those occurrences in proper hierarchical order? An example of an assignment using the previous grammar is: A=B+C+A • Figure above shows the left + operator lower than the right + operator. This is the correct order if + operator meant to be left associative, which is typical. Extended BNF Because of minor inconveniences in BNF, it has been extended in several ways. EBNF extensions do not enhance the descriptive powers of BNF; they only increase its readability and writability. Optional parts are placed in brackets ([ ]) <proc_call> -> ident [ ( <expr_list>)] Put alternative parts of RHSs in parentheses and separate them with vertical bars <term> -> <term> (+ | -) const Put repetitions (0 or more) in braces ({ }) <ident> -> letter {letter | digit} BNF: <expr> <expr> + <term> | <expr> - <term> | <term> <term> <term> * <factor> | <term> / <factor> | <factor> 7 CSE–PPL EBNF: <expr> <term> {(+ | -) <term>} <term> <factor> {(* | /) <factor>} Attribute Grammar Attribute Grammar are used to describe more of the structure of a programming language than can be described with a CFG. It is an extension to a CFG. Basic Concepts Attribute grammar are grammar to which have been added Attributes, Attributes computations functions and predicate functions Attributes : which are associated with grammar symbols , similar to variables in the sense that they can have value assigned to them Attribute Computation functions: these are sometimes called as semantic functions are associated with grammar rules, used to specify how attribute values are computed. Predicate functions: which state some of the syntax and static semantics rules of the language, are associated with grammar rules. Static Grammar : these are some characteristic of the structure of the programming language that are difficult to describe with BNF and some are impossible. Consider type compatibility rule ,like we have one integer and one float variable and if we assign int to float it is ok but if we try to assign float to int it not compatible , so this restriction can be specified inBNF but it requires additional nonterminal symbols and rule, these problems comes under the category of language rules called static semantics. The static semantics rules of a language state it is type constraints, static semantic is so named because the analysis required to check these specification can be done at compile time, because of these problem of describing static semantics with BNF no of mechanisms devised for that task, one such mechanism is attribute grammar, which describes both syntax and static semantic of programs. Attribute Grammar Defined: An attribute grammar is a grammar having following features In a grammar each symbol X is having set of attributes A(X). The set A(X) consists of two disjoint sets S(x) and I(X), called synthesized and Inherited attributes. Synthesized attributes are used to pass semantic information up a parse tree. Inherited attribute pass semantic information down a tree. For a rule X0->X1,X2,…..Xn the synthesized attribute of X0 are computed with semantic functions of the form S(X0)=f(A(X1),A(X2),…..A(Xn)), value of a synthesized attribute on a parse tree node depends only on the value of the attributes on that nodes children, Inherited attributes of symbol Xj are computed with a semantic function of the form I(Xj)=f(A(X0),A(X1),….A(Xn)). So value of inherited attribute on a parse tree node depends on the attribute values of that node parent node. 8 CSE–PPL Difference Between Synthesized and Inherited attributes Synthesized attributes 1) Synthesized att are the attributes whose values can be synthesized on occurrence of their symbol on the left hand side of the production 2) For a given grammar rule Ai->A1,A2….An, the synthesized attributes can follow bottom up approach for information flow between the attributes 3) The general form of the semantics function of the symbol Ai for synthesized att is S(Ai)=f(A(x1),….A(Xn)). Inherited attributes 1) Inherited att are the att whose values can be inherited on occurrence of their symbol on the right hand side of the production. 2) For a given grammar rule Ai->A1,A2…An. The inherited att can follow top down approach for information flow between the attributes. 3) The general form of semantic function of the symbol Ai for inherited attributes is I(Ai)=f(A(X1),A(X2)…). Intrinsic Attributes These are the synthesized attributes of terminal nodes whose values can be deduced outside the parse tree. 9