* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Here
Survey
Document related concepts
Transcript
Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to John Mitchell and Elsa Gunter who’s slides this lecture is based on. 1 Sequence control • Implicit and explicit sequence control – Expressions • Precedence rules • Associativity – Statements • Sequence • Conditionals • Iterations – Subprograms – Declarative programming • Functional • Logic programming 2 Expression Evaluation • Determined by – operator evaluation order – operand evaluation order • Operators: – Most operators are either infix or prefix (some languages have postfix) – Order of evaluation determined by operator precedence and associativity 3 Example • What is the result for: 3+4*5+6 • Possible answers: – – – – 41 = ((3 + 4) * 5) + 6 47 = 3 + (4 * (5 + 6)) 29 = (3 + (4 * 5)) + 6 = 3 + ((4 * 5) + 6) 77 = (3 + 4) * (5 + 6) 4 Example Again • In most language, 3 + 4 * 5 + 6 = 29 • … but it depends on the precedence of operators 5 Operator Precedence • Operators of highest precedence evaluated first (bind more tightly). Level Operator Operation Highest ** abs not Exp, abs, negation • Precedence for operators usually given in a table, e.g.: • In APL, all infix operators have same precedence * / mod rem Lowest +- Unary +-& Binary = <= < > => Relations And or xor Boolean Precedence table for ADA 6 C precedence levels • • • • • • • • • • • • • • • • • • • • • • • Precedence Operators 17 tokens, a[k], f() .,-> 16 ++, -15* ++, -, -, sizeof !,&,* 14 typename 13 *, /, % 12 +,11 <<, >> 10 <,>,<=, >= 9 ==, != 8 & 7 6 | 5 && 4 || 3 ?: 2 =, +=, -=, *=, /=, %=, <<=, >>=, &=, =, |= 1 , Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 Operator names Literals, subscripting, function call Selection Postfix increment/decrement Prefix inc/dec Unary operators, storage Logical negation, indirection Casts Multiplicative operators Additive operators Shift Relational Equality Bitwise and Bitwise xor Bitwise or Logical and Logical or Conditional Assignment Sequential evaluation 7 Associativity • • • When we have sorted precedence we need to sort associativity! What is the value of: 7–5–2 Possible answers: – In Pascal, C++, SML associate to the left 7 – 5 – 2 = (7 – 5) – 2 = 0 – In APL, associate to the right 7 – 5 – 2 = 7 – (5 – 2) = 4 8 Special Associativity • In languages with built in support for infix exponent operator, it is standard for it to associate to the right: 2 ** 3 ** 4 = 2 ** (3 ** 4) • In ADA, exponentiation in non-associative; must use parentheses 9 Operand Evaluation Order • Example: A := 5; f(x) = {A := x+x; return x}; B := A + f(A); • What is the value of B? • 10 or 15? 10 Example • If assignment returns the assigned value, what is the result of x = 5; y = (x = 3) + x; • Possible answers: 6 or 8 • Depends on language, and sometimes compiler – C allows compiler to decide – SML forces left-to-right evaluation • Note assignment in SML returns a unit value • .. but we could define a derived assignment operator in SML as fn (x,v)=>(x:=v;v) 11 Solution to Operand Evaluation Order • Disallow all side-effect – “Purely” functional languages try to do this – Miranda, Haskell – Problem: I/O, error conditions such as overflow are inherently side-effecting 12 Solution to Operand Evaluation Order • Disallow all side-effects in expressions but allow in statements – Problem: not applicable in languages with nesting of expressions and statements 13 Solution to Operand Evaluation Order • Fix order of evaluation – SML does this – left to right – Problem: makes some compiler optimizations hard to impossible • Leave it to the programmer to be sure the order doesn’t matter – Problem: error prone 14 Short-circuit Evaluation • Boolean expressions: • Example: x <> 0 andalso y/x > 1 • Problem: if andalso is ordinary operator and both arguments must be evaluated, then y/x will raise an error when x = 0 15 Boolean Expressions • Most languages allow (some version of) if…then…else, andalso, orelse not to evaluate all the arguments • if true then A else B – doesn’t evaluate B 16 Boolean Expressions • if false then A else B – doesn’t evaluate A • if b_exp then A else B – Evaluates b_exp, then applies previous rules 17 Boolen Expressions • Bexp1 andalso Bexp2 – If Bexp1 evaluates to false, doesn’t evaluate Bexp2 • Bexp1 orelse Bexp2 – If Bexp1 evaluates to true, doesn’t evaluate Bexp2 18 Short-circuit Evaluation – Other Expressions • Example: 0 * A = 0 • Do we need to evaluate A? • In general, in f(x,y,…,z) are the arguments to f evaluated before f is called and the values are passed, or are the unevaluated expressions passed as arguments to f allowing f to decide which arguments to evaluate and in which order? 19 Eager Evaluation • If language requires all arguments to be evaluated before function is called, language does eager evaluation and the arguments are passed using pass by value (also called call by value) or pass by reference 20 Lazy Evaluation • If language allows function to determine which arguments to evaluate and in which order, language does lazy evaluation and the arguments are passed using pass by name (also called call by name) 21 Lazy Evaluation • Lazy evaluation mainly done in purely functional languages • Some languages support a mix • Effect of lazy evaluation can be implemented in functional language with eager evaluation – Use thunking fn()=>exp and pass function instead of exp 22 Infix and Prefix • • • • • • • • • Infix notation: Operator appears between operands: 2+35 3+69 Implied precedence: 2 + 3 * 4 2 + (3 * 4 ), not (2 + 3 ) * 4 Prefix notation: Operator precedes operands: +235 + 2 * 3 5 (+ 2 ( * 3 5 ) ) + 2 15 17 Prefix notation is sometimes called Cambridge Polish notation – used as basis for LISP 23 Polish Postfix • Postfix notation: Operator follows operands: • 23+5 • 2 3 * 5 + (( 2 3 * 5 +) 6 5 + 11 • Called Polish postfix since few could pronounce Polish mathematician Lukasiewicz, who invented it. • An interesting, but unimportant mathematical curiosity when presented in 1920s. Only became important in 1950s when Burroughs rediscovered it for their ALGOL compiler. 24 Evaluation of postfix • 1. If argument is an operand, stack it. • 2. If argument is an n-ary operator, then the n arguments are already on the stack. Pop the n arguments from the stack and replace by the value of the operator applied to the arguments. • • • • • • • • Example: 2 3 4 + 5 * + 1. 2 - stack 2. 3 - stack 3. 4 - stack 4. + - replace 3 and 4 on stack by 7 5. 5 - stack 6. * - replace 5 and 7 on stack by 35 7. + - replace 35 and 2 on stack by 37 25 Importance of Postfix to Compilers • • Code generation same as expression evaluation. To generate code for 2 3 4 + 5 * +, do: 1. 2. 3. 4. 5. 6. 7. stack L-value of 2 stack L-value of 3 stack L-value of 4 + - generate code to take R-value of top stack element (L-value of 4) and add to R-value of next stack element (L-value of 3) and place Lvalue of result on stack stack L-value of 5 * - generate code to take R-value of top stack element (L-value of 5) and multiply to R-value of next stack element (L-value of 7) and place L-value of result on stack + - generate code to take R-value of top stack element (L-value of 35) and add to R-value of next stack element (L-value of 2) and place Lvalue of result (37) on stack 26 Control of Statement Execution • • • • Sequential Conditional Selection Looping Construct Must have all three to provide full power of a Computing Machine 27 Basic sequential operations • Skip • Assignments – Most languages treat assignment as a basic operation – Some languages have derived assignment operators such as: • += and *= in C – In SML assignment is just another (infix) function • := : ‘‘a ref * ‘‘a -> unit • I/O – Some languages treat I/O as basic operations – Others like, C, SML, Java treat I/O as functions/methods • Sequencing – C;C • Blocks – Begin …end – {…} 28 Conditional Selection • Design Considerations: – What controls the selection – What can be selected: modern languages allow any kind of program block – What is the meaning of nested selectors 29 Conditional Selection • Single-way – IF … THEN … – Controlled by boolean expression 30 Conditional Selection • Two-way – IF … THEN … ELSE – Controlled by boolean expression – IF … THEN … usually treated as degenerate form of IF … THEN … ELSE – IF…THEN together with IF..THEN…ELSE require disambiguating associativity 31 Multi-Way Conditional Selection • CASE – Typically controlled by scalar type – Each selection has own block of statements it executes – What if no selection is is given? • Language gives default behavior • Language forces total coverage, typically with programmer-defined default case 32 Multi-Way Conditional Selection • SWITCH – – – – Similar to Case One block of code for whole switch Selection specifies program point in block break used for early exit from block • ELSEIF – Equivalent to nested if…then…else… 33 Multi-Way Conditional Selection • Non-deterministic Choice – Syntax: if <boolean guard> -> <statement> [] <boolean guard> -> <statement> ... [] <boolean guard> -> <statement> fi 34 Multi-Way Conditional Selection • Non-deterministic Choice – Semantics: • Randomly choose statement whose guard is true • If none –Do nothing –Cause runtime error 35 Multi-Way Conditional Selection • Pattern Matching in SML datatype ‘a tree = LF of ‘a | ND of (‘a tree)*(‘a tree) - fun print_tree (LF x) = (print(“Leaf “);print_a(x)) | print_tree (ND(x,y)) = (print(“Node”); print_tree(x); print_tree(y)); 36 Multi-Way Conditional Selection • Search in Logic Programming – – – – – Clauses of form <head> :- <body> Select clause whose head unifies with current goal Instantiate body variables with result of unification Body becomes new sequence of goals 37 Example • APPEND in Prolog: append([a,b,c], [d,e], X) • X = [a,b,c,d,e] • Definition: • append([ ], X, X). • append( [ H | T], Y, [ H | Z]) :- append(T, Y, Z). Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 38 Loops • • • • Main types: Counter-controlled iteraters (For-loops) Logical-test iterators Recursion 39 For-loops • Controlled by loop variable of scalar type with bounds and increment size • Scope of loop variable? – Extent beyond loop? – Within loop? 40 For-loops • When are loop parameters calculated? – Once at start – At beginning of each pass 41 Logic-Test Iterators • While-loops – Test performed before entry to loop • repeat…until and do…while – Test performed at end of loop – Loop always executed at least once 42 Gotos • • • • Requires notion of program point Transfers execution to given program point Basic construct in machine language Implements loops 43 Gotos • Makes programs hard to read and reason about • Hard to know how a program got to a given point • Generally thought to be a bad idea in a high level language 44 Fortran Control Structure 10 IF (X .GT. 0.000001) GO TO 20 11 X = -X IF (X .LT. 0.000001) GO TO 50 20 IF (X*Y .LT. 0.00001) GO TO 30 X = X-Y-Y 30 X = X+Y ... 50 CONTINUE X =A Y = B-A GO TO 11 … 45 Historical Debate • Dijkstra, Go To Statement Considered Harmful – Letter to Editor, C ACM, March 1968 – Now on web: http://www.acm.org/classics/oct95/ • Knuth, Structured Prog. with go to Statements – You can use goto, but do so in structured way … • Continued discussion – Welch, GOTO (Considered Harmful)n, n is Odd • General questions – Do syntactic rules force good programming style? – Can they help? 46 Control structures represented as flowcharts Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 47 Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 48 Spaghetti code Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 49 Prime Programs • Earlier discussion on control structures seemed somewhat ad hoc. • Is there a theory to describe control structures? • Do we have the right control structures? • Roy Maddux in 1975 developed the concept of a prime program as a mechanism for answering these questions. Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 50 Proper programs • A proper program is a flowchart with: – 1 entry arc – 1 exit arc – There is a path from entry arc to any node to exit arc • A prime program is a proper program which has no embedded proper subprogram of greater than 1 node. (i.e., cannot cut 2 arcs to extract a prime subprogram within it). • A composite program is a proper program that is not prime. Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 51 Prime decomposition • Every proper program can be decomposed into a hierarchical set of prime subprograms. This decomposition is unique (except for special case of linear sequences of function nodes). Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 52 Structured programming • Issue in 1970s: Does this limit what programs can be written? • Resolved by Structure Theorem of Böhm-Jacobini. • Here is a graph version of theorem originally developed by Harlan Mills: Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 53 Structured Programming • Use of prime programs to define structured programming: • Concept first used by Dijkstra in 1968 as gotoless programming. • Called structured programming in early 1970s• Program only with if, while and sequence control structures. Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 54 Advance in Computer Science • Standard constructs that structure jumps if … then … else … end while … do … end for … { … } case … • Modern style – Group code in logical blocks – Avoid explicit jumps except for function return – Cannot jump into middle of block or function body • But there may be situations when “jumping” is the right thing to do! 55 Exceptions: Structured Exit • Terminate part of computation – – – – Jump out of construct Pass data as part of jump Return to most recent site set up to handle exception Unnecessary activation records may be deallocated • May need to free heap space, other resources • Two main language constructs – Declaration to establish exception handler – Statement or expression to raise or throw exception Often used for unusual or exceptional condition, but not necessarily. 56 Exceptions • Exception: caused by unusual event – Detected by hardware – Detected in program • By compiler • By explicit code in program 57 Exceptions • Built-in only or also user defined • Can built-in exceptions be raised explicitly in code • Carry value (such as a string) or only label 58 Exceptions • Exception handling: Control of execution in presence of exception • Can be simulated by programmer explicitly testing for error conditions and specifying actions – But this is error prone and clutters programs 59 Exception Handlers • Is code separate unit from code that can raise the exception • How is an exception handler bound to an exception • What is the scope of a handles: must handler be local to code unit that raises it • After handler is finished, where does the program continue, if at all • If no handler is explicitly present, should there be an implicit default handler 60 ML Example exception Determinant; (* declare exception name *) fun invert (M) = (* function to invert matrix *) … if … then raise Determinant (* exit if Det=0 *) else … end; ... invert (myMatrix) handle Determinant => … ; Value for expression if determinant of myMatrix is 0 61 C++ Example Matrix invert(Matrix m) { if … throw Determinant; … }; try { … invert(myMatrix); … } catch (Determinant) { … // recover from error } 62 C++ vs ML Exceptions • C++ exceptions – Can throw any type – Stroustrup: “I prefer to define types with no other purpose than exception handling. This minimizes confusion about their purpose. In particular, I never use a built-in type, such as int, as an exception.” -- The C++ Programming Language, 3rd ed. • ML exceptions – Exceptions are a different kind of entity than types. – Declare exceptions before use Similar, but ML requires the recommended C++ style. 63 ML Exceptions • Declaration exception name of type gives name of exception and type of data passed when raised • Raise raise name parameters expression form to raise and exception and pass data • Handler exp1 handle pattern => exp2 evaluate first expression if exception that matches pattern is raised, then evaluate second expression instead General form allows multiple patterns. 64 Which handler is used? exception Ovflw; fun reciprocal(x) = if x<min then raise Ovflw else 1/x; (reciprocal(x) handle Ovflw=>0) / (reciprocal(y) handle Ovflw=>1); • Dynamic scoping of handlers – First call handles exception one way – Second call handles exception another – General dynamic scoping rule Jump to most recently established handler on run-time stack • Dynamic scoping is not an accident – User knows how to handler error – Author of library function does not 65 Exception for Error Condition - datatype ‘a tree = LF of ‘a | ND of (‘a tree)*(‘a tree) - exception No_Subtree; - fun lsub (LF x) = raise No_Subtree | lsub (ND(x,y)) = x; > val lsub = fn : ‘a tree -> ‘a tree – This function raises an exception when there is no reasonable value to return 66 Exception for Efficiency • Function to multiply values of tree leaves fun prod(LF x) = x | prod(ND(x,y)) = prod(x) * prod(y); • Optimize using exception fun prod(tree) = let exception Zero fun p(LF x) = if x=0 then (raise Zero) else x | p(ND(x,y)) = p(x) * p(y) in p(tree) handle Zero=>0 end; 67 Dynamic Scope of Handler scope exception X; (let fun f(y) = raise X and g(h) = h(1) handle X => 2 in g(f) handle X => 4 end) handle X => 6; handler Which handler is used? 68 Compare to static scope of variables exception X; (let fun f(y) = raise X and g(h) = h(1) handle X => 2 in g(f) handle X => 4 end) handle X => 6; val x=6; (let fun f(y) = x and g(h) = let val x=2 in h(1) in let val x=4 in g(f) end); 69 Exceptions and Resource Allocation exception X; (let val x = ref [1,2,3] in let val y = ref [4,5,6] in … raise X end end); handle X => ... • Resources may be allocated between handler and raise • May be “garbage” after exception • Examples – – – – Memory Lock on database Threads … General problem: no obvious solution 70 Continuations • General technique using higher-order functions – Allows “jump” or “exit” by function call • Used in compiler optimization – Make control flow of program explicit • General transformation to “tail recursive form” • Idea: – The continuation of an expression is “the remaining work to be done after evaluating the expression” – Continuation of e is a function applied to e 71 Example • Expression – 2*x + 3*y + 1/x + 2/y • What is continuation of 1/x? – Remaining computation after division let val before = 2*x + 3*y fun continue(d) = before + d + 2/y in continue (1/x) end 72 Other uses for continuations • Explicit control – Normal termination -- call continuation – Abnormal termination -- do something else • Compilation techniques – Call to continuation is functional form of “go to” – Continuation-passing style makes control flow explicit MacQueen: “Callcc is the closest thing to a ‘come-from’ statement I’ve ever seen.” 73 Capturing Current Continuation • Language feature – callcc : call a function with current continuation – Can be used to abort subcomputation and go on • Examples – callcc (fn k => 1); > val it = 1 : int • Current continuation is “fn x => print x” • Continuation is not used in expression. – 1 + callcc(fn k => 5 + throw k 2); > val it = 3 : int • Current continuation is “fn x => print 1+x” • Subexpression throw k 2 applies continuation to 2 74 More with callcc • Example 1 + callcc(fn k1=> … callcc(fn k2 => … if … then (throw k1 0) else (throw k2 “stuck”) )) • Intuition – Callcc lets you mark a point in program that you can return to – Throw lets you jump to that point and continue from there 75 Subprograms 1. A subprogram has a single entry point 2. The caller is suspended during execution of the called subprogram 3. Control always returns to the caller when the called subprogram’s execution terminates Functions or Procedures? • • Procedures provide user-defined statements Functions provide user-defined operators 76 Subprograms • Specification: name, signature, actions • Signature: number and types of input arguments, number and types of output results – Book calls it protocol 77 Subprograms • Actions: direct function relating input values to output values; side effects on global state and subprogram internal state • May depend on implicit arguments in form of non-local variables 78 Subprogram As Abstraction • Subprograms encapsulate local variables, specifics of algorithm applied – Once compiled, programmer cannot access these details in other programs 79 Subprogram As Abstraction • Application of subprogram does not require user to know details of input data layout (just its type) – Form of information hiding 80 Subprogram Parameters • Formal parameters: names (and types) of arguments to the subprogram used in defining the subprogram body • Actual parameters: arguments supplied for formal parameters when subprogram is called 81 Subprogram Parameters • Parameters may be used to: – Deliver a value to subprogram – in mode – Return a result from subprogram – out mode – Both – in out mode • Most languages use only in mode • Ada uses all 82 Parameter Passing (In Mode) • Pass-by-value: Subprogram make local copy of value (r-value) given to input parameter • Assignments to parameter not visible outside program • Most common in current popular languages 83 Parameter Passing (In Mode) • Pass-by-reference: Subprogram is given an access path (l-value) to the input parameter which is used directly • Assignments to parameter directly effect nonlocal variables • Same effect can be had in pass-by-value by passing a pointer 84 Parameter Passing (In Mode) • Pass-by-name: text for argument is passed to subprogram and expanded in in each place parameter is used – Roughly same as using macros • Achieves late binding 85 Pass-by-name Example integer INDEX= 1; integer array ARRAY[1:2] procedure UPDATE (PARAM); integer PARAM begin PARAM := 3; INDEX := INDEX + 1; PARAM := 5; end UPDATE(ARRAY[INDEX]); •The above code puts 3 in ARRAY[1] and 5 in ARRAY[2] 86 Subprogram Implementation • Subprogram definition gives template for its execution – May be executed many times with different arguments 87 Subprogram Implementation • Template holds code to create storage for program data, code to execute program statements, code to delete program storage when through, and storage for program constants • Layout of all but code for program statments called activation record 88 Subprogram Implementation • Code segment Activation Record Code to create activation record inst Program code Code to delete activation record inst const 1 Return point Static link Dynamic link Result data Input parameters Local parameters const n 89 Activation Records • Code segment generated once at compiled time • Activation record instances created at run time • Fresh activation record instance created for each call of subprogram – Usually put in a stack (top inserted first) 90 Activation Records • Static link – pointer to bottom of activation record instance of static parent – Used to find non-local variables • Dynamic link – pointer to top of activation record instance of the caller – Used to delete subprogram activation record instance at completion 91 Summary • Expression – Precedence and associativity – Evaluation of formal arguments • Eager, lazy or mixed • Structured Programming – – – – Basic statements Conditionals loops Go to considered harmful • Exceptions – “structured” jumps that may return a value – dynamic scoping of exception handler • Continuations – Function representing the rest of the program – Generalized form of tail recursion – Used in Lisp, ML compilation • Subprograms 92