Download Lambda calculus

Lambda calculus What is λ-calculus ? “Lambda calculus is a formal system in mathematical logic and computer science for expressing computation based on function abstraction and application using variable binding and substitution” --Wikipedia 𝜆𝑥. 𝑥 2 ⇔ 𝑓 𝑥 = 𝑥 2 Lambda calculus in history of mathematics • The lambda calculus was introduced by mathematician Alonzo Church in the 1930s . • The original system was shown to be logically inconsistent in 1935 when Stephen Kleene and J. B. Rosser developed the Kleene–Rosser paradox. • Subsequently, in 1936 Church isolated and published just the portion relevant to computation, what is now called the untyped lambda calculus. • The λ-calculus was re-discovered as a versatile tool in Computer Science by people like McCarthy, Landin, and Scott in the 1960s. Mc-Carthy came out with his list processing language Lisp in 1960. McCarthy knew of the λ-calculus, and his language closely resembles it. • Now used as • Tool for investigating computability • Basis of functional programming languages Lisp, Scheme, Haskell,ML… Expressions in the λ-calculus • Expressions in the λ-calculus are written in strict prefix form 𝑥 ∗ 𝑥 , 𝑥2 → ∗ 𝑥 𝑥 • Function and argument are simply written next to each other. sin 𝑥 → sin 𝑥 If a function takes more than one argument, then these are simply lined up after the function. Thus: 𝑥+3 → + 𝑥 3 𝑥3 → ∗ 𝑥 𝑥 𝑥 • Brackets are employed only to enforce a special grouping sin 𝑥 + 4 → + (sin 𝑥) 4 *The 𝜆-calculus is a purely syntactic device, it does not make any distinctions between simple entities Functions in the λ-calculus • Function formation is sometimes written as an equation, 𝑓 𝑥 = 3𝑥 , sometimes as a mapping 𝑥 ⟼ 3𝑥 . • In the λ-calculus a special notation is available - The Greek letter λ. • The λ alerts that the variable is not part of an expression but the formal parameter of the function . The dot after the formal parameter introduces the function body. 𝜆 𝑥 .∗ 3 𝑥 • A function which has been written in λ-notation can itself be used in an expression. 𝜆 𝑥 .∗ 3 𝑥 4 𝜆𝑦. (𝜆𝑥.∗ 𝑥 𝑦) • why the brackets around the function? They are there to make clear where the definition of the function ends. 𝜆 𝑥 .∗ 3 𝑥 4 ⟹ 3 ∗ 𝑥 ∗ 4 (assuming that is interpreted as a 3-ary function) Functions in the λ-calculus • Although it is not strictly necessary, it will be convenient to introduce abbreviations for𝜆-terms. 𝐹 ≝ 𝜆𝑥. ∗ 3 4 • If body of a function consists of another function, as here 𝑁 ≝ 𝜆𝑦. (𝜆𝑥.∗ 𝑥 𝑦) we could also consider it as a function of two arguments. • If we want to see it as a function of two arguments then we can leave out the brackets: 𝜆𝑦. 𝜆𝑥.∗ 𝑥 𝑦 𝑜𝑟 𝜆 𝑦 𝑥.∗ 𝑥 𝑦 The official definition • Function formation and function application are all that there is. • λ-terms are constructed according to the grammar: 𝑀 ∷= 𝑐 𝑥 𝑀𝑀 | 𝜆𝑥. 𝑀  The placeholder c represents any constant, such as numbers 1, 2, 3,... or arithmetic operators +,∗, etc.  the letter x represents any of infinitely many possible variables. The given grammar is ambiguous; the term 𝜆𝑥. 𝑥 𝑦 could be parsed as 𝑀 → 𝜆𝑥. 𝑀 𝑀 → 𝑀𝑀 𝑀→𝑥 𝑀→𝑦 𝑀 → 𝑀𝑀 𝑀 → 𝜆𝑥. 𝑀 𝑀→𝑥 𝑀→𝑦 *(we use “app” to indicate use of the clause "MM" in the derivation) 𝛽 - Reduction • There is only one rule of computation, called reduction , it concerns the replacement of a formal parameter by an actual one. 𝜆 𝑥 .∗ 3 𝑥 4 → ∗34 𝛽 𝜆𝑦. 𝑦 𝜆𝑥.∗ 3𝑥 5 → 𝛽 𝜆𝑥.∗ 3𝑥 5 → * 3 5 𝛽 • When no further reductions are possible, we say that the term has been reduced to normal form. • Is every term has a normal form ? NO ! Ω == (𝜆𝑥. 𝑥 𝑥)(𝜆𝑥. 𝑥 𝑥) The term Ω always reduces to itself! Confluence • It may be that a λ-term offers many opportunities for reduction at the same time. • it is necessary that the result of a computation is independent from the order of reduction Theorem 1 (Church-Rosser) If a term M can be reduced (in several steps) to terms N and P, then there exists a term Q to which both N and P can be reduced (in several steps). As a picture: Confluence Theorem 1 (Church-Rosser) If a term M can be reduced (in several steps) to terms N and P, then there exists a term Q to which both N and P can be reduced (in several steps). Intuition: Lets look at a specific case where every variable can appear 0/1 time in a term. Base case - divided into 3 cases: 𝛽1 𝜆𝑦. 𝑀 . . . 𝜆𝑥. 𝑁 𝛽2 1. 𝑀′ … 𝜆𝑥. 𝑁 𝜆𝑦. 𝑀 … 𝑁′ 𝛽2 2. 𝛽1 𝑀′ … 𝑁′ 𝜆𝑥. 𝑀 𝑁 𝛽1 𝛽2 𝑀𝑁 𝜆𝑥. 𝑀 𝑁′ 𝛽2 𝑀[𝑁 ′ ] 𝛽1 Confluence 𝜆𝑥. 𝑀 𝑁 𝛽1 3. 𝑀𝑁 𝛽2 𝜆𝑥. 𝑀′ 𝑁 𝛽2 𝑀′ 𝑁 𝛽1 And if you believed me up until now the rest is very simple…  . . . Confluence Corollary 2 Every λ-term has at most one normal form. Proof. For the sake of contradiction, assume that there are normal forms N and P to which a certain term M reduces: By the theorem of Church and Rosser there is a term Q to which both N and P can be reduced. However, N and P are assumed to be in normal form, so they don’t allow for any further reductions. The only possible interpretation is that N = P = Q. Free and bound variables • The operator 𝜆 is a binding operator. Variables that fall within the scope of an abstraction are said to be bound. All other variables are called free. • 𝜆𝑥. ∗ 𝑥 𝑦 • Note: variable is bound by its "nearest" abstraction • (𝜆𝑥. 𝑦)(𝜆𝑥.∗ 𝑧𝑥) (the single occurrence of x in the expression is bound by the second lambda) Formal Definition: • The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by recursion on the structure of the terms, as follows: • FV(x) = {x}, where x is a variable • FV(λx.M) = FV(M) \ {x} • FV(M N) = FV(M) ∪ FV(N) 𝜶 − conversion • Alpha - conversion, allows bound variable names to be changed 𝜆𝑥. 𝑥 → 𝜆𝑦. 𝑦 • The only variable occurrences that are renamed are those that are bound to the same abstraction 𝜆𝑥. 𝜆𝑥. 𝑥 → 𝜆𝑦. 𝜆𝑥. 𝑥 • Alpha-conversion is not possible if it would result in a variable getting captured by a different abstraction 𝜆𝑥. 𝜆𝑦. 𝑥 ↛ 𝜆𝑦. 𝜆𝑦. 𝑦 𝑥 𝑥 𝑦 (𝜆𝑥. 0 𝑥𝑦𝑑𝑦) 𝑦 + 1 = (𝛼) (𝜆𝑥. 0 𝑥𝑧𝑑𝑧) 𝑦 + 1 = (𝛽) ( 0 𝑦𝑧𝑑𝑧) 𝜼 − conversion 𝜂 𝜆𝑥. 𝑓 𝑥 = 𝑓 • Two functions are the same if and only if they give the same result for all arguments. • Eta - conversion converts between 𝜆𝑥. 𝑓 𝑥 and 𝑓 whenever 𝑥 does not appear free in 𝑓. 𝜆𝑥. 𝜆𝑦. 𝑦 2 𝑥 𝜆𝑥. 𝜆𝑦. 𝑦 2 𝑥 𝜂 𝛽 𝜆𝑦. 𝑦 2 𝜆x. x 2 𝛼 𝜆𝑦. 𝑦 2 Higher-order functions • Let us look at an example: A 𝜆-term for squaring integers is given by 𝑄 ≝ 𝜆𝑥.∗ 𝑥 𝑥 • If we want to compute 𝑥 8 then this can be achieved by squaring x three times: 8 𝑥 = 𝑥 2 2 2 • In 𝜆-calculus notation, we would write for the “power-8”-function: 𝑃8 ≝ 𝜆𝑥. 𝑄(𝑄 𝑄𝑥 ) • It is now a simple step to write out a 𝜆-term which applies any function three times: 𝑇 ≝ 𝜆𝑓. (𝜆𝑥. 𝑓 𝑓 𝑓𝑥 ) • Operators such as T are called higher order because they operate on functions rather than numbers. Iteration and recursion • How can we generalize what we saw, in the high order function, to get the behavior of a for-loop? • First of all let us define some helpful new “constants”: 1. "𝑧𝑒𝑟𝑜? “ - Its behavior is like an if-then-else clause: 𝑧𝑒𝑟𝑜? 0 𝑥 𝑦 ⟶ 𝑥 𝑧𝑒𝑟𝑜? 𝑛 𝑥 𝑦 ⟶ 𝑦 (𝑛 ≠ 0) (In 𝐽𝑎𝑣𝑎, we would write this as 𝑛 == 0 ? 𝑥 ∶ 𝑦 ) 2. “pred” for predecessor function on natural numbers. Iteration and recursion • Now we construct a term 𝐼 (for “Iteration”) which takes as arguments a number 𝑛, a function 𝑓, and a value 𝑥, and computes the 𝑛 − 𝑓𝑜𝑙𝑑 application of 𝑓 to 𝑛: 𝐼 𝑛𝑓𝑥 = 𝑓(𝑓 𝑓 . . . 𝑓𝑥 . . . ) (If 𝑛 = 0 then 𝐼 0 𝑓 𝑥 should simply return 𝑥) 𝐼 = 𝜆 𝑛 𝑓 𝑥 . 𝑧𝑒𝑟𝑜? 𝑛 𝑥 𝐼 𝑝𝑟𝑒𝑑 𝑛 𝑓 𝑓 𝑥 Example: 𝑛 = 3 , 𝑓 = 𝜆𝑥. 𝑥 + 1 , 𝑥 = 0 I 3, f, 0 ⟹ zero? 3 0 I 2, λx. x + 1,1 I 2, f, 1 ⟹ zero? 2 1 I(1, λx. x + 1,2) I 1, f, 2 ⟹ zero? 1 2 I(0, λx. x + 1,3) I 0, f, 3 ⟹ zero? 0 𝟑 3 I(pred 0), λx. x + 1,4) Well-typed 𝝀- terms • there is nothing in the grammar which stops us from forming awful terms, such as “sin log”. • Such terms do not make any sense at all, and any sensible programming language compiler would reject them. • What is missing in the calculus is a notion of type. For example, the type of the sin function should be “accepts real numbers and produces real numbers”. • A language for expressing these properties (i.e., types) is easily definedWe start with some base types such as “int” and “real”, and then form function types on top of them. • The grammar: 𝜏 ∷= 𝑐 | 𝜏 → 𝜏 c represents all the base types Well-typed 𝝀- terms • On the basis of a type system, we can formulate restrictions on what kind of terms are valid (or well-typed). We do so by employing an inductive definition: Definition (Well-typed 𝝀-terms): Base case. For every type 𝐴 and every variable 𝑥, the term 𝑥: 𝐴 is well-typed and has type 𝐴. Function formation. For every term 𝑀 of type 𝐵 , every variable 𝑥, and every type 𝐴 , the term 𝜆𝑥: 𝐴. 𝑀 is well-typed and has type 𝐴 → 𝐵. Application. If 𝑀 is well-typed of type 𝐴 → 𝐵 and 𝑁 is well-typed of type 𝐴 then 𝑀 𝑁 is well-typed and has type 𝐵. • 𝜆𝑥: 𝐴. 𝑥: 𝐴 𝑖𝑠 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑 𝑜𝑓 𝑡𝑦𝑝𝑒 𝐴 → 𝐴 • 𝜆𝑥: 𝐴. (𝜆𝑦: 𝐵. 𝑥: 𝐴) 𝑖𝑠 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑 𝑜𝑓 𝑡𝑦𝑝𝑒 𝐴 → (𝐵 → 𝐴) • O𝑛 𝑡ℎ𝑒 𝑜𝑡ℎ𝑒𝑟 ℎ𝑎𝑛𝑑, 𝑡ℎ𝑒 𝑡𝑒𝑟𝑚 sin log 𝑖𝑠 𝑛𝑜𝑡 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑. Well-typed 𝝀- terms Calculating simple types: • It is quite easy to find out whether a term can be typed or not by following the steps in which the term was constructed. • What we do is to annotate subterms with type expressions which still contain type variables A,B,C, . . . and which we refine as we go along. • Consider for example, the term 𝜆 𝑓 𝑥. 𝑓 𝑥 − We give 𝑥 𝑡ℎ𝑒 𝑡𝑦𝑝𝑒 𝐴 , 𝑎𝑛𝑑 𝑓 𝑡ℎ𝑒 𝑡𝑦𝑝𝑒 𝐵 𝑓 𝑥 − 𝑛𝑒𝑒𝑑 𝑡𝑜 𝑏𝑒 𝑤𝑒𝑙𝑙 𝑡𝑦𝑝𝑒𝑑 − 𝑤𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐵 𝑡𝑜 𝐴 → 𝐶. 𝐴𝑐𝑐𝑜𝑟𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝑟𝑢𝑙𝑒, 𝑡ℎ𝑒 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒 𝑠ℎ𝑜𝑢𝑙𝑑 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 (𝐴 → 𝐶) → (𝐴 → 𝐶). *At this stage the type variables can be instantiated with something more concrete (such as “int” or “real”) but we only wanted to establish typability and so we can stop here. Well-typed 𝝀- terms Calculating simple types: 𝜆 𝑓 𝑥. 𝑓 𝑥 ∶ (𝐴 → 𝐶) → (𝐴 → 𝐶). • Further refinement is required if we extend the term to 𝜆 𝑓 𝑥. 𝑓 𝑥 𝜆𝑦. 𝑦 3  𝜆𝑦. 𝑦 𝑤𝑖𝑙𝑙 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 𝐷 → 𝐷 (𝐷 − 𝑛𝑒𝑤 𝑡𝑦𝑝𝑒 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒).  𝑊𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐴 𝑡𝑜 𝐷 , 𝑎𝑛𝑑 𝑎𝑙𝑠𝑜 𝐶 𝑡𝑜 𝐷 𝐼𝑛 𝑜𝑟𝑑𝑒𝑟 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑎𝑝𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑠𝑒𝑛𝑠𝑒  𝑡ℎ𝑒 𝑟𝑒𝑠𝑢𝑙𝑡𝑖𝑛𝑔 𝑡𝑦𝑝𝑒 𝑜𝑓 𝜆 𝑓 𝑥. 𝑓 𝑥 𝜆𝑦. 𝑦 𝑖𝑠 𝑛𝑜𝑤 𝐷 → 𝐷  3 𝑠ℎ𝑜𝑢𝑙𝑑 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 “𝑖𝑛𝑡”, 𝑠𝑜 𝑤𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐷 𝑡𝑜 "int“  Finally, if we spell out the types in the term we get: 𝜆 𝑓: 𝑖𝑛𝑡 → 𝑖𝑛𝑡 𝜆𝑥: 𝑖𝑛𝑡. 𝑓 𝑥 𝜆𝑦: 𝑖𝑛𝑡 . 𝑦 3 Theorem Every well-typed 𝜆-term has a normal form. The 𝝀-calculus as a model of computation Turing-complete A computational system that can compute every Turing-computable function is called Turing complete We call a calculus Turing-Complete if it allows one to define all computable function from N to N . The 𝝀-calculus is Turing-complete ! Church encoding: Terms that are usually considered primitive in other notations (such as integers, Boolean) are mapped to higher-order functions under Church encoding. Church numerals- a representation of the natural numbers using lambda notation The 𝝀-calculus as a model of computation 0 ≝ 𝜆𝑓 𝑥. 𝑥 1 ≝ 𝜆𝑓𝑥. 𝑓𝑥 2 ≝ 𝜆𝑓𝑥. 𝑓 𝑓𝑥 … 𝑛 = 𝜆𝑓𝑥. 𝑓 𝑛 𝑥 Computation with Church numerals: For example: Addition – uses the identity 𝑓 𝑚+𝑛 = 𝑓 𝑚 𝑓 𝑛 + ≝ 𝜆𝑚. 𝜆𝑛. 𝜆𝑓. 𝜆𝑥 . 𝑚 𝑓 ( 𝑛 𝑓 𝑥) But programs would be : Pretty slow Pretty large Pretty hard to understand. From Theory to Programming Language • Although the lambda-calculus is powerful enough to express any program, this doesn't mean that you'd actually want to do so. After all, the Turing Machine offers an equally powerful computational basis. Which lead us to Functional Programming… • Functional programming has its roots in lambda calculus – lambda calculus provides a theoretical framework for describing functions and their evaluation. Although it is a mathematical abstraction rather than a programming language, it forms the basis of almost all functional programming languages today. • Many functional programming languages can be viewed as elaborations on the lambda calculus. From Theory to Programming Language Functional Programming: “ functional programming is a programming paradigm, a style of building the structure and elements of computer programs, that treats computation as the evaluation of mathematical functions and avoids state and mutable data. “ • Functional programming emphasizes functions that produce results that depend only on their inputs and not on the program state - i.e. pure mathematical functions. • In functional code, the output value of a function depends only on the arguments that are input to the function. • So calling a function 𝑓 twice with the same value for an argument x will produce the same result 𝑓(𝑥) both times. --Wikipedia Similarity to Functional Programming Pascal: 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓 𝑥 ∶ 𝑖𝑛𝑡 ∶ 𝑖𝑛𝑡 ; 𝑏𝑒𝑔𝑖𝑛…<statements >… end; 𝜆 𝑥 . function body ML: 𝑓𝑢𝑛 𝑠𝑞 𝑥 ∶ 𝑖𝑛𝑡 = 𝑥 ∗ 𝑥; 𝜆 𝑥 . ∗𝑥𝑥 Scheme: 𝜆𝑥. 𝑀 ⟹ 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 𝑀 𝜆𝑥𝑦. + 𝑥 𝑦 3 4 ⟹ ( 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 𝑦 + 𝑥 𝑦 3 4 7 𝜆𝑥.∗ 𝑥 𝑥 ⟹ (𝑑𝑒𝑓𝑖𝑛𝑒 𝑠𝑞𝑢𝑎𝑟𝑒 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 ∗ 𝑥 𝑥 )

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lambda calculus