Download Lambda calculus

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Falcon (programming language) wikipedia , lookup

Anonymous function wikipedia , lookup

Currying wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Closure (computer programming) wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

Lambda lifting wikipedia , lookup

Standard ML wikipedia , lookup

Lambda calculus wikipedia , lookup

Lambda calculus definition wikipedia , lookup

Combinatory logic wikipedia , lookup

Transcript
Lambda calculus
What is λ-calculus ?
“Lambda calculus is a formal system in mathematical
logic and computer science for expressing computation based on
function abstraction and application using variable binding
and substitution”
--Wikipedia
𝜆𝑥. 𝑥 2 ⇔ 𝑓 𝑥 = 𝑥 2
Lambda calculus in history of mathematics
• The lambda calculus was introduced by mathematician Alonzo Church in the 1930s .
• The original system was shown to be logically inconsistent in 1935 when Stephen
Kleene and J. B. Rosser developed the Kleene–Rosser paradox.
• Subsequently, in 1936 Church isolated and published just the portion relevant to
computation, what is now called the untyped lambda calculus.
• The λ-calculus was re-discovered as a versatile tool in Computer Science by people like
McCarthy, Landin, and Scott in the 1960s.
Mc-Carthy came out with his list processing language Lisp in 1960.
McCarthy knew of the λ-calculus, and his language closely resembles it.
• Now used as
• Tool for investigating computability
• Basis of functional programming languages
Lisp, Scheme, Haskell,ML…
Expressions in the λ-calculus
• Expressions in the λ-calculus are written in strict prefix form
𝑥 ∗ 𝑥 , 𝑥2
→
∗ 𝑥 𝑥
• Function and argument are simply written next to each other.
sin 𝑥
→ sin 𝑥
If a function takes more than one argument, then these
are simply lined up after the function.
Thus:
𝑥+3 → + 𝑥 3
𝑥3
→ ∗ 𝑥 𝑥 𝑥
• Brackets are employed only to enforce a special grouping
sin 𝑥 + 4 → + (sin 𝑥) 4
*The 𝜆-calculus is a purely syntactic device, it does not make any distinctions between simple entities
Functions in the λ-calculus
• Function formation is sometimes written as an equation, 𝑓 𝑥 = 3𝑥 ,
sometimes as a mapping 𝑥 ⟼ 3𝑥 .
• In the λ-calculus a special notation is available - The Greek letter λ.
• The λ alerts that the variable is not part of an expression but the formal
parameter of the function .
The dot after the formal parameter introduces the function body.
𝜆 𝑥 .∗ 3 𝑥
• A function which has been written in λ-notation can itself be used in an
expression.
𝜆 𝑥 .∗ 3 𝑥 4
𝜆𝑦. (𝜆𝑥.∗ 𝑥 𝑦)
• why the brackets around the function?
They are there to make clear where the definition of the function ends.
𝜆 𝑥 .∗ 3 𝑥 4 ⟹ 3 ∗ 𝑥 ∗ 4 (assuming that is interpreted as a 3-ary function)
Functions in the λ-calculus
• Although it is not strictly necessary, it will be convenient to introduce
abbreviations for𝜆-terms.
𝐹 ≝ 𝜆𝑥. ∗ 3 4
• If body of a function consists of another function, as here
𝑁 ≝ 𝜆𝑦. (𝜆𝑥.∗ 𝑥 𝑦)
we could also consider it as a function of two arguments.
• If we want to see it as a function of two arguments then we can leave out the
brackets:
𝜆𝑦. 𝜆𝑥.∗ 𝑥 𝑦 𝑜𝑟 𝜆 𝑦 𝑥.∗ 𝑥 𝑦
The official definition
• Function formation and function application are all that there is.
• λ-terms are constructed according to the grammar:
𝑀 ∷= 𝑐
𝑥
𝑀𝑀 | 𝜆𝑥. 𝑀
 The placeholder c represents any constant, such as numbers 1, 2, 3,... or arithmetic
operators +,∗, etc.
 the letter x represents any of infinitely many possible variables.
The given grammar is ambiguous; the term 𝜆𝑥. 𝑥 𝑦 could be parsed as
𝑀 → 𝜆𝑥. 𝑀
𝑀 → 𝑀𝑀
𝑀→𝑥
𝑀→𝑦
𝑀 → 𝑀𝑀
𝑀 → 𝜆𝑥. 𝑀
𝑀→𝑥
𝑀→𝑦
*(we use “app” to indicate use of the clause "MM" in the derivation)
𝛽 - Reduction
• There is only one rule of computation, called reduction , it concerns the
replacement of a formal parameter by an actual one.
𝜆 𝑥 .∗ 3 𝑥 4
→
∗34
𝛽
𝜆𝑦. 𝑦 𝜆𝑥.∗ 3𝑥 5 →
𝛽
𝜆𝑥.∗ 3𝑥 5 → * 3 5
𝛽
• When no further reductions are possible, we say that the term has been
reduced to normal form.
• Is every term has a normal form ?
NO !
Ω == (𝜆𝑥. 𝑥 𝑥)(𝜆𝑥. 𝑥 𝑥)
The term Ω always reduces to itself!
Confluence
• It may be that a λ-term offers many opportunities for reduction at the same time.
• it is necessary that the result of a computation is independent from the order of
reduction
Theorem 1 (Church-Rosser) If a term M can be reduced (in several steps) to
terms N and P, then there exists a term Q to which both N and P can be reduced (in
several steps).
As a picture:
Confluence
Theorem 1 (Church-Rosser) If a term M can be reduced (in several steps) to terms N and P,
then there exists a term Q to which both N and P can be reduced (in several steps).
Intuition:
Lets look at a specific case where every variable can appear 0/1 time in a term.
Base case - divided into 3 cases:
𝛽1 𝜆𝑦. 𝑀 . . . 𝜆𝑥. 𝑁
𝛽2
1.
𝑀′ … 𝜆𝑥. 𝑁
𝜆𝑦. 𝑀 … 𝑁′
𝛽2
2.
𝛽1
𝑀′ … 𝑁′
𝜆𝑥. 𝑀 𝑁
𝛽1
𝛽2
𝑀𝑁
𝜆𝑥. 𝑀 𝑁′
𝛽2
𝑀[𝑁 ′ ]
𝛽1
Confluence
𝜆𝑥. 𝑀 𝑁
𝛽1
3.
𝑀𝑁
𝛽2
𝜆𝑥. 𝑀′ 𝑁
𝛽2
𝑀′ 𝑁
𝛽1
And if you believed me up until now the rest is very simple… 
.
.
.
Confluence
Corollary 2 Every λ-term has at most one normal form.
Proof. For the sake of contradiction, assume that there are normal forms N and P to
which
a certain term M reduces:
By the theorem of Church and Rosser there is a term Q to which both N and P can
be reduced. However, N and P are assumed to be in normal form, so they don’t
allow for any further reductions. The only possible interpretation is that N = P = Q.
Free and bound variables
• The operator 𝜆 is a binding operator. Variables that fall within the
scope of an abstraction are said to be bound.
All other variables are called free.
• 𝜆𝑥. ∗ 𝑥 𝑦
• Note: variable is bound by its "nearest" abstraction
• (𝜆𝑥. 𝑦)(𝜆𝑥.∗ 𝑧𝑥) (the single occurrence of x in the expression is bound by the second lambda)
Formal Definition:
• The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by
recursion on the structure of the terms, as follows:
• FV(x) = {x}, where x is a variable
• FV(λx.M) = FV(M) \ {x}
• FV(M N) = FV(M) ∪ FV(N)
𝜶 − conversion
• Alpha - conversion, allows bound variable names to be changed
𝜆𝑥. 𝑥 → 𝜆𝑦. 𝑦
• The only variable occurrences that are renamed are those that are bound to the
same abstraction
𝜆𝑥. 𝜆𝑥. 𝑥 → 𝜆𝑦. 𝜆𝑥. 𝑥
• Alpha-conversion is not possible if it would result in a variable getting captured by
a different abstraction
𝜆𝑥. 𝜆𝑦. 𝑥 ↛ 𝜆𝑦. 𝜆𝑦. 𝑦
𝑥
𝑥
𝑦
(𝜆𝑥. 0 𝑥𝑦𝑑𝑦) 𝑦 + 1 = (𝛼) (𝜆𝑥. 0 𝑥𝑧𝑑𝑧) 𝑦 + 1 = (𝛽) ( 0 𝑦𝑧𝑑𝑧)
𝜼 − conversion
𝜂 𝜆𝑥. 𝑓 𝑥 = 𝑓
• Two functions are the same if and only if they give the same result for all
arguments.
• Eta - conversion converts between 𝜆𝑥. 𝑓 𝑥 and 𝑓 whenever 𝑥 does not appear
free in 𝑓.
𝜆𝑥. 𝜆𝑦. 𝑦 2 𝑥
𝜆𝑥. 𝜆𝑦. 𝑦 2 𝑥
𝜂
𝛽
𝜆𝑦. 𝑦 2
𝜆x. x 2
𝛼
𝜆𝑦. 𝑦 2
Higher-order functions
• Let us look at an example:
A 𝜆-term for squaring integers is given by
𝑄 ≝ 𝜆𝑥.∗ 𝑥 𝑥
• If we want to compute 𝑥 8 then this can be achieved by squaring x three times:
8
𝑥 =
𝑥
2 2 2
• In 𝜆-calculus notation, we would write for the “power-8”-function:
𝑃8 ≝ 𝜆𝑥. 𝑄(𝑄 𝑄𝑥 )
• It is now a simple step to write out a 𝜆-term which applies any function three
times:
𝑇 ≝ 𝜆𝑓. (𝜆𝑥. 𝑓 𝑓 𝑓𝑥 )
• Operators such as T are called higher order because they operate on functions
rather than numbers.
Iteration and recursion
• How can we generalize what we saw, in the high order function, to
get the behavior of a for-loop?
• First of all let us define some helpful new “constants”:
1. "𝑧𝑒𝑟𝑜? “ - Its behavior is like an if-then-else clause:
𝑧𝑒𝑟𝑜? 0 𝑥 𝑦 ⟶ 𝑥
𝑧𝑒𝑟𝑜? 𝑛 𝑥 𝑦 ⟶ 𝑦 (𝑛 ≠ 0)
(In 𝐽𝑎𝑣𝑎, we would write this as 𝑛 == 0 ? 𝑥 ∶ 𝑦 )
2. “pred” for predecessor function on natural numbers.
Iteration and recursion
• Now we construct a term 𝐼 (for “Iteration”) which takes as arguments a
number 𝑛, a function 𝑓, and a value 𝑥, and computes the 𝑛 − 𝑓𝑜𝑙𝑑 application
of 𝑓 to 𝑛:
𝐼 𝑛𝑓𝑥 = 𝑓(𝑓 𝑓 . . . 𝑓𝑥 . . . )
(If 𝑛 = 0 then 𝐼 0 𝑓 𝑥 should simply return 𝑥)
𝐼 = 𝜆 𝑛 𝑓 𝑥 . 𝑧𝑒𝑟𝑜? 𝑛 𝑥 𝐼 𝑝𝑟𝑒𝑑 𝑛 𝑓 𝑓 𝑥
Example: 𝑛 = 3 , 𝑓 = 𝜆𝑥. 𝑥 + 1 , 𝑥 = 0
I 3, f, 0 ⟹ zero? 3 0 I 2, λx. x + 1,1
I 2, f, 1 ⟹ zero? 2 1 I(1, λx. x + 1,2)
I 1, f, 2 ⟹ zero? 1 2 I(0, λx. x + 1,3)
I 0, f, 3 ⟹ zero? 0 𝟑
3 I(pred 0), λx. x + 1,4)
Well-typed 𝝀- terms
• there is nothing in the grammar which stops us from forming awful terms, such as
“sin log”.
• Such terms do not make any sense at all, and any sensible programming language
compiler would reject them.
• What is missing in the calculus is a notion of type.
For example, the type of the sin function should be “accepts real numbers
and produces real numbers”.
• A language for expressing these properties (i.e., types) is easily definedWe start with some base types such as “int” and “real”, and then form function
types on top of them.
• The grammar:
𝜏 ∷= 𝑐 | 𝜏 → 𝜏
c represents all the base types
Well-typed 𝝀- terms
• On the basis of a type system, we can formulate restrictions on what kind of
terms are valid (or well-typed).
We do so by employing an inductive definition:
Definition (Well-typed 𝝀-terms):
Base case. For every type 𝐴 and every variable 𝑥, the term 𝑥: 𝐴 is well-typed
and has type 𝐴.
Function formation. For every term 𝑀 of type 𝐵 , every variable 𝑥, and every
type 𝐴 , the term 𝜆𝑥: 𝐴. 𝑀 is well-typed and has type 𝐴 → 𝐵.
Application. If 𝑀 is well-typed of type 𝐴 → 𝐵 and 𝑁 is well-typed of type 𝐴
then 𝑀 𝑁 is well-typed and has type 𝐵.
• 𝜆𝑥: 𝐴. 𝑥: 𝐴 𝑖𝑠 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑 𝑜𝑓 𝑡𝑦𝑝𝑒 𝐴 → 𝐴
• 𝜆𝑥: 𝐴. (𝜆𝑦: 𝐵. 𝑥: 𝐴) 𝑖𝑠 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑 𝑜𝑓 𝑡𝑦𝑝𝑒 𝐴 → (𝐵 → 𝐴)
• O𝑛 𝑡ℎ𝑒 𝑜𝑡ℎ𝑒𝑟 ℎ𝑎𝑛𝑑, 𝑡ℎ𝑒 𝑡𝑒𝑟𝑚 sin log 𝑖𝑠 𝑛𝑜𝑡 𝑤𝑒𝑙𝑙 − 𝑡𝑦𝑝𝑒𝑑.
Well-typed 𝝀- terms
Calculating simple types:
• It is quite easy to find out whether a term can be typed or not by following the
steps in which the term was constructed.
• What we do is to annotate subterms with type expressions which still contain
type variables A,B,C, . . . and which we refine as we go along.
• Consider for example, the term 𝜆 𝑓 𝑥. 𝑓 𝑥 −
We give 𝑥 𝑡ℎ𝑒 𝑡𝑦𝑝𝑒 𝐴 , 𝑎𝑛𝑑 𝑓 𝑡ℎ𝑒 𝑡𝑦𝑝𝑒 𝐵
𝑓 𝑥 − 𝑛𝑒𝑒𝑑 𝑡𝑜 𝑏𝑒 𝑤𝑒𝑙𝑙 𝑡𝑦𝑝𝑒𝑑 − 𝑤𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐵 𝑡𝑜 𝐴 → 𝐶.
𝐴𝑐𝑐𝑜𝑟𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝑟𝑢𝑙𝑒, 𝑡ℎ𝑒 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒
𝑠ℎ𝑜𝑢𝑙𝑑 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 (𝐴 → 𝐶) → (𝐴 → 𝐶).
*At this stage the type variables can be instantiated with something more concrete (such as
“int” or “real”) but we only wanted to establish typability and so we can stop here.
Well-typed 𝝀- terms
Calculating simple types:
𝜆 𝑓 𝑥. 𝑓 𝑥 ∶ (𝐴 → 𝐶) → (𝐴 → 𝐶).
• Further refinement is required if we extend the term to 𝜆 𝑓 𝑥. 𝑓 𝑥 𝜆𝑦. 𝑦 3
 𝜆𝑦. 𝑦 𝑤𝑖𝑙𝑙 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 𝐷 → 𝐷 (𝐷 − 𝑛𝑒𝑤 𝑡𝑦𝑝𝑒 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒).
 𝑊𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐴 𝑡𝑜 𝐷 , 𝑎𝑛𝑑 𝑎𝑙𝑠𝑜 𝐶 𝑡𝑜 𝐷 𝐼𝑛 𝑜𝑟𝑑𝑒𝑟 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑎𝑝𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑠𝑒𝑛𝑠𝑒
 𝑡ℎ𝑒 𝑟𝑒𝑠𝑢𝑙𝑡𝑖𝑛𝑔 𝑡𝑦𝑝𝑒 𝑜𝑓 𝜆 𝑓 𝑥. 𝑓 𝑥 𝜆𝑦. 𝑦 𝑖𝑠 𝑛𝑜𝑤 𝐷 → 𝐷
 3 𝑠ℎ𝑜𝑢𝑙𝑑 ℎ𝑎𝑣𝑒 𝑡𝑦𝑝𝑒 “𝑖𝑛𝑡”, 𝑠𝑜 𝑤𝑒 𝑟𝑒𝑓𝑖𝑛𝑒 𝐷 𝑡𝑜 "int“
 Finally, if we spell out the types in the term we get:
𝜆 𝑓: 𝑖𝑛𝑡 → 𝑖𝑛𝑡 𝜆𝑥: 𝑖𝑛𝑡. 𝑓 𝑥 𝜆𝑦: 𝑖𝑛𝑡 . 𝑦 3
Theorem Every well-typed 𝜆-term has a normal form.
The 𝝀-calculus as a model of computation
Turing-complete
A computational system that can compute every Turing-computable
function is called Turing complete
We call a calculus Turing-Complete if it allows one to define all
computable function from N to N .
The 𝝀-calculus is Turing-complete !
Church encoding:
Terms that are usually considered primitive in other notations (such as integers,
Boolean) are mapped to higher-order functions under Church encoding.
Church numerals- a representation of the natural numbers using lambda notation
The 𝝀-calculus as a model of computation
0 ≝ 𝜆𝑓 𝑥. 𝑥
1 ≝ 𝜆𝑓𝑥. 𝑓𝑥
2 ≝ 𝜆𝑓𝑥. 𝑓 𝑓𝑥
…
𝑛 = 𝜆𝑓𝑥. 𝑓 𝑛 𝑥
Computation with Church numerals:
For example: Addition – uses the identity 𝑓 𝑚+𝑛 = 𝑓 𝑚 𝑓 𝑛
+ ≝ 𝜆𝑚. 𝜆𝑛. 𝜆𝑓. 𝜆𝑥 . 𝑚 𝑓 ( 𝑛 𝑓 𝑥)
But programs would be :
Pretty slow
Pretty large
Pretty hard to understand.
From Theory to Programming Language
• Although the lambda-calculus is powerful enough to express any program, this
doesn't mean that you'd actually want to do so.
After all, the Turing Machine offers an equally powerful computational basis.
Which lead us to Functional Programming…
• Functional programming has its roots in lambda calculus –
lambda calculus provides a theoretical framework for describing functions and
their evaluation. Although it is a mathematical abstraction rather than a
programming language, it forms the basis of almost all functional programming
languages today.
• Many functional programming languages can be viewed as elaborations on the
lambda calculus.
From Theory to Programming Language
Functional Programming:
“ functional programming is a programming paradigm, a style of building the
structure and elements of computer programs, that treats computation as the
evaluation of mathematical functions and avoids state and mutable data. “
• Functional programming emphasizes functions that produce results that depend
only on their inputs and not on the program state - i.e. pure mathematical
functions.
• In functional code, the output value of a function depends only on the arguments
that are input to the function.
• So calling a function 𝑓 twice with the same value for an argument x will
produce the same result 𝑓(𝑥) both times.
--Wikipedia
Similarity to Functional Programming
Pascal:
𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓 𝑥 ∶ 𝑖𝑛𝑡 ∶ 𝑖𝑛𝑡 ; 𝑏𝑒𝑔𝑖𝑛…<statements >… end;
𝜆
𝑥
.
function body
ML:
𝑓𝑢𝑛 𝑠𝑞 𝑥 ∶ 𝑖𝑛𝑡 = 𝑥 ∗ 𝑥;
𝜆
𝑥
. ∗𝑥𝑥
Scheme:
𝜆𝑥. 𝑀
⟹ 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 𝑀
𝜆𝑥𝑦. + 𝑥 𝑦 3 4 ⟹ ( 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 𝑦 + 𝑥 𝑦 3 4
7
𝜆𝑥.∗ 𝑥 𝑥
⟹ (𝑑𝑒𝑓𝑖𝑛𝑒 𝑠𝑞𝑢𝑎𝑟𝑒 𝑙𝑎𝑚𝑏𝑑𝑎 𝑥 ∗ 𝑥 𝑥 )