Download CDM Recursive Functions Klaus Sutner Carnegie Mellon University

Document related concepts

List of first-order theories wikipedia , lookup

Axiom of reducibility wikipedia , lookup

Busy beaver wikipedia , lookup

Non-standard calculus wikipedia , lookup

Combinatory logic wikipedia , lookup

Computability theory wikipedia , lookup

History of the function concept wikipedia , lookup

Principia Mathematica wikipedia , lookup

Transcript
CDM
Recursive Functions
Klaus Sutner
Carnegie Mellon University
Fall 2015
Outline
1
Primitive Recursive Functions
2
Pushing Primitive Recursion
3
Register Machines versus Primitive Recursion
4
µ-Recursive Functions
5
General Recursive Functions *
2
Machines versus Programs
Machine models of computation (register machines, Turing machines) are
easy to describe and very natural. Turing machines in particular have the
great advantage that they “clearly” correspond to the abilities of an
(idealized) human computor, see Turing’s 1936 paper.
Alas, they don’t match up well with the way an actual mathematician or
computer scientist would define a computable function, or demonstrate
that some already known function is in fact computable. Constructing a
specific machine for a particular computation is rather tedious (plus, one
really should provide a correctness proof).
It is usually preferable to be able to write down some kind of expression
or equation that defines the function.
3
Primitive Recursive Functions
To avoid problems with semantics we will introduce a language that has
only one data type: N, the non-negative integers.
We will define the so-called primitive recursive functions, total functions
of the form
f : Nn → N
that can be computed intuitively using no more than a limited type of
recursion.
We use induction to define this class of functions: we select a few simple
basic functions and build up more complicated ones by composition and a
limited type of recursion.
4
The Basic Functions
5
There are three types of basic functions.
Constants zero
Projections
Successor function
C n : Nn → N , C n (x) = 0
Pin : Nn → N , Pin (x) = xi
S : N → N , S(x) = x + 1
Notation: We write x = (x1 , . . . , xn ) for a vector of suitable length. On
occasion we abbreviate x + 1 to x+ .
This may seem like a rather too spartan set of built-in functions, but as
we will see it’s all we need.
Closure Operations: Composition
Composition
Given functions gi : Nm → N for i = 1, . . . , n , h : Nn → N , we
define a new function f : Nm → N by composition as follows:
f (x) = h(g1 (x), . . . , gn (x))
Notation: we write h ◦ (g1 , . . . , gn ) inspired by the the well-known special
case m = 1:
(h ◦ g)(x) = h(g(x)).
6
Closure Operations: Primitive Recursion
Primitive recursion
Given h : Nn+2 → N and g : Nn → N, we define a new function
f : Nn+1 → N by
f (0, y)
=
g(y)
f (x + 1, y)
=
h(x, f (x, y), y)
This pattern should look very familiar.
Definition
A function is primitive recursive (p.r.) if it can be constructed from the
basic functions by applying composition and primitive recursion.
7
Example: Factorials
8
The standard definition of the factorial function uses recursion like so:
f (0) = 1
f (x + 1) = (x + 1) · f (x)
To write the factorial function in the form of a primitive recursion we need
g : N0 → N,
2
h : N → N,
g() = 1
h(u, v) = (u + 1) · v
g is none other than S ◦ C 0 and h is multiplication combined with the
successor function.
Done, since . . .
Example: Multiplication and Addition
To get multiplication we use another recursion, written in a slightly more
informal and easier to read manner:
mult(0, y) = 0
mult(x + 1, y) = add(mult(x, y), y)
Here we use addition, which can in turn be defined by yet another
recursion:
add(0, y) = y
add(x + 1, y) = S(add(x, y))
Since S is a basic function we have a proof that multiplication is
primitive recursive.
9
It’s Easier
10
Except perhaps for some rabid hacker, everyone would agree that the
chain
S
add
mult
fct
of primitive recursions is much easier to understand and construct than
some register machine program for factorials.
Exercise
Write a register machine program for the factorial function and prove
that it is correct.
R. Dedekind
These equational definitions of basic arithmetic functions dates back to
Dedekind’s 1888 booklet “Was sind und was sollen die Zahlen?”
11
Digression: Dedekind the Giant
Dedekind was Gauss’s last student and, together with Cantor and Frege,
the driving force behind the development of set theory. He published
little, but his sense of direction, precision and rigor was far ahead of
(most of) his contemporaries.
His student Emmy Noether used to say (exaggerating only slightly):
Es steht alles schon bei Dedekind.
Everything is already in Dedekind’s writings.
12
A PR Language
We defined primitive recursive functions by induction, operating in
function spaces.
But it is entirely straightforward to give a purely syntactic definition of a
class of terms whose semantics are precisely the primitive recursive
functions.
The terms are constructed from some simple built-ins (for constant zero,
projection and successor) plus constructors for composition and recursion.
A pleasant feature of these terms is the complete absence of variables.
13
The Language PRec
Each term in our language has a fixed arity, corresponding to the number
of arguments of the function it denotes. The built-in terms are
Cn of arity n for each n ≥ 0.
Pni of arity n for each n ≥ i ≥ 1.
S of arity 1.
Compound terms are constructed as follows:
Given n terms τi of arity m and one term σ of arity n,
Comp[σ, τ1 , . . . , τn ] is a new term of arity m.
Given terms σ of arity n + 2 and τ of arity n, PRec[σ, τ ] is a new
term of arity n + 1.
Our language of terms is the inductive closure.
14
The Meaning of a Program
We have to explain what these terms mean. This is not hard: each
well-formed term τ of arity n is associated with its meaning [[τ ]], a total
arithmetic function.
[[τ ]] : Nn → N.
The definition is by induction on the build-up of τ , first the basic terms
and then the compound ones.
Note that this is very different from machine models: there is no general
way to do induction on the “structure of a register machine” (or on the
number of states, see the Busy Beaver Problem). In this sense, primitive
recursive functions are much more attractive.
15
Semantics
16
[[Cn ]](x) = 0
constants zero
[[Pni ]](x) = xi
projections
successor function
[[S]](x) = x + 1
Composition
[[Comp[σ, τ1 , . . . , τn ]]](x) = [[σ]]([[τ1 ]](x), . . . , [[τn ]](x))
Primitive recursion
(
[[τ ]](y)
if x = 0,
[[PRec[σ, τ ]]](x, y) =
[[σ]](z, [[PRec[σ, τ ]]](z, y), y) if x = z + 1.
Example: Factorials
It follows immediately that [[τ ]] is always a primitive recursive function,
and that all primitive recursive functions are obtained this way. For
example, the factorial function corresponds to the expression
PRec[PRec[PRec[S ◦ P32 , P11 ] ◦ (P32 , P33 ), C1 ] ◦ (S ◦ P21 , P22 ), S ◦ C0 ]
The innermost PRec[, ] yields addition, the next multiplication and the
last factorial.
Why bother with this distinction? In general, it is a good idea to separate
syntax from semantics, they live in different worlds.
Admittedly, in this case the result is not overwhelming.
17
Evaluation
18
One reason for the PR language is that it allows to give a clear definition
of evaluation. An evaluation operator eval takes any well-formed term τ
of arity n, and any input x ∈ N and produces the result of executing the
corresponding p.r. function on x.
eval(τ, x) = [[τ ]](x)
Note that the term τ is just a string or a code number.
Exercise
Write a compiler that given any string τ checks whether it is a
well-formed expression denoting a primitive recursive function.
Exercise
Write an interpreter for primitive recursive functions (i.e., implement
eval).
2
Primitive Recursive Functions
Pushing Primitive Recursion
Register Machines versus Primitive Recursion
µ-Recursive Functions
General Recursive Functions *
Enhancements
20
Apparently we lack a mechanism for definition-by-cases:
(
3
f (x) =
x2
if x < 5,
otherwise.
We know that x 7→ 3 and x 7→ x2 are p.r., but is f also p.r.?
And how about more complicated operations such as the GCD or the
function that enumerates prime numbers?
Definition by Cases
21
Definition
Let g, h : Nn → N and R ⊆ Nn .
Define f = DC[g, h, R] by
(
g(x)
f (x) =
h(x)
if x ∈ R,
otherwise.
We want to show that definition by cases is admissible in the sense that
when applied to primitive recursive functions/relations we obtain another
primitive recursive function.
Note that we need express the relation R as a function; more on that in a
minute.
Sign and Inverted Sign
22
The first step towards implementing definition-by-cases is a bit strange,
but we will see that the next function is actually quite useful.
The sign function is defined by
sign(x) = min(1, x)
so that sign(0) = 0 and sign(x) = 1 for all x ≥ 1. Sign is primitive
recursive: PRec[S ◦ C2 , C0 ]
Similarly the inverted sign function is primitive recursive:
•
sign(x) = 1 − sign(x)
Equality and Order
Define E : N2 → N by
E = sign ◦ add ◦ (sub ◦ (P21 , P22 ), sub ◦ (P22 , P21 ))
Or sloppy but more intelligible:
•
•
E(x, y) = sign((x −
y) + (y −
x))
Then E(x, y) = 1 iff x = y, and 0 otherwise. Hence we can express
equality as a primitive recursive function.
Even better, we can get other order relations such as ≤, <, ≥ . . .
So, the fact that our language lacks relations is not really a problem; we
can express them as functions.
23
Relations
As before we can use the characteristic function of a relation R
1 x∈R
charR (x) =
0 otherwise.
to translate relations into functions.
Definition
A relation is primitive recursive if its characteristic function is primitive
recursive.
The same method works for any notion of computable function: given a
class of functions (RM-computable, p.r., polynomial time, whatever).
24
Closure Properties
Proposition
The primitive recursive relations are closed under intersection, union and
complement.
Proof.
charR∩S = mult ◦ (charR , charS )
charR∪S = sign ◦ add ◦ (charR , charS )
charN−R = sub ◦ (S ◦ C n , charR )
The proof is slightly different from the argument for decidable relations
but it’s really the same idea.
Exercise
Show that every finite set is primitive recursive.
25
Arithmetic and Logic
Note what is really going on here: we are using arithmetic to express
logical concepts such as disjunction.
The fact that this translation is possible, and requires very little on the
side of arithmetic, is a central reason for the algorithmic difficulty of
many arithmetic problems: logic is hard, by implication arithmetic is also
difficult.
For example, finding solutions of Diophantine equations is hard.
Incidentally, primitive recursive functions were used extensively by K.
Gödel in his incompleteness proof.
26
DC is Admissible
Proposition
If g, h, R are primitive recursive, then f = DC[g, h, R] is also primitive
recursive.
Proof.
f = add ◦ (mult ◦ (charR , g), mult ◦ (charR , h))
Less cryptically
f (x) = charR (x) · g(x) + charR (x) · h(x)
Since either charR (x) = 0 and charR (x) = 1, or the other way around,
we get the desired behavior.
2
27
Bounded Sum
28
Proposition
Let g : Nn+1 → N be primitive recursive, and define
f (x, y) = Σz<x g(z, y)
Then f : Nn+1 → N is again primitive recursive. The same holds for
products.
Proof.
n+2
PRec[add ◦ (g ◦ (P2n+3 , . . . , Pn+2
), P2n+2 ), C n ]
Exercise
P
Show that f (x, y) = z<h(x) g(z, y) is primitive recursive when h is
primitive recursive and strictly monotonic.
Construction versus Search
Consider the nth prime function n 7→ pn .
Trying to compute pn directly from n is rather difficult, primes are too
irregular for that. But since it is easy to recognize a number as being
prime, we can simply search for the right number.
This is an important algorithmic technique: as long as the search extends
over some sufficiently small domain, it is (almost) as good as a direct
construction based on deeper insights.
29
Bounded Search
30
We can model search in the realm of p.r. functions as follows.
Definition (Bounded Search)
Let g : Nn+1 → N . Then f = BS[g] : Nn+1 → N is the function defined
by
(
min z < x | g(z, y) = 0
f (x, y) =
x
if z exists,
otherwise.
So f (x, y) = x is just an arithmetic way of saying “the search failed.”
Closure
One can show that bounded search adds nothing to the class of p.r.
functions.
Proposition
If g is primitive recursive then so is BS[g].
This would usually be expressed as “primitive recursive functions are
closed under bounded search.”
31
Bounded Search II
32
This can be pushed a little further: the search does not have to end at x
but it can extend to a primitive recursive function of x and y.
(
min z < h(x, y) | g(z, y) = 0
f (x, y) =
h(x, y)
if z exists,
otherwise.
Dire Warning:
But we have to have a p.r. bound, unbounded search as in
min z | g(z, y) = 0
is not an admissible operation; not even when there is a suitable z for
each y, more later.
PR and Basic Number Theory
Claim
The divisibility relation div(x, y) is primitive recursive.
Note that
div(x, y) ⇐⇒ ∃ z, 1 ≤ z ≤ y (x ∗ z = y)
so that bounded search intuitively should suffice to obtain divisibility.
Formally, we have already seen that the characteristic function M (z, x, y)
of x · z = y is p.r. But then


X
M (z, x, y)
sign 
z≤y
is the p.r. characteristic function of div.
33
Primality
34
Claim
The primality relation is primitive recursive.
Intuitively, this is true since x is prime iff
1 < x and ∀ z < x (div(z, x) ⇒ z = 1).
Claim
The next prime function f (x) = min z > x | z prime is p.r.
This follows from the fact that bounded search again suffices:
f (x) ≤ 2x
for x ≥ 1.
This bounding argument requires number theory (a white lie).
Enumerating Primes
35
Claim
The function n 7→ pn where pn is the nth prime is primitive recursive.
To see this we can iterate the “next prime” function from the last claim:
p(0) = 2
p(n + 1) = f (p(n))
Yet More Logic
36
Arguments like the ones for basic number theory suggest another type of
closure properties, with a more logical flavor.
Definition (Bounded quantifiers)
P∀ (x, y) ⇔ ∀ z < x P (z, y)
and
P∃ (x, y) ⇔ ∃ z < x P (z, y).
Note that P∀ (0, y) = true and P∃ (0, y) = false.
Informally,
P∀ (x, y) ⇐⇒ P (0, y) ∧ P (1, y) ∧ . . . ∧ P (x − 1, y)
and likewise for P∃ .
Bounded Quantification
37
Bounded quantification is really just a special case of bounded search: for
P∃ (x, y) we search for a witness z < x such that P (z, y) holds.
Generalizes to ∃ z < h(x, y) P (z, y) and ∀ z < h(x, y) P (z, y).
Proposition
Primitive recursive relations are closed under bounded quantification.
Proof.
charP∀ (x, y) =
Y
charP (z, y)
z<x
!
charP∃ (x, y) = sign
X
charP (z, y)
z<x
2
Exercises
Exercise
Give a proof that primitive recursive functions are closed under definition
by multiple cases.
Exercise
Show in detail that the function n 7→ pn where pn is the nth prime is
primitive recursive. How large is the p.r. expression defining the function?
38
Sequence Numbers
39
Recall the coding machinery we developed for register machines:
h . i : N∗ → N
len(h a0 , a1 , . . . , an−1 i ) = n,
dec(h a0 , a1 , . . . , an−1 i , i ) = ai ,
i<n
Technically, h i cannot be primitive recursive since the domain N∗ is not
allowed in our framework.
But, this is really just a technical problem: for every fixed n,
h x1 , . . . , xn i is a p.r. function.
Decoding is Primitive Recursive
Lemma
len, dec and Seq are all primitive recursive.
The key observation here is that there is a p.r. bounding function B such
that for every fixed length n,
∀ a1 , . . . , an < m (h a1 , . . . , an i < B(n, m)) .
Hence the whole discrete world of rationals, lists, matrices, trees, graphs
and so on is primitive recursive: everything can be coded as sequence
numbers and all the requisite operations are primitive recursive.
40
Recording History
41
One application of sequence numbers is course-of-value recursion. First
note that ordinary primitive recursion f = PRec[h, g] can be expressed in
terms of sequence numbers like so:
f (x, y) = z ⇐⇒ ∃ s ∈ Seq len(s) = x+ ∧ (s)0 = g(y)∧
∀ 1 < i < x ((s)i+ = h(i, (s)i , y)) ∧ (s)x = z
The sequence number s records all previous values of f . Now consider
the following function associated with f :
f (x, y) = h f (0, y), f (1, y), . . . , f (x, y) i
Lemma
f is primitive recursive iff f is primitive recursive.
Course of Value Recursion
Thus, it is natural to generalize the primitive recursion scheme slightly by
defining functions so that the value at x depends directly on all the
previous values.
f (0, y) = g(y)
f (x+ , y) = H(x, f (x, y), y)
Lemma
If g and H are primitive recursive, then f is also primitive recursive.
Exercise
Prove the last two lemmata. You may safely assume that standard
sequence operations such as append are primitive recursive.
42
Exercises
Exercise
Prove that all the standard list operations are indeed primitive recursive.
Exercise
Explain how to implement search in binary search trees as a primitive
recursive operation.
Exercise
Come up with yet another coding function based on prime factorizations
and show that the decoding functions are primitive recursive.
43
Primitive Recursive Functions
Pushing Primitive Recursion
3
Register Machines versus Primitive Recursion
µ-Recursive Functions
General Recursive Functions *
RM vs. PR
Burning Question: How does the computational strength of register
machines compare to primitive recursive functions?
It is a labor of love to check that any p.r. function can indeed be
computed by a RM.
This comes down to building a RM compiler/interpreter for p.r.
functions. Since we can use structural induction this is not hard in
principle; we can use a similar approach as in the construction of the
universal RM.
45
Opposite Direction?
The cheap answer is to point out that some RM-computable
functions are not total, so they cannot be p.r.
True, but utterly boring. Here are the right questions:
Is there a total RM-computable function that is not primitive
recursive?
How much of a RM computation is primitive recursive?
46
In a Nutshell
Using the coding machinery from last time it is not hard to see that the
relation “RM M moves from configuration C to configuration C 0 in t
steps” is primitive recursive. relation).
But when we try to deal with “RM M moves from C to C 0 in some
number of steps” things fall apart: there is no obvious way to find a
primitive recursive bound on the number of steps.
It is perfectly reasonable to conjecture that RM-computable is strictly
stronger than primitive recursive, but coming up with a nice example is
rather difficult.
47
Steps are p.r.
48
Proposition
Let M be a register machine. The t-step relation
C
t
M
C0
is primitive recursive, uniformly in t and M .
Of course, this assumes a proper coding method for configurations and
register machines.
Since configurations are of the form
(p, (x1 , . . . , xn ))
where p and xi are natural numbers this is a straightforward application
of sequence numbers.
t Steps
49
Likewise we can encode a whole sequence of configurations
C = C0 , C1 , . . . , Ct−1 , Ct = C 0
again by a single integer.
And we can check in a p.r. way that C
t
M
C 0.
A crucial ingredient here is that the size of the Ci is bounded by
something like the size of C plus t, so we can bound the size of the
sequence number coding the whole computation given just the size of C
and t.
Exercise
Figure out exactly what is meant by the last comment.
Whole Computations?
50
Now suppose we want to push this argument further to deal with whole
computations. We would like the transitive closure
C
M
C0
to be primitive recursive.
If we could bound the number of steps in the computation by some p.r.
function of C then we could perform a brute-force search.
However, there is no reason why such a bound should exist, the number
of steps needed to get from C to C 0 could be enormous.
Again, there is a huge difference between bounded and unbounded search.
Exactly How Much is Missing?
So, can we concoct a total, register computable function that fails to be
primitive recursive?
One way to accomplish this is to make sure the function grows faster
than any primitive recursive one, exploiting the inductive structure of the
these functions.
51
Ackermann’s Function (1928)
The Ackermann function A : N × N → N is defined by
A(0, y) = y +
A(x+ , 0) = A(x, 1)
A(x+ , y + ) = A(x, A(x+ , y))
where x+ is shorthand for x + 1.
Note the odious double recursion – on the surface, this looks more
complicated than primitive recursion, but we don’t know for a fact that it
really is more complex.
52
Ackermann is Computable
Proof: Here is a bit of C code that implements the Ackermann function
(assuming that we have infinite precision integers).
int acker(int x, int y)
{
return( x ? (acker( x-1, y ? acker( x, y-1 ): 1 )): y+1 );
}
All the work of organizing the nested recursion is handled by the compiler
and the execution stack. Of course, doing this on a register machine is a
bit more challenging, but it can be done.
53
Family Perspective
54
It is useful to think of Ackermann’s function as a family of unary
functions (Ax )x≥0 where Ax (y) = A(x, y). The critical part of the
definition then looks like so:
(
Ax (1)
Ax+ (y) =
Ax (Ax+ (y − 1))
if y = 0,
otherwise.
so that
+
Ax+ (y) = Ayx (1).
So it’s all just iteration. From this it follows easily by induction that
Claim
Each level function Ax is primitive recursive.
The Bottom Hierarchy
55
A(0, y) = y +
A(1, y) = y ++
A(2, y) = 2y + 3
A(3, y) = 2y+3 − 3
2
A(4, y) = 2
..
.
2
−3
The first 4 levels of the Ackermann hierarchy are easy to understand,
though A4 starts causing problems: the stack of 2’s in the exponentiation
has height y + 3. This usually called super-exponentiation or tetration.
The Mystery of A(6, 6)
Alas, if we continue just a few more levels, darkness befalls.
A(5, y) ≈ super-super exponentiation
A(6, y) ≈ an unspeakable horror
A(7, y) ≈ speechless
For level 5, one can get some vague understanding of iterated
super-exponentiation, but things start to get murky.
At level 6, we iterate over the already nebulous level 5 function, and
things really start to fall apart.
At level 7, Wittgenstein comes to mind: “Wovon man nicht sprechen
kann, darüber muss man schweigen.”
56
Ackermann and Union/Find
One might think that the only purpose of the Ackermann function is to
refute the claim that computable is the same as p.r. Surprisingly, the
function pops up in the analysis of the Union/Find algorithm (with
ranking and path compression).
The running time of Union/Find differs from linear only by a minuscule
amount, which is something like the inverse of the Ackermann function.
But in general anything beyond level 3.5 of the Ackermann hierarchy is
irrelevant for practical computation.
Exercise
Read an algorithms text that analyzes the run time of the Union/Find
method.
57
Ackermann is Total
Note that it is not entirely clear that A(x, y) is well-defined for all x and
y, see the notes on well-orders. One possible and clumsy proof is by
induction on x, and subinduction on y.
A more elegant way is to we use induction on N × N with respect to the
lexicographic product order
(a, b) < (c, d) ⇐⇒ (a < c) ∨ (a = c ∧ b < d),
a well-ordering of order type ω 2 .
Exercise
Carry out the proof of totality.
58
Ackermann vs. PR
59
Theorem
The Ackermann function dominates every primitive recursive function in
the sense that there is a k such that
f (x) < A(k, max x).
Hence A is not primitive recursive.
Sketch of proof.
One can argue by induction on the buildup of f .
The atomic functions are easy to deal with.
The interesting part is to show that the property is preserved during an
application of composition and of primitive recursion. Alas, the details
are rather tedious.
2
Register vs. Ackermann
Let’s come back to the gap between p.r. and RM-computable.
How much do we have to add to primitive recursion to capture the
Ackermann function?
Proposition
There is a primitive recursive relation R such that
A(a, b) = c ⇐⇒ ∃ t R(t, a, b, c)
Think of t as a complete trace of the computation of the Ackermann
function on input a and b. For example, t could be a hash table that
stores all the values that were computed during the call to A(a, b) using
memoizing.
60
Verifying Traces
The entries in t are determined by the following rules:
(a, b, c) ∈ t for some c
(0, y, z) ∈ t implies z = y +
(x+ , 0, z) ∈ t implies (x, 1, z) ∈ t
(x+ , y + , z) ∈ t implies (x, u, z) ∈ t and (x+ , y, u) ∈ t for some u.
(x, y, z) ∈ t and (x, y, z 0 ) ∈ t implies z = z 0 .
Any table that satisfies these rules proves that indeed A(a, b) = c
(though it might have extraneous entries).
So we have a decision algorithm that tests whether an alleged
computation of A is in fact correct. In fact, it is not hard to show that
the decision algorithm is primitive recursive.
61
Unbounded Search
62
So to compute A we only need to add search: systematically check all
possible tables until the right one pops up (it must since A is total). The
problem is that this search is no longer primitive recursive.
More precisely, let
acker(t, x, y) ⇐⇒ t is a correct trace for x, y
lookup(t) = z ⇐⇒ (x, y, z) in t
Then
A(x, y) = lookup min t | acker(t, x, y)
This is all primitive recursive except for the unbounded search in min.
Iterative Ackermann
63
The recursive definition from above is close to Ackermann’s original
approach. Here is an alternative based more directly on iteration.
B1 (x) = 2x
Bk+1 (x) = Bkx (1)
So B1 is doubling, B2 exponentiation, B3 super-exponentiation and so
on.
In general, Bk is closely related to Ak+1 .
Subsequence Order
Recall the subsequence ordering on words where u = u1 . . . un precedes
v = v1 v2 . . . vm if there exists a strictly increasing sequence
1 ≤ i1 < i2 < . . . in ≤ m of indices such that u = vi1 vi2 . . . vir .
We write u v v.
Subsequence order is not total unless the alphabet has size 1.
Note that subsequence order is independent of any underlying order of
the alphabet (unlike, say, lexicographic or length-lex order).
64
Warmup: Antichains
An antichain in a partial order is a sequence x0 , x1 , . . . , xn , . . . of
elements such that i < j implies that xi and xj are incomparable.
Example
Consider the powerset of [n] = {1, 2, . . . , n} with the standard subset
ordering. How does one construct a long antichain?
For example, x0 = {1} is a bad idea, and x0 = [n] is even worse.
65
Higman’s Lemma
66
Theorem (Higman’s 1952)
Every anti-chain in the subsequence order is finite.
Proof. Here is the Nash-Williams proof (1963): assume there is an
anti-chain.
For each n, let xn be the length-lex minimal word such that
x0 , x1 , . . . , xn starts an anti-chain, producing a sequence x = (xi ).
For any sequence of words y = (yi ) and a letter a define a−1 y to be the
sequence consisting of all words in y starting with letter a, and with a
removed.
Since the alphabet is finite, there exists a letter a such that x0 = a−1 x is
an anti-chain. But then x00 < x0 , contradiction.
2
Friedman’s Self-Avoiding Words
For a finite or infinite word x write x[i] for the block xi , xi+1 , . . . , x2i .
We will always assume that i ≤ |x| /2 when x is finite.
Bizarre Definition: A word is self-avoiding if for i < j the block x[i] is
not a subsequence of x[j].
The following is a consequence of Higman’s theorem.
Theorem
Every self-avoiding word is finite.
By the last theorem we can define
α(k) = length of longest self-avoiding word over Σk
67
How Big?
68
Trivially, α(1) = 3.
A little work shows that α(2) = 11, as witnessed by 01110000000.
But
α(3) > B7198 (158386),
an incomprehensibly large number.
Smelling salts, anyone?
It is truly surprising that a function with as simple a definition as α
should exhibit this kind of growth.
An Insane Hierarchy *
If you understand (small) ordinals, one can define a hierarchy (due to
Grzegorczyk and Wainer) of fast growing functions fα : N → N like so.
f0 (x) = x + 1
fα+1 (x) = fαx (x)
fλ (x) = fλ[x] (x)
Here λ = lim λ[n] and the approximating sequence can be chosen in a
natural fashion as long as λ is not too large. For example, λ ≤ ε0 works
well where ε0 is the first fixed point of α 7→ ω α .
In this setting the Ackermann function is essentially just fω and utterly
puny compared to fω2 or fωω or fε0 .
69
Primitive Recursive Functions
Pushing Primitive Recursion
Register Machines versus Primitive Recursion
4
µ-Recursive Functions
General Recursive Functions *
Closing the Gap
Since the Ackermann function misses being primitive recursive only by
the lack of an unbounded search operation, it is tempting to extend
primitive recursive functions a bit by adding a min operator.
From the previous discussion, primitive recursive plus min operator is
powerful enough to produce the Ackermann function – of course, the real
question is: what is relationship between the extended class and
computable functions in general?
As we will see, we obtain the same class of functions.
71
Regular Minimization
72
Let us call a function g regular if
∀ x ∃ z g(z, x) = 0.
Hence, if we define f from g by minimization, the resulting function will
be total. This is also called regular search for obvious reasons.
f (x) = min z | g(z, x) = 0
We will write f (x) = µz(g(z, x) = 0).
µ-Recursive Functions
Definition (Kleene 1936)
The class of µ-recursive functions is defined as the closure of the basic
functions zero, successor, and projections under the operations
composition, primitive recursion and regular search.
It follows immediately that the Ackermann function is µ-recursive.
73
Regular Search Computability
Here is a slightly different definition: we remove primitive recursion from
the operations, but augment the initial functions by addition and
multiplication. Define the regular search computable functions to be the
class obtained from:
The basic functions projections, equality, addition and
multiplication, and
closed under composition and regular search.
Exercise (Hard)
How does this compare to µ-recursive functions?
74
Partial Functions
If we want to establish a close connection between machines and classes
of functions defined by closures, we have to tackle the problem of
divergent computations.
Given our framework, this is not hard: just drop the regularity condition.
We allow
f (x) = min z | g(z, x) = 0
regardless of whether a witness z exists for all x.
This produces the class of partial µ-recursive functions.
75
But Beware
76
It is critical that g is still total. For partial g application of µ operator
can lead to non-computable functions.
Also, one cannot do things like
(
min z | g(z, x) = 0
f (x) =
0
if z exists,
otherwise.
This definition-by-cases uses a test that is generally undecidable.
Church-Turing Thesis
Let f : Nn ,→ N be a partial function. Then the following are equivalent:
f is intuitively computable,
f is partial µ-recursive,
f is register machine computable,
f is λ-computable,
f is Herbrand-Gödel-computable.
Of course, the first item has to be removed to get a proper theorem.
We’ll skip the proof. There are many other equivalent characterizations
which all add to the weight of this thesis.
Also note that Gödel was completely convinced by Turing’s analysis (and
gave him full credit, claiming none for himself and Church).
77
The Interesting Part
Kleene established a strong normal form result that shows that all
computable functions can be represented in a certain uniform way.
Theorem (Kleene Normal Form)
There is a p.r. relation trace and a p.r. function lookup such that register
machine Me on input x halts on output y if, and only if,
y = lookup min t | trace(t, e, x) .
Here t is a complete trace of the computation of Me on input x, much as
in the discussion of the Ackermann function. The trace predicate checks
whether t is really a correct trace for machine Me on input x.
Note the lookup: it extracts the output of M from the trace t. One can
show that some device such as lookup is required.
78
A Practical Approach
Suppose you have some problem A ⊆ N that is probably (at least)
semidecidability.
Figure out a way to express A as follows:
x ∈ A ⇐⇒ ∃ z R(z, x)
where R is just primitive recursive. In many cases it is not hard to find
R, it arises naturally from the definition of A.
If the search cannot be bounded, then A is just semidecidability.
If one can find a computable bound for the witness z, then A is
decidable.
79
Primitive Recursive Functions
Pushing Primitive Recursion
Register Machines versus Primitive Recursion
µ-Recursive Functions
5
General Recursive Functions *
Example: Addition
81
Suppose we have a constant 0 and the successor function x 7→ x+ .
Consider the following system of equations E:
f (x, 0) = x
+
f (x, y + ) = f (x, y)
Intuitively, E defines addition. For example
+
+
++
f (3, 2) = f (2, 1+ ) = f (3, 1) = f (3, 0+ ) = f (3, 0)
= 3++ = 4+ = 5
In fact, we can derive f (3, 2) = 5 from these equations using only
standard (and very simple) rules of inference.
Equational Reasoning
More formally, we really should use numerals denoting natural numbers:
0, 1 = 0+ , 2 = 1+ , . . . In general, we have a numeral n for every natural
number n.
There are two rules:
Substitution:
We can replace a free variable everywhere in an equation by a
numeral.
Replacement:
A term f (a1 , . . . , ak ) can be replaced by a numeral b if we have
already proven f (a1 , . . . , ak ) = b.
82
3+2=5
83
(1)
(2)
(3)
(4)
(5)
subst of E1
f (3, 0) = 3
+
subst. E2
+
subst. E2
f (3, 1) = f (3, 0)
f (3, 2) = f (3, 1)
+
repl. (1), (2)
+
repl. (4), (3)
f (3, 1) = 3 = 4
f (3, 2) = 4 = 5
Tedious, but completely mechanical and easy to implement in any
language with good support for pattern matching.
Herbrand-Gödel Computability
With a little more effort we can give a precise notion of derivability (or
provability) in E:
E ` f (a1 , . . . , ak ) = b
Definition
A function f : Nk ,→ N is Herbrand-Gödel computable if there is a finite
system of equations E that has an k-ary function symbol f such that
E ` f (a1 , . . . , ak ) = b ⇐⇒ f (a1 , . . . , ak ) ' b
for all ai , b ∈ N.
One often speaks about general recursive functions.
84
Terminology
These functions are usually called general recursive functions.
The “general” stems from the fact that primitive recursive functions were
originally referred to as “recursive functions”.
To round off the confusion, many older authors mean total computable
functions when they say “recursive function”, and use “partial recursive
function” for the general case.
And there is an effort under way to get rid of all names that include
“recursive” and replace it by “computable.” We’ll mix and match.
85
Herbrand-Gödel versus Machines
Theorem
A function is general recursive iff it is (register machine) computable.
Sketch of proof. Say the function is f : N ,→ N .
Given a system of equations E for f and exploiting standard coding
tricks, we can build a register machine M that on input a ∈ N will search
for a b ∈ N and a derivation in E of f (a) = b.
If they exist the RM returns b after finitely many steps, otherwise it
diverges.
86
And Back
For the opposite direction we need to find a way to simulate register
machines in terms of arithmetic functions. This requires two main
ingredients:
a bit of coding machinery to express RMs, configurations and such
as natural numbers, and
a way to express search (for the time t when a computation stops)
via HG computability.
Except for the unbounded search this is all primitive recursive. But p.r.
functions can clearly be described by a system of equations.
87
Search
88
It remains to deal with unbounded search in an equational manner
(Kleene 1952).
Assume f (x) = µz(g(x, z) = 0). Introduce three new function symbols
α, β and gb with equations
α(x, y + ) = x
β(x, 0) = x
gb(x, 0) = g(x, 0)
gb(x, y + ) = α(g(x, y + ), gb(x, y))
f (x) = β(z, gb(x, z))
Ponder deeply.
2