* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CDM Recursive Functions Klaus Sutner Carnegie Mellon University
Survey
Document related concepts
Transcript
CDM Recursive Functions Klaus Sutner Carnegie Mellon University Fall 2015 Outline 1 Primitive Recursive Functions 2 Pushing Primitive Recursion 3 Register Machines versus Primitive Recursion 4 µ-Recursive Functions 5 General Recursive Functions * 2 Machines versus Programs Machine models of computation (register machines, Turing machines) are easy to describe and very natural. Turing machines in particular have the great advantage that they “clearly” correspond to the abilities of an (idealized) human computor, see Turing’s 1936 paper. Alas, they don’t match up well with the way an actual mathematician or computer scientist would define a computable function, or demonstrate that some already known function is in fact computable. Constructing a specific machine for a particular computation is rather tedious (plus, one really should provide a correctness proof). It is usually preferable to be able to write down some kind of expression or equation that defines the function. 3 Primitive Recursive Functions To avoid problems with semantics we will introduce a language that has only one data type: N, the non-negative integers. We will define the so-called primitive recursive functions, total functions of the form f : Nn → N that can be computed intuitively using no more than a limited type of recursion. We use induction to define this class of functions: we select a few simple basic functions and build up more complicated ones by composition and a limited type of recursion. 4 The Basic Functions 5 There are three types of basic functions. Constants zero Projections Successor function C n : Nn → N , C n (x) = 0 Pin : Nn → N , Pin (x) = xi S : N → N , S(x) = x + 1 Notation: We write x = (x1 , . . . , xn ) for a vector of suitable length. On occasion we abbreviate x + 1 to x+ . This may seem like a rather too spartan set of built-in functions, but as we will see it’s all we need. Closure Operations: Composition Composition Given functions gi : Nm → N for i = 1, . . . , n , h : Nn → N , we define a new function f : Nm → N by composition as follows: f (x) = h(g1 (x), . . . , gn (x)) Notation: we write h ◦ (g1 , . . . , gn ) inspired by the the well-known special case m = 1: (h ◦ g)(x) = h(g(x)). 6 Closure Operations: Primitive Recursion Primitive recursion Given h : Nn+2 → N and g : Nn → N, we define a new function f : Nn+1 → N by f (0, y) = g(y) f (x + 1, y) = h(x, f (x, y), y) This pattern should look very familiar. Definition A function is primitive recursive (p.r.) if it can be constructed from the basic functions by applying composition and primitive recursion. 7 Example: Factorials 8 The standard definition of the factorial function uses recursion like so: f (0) = 1 f (x + 1) = (x + 1) · f (x) To write the factorial function in the form of a primitive recursion we need g : N0 → N, 2 h : N → N, g() = 1 h(u, v) = (u + 1) · v g is none other than S ◦ C 0 and h is multiplication combined with the successor function. Done, since . . . Example: Multiplication and Addition To get multiplication we use another recursion, written in a slightly more informal and easier to read manner: mult(0, y) = 0 mult(x + 1, y) = add(mult(x, y), y) Here we use addition, which can in turn be defined by yet another recursion: add(0, y) = y add(x + 1, y) = S(add(x, y)) Since S is a basic function we have a proof that multiplication is primitive recursive. 9 It’s Easier 10 Except perhaps for some rabid hacker, everyone would agree that the chain S add mult fct of primitive recursions is much easier to understand and construct than some register machine program for factorials. Exercise Write a register machine program for the factorial function and prove that it is correct. R. Dedekind These equational definitions of basic arithmetic functions dates back to Dedekind’s 1888 booklet “Was sind und was sollen die Zahlen?” 11 Digression: Dedekind the Giant Dedekind was Gauss’s last student and, together with Cantor and Frege, the driving force behind the development of set theory. He published little, but his sense of direction, precision and rigor was far ahead of (most of) his contemporaries. His student Emmy Noether used to say (exaggerating only slightly): Es steht alles schon bei Dedekind. Everything is already in Dedekind’s writings. 12 A PR Language We defined primitive recursive functions by induction, operating in function spaces. But it is entirely straightforward to give a purely syntactic definition of a class of terms whose semantics are precisely the primitive recursive functions. The terms are constructed from some simple built-ins (for constant zero, projection and successor) plus constructors for composition and recursion. A pleasant feature of these terms is the complete absence of variables. 13 The Language PRec Each term in our language has a fixed arity, corresponding to the number of arguments of the function it denotes. The built-in terms are Cn of arity n for each n ≥ 0. Pni of arity n for each n ≥ i ≥ 1. S of arity 1. Compound terms are constructed as follows: Given n terms τi of arity m and one term σ of arity n, Comp[σ, τ1 , . . . , τn ] is a new term of arity m. Given terms σ of arity n + 2 and τ of arity n, PRec[σ, τ ] is a new term of arity n + 1. Our language of terms is the inductive closure. 14 The Meaning of a Program We have to explain what these terms mean. This is not hard: each well-formed term τ of arity n is associated with its meaning [[τ ]], a total arithmetic function. [[τ ]] : Nn → N. The definition is by induction on the build-up of τ , first the basic terms and then the compound ones. Note that this is very different from machine models: there is no general way to do induction on the “structure of a register machine” (or on the number of states, see the Busy Beaver Problem). In this sense, primitive recursive functions are much more attractive. 15 Semantics 16 [[Cn ]](x) = 0 constants zero [[Pni ]](x) = xi projections successor function [[S]](x) = x + 1 Composition [[Comp[σ, τ1 , . . . , τn ]]](x) = [[σ]]([[τ1 ]](x), . . . , [[τn ]](x)) Primitive recursion ( [[τ ]](y) if x = 0, [[PRec[σ, τ ]]](x, y) = [[σ]](z, [[PRec[σ, τ ]]](z, y), y) if x = z + 1. Example: Factorials It follows immediately that [[τ ]] is always a primitive recursive function, and that all primitive recursive functions are obtained this way. For example, the factorial function corresponds to the expression PRec[PRec[PRec[S ◦ P32 , P11 ] ◦ (P32 , P33 ), C1 ] ◦ (S ◦ P21 , P22 ), S ◦ C0 ] The innermost PRec[, ] yields addition, the next multiplication and the last factorial. Why bother with this distinction? In general, it is a good idea to separate syntax from semantics, they live in different worlds. Admittedly, in this case the result is not overwhelming. 17 Evaluation 18 One reason for the PR language is that it allows to give a clear definition of evaluation. An evaluation operator eval takes any well-formed term τ of arity n, and any input x ∈ N and produces the result of executing the corresponding p.r. function on x. eval(τ, x) = [[τ ]](x) Note that the term τ is just a string or a code number. Exercise Write a compiler that given any string τ checks whether it is a well-formed expression denoting a primitive recursive function. Exercise Write an interpreter for primitive recursive functions (i.e., implement eval). 2 Primitive Recursive Functions Pushing Primitive Recursion Register Machines versus Primitive Recursion µ-Recursive Functions General Recursive Functions * Enhancements 20 Apparently we lack a mechanism for definition-by-cases: ( 3 f (x) = x2 if x < 5, otherwise. We know that x 7→ 3 and x 7→ x2 are p.r., but is f also p.r.? And how about more complicated operations such as the GCD or the function that enumerates prime numbers? Definition by Cases 21 Definition Let g, h : Nn → N and R ⊆ Nn . Define f = DC[g, h, R] by ( g(x) f (x) = h(x) if x ∈ R, otherwise. We want to show that definition by cases is admissible in the sense that when applied to primitive recursive functions/relations we obtain another primitive recursive function. Note that we need express the relation R as a function; more on that in a minute. Sign and Inverted Sign 22 The first step towards implementing definition-by-cases is a bit strange, but we will see that the next function is actually quite useful. The sign function is defined by sign(x) = min(1, x) so that sign(0) = 0 and sign(x) = 1 for all x ≥ 1. Sign is primitive recursive: PRec[S ◦ C2 , C0 ] Similarly the inverted sign function is primitive recursive: • sign(x) = 1 − sign(x) Equality and Order Define E : N2 → N by E = sign ◦ add ◦ (sub ◦ (P21 , P22 ), sub ◦ (P22 , P21 )) Or sloppy but more intelligible: • • E(x, y) = sign((x − y) + (y − x)) Then E(x, y) = 1 iff x = y, and 0 otherwise. Hence we can express equality as a primitive recursive function. Even better, we can get other order relations such as ≤, <, ≥ . . . So, the fact that our language lacks relations is not really a problem; we can express them as functions. 23 Relations As before we can use the characteristic function of a relation R 1 x∈R charR (x) = 0 otherwise. to translate relations into functions. Definition A relation is primitive recursive if its characteristic function is primitive recursive. The same method works for any notion of computable function: given a class of functions (RM-computable, p.r., polynomial time, whatever). 24 Closure Properties Proposition The primitive recursive relations are closed under intersection, union and complement. Proof. charR∩S = mult ◦ (charR , charS ) charR∪S = sign ◦ add ◦ (charR , charS ) charN−R = sub ◦ (S ◦ C n , charR ) The proof is slightly different from the argument for decidable relations but it’s really the same idea. Exercise Show that every finite set is primitive recursive. 25 Arithmetic and Logic Note what is really going on here: we are using arithmetic to express logical concepts such as disjunction. The fact that this translation is possible, and requires very little on the side of arithmetic, is a central reason for the algorithmic difficulty of many arithmetic problems: logic is hard, by implication arithmetic is also difficult. For example, finding solutions of Diophantine equations is hard. Incidentally, primitive recursive functions were used extensively by K. Gödel in his incompleteness proof. 26 DC is Admissible Proposition If g, h, R are primitive recursive, then f = DC[g, h, R] is also primitive recursive. Proof. f = add ◦ (mult ◦ (charR , g), mult ◦ (charR , h)) Less cryptically f (x) = charR (x) · g(x) + charR (x) · h(x) Since either charR (x) = 0 and charR (x) = 1, or the other way around, we get the desired behavior. 2 27 Bounded Sum 28 Proposition Let g : Nn+1 → N be primitive recursive, and define f (x, y) = Σz<x g(z, y) Then f : Nn+1 → N is again primitive recursive. The same holds for products. Proof. n+2 PRec[add ◦ (g ◦ (P2n+3 , . . . , Pn+2 ), P2n+2 ), C n ] Exercise P Show that f (x, y) = z<h(x) g(z, y) is primitive recursive when h is primitive recursive and strictly monotonic. Construction versus Search Consider the nth prime function n 7→ pn . Trying to compute pn directly from n is rather difficult, primes are too irregular for that. But since it is easy to recognize a number as being prime, we can simply search for the right number. This is an important algorithmic technique: as long as the search extends over some sufficiently small domain, it is (almost) as good as a direct construction based on deeper insights. 29 Bounded Search 30 We can model search in the realm of p.r. functions as follows. Definition (Bounded Search) Let g : Nn+1 → N . Then f = BS[g] : Nn+1 → N is the function defined by ( min z < x | g(z, y) = 0 f (x, y) = x if z exists, otherwise. So f (x, y) = x is just an arithmetic way of saying “the search failed.” Closure One can show that bounded search adds nothing to the class of p.r. functions. Proposition If g is primitive recursive then so is BS[g]. This would usually be expressed as “primitive recursive functions are closed under bounded search.” 31 Bounded Search II 32 This can be pushed a little further: the search does not have to end at x but it can extend to a primitive recursive function of x and y. ( min z < h(x, y) | g(z, y) = 0 f (x, y) = h(x, y) if z exists, otherwise. Dire Warning: But we have to have a p.r. bound, unbounded search as in min z | g(z, y) = 0 is not an admissible operation; not even when there is a suitable z for each y, more later. PR and Basic Number Theory Claim The divisibility relation div(x, y) is primitive recursive. Note that div(x, y) ⇐⇒ ∃ z, 1 ≤ z ≤ y (x ∗ z = y) so that bounded search intuitively should suffice to obtain divisibility. Formally, we have already seen that the characteristic function M (z, x, y) of x · z = y is p.r. But then X M (z, x, y) sign z≤y is the p.r. characteristic function of div. 33 Primality 34 Claim The primality relation is primitive recursive. Intuitively, this is true since x is prime iff 1 < x and ∀ z < x (div(z, x) ⇒ z = 1). Claim The next prime function f (x) = min z > x | z prime is p.r. This follows from the fact that bounded search again suffices: f (x) ≤ 2x for x ≥ 1. This bounding argument requires number theory (a white lie). Enumerating Primes 35 Claim The function n 7→ pn where pn is the nth prime is primitive recursive. To see this we can iterate the “next prime” function from the last claim: p(0) = 2 p(n + 1) = f (p(n)) Yet More Logic 36 Arguments like the ones for basic number theory suggest another type of closure properties, with a more logical flavor. Definition (Bounded quantifiers) P∀ (x, y) ⇔ ∀ z < x P (z, y) and P∃ (x, y) ⇔ ∃ z < x P (z, y). Note that P∀ (0, y) = true and P∃ (0, y) = false. Informally, P∀ (x, y) ⇐⇒ P (0, y) ∧ P (1, y) ∧ . . . ∧ P (x − 1, y) and likewise for P∃ . Bounded Quantification 37 Bounded quantification is really just a special case of bounded search: for P∃ (x, y) we search for a witness z < x such that P (z, y) holds. Generalizes to ∃ z < h(x, y) P (z, y) and ∀ z < h(x, y) P (z, y). Proposition Primitive recursive relations are closed under bounded quantification. Proof. charP∀ (x, y) = Y charP (z, y) z<x ! charP∃ (x, y) = sign X charP (z, y) z<x 2 Exercises Exercise Give a proof that primitive recursive functions are closed under definition by multiple cases. Exercise Show in detail that the function n 7→ pn where pn is the nth prime is primitive recursive. How large is the p.r. expression defining the function? 38 Sequence Numbers 39 Recall the coding machinery we developed for register machines: h . i : N∗ → N len(h a0 , a1 , . . . , an−1 i ) = n, dec(h a0 , a1 , . . . , an−1 i , i ) = ai , i<n Technically, h i cannot be primitive recursive since the domain N∗ is not allowed in our framework. But, this is really just a technical problem: for every fixed n, h x1 , . . . , xn i is a p.r. function. Decoding is Primitive Recursive Lemma len, dec and Seq are all primitive recursive. The key observation here is that there is a p.r. bounding function B such that for every fixed length n, ∀ a1 , . . . , an < m (h a1 , . . . , an i < B(n, m)) . Hence the whole discrete world of rationals, lists, matrices, trees, graphs and so on is primitive recursive: everything can be coded as sequence numbers and all the requisite operations are primitive recursive. 40 Recording History 41 One application of sequence numbers is course-of-value recursion. First note that ordinary primitive recursion f = PRec[h, g] can be expressed in terms of sequence numbers like so: f (x, y) = z ⇐⇒ ∃ s ∈ Seq len(s) = x+ ∧ (s)0 = g(y)∧ ∀ 1 < i < x ((s)i+ = h(i, (s)i , y)) ∧ (s)x = z The sequence number s records all previous values of f . Now consider the following function associated with f : f (x, y) = h f (0, y), f (1, y), . . . , f (x, y) i Lemma f is primitive recursive iff f is primitive recursive. Course of Value Recursion Thus, it is natural to generalize the primitive recursion scheme slightly by defining functions so that the value at x depends directly on all the previous values. f (0, y) = g(y) f (x+ , y) = H(x, f (x, y), y) Lemma If g and H are primitive recursive, then f is also primitive recursive. Exercise Prove the last two lemmata. You may safely assume that standard sequence operations such as append are primitive recursive. 42 Exercises Exercise Prove that all the standard list operations are indeed primitive recursive. Exercise Explain how to implement search in binary search trees as a primitive recursive operation. Exercise Come up with yet another coding function based on prime factorizations and show that the decoding functions are primitive recursive. 43 Primitive Recursive Functions Pushing Primitive Recursion 3 Register Machines versus Primitive Recursion µ-Recursive Functions General Recursive Functions * RM vs. PR Burning Question: How does the computational strength of register machines compare to primitive recursive functions? It is a labor of love to check that any p.r. function can indeed be computed by a RM. This comes down to building a RM compiler/interpreter for p.r. functions. Since we can use structural induction this is not hard in principle; we can use a similar approach as in the construction of the universal RM. 45 Opposite Direction? The cheap answer is to point out that some RM-computable functions are not total, so they cannot be p.r. True, but utterly boring. Here are the right questions: Is there a total RM-computable function that is not primitive recursive? How much of a RM computation is primitive recursive? 46 In a Nutshell Using the coding machinery from last time it is not hard to see that the relation “RM M moves from configuration C to configuration C 0 in t steps” is primitive recursive. relation). But when we try to deal with “RM M moves from C to C 0 in some number of steps” things fall apart: there is no obvious way to find a primitive recursive bound on the number of steps. It is perfectly reasonable to conjecture that RM-computable is strictly stronger than primitive recursive, but coming up with a nice example is rather difficult. 47 Steps are p.r. 48 Proposition Let M be a register machine. The t-step relation C t M C0 is primitive recursive, uniformly in t and M . Of course, this assumes a proper coding method for configurations and register machines. Since configurations are of the form (p, (x1 , . . . , xn )) where p and xi are natural numbers this is a straightforward application of sequence numbers. t Steps 49 Likewise we can encode a whole sequence of configurations C = C0 , C1 , . . . , Ct−1 , Ct = C 0 again by a single integer. And we can check in a p.r. way that C t M C 0. A crucial ingredient here is that the size of the Ci is bounded by something like the size of C plus t, so we can bound the size of the sequence number coding the whole computation given just the size of C and t. Exercise Figure out exactly what is meant by the last comment. Whole Computations? 50 Now suppose we want to push this argument further to deal with whole computations. We would like the transitive closure C M C0 to be primitive recursive. If we could bound the number of steps in the computation by some p.r. function of C then we could perform a brute-force search. However, there is no reason why such a bound should exist, the number of steps needed to get from C to C 0 could be enormous. Again, there is a huge difference between bounded and unbounded search. Exactly How Much is Missing? So, can we concoct a total, register computable function that fails to be primitive recursive? One way to accomplish this is to make sure the function grows faster than any primitive recursive one, exploiting the inductive structure of the these functions. 51 Ackermann’s Function (1928) The Ackermann function A : N × N → N is defined by A(0, y) = y + A(x+ , 0) = A(x, 1) A(x+ , y + ) = A(x, A(x+ , y)) where x+ is shorthand for x + 1. Note the odious double recursion – on the surface, this looks more complicated than primitive recursion, but we don’t know for a fact that it really is more complex. 52 Ackermann is Computable Proof: Here is a bit of C code that implements the Ackermann function (assuming that we have infinite precision integers). int acker(int x, int y) { return( x ? (acker( x-1, y ? acker( x, y-1 ): 1 )): y+1 ); } All the work of organizing the nested recursion is handled by the compiler and the execution stack. Of course, doing this on a register machine is a bit more challenging, but it can be done. 53 Family Perspective 54 It is useful to think of Ackermann’s function as a family of unary functions (Ax )x≥0 where Ax (y) = A(x, y). The critical part of the definition then looks like so: ( Ax (1) Ax+ (y) = Ax (Ax+ (y − 1)) if y = 0, otherwise. so that + Ax+ (y) = Ayx (1). So it’s all just iteration. From this it follows easily by induction that Claim Each level function Ax is primitive recursive. The Bottom Hierarchy 55 A(0, y) = y + A(1, y) = y ++ A(2, y) = 2y + 3 A(3, y) = 2y+3 − 3 2 A(4, y) = 2 .. . 2 −3 The first 4 levels of the Ackermann hierarchy are easy to understand, though A4 starts causing problems: the stack of 2’s in the exponentiation has height y + 3. This usually called super-exponentiation or tetration. The Mystery of A(6, 6) Alas, if we continue just a few more levels, darkness befalls. A(5, y) ≈ super-super exponentiation A(6, y) ≈ an unspeakable horror A(7, y) ≈ speechless For level 5, one can get some vague understanding of iterated super-exponentiation, but things start to get murky. At level 6, we iterate over the already nebulous level 5 function, and things really start to fall apart. At level 7, Wittgenstein comes to mind: “Wovon man nicht sprechen kann, darüber muss man schweigen.” 56 Ackermann and Union/Find One might think that the only purpose of the Ackermann function is to refute the claim that computable is the same as p.r. Surprisingly, the function pops up in the analysis of the Union/Find algorithm (with ranking and path compression). The running time of Union/Find differs from linear only by a minuscule amount, which is something like the inverse of the Ackermann function. But in general anything beyond level 3.5 of the Ackermann hierarchy is irrelevant for practical computation. Exercise Read an algorithms text that analyzes the run time of the Union/Find method. 57 Ackermann is Total Note that it is not entirely clear that A(x, y) is well-defined for all x and y, see the notes on well-orders. One possible and clumsy proof is by induction on x, and subinduction on y. A more elegant way is to we use induction on N × N with respect to the lexicographic product order (a, b) < (c, d) ⇐⇒ (a < c) ∨ (a = c ∧ b < d), a well-ordering of order type ω 2 . Exercise Carry out the proof of totality. 58 Ackermann vs. PR 59 Theorem The Ackermann function dominates every primitive recursive function in the sense that there is a k such that f (x) < A(k, max x). Hence A is not primitive recursive. Sketch of proof. One can argue by induction on the buildup of f . The atomic functions are easy to deal with. The interesting part is to show that the property is preserved during an application of composition and of primitive recursion. Alas, the details are rather tedious. 2 Register vs. Ackermann Let’s come back to the gap between p.r. and RM-computable. How much do we have to add to primitive recursion to capture the Ackermann function? Proposition There is a primitive recursive relation R such that A(a, b) = c ⇐⇒ ∃ t R(t, a, b, c) Think of t as a complete trace of the computation of the Ackermann function on input a and b. For example, t could be a hash table that stores all the values that were computed during the call to A(a, b) using memoizing. 60 Verifying Traces The entries in t are determined by the following rules: (a, b, c) ∈ t for some c (0, y, z) ∈ t implies z = y + (x+ , 0, z) ∈ t implies (x, 1, z) ∈ t (x+ , y + , z) ∈ t implies (x, u, z) ∈ t and (x+ , y, u) ∈ t for some u. (x, y, z) ∈ t and (x, y, z 0 ) ∈ t implies z = z 0 . Any table that satisfies these rules proves that indeed A(a, b) = c (though it might have extraneous entries). So we have a decision algorithm that tests whether an alleged computation of A is in fact correct. In fact, it is not hard to show that the decision algorithm is primitive recursive. 61 Unbounded Search 62 So to compute A we only need to add search: systematically check all possible tables until the right one pops up (it must since A is total). The problem is that this search is no longer primitive recursive. More precisely, let acker(t, x, y) ⇐⇒ t is a correct trace for x, y lookup(t) = z ⇐⇒ (x, y, z) in t Then A(x, y) = lookup min t | acker(t, x, y) This is all primitive recursive except for the unbounded search in min. Iterative Ackermann 63 The recursive definition from above is close to Ackermann’s original approach. Here is an alternative based more directly on iteration. B1 (x) = 2x Bk+1 (x) = Bkx (1) So B1 is doubling, B2 exponentiation, B3 super-exponentiation and so on. In general, Bk is closely related to Ak+1 . Subsequence Order Recall the subsequence ordering on words where u = u1 . . . un precedes v = v1 v2 . . . vm if there exists a strictly increasing sequence 1 ≤ i1 < i2 < . . . in ≤ m of indices such that u = vi1 vi2 . . . vir . We write u v v. Subsequence order is not total unless the alphabet has size 1. Note that subsequence order is independent of any underlying order of the alphabet (unlike, say, lexicographic or length-lex order). 64 Warmup: Antichains An antichain in a partial order is a sequence x0 , x1 , . . . , xn , . . . of elements such that i < j implies that xi and xj are incomparable. Example Consider the powerset of [n] = {1, 2, . . . , n} with the standard subset ordering. How does one construct a long antichain? For example, x0 = {1} is a bad idea, and x0 = [n] is even worse. 65 Higman’s Lemma 66 Theorem (Higman’s 1952) Every anti-chain in the subsequence order is finite. Proof. Here is the Nash-Williams proof (1963): assume there is an anti-chain. For each n, let xn be the length-lex minimal word such that x0 , x1 , . . . , xn starts an anti-chain, producing a sequence x = (xi ). For any sequence of words y = (yi ) and a letter a define a−1 y to be the sequence consisting of all words in y starting with letter a, and with a removed. Since the alphabet is finite, there exists a letter a such that x0 = a−1 x is an anti-chain. But then x00 < x0 , contradiction. 2 Friedman’s Self-Avoiding Words For a finite or infinite word x write x[i] for the block xi , xi+1 , . . . , x2i . We will always assume that i ≤ |x| /2 when x is finite. Bizarre Definition: A word is self-avoiding if for i < j the block x[i] is not a subsequence of x[j]. The following is a consequence of Higman’s theorem. Theorem Every self-avoiding word is finite. By the last theorem we can define α(k) = length of longest self-avoiding word over Σk 67 How Big? 68 Trivially, α(1) = 3. A little work shows that α(2) = 11, as witnessed by 01110000000. But α(3) > B7198 (158386), an incomprehensibly large number. Smelling salts, anyone? It is truly surprising that a function with as simple a definition as α should exhibit this kind of growth. An Insane Hierarchy * If you understand (small) ordinals, one can define a hierarchy (due to Grzegorczyk and Wainer) of fast growing functions fα : N → N like so. f0 (x) = x + 1 fα+1 (x) = fαx (x) fλ (x) = fλ[x] (x) Here λ = lim λ[n] and the approximating sequence can be chosen in a natural fashion as long as λ is not too large. For example, λ ≤ ε0 works well where ε0 is the first fixed point of α 7→ ω α . In this setting the Ackermann function is essentially just fω and utterly puny compared to fω2 or fωω or fε0 . 69 Primitive Recursive Functions Pushing Primitive Recursion Register Machines versus Primitive Recursion 4 µ-Recursive Functions General Recursive Functions * Closing the Gap Since the Ackermann function misses being primitive recursive only by the lack of an unbounded search operation, it is tempting to extend primitive recursive functions a bit by adding a min operator. From the previous discussion, primitive recursive plus min operator is powerful enough to produce the Ackermann function – of course, the real question is: what is relationship between the extended class and computable functions in general? As we will see, we obtain the same class of functions. 71 Regular Minimization 72 Let us call a function g regular if ∀ x ∃ z g(z, x) = 0. Hence, if we define f from g by minimization, the resulting function will be total. This is also called regular search for obvious reasons. f (x) = min z | g(z, x) = 0 We will write f (x) = µz(g(z, x) = 0). µ-Recursive Functions Definition (Kleene 1936) The class of µ-recursive functions is defined as the closure of the basic functions zero, successor, and projections under the operations composition, primitive recursion and regular search. It follows immediately that the Ackermann function is µ-recursive. 73 Regular Search Computability Here is a slightly different definition: we remove primitive recursion from the operations, but augment the initial functions by addition and multiplication. Define the regular search computable functions to be the class obtained from: The basic functions projections, equality, addition and multiplication, and closed under composition and regular search. Exercise (Hard) How does this compare to µ-recursive functions? 74 Partial Functions If we want to establish a close connection between machines and classes of functions defined by closures, we have to tackle the problem of divergent computations. Given our framework, this is not hard: just drop the regularity condition. We allow f (x) = min z | g(z, x) = 0 regardless of whether a witness z exists for all x. This produces the class of partial µ-recursive functions. 75 But Beware 76 It is critical that g is still total. For partial g application of µ operator can lead to non-computable functions. Also, one cannot do things like ( min z | g(z, x) = 0 f (x) = 0 if z exists, otherwise. This definition-by-cases uses a test that is generally undecidable. Church-Turing Thesis Let f : Nn ,→ N be a partial function. Then the following are equivalent: f is intuitively computable, f is partial µ-recursive, f is register machine computable, f is λ-computable, f is Herbrand-Gödel-computable. Of course, the first item has to be removed to get a proper theorem. We’ll skip the proof. There are many other equivalent characterizations which all add to the weight of this thesis. Also note that Gödel was completely convinced by Turing’s analysis (and gave him full credit, claiming none for himself and Church). 77 The Interesting Part Kleene established a strong normal form result that shows that all computable functions can be represented in a certain uniform way. Theorem (Kleene Normal Form) There is a p.r. relation trace and a p.r. function lookup such that register machine Me on input x halts on output y if, and only if, y = lookup min t | trace(t, e, x) . Here t is a complete trace of the computation of Me on input x, much as in the discussion of the Ackermann function. The trace predicate checks whether t is really a correct trace for machine Me on input x. Note the lookup: it extracts the output of M from the trace t. One can show that some device such as lookup is required. 78 A Practical Approach Suppose you have some problem A ⊆ N that is probably (at least) semidecidability. Figure out a way to express A as follows: x ∈ A ⇐⇒ ∃ z R(z, x) where R is just primitive recursive. In many cases it is not hard to find R, it arises naturally from the definition of A. If the search cannot be bounded, then A is just semidecidability. If one can find a computable bound for the witness z, then A is decidable. 79 Primitive Recursive Functions Pushing Primitive Recursion Register Machines versus Primitive Recursion µ-Recursive Functions 5 General Recursive Functions * Example: Addition 81 Suppose we have a constant 0 and the successor function x 7→ x+ . Consider the following system of equations E: f (x, 0) = x + f (x, y + ) = f (x, y) Intuitively, E defines addition. For example + + ++ f (3, 2) = f (2, 1+ ) = f (3, 1) = f (3, 0+ ) = f (3, 0) = 3++ = 4+ = 5 In fact, we can derive f (3, 2) = 5 from these equations using only standard (and very simple) rules of inference. Equational Reasoning More formally, we really should use numerals denoting natural numbers: 0, 1 = 0+ , 2 = 1+ , . . . In general, we have a numeral n for every natural number n. There are two rules: Substitution: We can replace a free variable everywhere in an equation by a numeral. Replacement: A term f (a1 , . . . , ak ) can be replaced by a numeral b if we have already proven f (a1 , . . . , ak ) = b. 82 3+2=5 83 (1) (2) (3) (4) (5) subst of E1 f (3, 0) = 3 + subst. E2 + subst. E2 f (3, 1) = f (3, 0) f (3, 2) = f (3, 1) + repl. (1), (2) + repl. (4), (3) f (3, 1) = 3 = 4 f (3, 2) = 4 = 5 Tedious, but completely mechanical and easy to implement in any language with good support for pattern matching. Herbrand-Gödel Computability With a little more effort we can give a precise notion of derivability (or provability) in E: E ` f (a1 , . . . , ak ) = b Definition A function f : Nk ,→ N is Herbrand-Gödel computable if there is a finite system of equations E that has an k-ary function symbol f such that E ` f (a1 , . . . , ak ) = b ⇐⇒ f (a1 , . . . , ak ) ' b for all ai , b ∈ N. One often speaks about general recursive functions. 84 Terminology These functions are usually called general recursive functions. The “general” stems from the fact that primitive recursive functions were originally referred to as “recursive functions”. To round off the confusion, many older authors mean total computable functions when they say “recursive function”, and use “partial recursive function” for the general case. And there is an effort under way to get rid of all names that include “recursive” and replace it by “computable.” We’ll mix and match. 85 Herbrand-Gödel versus Machines Theorem A function is general recursive iff it is (register machine) computable. Sketch of proof. Say the function is f : N ,→ N . Given a system of equations E for f and exploiting standard coding tricks, we can build a register machine M that on input a ∈ N will search for a b ∈ N and a derivation in E of f (a) = b. If they exist the RM returns b after finitely many steps, otherwise it diverges. 86 And Back For the opposite direction we need to find a way to simulate register machines in terms of arithmetic functions. This requires two main ingredients: a bit of coding machinery to express RMs, configurations and such as natural numbers, and a way to express search (for the time t when a computation stops) via HG computability. Except for the unbounded search this is all primitive recursive. But p.r. functions can clearly be described by a system of equations. 87 Search 88 It remains to deal with unbounded search in an equational manner (Kleene 1952). Assume f (x) = µz(g(x, z) = 0). Introduce three new function symbols α, β and gb with equations α(x, y + ) = x β(x, 0) = x gb(x, 0) = g(x, 0) gb(x, y + ) = α(g(x, y + ), gb(x, y)) f (x) = β(z, gb(x, z)) Ponder deeply. 2