Download BASIC FIXED POINT THEOREMS This article is self contained

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Birkhoff's representation theorem wikipedia , lookup

Transcript
BASIC FIXED POINT THEOREMS
RAYMOND M. SMULLYAN
1. I NTRODUCTION
This article is self contained. Representational systems, as defined in [1], whose definitions we will review (with a slight modification), unify recursion theory, elementary formal
systems (as defined in [1]) and first-order arithmetic theories. Applicative systems, as defined in [2], generalize combinatory logic. No prior acquaintance with combinatory logic
is presupposed—indeed we introduce here some of its basic features in a somewhat new
light. Our purpose now is to unify representation systems with applicative systems, which
we do by introducing what we call basic systems, all results of which have applications to
all the areas mentioned above.
2. BASIC S YSTEMS
By a basic system B we mean a collection of the following items:
(1) A set N of entities called elements.
(2) An equivalence relation ≡ on N.
(3) A function app. that assigns to each element x and each n-tuple (y1 , . . . , yn ) of elements, an element denoted x(x1 , . . . , xn ), called the result of applying x to
(x1 , . . . , xn ).
If desired, one can think of B as the ordered triple (N, ≡, app.).
Intended Applications. (1) By a representations system R we shall mean a collection of the
following items:
(1) An infinite sequence without repetitions: S1 , S2 , . . . , Sn , . . . of elements called sentences. For each n we call n the index of Sn .
(2) For each positive integer n, a sequence without repetitions H1n , H2n , . . . , Hnn , . . . of
elements called n-ary predicates or predicates of degree n.
(3) To each n-ary predicate and each n-tuple (a1 , . . . , an ) of natural numbers is assigned a sentence denoted H n (a1 , . . . , an ), called the result of applying the predicate H n to the n-tuple.1
(4) A subset of the set of sentences whose elements we call true sentences (or in some
contexts, provable sentences). Two sentences are called equivalent if and only if
they are either both true or both not true.
Discussion. Representation systems have applications to first-order theories of natural
numbers, such as Peano Arithmetic. Consider such a theory T together with a Gödel
numbering. We associate with these a representation system as follows: The sentence
will be the closed formula of T , and we arrange them in a sequence S1 , S2 , . . . , Sn , . . . according to the magnitude of their Gödel numbers. The n-ary predicate are the formulas
1Without ambiguity we can sometimes delete superscripts and write H(a , . . . , a ) for H n (a , . . . , a ).
n
n
1
1
1
2
RAYMOND M. SMULLYAN
φ (x1 , . . . , xn ) with n free variables, and for each n, we arrange the n-ary predicate in a sequence according to the magnitude of their Gödel numbers. We take the result of applying
a formula φ with n free variables to an n-tuple (av , . . . , an ) of natural numbers, to be the
sentence φ (av , . . . , an ), which is the result of substitution for each free variable vi (i ≤ n)
the numeral ai that designates the number ai .
We are now interested in seeing how results of basic systems have application to representation systems. Well, for any representation system R, by its associated basic system
denoted R, we mean the system in which :
(1) We take N to be the set of natural numbers.
(2) We take x ≡ y to mean that two sentences Sx and Sy are equivalent (both true or
both not true).
(3) We take x(y1 , . . . , yn ) to be the index of the sentence Hx (y1 , . . . , yn ).
We noe see how any theorems about basic systems have applications to representation
systems.
(2) An applicative system A (sometimes called a combinatory system) as defined in [2]
consists simply of a set N and a binary function α(x, y) on elements of N. We write xy
for α(x, y), and following the convention of combinatory logic, in compound expressions,
parentheses are to be restored to the left—e.g., xyz means (xy)z (not x(yz)); xyzw means
((xy)z)w, etc.
Given an applicative system A, we define its associated basic system A to be that system
in which N is the same as in A, the equivalence relation is to be identity, and we take
x(y1 , . . . , yn ) to be the element xy1 , . . . , yn .
Thus all results of generalized applicative systems apply both to representation systems
and combinatory logic.
Special Functions. We now return to the general study of basic systems.
For each positive integer n, we let α n+1 be the function defined by the condition
n+1
α (x, x1 , x2 , . . . , xn ) = x(x1 , . . . , xn ). We let α 1 be the identity function (α 1 (x) = x). We
refer to the functions α 1 , α 2 , . . . , α n , . . . as applicative functions.
For each positive integer n and each i ≤ n, we let Pin be the projection function:
n
Pi (x1 , . . . , xn ) = xi .
For each n and element y, we let Cyn be the constant function: Cyn (x1 , . . . , xn ) = y.
We now let SP be the smallest class of functions that contains all applicative, projective
and constant functions and is closed under composition. The elements of SP will be called
special functions.
The following special functions will play key roles in the study of fixed points:
(1) d(x) = x(x) (the diagonal function)
(2) t(x, y) = x(x, y)
(3) t 0 (x, y) = y(x, y)
(4) l(x, y) = x(y(x))
These functions are indeed special, since
(1) d(x) = α 2 (α 1 (x), α 1 (x))
(2) t(x, y) = α 3 (P12 (x, y), P22 (x, y), P12 (x, y))
(3) t 0 (x, y) = α 3 (P22 (x, y), P12 (x, y), P22 (x, y))
(4) l(x, y) = α 3 (P12 (x, y), α 2 (P22 (x, y), P22 (x, y)))
For any function f of n + 1 arguments and any element a, by fa we mean the function of
n arguments defined by the condition: fa (x1 , . . . , xn ) = f (a, x1 , . . . , xn ). We note that if f is
special, so if fa , since fa (x1 , . . . , xn ) = f (cna (x1 , . . . , xn ), P1n (x1 , . . . , xn ), . . . , Pnn (x1 , . . . , xn )).
BASIC FIXED POINT THEOREMS
3
Admissibility. Given a sequence fi (x1 , . . . , xn ), . . . , fk (x1 , . . . , xn ) of functions of the same
number of arguments, an element b will be said to cover an element a with respect to
the sequence, iff2 for all x1 , . . . , xn , b(x1 , . . . , xn ) ≡ a( f1 (x1 , . . . , xn ), . . . , fk (x1 , . . . , xn ). This
definition applies also to the case k = 1—i.e., b covers a with respect to f (x1 , . . . , xn ) iff
b(x1 , . . . , xn ) ≡ a( f (x1 , . . . , xn )). We shall call a sequence of functions admissible iff for
every element a there is an element b that covers a with respect to the sequence.
Finally, we will call B adequate iff every sequence of special functions is admissible.
Consider now a representation system R and its associated basic system R. For any
numbers a, b, x1 , . . . , xn , y1 . . . , yR , to say that a(x1 , . . . , xn ) ≡ b(y1 , . . . , yR ) holds in R is to
say that the sentence Ha (x1 , . . . , xn ) (which is the sentence whose index is a(x1 , . . . , xn )) is
equivalent to the sentence Hb (y1 , . . . , yR ) (which is the sentence whose index is b(y1 , . . . , yn ))
—i.e., both sentences are true or both are not true. Also, to say that a(x1 , . . . , xn ) ≡ b in R
is to say that the sentence Ha (x1 , . . . , xn ) is equivalent to the sentence Sb .
A sequence of functions is said to be admissible in R iff it is admissible in R, and
we say that R is adequate iff R is adequate, which means that for every sequence <
fi (x1 , . . . , xn ), . . . , fk (x1 , . . . , xn ) > of special functions and any predicate Ha of degree R,
there is a predicate Hb of degree n such that for all numbers x1 , . . . , xn , the sentence
Hb (x1 , . . . , xn ) is equivalent to the sentence Ha ( fi (x1 , . . . , xn ), . . . , fk (x1 , . . . , xn )).
For an applicative system A, we say that the sequence fi , . . . , fk of functions is admissible iff it is admissible in the associated basic system A, which is to say that for all elements xi , . . . , xn , the element bx1 , . . . , xn is identical to ayi , . . . , yk , where for each i ≤ k, yi
is fi (x1 , . . . , xn ). We call A adequate iff every sequence of special functions is admissible.
We now turn to the study of fixed point theorems of basic systems and their application
to representation systems and applicative systems.
3. F IXED P OINTS
We call b a fixed point of a iff a(b) ≡ b. We should say that B has the fixed point property
iff every element a has a fixed point.
Theorem 1. If B is adequate, or even if its diagonal function d(x) = x(x) is admissible,
then B has the fixed point property.
Proof. Suppose the diagonal function d(x) is admissible. Then for every element a, there
is an element b such that b(x) ≡ a(d(x))—which means b(x) ≡ a(x(x)) (for all x). We take
b for x, and we see that b(b) ≡ a(b(b)), and so b(b) is a fixed point of a.
Application. For a representation system R, a sentence Sb is called a fixed point of a unary
predicate H iff Sb is equivalent to H(b). Now, Sb is a fixed point of Ha iff b is a fixed point
of a in the basic system R associated with R (because a(b) ≡ b in the system R says that
the sentence Ha (b) (whose index is a(b)) is equivalent to Sb (whose index is b)). It is then
immediate from Theorem 1 that if R is adequate, then it has the fixed property that every
unary predicate has a fixed point.
More specifically, a sufficient condition fot R to have the fixed point property is that the
diagonal function d(x) be admissible (d(x), which is x(x), is now the index of Hx (x)).
Moreover, if for a given unary predicate H, b is a number such that for all x, the sentence
Hb (x) is equivalent ot H(d(x)), then Sd(b) is a fixed point of H.
The fixed point property for a representation system has particular significance: A unary
predicate H is said to represent the set of all numbers such that H(n) is true, and a set of
2We henceforth use Halmos’ abbreviation “iff” for “if and only if”.
4
RAYMOND M. SMULLYAN
numbers is called representable iff some predicate represents it. A sentence Sb is called a
Gödel sentence for a number set A iff Sb is true if and only if b ∈ A. (Such a sentence may
be thought of as asserting that its own index is in A.) Now, to say that Sb is a fixed point
of a predicate H is equivalent to saying that Sb is a Gödel sentence for the set represented
by H, and so if every unary predicate has a fixed point, then every representable set has a
Gödel sentence (and conversely). This has two important ramifications: Suppose that R is
adequate—or even that the diagonal function is admissible—and also that the complement
of every representable set is again representable. It then follows (as observed by Alfred
Tarski [4]) that the set of indices of the true sentences cannot be representable, because
there obviously cannot be a Gödel sentence fot that set. Now suppose further there is a
mathematical system which proves various sentences of R and is accurate in the sense that
it proves only true sentences, and that the set P of indices of the provable sentences is
representable. Then there must be a Gödel sentence for the complement of P, and such
a sentence must be true but not provable, and thus Gödel’s famous result applies to that
system.
For an applicative or combinatory system A, an element b is said to be a fixef point of
an element a iff ab = b, which is equivalent to the condition that a(b) ≡ b holds in the
basic system A associated with A—in other words, that b is a fixed point of a in A. It then
follows from Theorem 1 that if A is adequate, or even if the diagonal function d(x), which
now is x(x) is admissible, then every element a has a fixed point—specifically bb, where b
is any element that covers a with respect to the diagonal function.
For an applicative system A, α n+1 is the function: α n+1 (x, x1 , . . . , xn ) = xx1 . . . xn (since
n+1
α (x, x, . . . , xn ) is x(x1 , . . . , xn ), which now is xx1 , . . . , xn ). Then for any element y, the
function αyn+1 is the function satisfying the condition αyn+1 (x1 , . . . , xn ) = yx1 , . . . , xn . We
are now interested in the function α 2 , and so for any elements a and x, the element αa2 (x)
is a(x).
Now suppose the set N of elements of A contains the standard combinator B satisfying
the condition: Bxyz = x(yz). Then for every element a, the function αa2 is admissible—
more specifically, for any element b, the element Bba covers b with respect to αa2 (since
for all x, Bba(x) = Bbax = b(ax) = b(αa2 (x)).
Next, suppose that among the elements of A is the combinator M satisfying the condi2 is the diagonal function (since α 2 (x) = Mx = xx).
tion: Mx = xx. Then the function αM
M
2 is admissible,
Therefore, if M and B are both present, then the diagonal function αM
and every element has a fixed point. More specifically, if B and M are present, then
2 , which is the diagonal
for any element a, the element BaM covers a with respect to αM
function, and so BaM(BaM) is therefore a fixed point of a (which can be seen directly:
BaM(BaM) = a(M(BaM)) = a(BaM(BaM)) (since M(BaM) = BaM(BaM))).
The Combinator L. In [3] we introduced the combinator L defined by the condition Lxy =
x(yy). This L is quite useful in the study of fixed point and was heavily exploited in [2]. If
L is present, then for any element a, the element La covers a with respect to the diagonal
function d(x), because for all x, Lax = a(xx) = a(d(x)), and therefore La(La) is also a
fixed point of a (which can be verified directly).
4. C ROSS P OINTS
Returning to basic systems in general, we shall call an (ordered) pair (b1 , b2 ) of elements
a cross point of an ordered pair (a1 , a2 ) iff b1 ≡ a1 (b2 ) and b2 ≡ a2 (b1 ). We say that B has
the cross point property iff every pair (a1 , a2 ) has a cross point.
BASIC FIXED POINT THEOREMS
5
Theorem 2. If B is adequate, it has the cross point property. Better yet, any of the following
three conditions is sufficient for B to have the cross point property:
(a) The function t(x, y) = x(y, x) is admissible.
(b) For each element a, the function la (x) is admissible.3
(c) B has the fixed point property and for each element a, the function αa2 is admissible.
Proof. (a) Suppose t(x, y) is admissible. Then for any elements a1 and a2 there are elements b1 and b2 such that for all elements x and y:
(1) b1 (x, y) ≡ a1 (x(y, x))
(2) b2 (x, y) ≡ a2 (x(y, x))
In (1) we take x to be b2 and y to be b1 , and in (2) we take x to be b1 and y to be b2 , and we
get:
(1) b1 (b2 , b1 ) ≡ a1 (b2 (b1 , b2 ))
(2) b2 (b1 , b2 ) ≡ a2 (b1 (b2 , b1 ))
Thus (c1 , c2 ) is a cross point of (a1 , a2 ), where c1 = b1 (b2 , b1 ) and c2 = b2 (b1 , b2 ).
(b) We will prove something stronger: Let us say that B has the strong cross point
property iff for any elements a and b there is an element c such that c ≡ a(b(c)). This does
imply the cross point property, because if we let d be the element b(c), then c ≡ a(d), and
not only d ≡ b(c), but d = b(c), and thus (c, d) is a cross point of (a, b).
We will now show that the hypothesis (b) of Theorem 2 implies that B in fact has the
strong cross point property.
Well, suppose that the hypothesis (b) holds. Then given elements a and b, since the
function lb is admissible, there is an element m such that m(x) ≡ a(tb (x)) for all x, and thus
m(x) ≡ a(b(x(x))). Taking m for x, we have m(m) ≡ a(b(m(m))), and thus c ≡ a(b(c)),
where c = m(m).
(c) Suppose that hypothesis (c) of Theorem 2 holds. We will see that B then has the
strong cross point property.
Since αb2 is admissible for each element b, then for any element a there is an element m
such that m(x) ≡ a(αb2 (x)) for all x, and so m(x) ≡ a(b(x)). By hypothesis there is a fixed
point c of m. Thus c ≡ m(c) and since m(c) ≡ a(b(c)), then c ≡ a(b(c)).
Applications. We say that a representation system R has the cross point property iff its
associated basic system R has, which is to say that for any unary predicates Ha and Hb ,
there are sentences Sc and Sd such that Sc is equivalent to Ha (d) and Sd is equivalent to
Hb (c). This condition is equivalent to the condition that for any representable sets A and
B, there are sentences Sa and Sb such that Sa is true iff b ∈ A and Sb is true iff a ∈ B. This
has a very curious ramification:
Suppose R has the cross point property and that the compliment A0 of any representable
set A is again representable and that we have a mathematical system that proves various
true sentences and only true ones, and that the set P of indices of the provable sentences is
representable. The curious thing that then follows is this: The sets P and its compliment
P0 are both representable, hence there are sentences Sa and Sb such that Sa is true iff b ∈ P
and Sb is true iff a ∈ P0 . This means that Sa is true iff Sb is provable and Sb is true iff Sa is
not provable. It then follows that of these two sentences, one of them must be true and not
provable, but there is no way of knowing which one it is! Here is why:
Suppose Sa is provable. Then it is true, and hence Sb is provable (as Sa says), hence
Sb is true, hence Sa is not provable (as Sb says) and we have a contradiction. Therefore Sa
3la(x) = a(x(a))
6
RAYMOND M. SMULLYAN
cannot be provable. Therefore Sb must be true (since it is true iff Sa is not provable). Now
we know that Sa is not provable and Sb is true. Either Sa is true or not. If so, then Sa is true
but not provable. If not, then Sb is not provable (since Sa is not true iff Sb is not provable),
in which case Sb is true but not provable.
In summary, if Sa is true, then it is true but not provable, and if Sa is not true, then Sb is
true but not provable. There is no way to tell whether Sa is true or not.
It follows from Theorem 2 that if R is adequate, then R has the cross point property.
We say that an applicative system A has the cross point property iff the basic system A
has the property, which means that for any elements a1 and b1 , there are elements b1 and
b2 such that b1 = a1 b2 and b2 = a2 b1 .
The strong cross point property for A is that for any element a and b there is an element
c such that c = a(bc).
It follows from Theorem 2 that if the combinators B and M are present, then A has the
strong cross point property, because if B is present, then for every element a, the function
αa2 is admissible (as we have seen), and if B and M are present, then every element has a
fixed point, and so hypothesis (c) of Theorem 2 holds.
More specifically, since a(bc) is Babc, then the strong cross point property is that for
every a and b, there is some c such that c = Babc—in other words that c is a fixed point of
Bab. We have already seen that for every x, a fixed point of x is BxM(BxM), and so a fixed
point of Bab is B(Bab)M(B(BabM)).
5. D OUBLE F IXED P OINTS
We shall call a pair (b1 , b2 ) a double fixed point of a pair (a1 , a2 ) iff b1 ≡ a1 (b1 , b2 ) and
b2 ≡ a2 (b1 , b2 ). We say that B has the double fixed point property iff every pair (a1 , a2 )
has a double fixed point.
Theorem 3. If B is adequate then it has the double fixed point property.
Proof. We use the special functions t(x, y) = x(y, x) and t 0 (x, y) = y(x, y).
Assume B is adequate. Then the sequence (t 0 (x, y),t(x, y)) and the sequence (t(x, y),
0
t (x, y)) are both admissible, hence for any elements a1 and a2 there are elements c1 and c2
such that for all x and y:
(1) c1 (x, y) ≡ a1 (y(x, y), x(y, x))
(2) c2 (x, y) ≡ a2 (x(y, x), y(x, y))
In (1) we take c2 for x and c1 for y, and in (2) we take c1 for x and c2 for y and we get
(1)0 c1 (c2 , c1 ) ≡ a1 (c1 (c2 , c1 ), c2 (c1 , c2 ))
(2)0 c2 (c1 , c2 ) ≡ a2 (c1 (c2 , c1 ), c2 (c1 , c2 ))
Thus (b1 , b2 ) i a double fixed point of (a1 , a2 ), where b1 = c1 (c2 , c1 ) and b2 = c2 (c1 , c2 ).
Applications. We say that a representation system R has the double fixed point property
iff its associated basic system R has, which is equivalent to the condition that for any two
binary predicates H and K, there are sentences Sa and Sb such that Sa is equivalent to
H(a, b) and Sb is equivalent to K(a, b).
A binary predicate H is said to represent a numerical binary relation R(x, y) iff for all
numbers x and y, the sentence H(x, y) is true iff R(x, y) holds. The double fixed point property for R is equivalent to he condition that for any two representational binary relations
R1 (x, y) and R2 (x, y), there are sentences Sa and Sb such that Sa is true iff R1 (a, b) and Sb
is true iff R2 (a, b).
BASIC FIXED POINT THEOREMS
7
We say an applicative system A has the double fixed point property iff its associated
basic system A does, which is that for any elements a1 and a2 there are elements b1 and
b2 such that b1 = a1 b1 b2 and b2 = a2 b1 b2 . This is a famous principle in the field of
combinatory logic.
In [2] we introduced the combinators C1 and C2 satisfying the conditions C1 zxy =
z(yxy)(xyx) and C2 zxy = z(xyx)(yxy).
If these two combinators are present, then the pair < t 0 (x, y),t(x, y) > and the pair
< t(x, y),t 0 (x, y) > are both admissible in A, and it then follows from the proof of Theorem 3 that A thus has the double fixed point property.
The combinators C1 and C2 are both derivable from the combinator B and the standard
combinators C and W , where Cxyz = xzy and W xy = xyy (not to be confused with x(yy)).4
*
*
*
*
*
There are many other interesting multiple fixed point theorems in the literature which
can be obtained as corollaries of theorems of basic systems, and a further study of this is
planned.
An Open Problem. Let is say that a basic system B has the recursion property iff for every
element a there is an element b such that b(x) ≡ a(b, x) for all x.
If B is adequate, does it necessarily have the recursion property?
R EFERENCES
[1]
[2]
[3]
[4]
Raymond Smullyan, The Theory of Formal Systems, Princeton University Press, 1961.
Raymond Smullyan, Diagonalization and Self-Reference, Oxford University Press, 1994.
Raymond Smullyan, To Mock a Mockingbird, Alfred A. Knopf, 1985.
Alfred Tarski, “Der Wahrheitsbegriff in den formalisierten Sprachen,” Studia philosophica, vol. 1, 1936.
4For proof, see [2], Chapter 17.