Download Chapter 4, Mathematics

What is Philosophy Chapter 4 by Richard Thompson Mathematics (last edited on 25th April 2012) This chapter assumes familiarity with formal logic, described in Chapter 2 of these notes In the Greek world the only successful field of theoretical knowledge, apart from Aristotle’s formal Logic, was Mathematics and the most highly developed branch of Mathematics was Geometry. Chinese and Indian Mathematicians independently developed arithmetic and the rudiments of algebra. The Eastern and Western traditions appear to have been quite separate until the Arabs brought Indian arithmetic to Europe around the thirteenth century. Until then Europeans did not know the decimal system of numbering and had no efficient procedure for calculation, depending on the abacus for all but the very simplest calculations. Greek Geometry was codified by Euclid who expressed it in an axiomatic form around 300 BC. But the axiomatic method was known earlier; it is mentioned by Aristotle in his Metaphysics. Euclid’s original system was eventually found to be incomplete, and its reliance on the use of diagrams left room for fallacious arguments, but the various defects were repaired by David Hilbert working in the late nineteenth and early twentieth centuries. The axiomatic method was for many centuries treated as the paradigm of systematic theoretical knowledge, and the hope that all knowledge might eventually be cast in that form inspired Rationalism. Traditionally an axiomatic system was supposed to start with a set of propositions whose truth could be easily established, they should if possible be self evidently true. The remaining propositions of the discipline in question were then to be deduced from the axioms. Aristotle thought that the justification of the principles of Logic and the axioms of Geometry was one of the functions of Metaphysics - the part of Philosophy presupposed by all the individual disciplines, so an Aristotelian might offer that as an example of an intrinsically philosophical problem, that can never be assigned to any more specialised field of study. Historical Summary: Mathematics and Logic In the late 19th and early 20th centuries mathematically minded philosophers and philosophically minded mathematicians extended logic and presented it in algebraic form. That opened the way for a rigorous re-examination of the status of Mathematics. Even in the eighteenth century Leibnitz had dreamt of a sort of arithmetic of logic, by means of which the truth value of any proposition could be determined by some sort of calculation. He hoped that we might get to a point where all disagreements could be settled in that way. We’d now call this having a decision procedure. In propositional logic truth tables provide just such a procedure. In modern terms Leibnitz thought that all propositions might be decidable; that there might be some algorithm that could be applied to absolutely any proposition and that could in every case be relied on to yield a definite answer TRUE or FALSE. Page 1 What is Philosophy Chapter 4 by Richard Thompson At this point it is convenient to define algorithm. Any set of rules that can be relied on to solve any problem of a certain type in a finite number of steps is called an ‘algorithm’. For example the standard procedures for addition, subtraction and multiplication are all algorithms. In logical theory ‘decision procedure’ is equivalent to ‘algorithm’. In cookery a reliable recipe is an algorithm for producing the soup, cake, stew or whatever it is that it tells us how to cook. A computer program, if it works, embodies some sort of algorithm. Only a Rationalist could have supposed there might be such a procedure for determining the truth of any proposition, but even many Empiricists thought that the propositions of Mathematics might all be decidable even if no other propositions were. The Mathematician Peano (1848-1925) produced a short list of axioms from which he hoped it would be possible to deduce all true propositions about the natural numbers, the members of the set N={1,2,3...} I list those axioms later. Frege tried to extend logic to the point where numbers could be defined within logic, hoping to prove all Peano’s axioms and so to establish them as logical truths. Russell found a contradiction in Frege’s system, but he and A. N. Whitehead published Principia Mathematica (1910-1913 ) which described a system of logic that appeared to be free from contradiction and sufficiently powerful to prove Peano’s axioms. The ideal was a completely formal system in which inference started from certain special propositions, the axioms, and proceeded by precisely defined rules, the rules of inference. Once axioms and rules of inference have been chosen no subjective judgement should be necessary. Purely mechanical tests should suffice to check whether any proposed proof meets the criteria. A formalisation with that property is said to be effective. Effectiveness is extremely important, for in everyday argument it is often hard to be sure whether an argument is valid or not. As I have already remarked in chapter 2 it can be especially hard to be sure that an argument is invalid, as that may require showing that there is no valid form to which the argument conforms. Although we can be completely sure an argument is invalid if it leads from a true premiss to a false conclusion, that fact is no help in the cases where we do not know whether the conclusion is true or false, and where we are interested in determining the validity of the argument because, if valid, it will show that its conclusion is true. The desire to overcome that difficulty was one of the motives behind the construction of formal logic. Although the Russell-Whitehead system was formal in the sense that it had a formal test for validity, it was not decidable - that is it did not have a decision procedure that would determine, for an arbitrary formula, whether or not it was a theorem. It was eventually proved that no system as strong as theirs could have such a decision procedure. It appeared that the most one could hope for was that (1) Every truth expressible in the system should have a proof and (2) There should be a decision procedure capable of determining the validity of any putative proof. However in 1931 Kurt Gödel (1906-1978) proved that even this is impossible. (1) and (2) are incompatible so no consistent system strong enough to include even basic arithmetic 1 can be both complete and effectively formalised. Since then a variety of 1 In this context ‘arithmetic’ does not mean just the carrying out of simple calculations, but includes the whole of the theory of numbers, with algebraic identities and theorems about primes and divisibility. Page 2 What is Philosophy Chapter 4 by Richard Thompson formal systems have appeared sufficient to prove most of contemporary Mathematics, but their construction is usually treated as part of Mathematics rather than pure logic. I want now to describe the search for ‘foundations’ in more detail and consider what wider significance the story may have. But before I do that I must introduce set theory and equivalence relations. Set Theory  The beginnings of the subject are usually traced to Euler, who used diagrams to illustrate inferences in Aristotelian logic. (see chapter 2 of these notes) In 1847 George Boole (1815-64) published The Mathematical Analysis of Logic in which he said “That which renders Logic possible, is the existence in our minds of general notions, - our ability to conceive of a class and to designate its members by a common name”. Boole developed an algebra of classes, using the arithmetical symbols for addition and multiplication but specifying new rules so that ‘+’ represented union and ‘.’ represented intersection. I shall substitute for that the notation we use today. Boole thought classes could be defined by specifying a common property peculiar to the members of the class, and possessed by all members, thus {x: F(x)} denotes the class of all individuals of which F(x) is true, so {x: x is a fish} is the class of all fish. Although I can’t find a reference to the possibility in Boole, an alternative way of defining a finite class is to list its members, as in U = {1, 2, 3, 4, 5, 6} In such a definition the order in which the elements appear is unimportant. {4, 6, 1, 5, 2, 3} is just as good a definition of U. Membership: x is a member of set A is symbolised by ‘x  A’ The intersection of two classes is the class containing precisely the individuals that belong to both classes, thus A  B= {x: x  A  x  B}. for example {1, 2, 3, 4}  {2, 4, 6, 8} = {2, 4} The numbers are arranged in numerical order for convenience, but listing them in a different order would not alter the meaning. The Union of two classes is the set containing just those elements that belong to either of those classes or to both, thus A  B= {x: x  A  x  B} so that {1, 2, 3, 4}  {2, 4, 6, 8} = {1, 2, 3, 4, 6, 8} Notice that the numbers 2 and 4 appear only once in the listing of the members of the union. Think of the union of two sets as like the merger of two clubs, where anyone who used to belong to either of the original clubs becomes a member of the new club. There might be people who originally belonged to both the original clubs, but such people would not be members of the new club twice over with two votes at the AGM; they’d just be members of it and have one vote each. The Complement of a class. Boole refers to ‘the members of the universe’ that do not belong to a class, A’ = {x: (x  A) } (x  A) is usually abbreviated to x  A) Page 3 What is Philosophy Chapter 4 by Richard Thompson To talk of ‘things that don’t belong to a class’ is obscure unless we say what sort of things are eligible for membership. If G = the set of my garden tools, what is G’ ? Is it the set of other people’s garden tools, or the set of all my tools that are not garden tools, or the set of all physical objects that are not my garden tools? Are its members confined to physical objects, or does it contain the numbers, and the plays of Shakespeare, and the rules of etiquette of the Byzantine court in the tenth century? Until we’ve defined our universe of discourse G’ is undefined. Today we usually represent the Universe class as E and the empty class, that has no members, as  . The complement of a set S is then interpreted as the members of the universe set that do not belong to S. Equivalence relations Equality has three important properties: it is: (1) Reflexive a = a (2) Symmetric a = b  b = a (3) Transitive (a = b & b = c) a = c any relation with those three properties is called an equivalence relation If represents an equivalence relation over some set S,  is said to define a partition of S. That means a subdivision into subsets S 1, ...Sn so that every member of S belongs to one and only one subset and any two members, a and b, of the same subset satisfy a  b . Those subsets are called equivalence classes. In general a Partition of a set S is a subdivision of S into subsets with the properties that: (1) Every member of S belongs to precisely one subset, from which it follows that (2) Any two sets in the partition are disjoint, that is they have no members in common. Any satisfactory filing system or classification system must be based on a partition of the material to be classified, so that everything has its place, but only one place. To any partition of a set there corresponds an equivalence relation over S, for the relation that holds between two elements when they belong to the same subset is an equivalence relation. For example, define  as the relation holding between two natural numbers when they both leave the same remainder when divided by 4. In the theory of numbers two numbers so related are said to be congruent modulo 4. Thus 5     partitions the natural numbers into four equivalence classes, one corresponding to each of the possible remaindersand 3; the equivalence classes are : {0, 4, 8,...}, {1, 5, 9,...}, {2, 6, 10,...} and {3, 7, 11, ...} Another example of an equivalence relation is the relation of congruence over the set P of plane geometrical figures. One of its equivalence classes is the set of all circles of radius 15 cm. another is the set of all triangles with sides equal to 11 cm, 14 cm and 16 cm. There are infinitely many equivalence classes, one corresponding to each distinct Page 4 What is Philosophy Chapter 4 by Richard Thompson description of a plane geometrical figure. Some theorists have defined numbers as equivalence classes of one sort or another, and the most popular theories of the foundations of Mathematics base the subject on set theory, so it seemed best to clarify both ideas at the outset, although Peano, whose system I discuss first, didn’t use either sets or equivalence relations in his treatment of natural numbers. Peano’s System Peano’s system, confusingly often referred to as Z, is sufficient to prove all the basic rules of school algebra. Peano developed the theory of natural numbers from their use in counting. For the purposes of the present discussion, the natural numbers are the numbers {0,1,2,...} and are called natural numbers because of their use in counting. Mathematical usage is not completely consistent. For most purposes mathematicians apply the term ‘natural number’ to the set {1,2,3...}, not including zero, but those who work in the foundations of mathematics include zero in the set. During the first year of the operation of the Open University I worked as a part-time tutor for the Foundation Course in Mathematics and recall receiving several memoranda, alternatively saying that zero was a natural number, and that it wasn’t. I don’t recall whether it finally ended up in or out of the set. Peano took as primitive ideas the number zero, and the successor operation that corresponds to the transition from one number to the next in the process of counting. The successor of some number x is represented by x| . The numbers are represented in Peano’s system by the symbols 0, 0|, 0||, 0|||, and so on. Symbols such as 1, 5, 73, are treated as just abbreviations for a zero followed by the appropriate number of dashes. Multiplication is represented by “.” Z comprises: Z1 (x)( y)(x| = y|  x = y) two numbers with the same successor are equal Z2 (x)( 0  x|) zero is not the successor of any number Z3 (x)( x + 0 = x) addition of zero makes no difference Z4 (x)( y)( x + y| )= (x + y)| Z5 (x)( x.0 = 0) multiplication by zero produces zero Z6 (x)( y)( x .y| )= ((x.y) + x) together with an infinite set of axioms for mathematical induction. The induction rules are defined as all formulae of the form: ((P(0) & (x)( P(x) P(x|))) (x)(P(x)) that has the effect that if, for some quality P, it is possible to prove both that zero has quality and P, and given also that, if P applies to any number, it also applies to the successor of that number, then we may infer that P applies to every number. Mathematical Induction Note that in Z mathematical induction is not a single axiom, but an infinite set of axioms generated according to a rule. The principle of mathematical deduction is that, as the natural numbers are defined by the process of counting, a proof may follow the steps of that definition. Page 5 What is Philosophy Chapter 4 by Richard Thompson To prove (x)(P(x)) we first prove P(0), called the basis of the induction, and we then prove that, if P(x) is true for any value of x, it must also be true for the number one greater. That is called the induction step. The two steps of the proof together establish that P is true of any number that can be reached by starting from 0 and counting, in other words that P is true of any natural number. For example Induction may often be used to prove some formulae for sums of series. The sum of the natural numbers from 0 to n is given by: 0 + 1 + 2 + 3+...+ n = [n(n+1)]/2 To prove that let 0 + 1 + 2 + 3+...+ n = Sn and let [n(n+1)]/2 = F(n) The proof by induction is as follows: Basis S0 = 0 and F(0) = [0*1]/2 = 0, so that S0 = F(0) .....(1) Induction step suppose that for some number k, Sk = F(k) ..... (2) (note that we’ve already shown that there is at least one such k, namely 0) adding (k+1) to both sides of (2) gives: Sk + (k + 1) = F(k) + (k+1)  Sk+1 = [k(k+1)]/2 + (k+1) = (k + 1)[k/2 + 1] = (k + 1)[(k+2)/2] = [(k+1)(k + 2)]/2 which is the expression that would be obtained by substituting (k + 1) for n in the original formula = F(k+1), Hence the formula holds for every natural number n Review of Peano’s Axioms Z1 states that if two numbers have the same successor they are equal. Z2 states that zero is not the successor of any number. It is possible to prove from the axioms that every number except zero is the successor of some number. Addition and multiplication are introduced by recursive definitions A recursive definition of a function f is one that defines f(0), the value of the function for zero, and also gives a rule for obtaining the value f(x|) from f(x). Z3 defines the addition of zero and Z4 specifies a rule that reduces the addition y| to the addition of y. The underlying idea is illustrated by: 5 + 3 = 5 + 2| = (5 + 2)| = (5 + 1|)| = ((5+1)|)| = (5+1)||| = (5 + 0)|)||| = (5+ 0)||| = 5||| = (0|||||)||| =...<a few steps omitted>...= 0|||||||| which is what we usually abbreviate to “8” Z5 defines multiplication by zero, and Z6 defines multiplication by y| in terms of multiplication by y and addition. Page 6 What is Philosophy Chapter 4 by Richard Thompson A recursive definition lends itself particularly readily to proof by induction. Note that Z2 precludes there being any natural numbers less than zero, so there are no negative natural numbers. However that does not prevent the system being extended to include additional objects to do the work of negative numbers, though such objects do not satisfy the axioms for natural numbers. One way of introducing negative numbers is to define integers in terms of ordered pairs of numbers (a, b). Think of (a, b) as representing (a - b), so (5, 3) represents 2, and (4, 8) represents -4. Thus many different pairs of natural numbers will represent what we’d want to call the same integer, for instance (4, 8), (7, 11), (0, 4) all represent - 4 Define integer equality so that (a, b)  (c, d) if and only if a + d = b + c, then  is an equivalence relation that partitions ordered pairs of natural numbers into equivalence classes, one class corresponding to each integer. Define integer addition by (a, b) PLUS (c, d) = ([a+c], [b + d]) Define integer multiplication by (a, b) TIMES (c, d) = ([ac+bd], [ad + bc]) Just defining addition and multiplication like this is not sufficient to establish integer arithmetic. We need to show also that the definitions are consistent and correspond to the operations of addition and multiplication for integers. We have defined integers as equivalence classes of pairs of natural numbers, so we have to show that if we calculate, for example (-4) x(-3) we get the same answer whichever number pairs we use to represent the -4 and the -3 In other words we have to show that sum and product as we have defined them are not altered when either (a, b) or (c, d) is replaced by another equivalent number pair. I shall do that in the case of addition. We have to show that if (A, B) (a, b), and (C, D)  (c, d), then (A, B) PLUS (C, D)  (a, b) PLUS (c, d) (1) Suppose (A, B) (a, b), and (C, D)  (c, d), (2) (A, B) (a, b)  A + b = B + a , and (C, D)  (c, d)  C + d = D + c (3) (A, B) PLUS (C, D) = ([A + C], [B + D]) (4) (a, b) PLUS (c, d) = ([a + c], [b + d]) (5) from (1) and (2) A + b = B + a , and C + d = D + c (6) from (5) A + b + C + d = B + a + D + c (7) from (6) (A + C) + (b + d) = (B + D) + (a + c) (8) from (7) ([A + C], [B + D])  ([a + c], [b + d]) (9) from (8) (A, B) PLUS (C, D)  (a, b) PLUS (c, d) Hence PLUS defines an operation on integers. That proof assumes that addition is associative and commutative. Those properties are not explicitly stated in Peano’s axioms, but may be proved from them. Page 7 What is Philosophy Chapter 4 by Richard Thompson To show the connection between the integer operations just defined and the original operations on natural numbers, we note that: (a, 0) PLUS (b, 0)  (a+b, 0) and (a, 0 )TIMES (b, 0)  (ab, 0) so that every integer equal to an integer of the form (k, 0) corresponds to the positive integer + k, also (0, 0) corresponds to the natural number 0, since: (0, 0) PLUS (a, b)  (a + 0, b + 0) = (a, b) and (0, 0) TIMES (a, b)  (ax0 + bx0, bx0 + ax0) = (0, 0) and (k, 0) PLUS (0, k) = (k, k)  (0,0) so (0, k) corresponds to - k We have now defined integers within Z, and can go on to prove that the product of two negative numbers is positive, for (-1)TIMES(-1) = (0,1)TIMES(0, 1) = ([0+1], [0 + 0])  (1, 0) = +1 Thus integers defined as pairs of natural numbers turn out to correspond to our everyday notion of all the whole numbers, whether positive or negative. There is also a number pair method for introducing rational numbers, the set of all whole numbers and fractions, so a good deal of Mathematics can be proved from Peano’s axioms, but, as Gödel proved, Z is nonetheless incomplete. Logicism It had long been generally believed that mathematical truths were a priori though there was no general agreement what that amounted to. Empiricists thought Mathematical truths must be analytic, but thinkers in the Platonic tradition disagreed. As Mathematics has a close affinity to Logic, it occurred to some mathematicians and logicians working in the late nineteenth century that Mathematics might actually be reduced to logic. The thesis that it can be so reduced is called to ‘Logicism’. Frege, although he didn’t use the concept of ‘equivalence relation’ suggested that numbers could be identified with equivalence classes of sets. In 1884 he published Die Grundlagen der Arithmetik in which he examined a number of popular accounts of number. He was particularly concerned to rebut two common misconceptions. (1) that a number is a property of physical objects, or collections of objects, and (2) that arithmetical truths are inductive generalisations based on our experience. Against (1) Frege pointed out that here is no unique number appropriate to any object or collection of objects. One pack of playing cards is also 2 colours 4 suits, 13 denominations and 52 cards. Furthermore there is no need for objects to be physically collected together to be numbered. If a couple have seven children, they have seven children whether the children all live at home, or all in separate homes of their own. A number, Frege argued, cannot belong to a physical object, or class of such objects, but only to a concept. If there is just one pack of cards on the table the concept ‘Pack of cards on the table’ is associated with number 1, ‘Suit of cards on the table’ with number 4 and ‘Playing card on the table’ with number 52. Page 8 What is Philosophy Chapter 4 by Richard Thompson Digression: Sets or Classes In the late nineteenth and early twentieth centuries the terms ‘aggregate’ ‘class’ ‘collection and ‘set’ were used interchangeably. Today ‘set’ and ‘class’ have been singled out for use in special senses, but that was only after Russell had discovered an underlying problem. In discussing the really stages in the attempt to develop a foundation for Mathematics I shall follow Russell and Whitehead in using the work ’class’ . Return to Frege and Russell Frege made a proposal equivalent to specifying that (Natural) numbers be defined as the equivalence classes of classes defined by the equivalence relation of one-one correspondence. Two classes S and T were said to be of the same cardinality if their members could be paired off so that every member of S is paired with precisely one member of T and vice versa. (Frege attributed the idea to David Hume.) The natural numbers were then identified with the equivalence classes defined by that relation. It remained to produce a specimen class for each equivalence class so we can say which natural number is which. Does that mean we need to assume the existence of objects of some sort? three objects to define the number three and a million objects to define a million? Frege and Russell thought not. Even if nothing existed there would still be the empty class, {} = , the class with no members. Zero is the class of classes of the same cardinality as the empty class, . One is the class of classes of the same cardinality as the class which has the empty class as its only member, i.e. {}, two is the class containing zero and 1, namely {, {}}, and for any natural number n, an example of an n membered class is the class containing all the natural numbers less than n. Thus an example of each natural number was to be constructed, ultimately, from the empty class, with no need to assume that anything not a class actually existed. I suspect that Frege and Russell thought that no justification was needed for assuming the existence of the empty class, since it was established by definition, and that definition does not appear to assume the existence of anything. Russell’s Paradox Frege’s original system was proved inconsistent by Russell, who derived a contradiction from the supposition that there might be a class having as its members just those classes that are not members of themselves. That paradox arises from the seemingly innocuous assumption of extensionality, that for any predicate there is a class of particulars of which that predicate is true. Extensionality would imply that for any quality F there is a class { x: F(x)} the class of objects that have quality F { x: F(x)} is called the extension of F. A predicate that does not apply to anything has the empty class as its extension. Russell proposed letting F(x) = x  x, ‘x is not a member of itself ’ so that the class Page 9 What is Philosophy Chapter 4 by Richard Thompson S = {x: F(x)} is the class of all classes that are not members of themselves. Unfortunately to assert the existence of such a class is paradoxical, since if that class S belongs to itself, it is not one of the classes that do not belong to themselves, and so fails the test for membership of S, and so cannot belong to itself, while if S is not a member of itself it satisfies the test and ought to be a member of itself. Treating it formally, S = {x: F(x)} = { x: x  x)} then S  S  F(S)  S  S giving a contradiction. S belongs to itself if, and only if, it does not. This shows that naive class theory, the collection of our common sense expectations of classes (insofar as we have such expectations) is inconsistent. In Principia Mathematica Whitehead and Russell listed seven paradoxes, remarking that they were examples of an infinity of possible paradoxes. Their first example, the paradox of the liar, is of great antiquity, originating with the remark of Epimenides the Cretan that all Cretans were liars. It is usually presented in the simpler form: P1: ‘This proposition is false’ If P1 is false, it follows that it is true, and if it is true it must, as it says, be false. They also included Russell’s paradox of the class of all classes not members of themselves, an analogous paradox about relations, the Burali-Forti contradiction about the class of all ordinal numbers, which I discuss later in this chapter, and three paradoxes about definitions of numbers of which I shall give just one example, known as Berry‘s paradox. Berry’s Paradox arises from the consideration of the definitions of numbers by English sentences. In general larger numbers need longer definitions than smaller numbers. That is only roughly true, because ‘seven hundred and forty three’ is a longer phrase than ‘one million’, but as we consider progressively larger numbers even the ‘round’ numbers will need progressively longer definitions. Consider for instance 100000000000000000000000. Let us define the number N so that N = the smallest number that is too large to be defined in fewer than sixteen words. Now consider the length of that definition; it contains only fifteen words which is less than 16, contradicting the terms of the definition. Whitehead and Russell (henceforth ‘W&R’) noted that the paradoxes had in common the property of self reference and proposed to avoid such paradoxes by introducing a Theory of Types, designed to prevent self reference. W&R argued that the idea of something being a member of itself does not make sense. Every individual should, they suggested, be assigned to a type. The basic type contains individuals that may belong to classes, but cannot themselves have members. Page 10 What is Philosophy Chapter 4 by Richard Thompson Let’s call those individuals of type 0. A class of type 0 individuals would then be of type 1, and so on. An object of type n may contain only objects of type n-1 and may belong only to objects of type n+1. Thus no individual could sensibly be said to belong to itself. A proposition of the form ‘S  S’ is just nonsense, and cannot be either true or false because it doesn’t say anything. I think the intuitive appeal of that argument is that we think of a class like a bag. A bag may contain many things, even other bags, but it cannot contain itself, nor may a bag B contain another bag that contains B, or contains anything that contains anything that in turn contains B and so on. Not all the paradoxes can be avoided by distinguishing types of individual. To avoid the paradox of the liar, or of the number that can’t be defined in fewer than sixteen words, Russell elaborated the theory of types by distinguishing different levels of language. ‘Grass is green’ would count as level 1, ‘It is true that grass is green’ would count as level 2 ‘It is not certain that it is true that grass is green’ would be of level 3, and so on. So elaborated the Theory of Types did appear to avoid all the known paradoxes, but only at the cost of considerable technical complications. Although they said a great deal about classes, W&R thought that they were not actually committed to the existence of such entities, insofar as the existence of a class involves anything more than the existence of its members. The primary notion, they held, was not class but the property used to define a class. References to classes were to be given ‘contextual definitions’ in terms of the common properties of their members. The only meaningful statements about classes were to be those covered by such definitions. In isolation the symbol for a class had no meaning. In accordance with the Theory of Types, only properties of a certain sort could be used to define a class. They were extensional propositional functions. A propositional function is what I have called an open sentence (see chapter 2), something like ‘x is a fish’, which turns into a meaningful sentence when the name of some physical object is substituted for the ‘x’. The class of fishes is then represented by the formula {x: x is a fish}. (W&R actually used a different notation that is beyond the capacity of my word processor.) However {x: x is a fish} has no meaning in isolation; only when used in one of certain specified ways does it generate a proposition. For example it can be used to generate the proposition: {x: x is a fish} {x : x is a vertebrate}, which means the class of fishes is a subset of the class of vertebrates. The same proposition can be expressed as: ‘all fishes are vertebrates’ written as (x)(x is a fish  x is a vertebrate) Any finite class is easily defined in terms of identity. For instance x  {2,4,5} is equivalent to (x = 2)  (x = 4)  (x = 5) Page 11 What is Philosophy Chapter 4 by Richard Thompson However not every propositional function defines a class. Basic to the idea of a class is that it is completely specified when its members are specified, which is why W&R required that the defining propositional function should be extensional. By that they meant that truth should depend only what individuals are referred to. Thus {x: the Bishop believes x is a good man} = the class of men whom the Bishop believes to be good does not define a class, because the Bishop might believe that the Vicar of Bray is a good man, and also believe that the burglar who broke into the Episcopal palace last night is not a good man, even though, unknown to the bishop, the burglar was the Vicar of Bray. ‘The Bishop believes x is a good man’ is not extensional, because the truth of any proposition generated from it depends not only on the identity of whatever individual is substituted for the ‘x’, but on how that individual is described. W&R found that their theory of types made it impossible to implement this analysis of references to classes. To use classes as a basis for Mathematics they needed to be able to refer to all the classes that have a particular individual, i, as a member. On their analysis of classes, that involved referring to all the properties of any individual, but that was forbidden by the theory of types which distinguished different orders of proposition about any individual. All the theory of types allowed was reference to all the properties of a particular order, so one could talk of ‘all the first order properties of i’, ‘all the second order properties’, and so on, but not just ‘all the properties of i’ W&R defined the predicable properties of any individual as the properties of order one higher than the individual. Thus green, hard, and light are all predicable properties of physical objects, since all three properties are of the first order, while physical objects are of order zero. On the other hand ‘colour predicate’ is a second order property since it qualifies colours, which are themselves first order properties, so ‘colour predicate’ is a predicative property of green and ‘green’ is a predicable property of grass, but ‘colour predicate‘ is not predicable of grass. ‘Simon’s Shirt contains all the colours of the rainbow’ is not a predicable statement about Simon‘s shirt, since ‘colour of the rainbow’ is a second order predicate, while ‘Simon’s Shirt’ is a basic individual of type zero. W&R then introduced an Axiom of reducibility according to which, for every extensional property of an individual, there is an extensionally equivalent predicable property, in other words there is a predicable property that applies to precisely the same individuals. Thus for any extensional property  there is a predicable property  that applies in precisely the same cases so that (x)( (x)  !(x)) the exclamation mark is to emphasize that  is predicable. W&R’s example was ‘Napoleon had all the properties that make a great general” That is a non predicable proposition about Napoleon as it refers directly to Napoleon’s qualities not to Napoleon. The equivalent predicable proposition will attribute to Napoleon a list of all the qualities that actually do make a great general, so it is something like ‘Napoleon was brave, cunning, cool, calculating, meticulous in his planning....’ I’m not sure about that analysis. Napoleon’s possession of properties such as those must be the justification for judging him to be a great general, but it does not follow that the judgement is just a recitation of the considerations that justify it. Page 12 What is Philosophy Chapter 4 by Richard Thompson The apparent need to introduce the Axiom of Reducibility made it far less plausible to claim that Mathematics had actually been constructed within logic, since the axiom had been introduced specifically to facilitate that construction, not to meet any requirement of logic in general. It may actually not even have been necessaryFor the axiom was introduced to avoid problems created by distinguishing between different orders of property, a distinction that had been made to avoid certain of the logical paradoxes. However not all the paradoxes required that distinction. Those that did were the so called semantic paradoxes, like the liar and the least number that can’t be defined in a certain number of words. Those paradoxes arise from confusions about the meanings of words in everyday discourse, and no such paradoxes could arise in a formal system like those used in formal Logic and Mathematics. There is no way to translate into logical symbolism ‘This statement is false’ or ‘Number N cannot be defined in fewer than 16 words’ so there is no need to complicate mathematical logic to avoid the semantic paradoxes, and therefore no need to introduce the Axiom of Reducibility to ameliorate the complications. Intrinsically logical paradoxes such as Russell’s Paradox can be avoided by the simple theory of types. Quine later constructed a modified version of the W&R system using only a simple theory of types. However most mathematicians interested in the Foundations of Mathematics have rejected Logicism in favour of Formalism, a different approach mainly due to David Hilbert (1862-1943). Formalists have usually preferred to build Mathematics in set theory instead of in the predicate logic, so the next step is to consider the development of mathematical set theory. Zermelo Set theory Working independently of Russell and Whitehead, their contemporary Ernst Zermelo (1871-1953) tried to develop a set theory free from paradoxes by restricting the sets assumed to exist. Incidentally ‘set’ has come to be reserved for individuals that can both have members, and can also themselves be members of other sets. A collection that cannot itself be a member of anything else is called a ‘class’ or often a ‘proper class’. There is no set of all sets, so ‘all sets’ is not a set but a proper class. Zermelo assumed some domain of objects with a relation so that a b asserts that a is a member of b. His axioms were: Zo 1: Two sets are equal if and only if they have the same members. Zo 2: The existence of elementary sets. 2.1: There is a null set  with no members. 2.2: For every element x there is a set {x} of which x is the only member. 2.3: For any two elements x, y there is a set {x, y} containing x, y and no other element. Zo 3: Separation. If the predicate P is defined for all the members of a set S, then there is a set consisting of all the members of S for which P is true; that set is usually represented {x: P(x) & x S Page 13 What is Philosophy Chapter 4 by Richard Thompson Zo 4: Power set: For any set S there is a set consisting of all and only the subsets of S Zo 5: Union. For any set S there is another set, called the union of the members of S, consisting of just those elements which belong to members of S Zo 6: Axiom of choice. If S is a set of which all the members are non-empty sets, then the union set of S has at least one subset which has exactly one member in common with each element of S Zo 7: Axiom of Infinity: There is at least one set, Z, such that one of its members is the null set, and if x belongs to Z, so does {x} How far are these axioms intuitive in the sense of expressing our concept of set?. The difficulty is that until we study Logic or Mathematics we don’t have much intuition about sets, and any sense of intuition we feel later may just be a reflection of the climate of mathematical opinion at the time we learnt our set theory. I think people usually imagine a set either as a collection of objects, or as a list of objects. Both pictures are liable to be misleading. ‘Collection’ is misleading because there is no need to do any physical collecting, and not even any possibility of doing it in the case of abstract objects. ‘List’ doesn’t apply to infinite sets. Georg Cantor (1845-1918) said “A set [he used the word Menge ] is a collection into a whole of definite distinct objects of our intuition of our thought” Nonetheless I think that when people find at least some of Zo1 to Zo7 intuitively plausible, it is because they are thinking of either collections or lists. As I’ve already remarked, I suspect that what Mathematicians and Logicians call their ‘intuitions’ about sets are really their intuitions about bags, and that there are really no intuitions specifically about sets. An alternative is to think of set theory as a game invented by Zermelo and others, and to say ‘ Can a game be played according to these rules; are they consistent?’ For any formal system consistency can be established by producing a model. (discussed later) Finite sets can be modelled by either collections or lists so for Zo1 to Z05 we can safely rely on intuition even if it is derived from those defective pictures of set theory. I’m quite unable to conjure up much of an intuition for Zo7. Axiom 7, the axiom of infinity corresponds to Frege’s construction of specimens of the numbers as , {}, {,{}}, {,{}, {{}}}, ... so we definitely need it, but what intuitive appeal it has seems to me no more than our desire to get the natural numbers. Zo3, which permits the definition of sets of elements possessing a common quality, does not permit the definition of the set of all sets not members of themselves, because the axiom permits only the definition of a new set that is a subset of some set already established, thus Zermelo’s system avoids Russell’s paradox. Zo6 is controversial, and Mathematicians like to go as far as they can without using it, but it is at least obviously true in the case of finite sets. Although they are not explicitly mentioned in any of the axioms, the three simple binary operations on sets, union, intersection and difference are provided for. Page 14 What is Philosophy Chapter 4 by Richard Thompson Given any two sets S and T, Union: Zo 2.3 guarantees a set {S, T} and Zo5 entails that there is a set R containing precisely the objects that belong to either S or T. R is ST Intersection: In Zo3 let P(x) be ‘x  T’ then {x: P(x) & x S = {x: x  T & x S = S  T Difference: In Zo3 let P(x) be ‘x  T’ then {x: P(x) & x S = {x: x  T & x S = S - T = S  T’ The existence of a set {x} for any x, together with the existence of a union for any two sets, guarantees the existence of a set containing any finite list of elements we care to specify and also, therefore, the existence of a union for any finite list of sets. However the union of an infinite list of sets is not available unless those sets are themselves members of some set, and there is no guarantee that they will be. For instance there cannot be a union of all sets, because there is no set of all sets. The Axiom of replacement (see below) allows unions for some infinite lists of sets. Zermelo’s system has since been tidied up in various ways, mainly by Fraenkel and the amended system is usually referred to as ‘ZF’ Von Neumann later proposed an axiom of Foundation, to rule out infinitely descending chains of sets like: x1 x, x2  x1,...xn+1 xn.... The axiom of foundation states that every non-empty Set S must contain an element T, such that S and T have no common element. That prohibits the identification of a one membered set with its only member, since if x = {x} we should have x  x and x would have no member disjoint from x. That consequence was unacceptable to Quine whose liked to avoid postulating abstract entities unless it was absolutely necessary. If we do identify a one membered set with its only member, we can construct an infinite descending chain by setting xn = x for all n. In the light of that simple example the idea of an infinite descending chain no longer seems counter intuitive. When I first encountered the idea of an infinite descending chain I found it baffling because each set in the chain appeared to have no definite membership, because at least one of its members was another set with the same frustrating characteristic. However that feeling arose because I was treating the description of the chain as a definition of its members. Were the members of the chain defined in some other way, from which their relative positions in the chain followed, there need be nothing puzzling about the relationship. I have not been able to discover enough of the background to the subject to find why infinite descending chains were considered objectionable. Was it that some proof emerged of the existence of an unwelcome chain? Or was it just that someone realised that a chain was a theoretical possibility not inconsistent with the basic Zermelo axioms, and recoiled from mere the possibility ? The fact that the axiom had to be added as an afterthought underlines the weakness of our intuitions of sets. Furthermore if infinite downward descent is undesirable, that seems to weaken the case for the (essential to Mathematics) infinite Page 15 What is Philosophy Chapter 4 by Richard Thompson ascent of Zo 7. Another addition, due to Fraenkel, was the Axiom of Replacement, that if S is a set, then replacing the members of S by anything in the domain produces another set. So that is S is a set, and R is a function with a unique value for every member of S, there is also a set T = {t: t = R(s), s  S} The Axiom of Replacement was introduced to allow the definition of more sets. Zo 7 guarantees one infinite set, Z, and replacement applied to Z gives more. In particular the union of all the sets in an infinite list is now available. Fraenkel was particularly interested in allowing the unions of certain infinite sets, the members of which were themselves infinite sets, and some distinctly counter-intuitive consequences concerning large infinite cardinals eventually emerged. While Peano’s axioms could be said to follow from our idea of number because they agree with our intuitions, the same cannot be said for all the axioms of Zermelo set theory and some of the proposed extensions to that theory seem even less intuitive. Formalism While Frege, Whitehead and Russell were trying to construct Mathematics within Logic, Hilbert worked to reduce Mathematics to a collection of formal systems, in which mathematical Logic was augmented by specifically mathematical axioms. The background to Formalism was axiomatic geometry, which was the starting point of Hilbert’s own work. For two millennia Euclid’s axiomatization of geometry was accepted without challenge, but doubts arose during the nineteenth century. Mathematicians were particularly concerned about the parallel postulate which guarantees that, given a straight line L and a point P not in L, there is a unique line through P parallel to L. Attempts to prove the parallel postulate from the other axioms failed, and it was eventually shown that it could not be proved since replacing the postulate by other axioms did not give a contradiction but instead produced alternative geometries, in which there were either no parallel lines at all, or more than one line through P parallel to L. Some of those geometries were locally indistinguishable from the Euclidean. A familiar example of a non-Euclidean geometry is spherical geometry, treating of figures on the surface of a sphere. In spherical geometry the place of straight line is taken by the great circles, namely the circles with centres at the centre of the sphere. On a globe the lines of longitude are great circles through the poles, and the equator is also a great circle. The shortest distance between two points, measured in the surface of the sphere, lies along the great circle joining the points, so aeroplanes often fly along great circle routes. There is usually precisely one great circle joining two points in the surface of a sphere, except when the two points are at the opposite ends of a diameter, like the North and South poles, where there are infinitely many great circles joining them. The great circles are the nearest equivalent in spherical geometry to the straight lines of Euclidean Geometry. However, any two different great circles meet in two points, so there are no parallel lines in spherical geometry, and in a spherical triangle, formed by the arcs of three great circles, the sum of the angles depends on the area, and is in all cases Page 16 What is Philosophy Chapter 4 by Richard Thompson greater than the 180 degrees which is the sum implied by the Euclidean axioms. o If N is the North Pole, and A and B are the two points on the equator, longitude 0 at o A and longitude 90 W at B, the spherical triangle NAB has all three of its angles right o angles so that its angle sum is 270 . Yet if we confine ourselves to a relatively small part of the earth’s surface, such as a town, we can treat the surface as approximately a plane and apply Euclidean geometry. If we mark out a triangle on the school playing fields and measure its angles we should be unlikely to find their sum significantly different from o 180 ; that does not imply that the earth is flat, just that the playing fields are small enough for plane geometry to be an adequate approximation. Similarly the fact that Euclidean geometry seems adequate for everyday purposes does not imply that it is universally true, just that it is adequate to describe the part of space we inhabit sufficiently accurately for our purposes. In principle the non-Euclidean geometries could be tested by measuring the three angles of a triangle. The various non-Euclidean geometries imply sums differing from 180 degrees by quantities that increase with the area of the triangle. No appreciable deviation from 180 has been observed in the conventional geometry of space, but every physical measurement is subject to some error and the best we can do is to show that the angle sum of every triangle we have actually been able to measure differs from two right angles by less than the margin of error in the measuring procedure. So if we regard geometry as describing the space we live in, we can’t be sure that the parallel postulate is true, and even though we are not sure of that we can still describe our experiences perfectly. That refutes Kant’s claim that Euclidean geometry is a body of synthetic a priori truths, so called because they are true in every conceivable world, for we can conceive worlds in which Euclidean geometry would be false, even though we cannot be sure that ours is such a world. There seems to be a choice between regarding Mathematics as a body of contingent truths describing the world, or as a body of necessary truths describing abstract entities analogous to Plato’s Forms. Logicism provided one way of avoiding both those alternatives, but not the only way. Hilbert suggested another way by proposing that Mathematics might be developed as a study of formal systems without actually specifying any meaning for them. Formal Systems and Interpretations One could think of a formal system as a game played with symbols according to rules allowing us to progress from some combinations of symbols to others in certain ways. Such progressions correspond to mathematical proofs. It is possible to interpret such a game by setting up a correspondence between formulae and their transformation rules on the one hand, and some set of individuals and the rules for manipulating them on the other. As an example take the propositional logic. In chapter 2 I introduced the logical symbols by saying what each symbol meant, but it would be possible just to tell someone the rules for constructing truth tables or proofs, without giving any interpretation. That would be like programming a computer to produce truth tables and to say ‘tautology’ or ‘contradiction’ where appropriate. We should then have an uninterrupted formal Page 17 What is Philosophy Chapter 4 by Richard Thompson system. When we construct a formal system it is because we intend to use it to represent some body of knowledge, and we therefore have in mind particular meanings for the various symbols involved, but formalists thought it essential that the rules for operating the system make no appeal to those meanings or to the intended application of the system. Several quite different interpretations of the same system are usually possible; sometimes there are even several different interpretations that we find useful. For instance in the case of the propositional logic, the interpretation of the logical symbols that logicians had in mind when they originally invented the system is only one interpretation among many. A purely arithmetical interpretation is possible. Let P, Q, R...be numbers taking either of the values {0, 1} Let P = 1- P, P & Q = PQ, (P V Q) = P + Q - PQ, (P  Q) = 1 + PQ - P, and (P  Q) = PQ + (1 - P)(1 - Q). The tautologies are then the formulae that equal 1 for all possible combinations of values of their variables, and the contradictions are the formulae equal to 0 for all values. It is also possible to interpret the truth functional logic as an algebra of sets, with & representing intersection and V representing union. The axioms of geometry also admit of alternative interpretations. By inventing co-ordinate Geometry Déscartes showed that they can be interpreted arithmetically. The Properties of a Formal System A formal system usually consists of: (1) A set of symbols (2) Syntactic (grammatical) rules specifying which formulae are properly constructed - such formulae are called well formed formulae (wff’s for short) (3) A set of wff’s called the axioms. If the set of axioms is finite they may be listed, but if they are infinitely many there must be a decision procedure (algorithm) that will determine, for any arbitrary wff , whether or not it is an axiom (4) A set of Rules of Inference that allow one to construct theorems. All axioms are theorems, and the rules of inference allow new theorems to be derived from ones already known. The propositional calculus is a particularly simple formal system since there is a test (constructing the truth table) to find out which formulae are theorems. Such a test is called a decision procedure. If a formal system has a such decision procedure, it is possible to dispense with (4) by making all the valid formulae into axioms. That is what we were doing when we defined tautologies in terms of truth tables. Whitehead and Russell originally gave a set of five axioms and various rules of inference from which all the tautologies could be Page 18 What is Philosophy Chapter 4 by Richard Thompson deduced. Consistency We always require systems to be free from contradiction. One way of showing that a system is consistent is to produce a model (interpretation) of the system to show that it can be used to represent something that is clearly consistent. I shall give a detailed specification of a model of a formal system later, but for the moment a simple example should suffice to show what is involved. The arithmetical interpretation of the propositional logic in terms of calculations with the numbers 1 and zero showed that the propositional logic was consistent. To demonstrate consistency like that the model has to be simple enough for us to see that it is consistent, which is easiest if the interpretation is finite. However Peano’s system and all the stronger systems developed to accommodate progressively more mathematics, have no finite models, they were after all constructed to guarantee the existence of infinitely many numbers. However, in the absence of a simple interpretation there is another way to establish consistency. In a system that includes basic logic, a contradiction implies anything at all, so that in a contradictory system every wff will be a theorem. So to show consistency, it suffices to find one wff that is not a theorem. The strategy is to find some property satisfied by all the axioms, and preserved by all the rules of inference, and then to find some wff that does not possess that property. Often a convenient property is truth under some interpretation but any property would do. One might suppose that to prove the consistency of a formal system S should require some even stronger system S*, so that if the consistency of S is doubtful, that of S* must be even more dubious, making the proof of little value. However a consistency proof need consider only a few properties of a formal system, so even though S* may be stronger than S in the single respect of being able to prove the consistency of S; it might in other respects be much weaker, or so the Formalists hoped. In the early twentieth century it was hoped that consistency proofs might be finitary in the sense “that the discussion, assertion, or definition in question is kept within the bounds of thorough-going producibility of objects and thorough-going constructibility of processes, and may accordingly be carried out within the domain of concrete inspection” (Hilbert and Bernays Grundlagen der Mathematik) Hilbert and members of his school looked for proofs such that (1) they were constructive in the sense that any entities referred to could be produced for our inspection (2) all processes involved were guaranteed to be completed in a finite number of steps, that number always being within some bound that is known in advance. Initially the formalists made encouraging progress, proving the consistency of the propositional logic and the first order predicate logic. In the case of the propositional logic we need consider only two truth values for n each variable, and the truth table for any formula contains precisely 2 rows. That was how I defined truth functions, but some systems introduce truth functional logic axiomatically. Such systems can be shown consistent by showing first that every axiom has a truth table with a ‘T’ in every row, and second that the rules of inference preserve Page 19 What is Philosophy Chapter 4 by Richard Thompson that property, so that anything deduced from a tautology is itself a tautology. A similar procedure can be used for a subset of the predicate logic. Consider formulae involving just one place predicates. Consider for example a formula that contains two such predicates, as in  x )(F(x)) & (x)( G(x) )]  x )(F(x) & G(x)) I shall show this subset of predicate logic is consistent by showing that that formula is not a theorem. In assessing the truth of the various components of that formula it suffices to consider individuals of just four different kinds. Those that are both F and G, call them FG’s, those that are F but not G, call them Fg’s, those that are not F but are G, fG’s, and those that are neither F nor G, fg’s. If there are any FG individuals, every component proposition is true so (1) reduces to T  T, which is true. If all the objects in the domain are fg, every component proposition is false, so (1) reduces to F  F which is again true. However, if the domain contains no FG’s, but does contain both Fg’s and fG’s x)(F(x)) and (x)( G(x) ) are both true, but x )(F(x) & G(x)) is false, so (1) reduces to T  F which is false, so (1) is not a theorem. On the other hand (2) x)(F(x))&(x)(G(x))]  x)(F(x)&G(x)) is valid. It could only be false if x )(F(x)&G(x)) were false while x)(F(x))&(x)(G(x)) were true, but ifx )(F(x) & G(x)) were false there must be at least one object that is not FG, and hence either Fg, or fG or fg, in which case at least one of x)(F(x)) or (x)(G(x) must be false, making x)(F(x))&(x)(G(x)) false so that the whole expression would be true. In that example I abbreviated the procedure by grouping together interpretations that should strictly speaking have been considered separately. As there were four different types of object (FG, fG, Fg, and fg), a full treatment would have required consideration of fifteen different possibilities, since each of the four types could be either 4 present or not present in the domain, giving 2 significantly different types of domain of individuals, from which we must subtract one to exclude the possibility that none of the four types is present at all, which would imply an empty domain. With a decision procedure for theoremhood, it is easy to prove consistency - we have already done so for the one place predicate logic by showing that formula (1) is not Page 20 What is Philosophy Chapter 4 by Richard Thompson a theorem. No such decision procedure is available for logic with two place or more place predicates, for once we introduce a predicate such as L(x,y) = x loves y there are infinitely many different types of individual. There are people who love no-one, people who love one person, people who love two people, and so on, giving an infinity of types even without considering more complicated cases like people who love three people, two of whom reciprocate their feelings, and are loved by seven other people none of whose affections they return. However we can show the consistency of the predicate logic, even with multi-place predicates as follows. Simplify all formulae by omitting all quantifiers and variables, and treat all the predicate letters as if they were propositional variables. It is possible to show that when so treated, every valid formula is turned into a tautology. For instance (x)(P(x)) reduces to P, which is not a tautology, so (x)(P(x)) is not a valid formula The tautologies so obtained are not, of course, equivalent to the original formulae. Some formulae that are not valid will reduce to tautologies so this procedure is not a method of establishing that individual formulae are valid. For instance the formula: (1) x )(F(x)) & (x)( G(x) )]  x )(F(x) & G(x)) which I discussed above yields a tautology, even though it is not valid. The important thing is that every valid formula will be reduced to some tautology. Therefore any formula that does not reduce to any tautology is not valid. As there are some formulae that don’t reduce to tautologies, there are some formula that can thus be shown not to be provable, so the system is consistent. In particular there cannot be two provable formulae related in the same way as F and F since no two formulae so related could both reduce to tautologies. If one reduced to a tautology T, the other would reduce to T, which would be a contradiction. Proving the consistency of arithmetic turned out to be much harder. Consistency proofs of considerable complexity were produced for some fragments, notably for the arithmetic of natural numbers using addition but not multiplication, but the enterprise came to a halt when Gödel proved that any system that can prove its own consistency must in fact be inconsistent. To be more precise he proved that if, in a system S, it is possible to construct a formula F, such that F is true if and only if S is consistent, then F is provable in S if and only if S is inconsistent. He also showed how to construct such a formula F for arithmetic. At first glance that might not appear particularly worrying. Consistency proofs always had been performed outside the system being proved consistent. In any inconsistent theory all formulae are provable, so if an inconsistent theory can express its own consistency it will also be able to prove it. So a consistency proof constructed within a theory would not provide any reason for believing that the system actually is consistent. The discovery that many systems can only prove their own consistency if they are inconsistent was both surprising and interesting, but even without that discovery it should have been plain that, if the consistency of a theory was in any doubt, a consistency proof in that same theory would have been worthless. Page 21 What is Philosophy Chapter 4 by Richard Thompson Oddly enough Formalists had hoped to construct consistency proofs using only very simple inferences which would be a subset of arithmetic. Hence the proofs the formalists sought would in the case of arithmetic have been capable of incorporation in the system being proved consistent and therefore of the type that should have been perceived to be useless even before Gödel rammed the point home. Thus theories such as Peano’s and stronger theories can only be proved consistent in theories in all respects stronger than themselves, which makes the formalist program seem no longer very interesting. Algorithms and Computers A formal system is said to be effectively axiomatisable if there is a decision procedure for deciding whether or not an arbitrary formula is an axiom, and a decision procedure for deciding whether or not any proposed inference conforms to the rules of the system. A satisfactory decision procedure must be one that can be relied upon to give a definite answer in every case after only a finite sequence of steps. Such a procedure is an algorithm. Church’s Thesis (named after Alonzo Church (1903-1995)) says that the set of algorithms is identical with the set of problems that can be solved by computer. No one has been able to offer a proof for Church’s Thesis, it’s hard to know where one would start, but the thesis is thought to be highly plausible, especially as it has been shown that the set of computable functions is the same as the set of recursive functions. Logicians usually discuss computing in terms of an idealised rather simple computer called a Turing Machine, named after Alan Turing. I’m not sure whether anyone has ever actually made a Turing machine, but such machines can be emulated on the much more complicated computers that we actually do use. It has been shown that any recursive function can be evaluated on some Turing machine Gödel and Incompleteness During the early decades of the 20th century Mathematicians still hoped that the whole of Mathematics might eventually be represented by a formal system until, in 1931, Kurt Gödel proved that this was impossible. He showed that any effectively axiomatisable system capable of expressing even a substantial part of Mathematics, must either be inconsistent, or else incomplete, in the sense that its notation must permit the expression of a true proposition that cannot be proved in the system. Inconsistency is of course completely unacceptable, since, as we’ve already seen, a contradiction entails anything whatever, so that in an inconsistent system every wff would be a theorem and there would be no distinction between truth and falsehood. We therefore have to accept that any formal system for Mathematics will be incomplete. Gödel’s strategy was to construct formulae that had a sort of double meaning. Within the formal system they were mathematical statements, asserting that a certain equation has no solutions, but considered from outside the system they could be seen as asserting their own unprovablity. Gödel’s first step was to define what has come to be called a Gödel numbering of the formal system in which every symbol is assigned a number, and there is a rule such that, given any sequence of symbols, a number can be calculated for the sequence from Page 22 What is Philosophy Chapter 4 by Richard Thompson the numbers of its components. No two sequences of symbols have the same Gödel number, so any Gödel number can be decoded to give the original sequence. Gödel then defined a proof predicate P(x,y) to represent provability in the following way. If A is a series of formulae that constitute a proof of some formula B, and if the Gödel numbers of A and B are g(A) and g(B), respectively, then P[g(A), g(B)] is true if and only if A is a proof of B ‘Formula B is provable’ can then be presented by (x)[P(x, g(B)] and ‘Formula B is not provable’ by (x)(P(x, g(B)) Define ‘Formula(x)’ = the formula with Gödel number x Define the diagonalisation of a formula A as: (x)[A=Formula(x) & A] Suppose now that ‘A’ contains ‘x’ as its only free variable, the diagonalisation of A then asserts that A applies to its own Gödel number. Gödel showed that there is a recursive function n  diag(n) that calculates the Gödel number of the diagonalisation of the formula A that has n as its Gödel number. Now consider (k)(P(k, diag(x)) which asserts that diag(x) is not the Gödel number of a provable formula Let M = g((k)(P(k, diag(x)), so that M is the Gödel number of (k)(P(k, diag(x)) So Formula(M) = (k)(P(k, diag(x)), and Diag(M) = (k)(P(k, diag(M)) so that Diag(M) is the Gödel number of a true formula if and only if Diag(M) is not the Gödel number of a provable formula. M is provable only if it is false, so that it is provable only in an inconsistent system. Yet if it is not provable, it is true. So any consistent system powerful enough to define Diag must be incomplete. An important consequence of Gödel’s incompleteness theorem is that we cannot define Mathematics by giving a formal system, since no effectively axiomatisable system can contain the whole of Mathematics. Gödel’s incompleteness theorem applies to any effectively axiomatisable system at least as strong as a certain system Q, where Q is defined as the predicate logic combined with the following mathematical axioms: Q1 (x)( y)(x’ = y’  x = y) Q2 (x)( 0  x’) Q3 (x)( x  0 ( y)(x = y’)) Q4 (x)( x + 0 = x) Q5 (x)( y)( x + y’ = (x + y)’) Q6 (x)( x’.0 = 0) Q7 (x)( y)( x .y’ = (x.y) + x) The notation is the same as that used in Peano’s axioms. x’ means the successor of x. ‘ Page 23 What is Philosophy Chapter 4 by Richard Thompson is the successor operation that generates any number except zero from its predecessor. Q is weaker than Peano’s system as it does not include mathematical induction, but it does include all the other axioms of Z and also includes Q3 which is a theorem of Z, but which, in the absence of the induction axioms, cannot be proved from the other axioms in Q. All the individual arithmetical facts are provable in Q, but the commutative rules for addition and multiplication are not, so Q is a relatively weak system, but it is strong enough to define all the recursive functions and hence powerful enough to support the definitions of the functions Gödel used in his proof. Gödel’s theorem has no direct bearing on the parts of mathematics with which most people are familiar. Simple arithmetical truths, of the form a + b = c, and a x b = d can all be proved, they all follow from Peano’s axioms, and indeed from the weaker system Q. All the elementary algebraic formulae follow from Peano’s axioms. It is only algebraic propositions about unspecified numbers that are sometimes undecidable, and of those only propositions involving both addition and multiplication. There is a complete system for numbers with addition but no multiplication, and another complete system for numbers with multiplication but no addition. The Gödel sentences are in fact all true. The undecidable sentences for which Gödel provided a method of construction assert that a certain equation has no solutions. The equation is an eighth degree polynomial in eighty variables, with at least one of its coefficients some thousands of (decimal) digits long. Different formal systems will give rise to different equations, but all are of the form just specified. If such an equation actually has a solution, its solubility is provable in Q, just by exhibiting the solution and verifying by calculation that it is correct. To do that requires only the ability to perform arithmetical calculations for which even system Q suffices. Hence the proposition that an equation has no solution could be disproved if it were false, so, if undecidable, it must be true. Of course Gödel’s result does not show that there are mathematical truths that cannot be proved at all. What it shows is that, given any system S for arithmetic, there is some truth that cannot be proved in S. It is important to note that Gödel’s proof is constructive, that is it shows how, working in a metalanguage, one can construct for any effectively axiomatisable formal system an undecidable but true proposition. Once we have the undecidable proposition, call it U, we can add it to our formal system as an additional axiom, giving a new stronger system in which U is (trivially) a theorem. In that stronger system U itself would no longer be equivalent to the assertion of its own unprovablity, but there would instead be some other proposition that would be undecidable in the stronger system in the same way that U was undecidable in the weaker. Since Gödel proved the theorem that bears his name, other mathematicians have produced more undecidable sentences that, while still rather abstruse, are less remote from ordinary mathematical practise than the original Gödel sentences. There are many references to such sentences on the World Wide Web, but I have been unable to obtain sufficient details to give examples here. Mathematical Truth Page 24 What is Philosophy Chapter 4 by Richard Thompson Since no effectively axiomatisable formal system can be complete, Mathematical truth cannot be identified with provability in any particular formal system, so we need an alternative definition. The alternative is truth under any standard interpretation. To explain that we must start with interpretation. The use of the word is not entirely consistent. There is a loose sense in which it means giving some meaning to the symbols of a system. The looser sense is applicable when logicians demonstrate the mutual independence of axioms by constructing interpretations in which all but one of the axioms true and the remaining axiom is false. In the present context ‘interpretation’ is used more strictly to mean giving the symbols an interpretation that make all the theorems true. In that stronger sense an Interpretation of a formal system is: (1) A domain of individuals (2) an assignment of one of the individuals to every individual constant, or, as it is sometimes put, the assignment of a designation for every name (3) an assignment of a truth value (TRUE or FALSE) to every propositional constant (4) an assignment to each functional letter of a function mapping (some) members of the domain to members of the domain. (5) a characteristic function for each function letter chosen so that it provides an assignment of truth conditions to every predicate letter in such a way that: (a) For each one place predicate letter F, the interpretation specifies for which individuals F(x) is true (b) For each two place predicate letter G, the interpretation specifies for which ordered pairs of individuals G(x, y) is true (c) and so on for three or more place predicate letters. Thus an interpretation of (x)[F(x)  G(x, x)]  [F(a)  G(a, a)] must assign some particular individual to a, and must provide algorithms for determining, for every individual x, whether F(x) is true or false, and for every pair of individuals, x, y whether G(x, y) is true or false. A theory is a set of sentences that includes the logical consequences of every member of the set - in other words a set of sentences closed under logical consequence. An interpretation satisfies a sentence if that sentence is true under the interpretation. A model of a sentence is an interpretation that makes the sentence true. A model of a theory makes every axiom and theorem true P entails Q if Q is true under every interpretation that makes P true. A formula is valid if it is true under every interpretation. A valid formula will represent a logically true proposition. We now have a definition of logical truth, but the definition does not always suffice to determine whether a particular sentence is valid or not, though it is sometimes possible to settle the logical status of a sentence from first principles by considering possible Page 25 What is Philosophy Chapter 4 by Richard Thompson interpretations. To be valid a sentence must be true under all interpretations, so we can show that a sentence is not valid by finding just one interpretation that makes it false.  For example (x)(F(x))  (x)(F(x)) can be shown invalid using an interpretation with just two elements - call them a and b Let F(a) be true and F(b) be false Then (x)(F(x)) is true but (x)(F(x)) is false so that (1) reduces to T  F, which is false. However, while falsehood under just one interpretation establishes invalidity, to prove that a formula is valid requires that we show that it is true for every interpretation. Since there are infinitely many possible interpretations we can do that only in a few special cases. In simple cases it may be possible to group the infinitely many interpretations into finitely many classes, so that all the interpretations in the same class assign the same truth values to every sentence. As we’ve already seen that can be done for the predicate logic restricted to one place predicates where, if there are n predicates, n the number of significantly different interpretations is 2 - 1 The Standard Interpretation That is the interpretation in which the symbols are interpreted in their ordinary mathematical senses, with ‘0’ representing zero, 0| representing 1, + representing addition and . representing multiplication. Mathematical truth is then truth for every standard interpretation. Non-Standard Interpretations Because any effectively axiomatised system that contains Q is incomplete, there will be mathematical truths that cannot be proved in it. It therefore has models in which the sentences corresponding to those unprovable truths are false. Such a model is called a non-standard model. Non-standard models contain extra elements in addition to the natural numbers. Even in a system as weak as Q, the axioms guarantee the presence of 0, 0’, 0’’,.... so something like the natural numbers will be present in any model of the axiom set, but there is nothing to prevent other elements appearing too. Frege’s Ancestral relation Frege held that it was important to construct a definition of number that did not apply to anything except the natural numbers, and proposed to eliminate non-numbers by specifying that nothing is a natural number unless it can be reached in a finite number of steps, starting from zero. That is easily said in ordinary language, but not so easily formalised. Frege developed what is generally referred to as ‘Frege’s Ancestral Relation’ Suppose we want to analyse ‘x is an ancestor of y’ in terms of ‘x is a parent of y’ ‘x is an ancestor of y’ means: ‘y has all those properties that apply to x and are always transmitted to their children by anyone who has children’ Analogously we could in arithmetic define ‘n is a descendant of zero’ as ‘n has all the properties that apply to zero and are such that, if they apply to x, also Page 26 What is Philosophy Chapter 4 by Richard Thompson apply to the successor of x’ i.e. (P)([(P(0) & (x)(P(x)  P(x|)] P(n)) ....(1) where P represents a property ‘n is a natural number can then be written: (n = 0) V (P)([(P(0) & (x)(P(x)  P(x|)] P(n)) ‘n is zero or n has all the properties of zero that, if possessed by any number, are also possessed by its successor’’ Since (1) makes an assertion for arbitrary n it implies (P)[ ((P(0) & (x)( P(x) P(x|))) (x)( P(x))....(2) Which is equivalent to a single axiom for mathematical induction. If, in Peano’s system, we replace the induction rules by (2) we have specified that all natural numbers shall be successors of zero, and ruled out any non-standard models. hence every mathematical truth must be true in every model of the revised axiom system, which is therefore complete. That may seem like a ‘way round’ Gödel ’s incompleteness theorem, but it isn’t. Gödel proved that no system adequate for arithmetic can be effectively axiomatisable, consistent and complete. As the modification of Peano’s system is complete and adequate for arithmetic it must be either not effectively axiomatisable, or inconsistent. It’s axioms are satisfiable in arithmetic so it is not inconsistent, hence it is not effectively axiomatisable - there cannot be any algorithm for determining, of any arbitrary sequence of formulae, whether or not that sequence is a valid proof. This extension of Peano’s system is said to be second order because it involves quantification over properties of numbers (the values of P) rather than just over numbers. Z on the other hand is a first order system. This seems the right moment for a short digression on second order logic. In second order logic it is possible to define equality. a = b may be defined as (P)(P(a)P(b)), meaning that a and b name the same individual if all properties that apply to one apply to the other. That implies that if two names refer to distinct individuals there must be some way of distinguishing them through some predicate true of one but false of the other. That is what is asserted by the second order sentence: a b  (P)(P(a)&~P(b)) Yet what is allowed as a possible value of ‘P’, in the context of that sentence? If we allow P(x) = (x=a) that seems to make the assertion vacuous, yet there is no clear guide to what might be permitted. Another strange second order sentence is one that asserts that any two individuals have some common property: (a)(b)(P)(P(a)&P(b)) Would a possible value of P be the property defined by P(x) = [(x=a)(x=b)] ? I’m reminded of a type of joke popular with small children and the manufacturers of Christmas crackers. Such jokes often take the form ‘why is a so-and-so like a such and such’ for values of ‘so-and-so’ and Page 27 What is Philosophy Chapter 4 by Richard Thompson ‘such-and-such’ which do not suggest any striking likeness. For instance ‘Why is an elephant like a lawnmower?’ answer ‘neither of them is a sewing machine’ For another way of looking at the matter, let V be the set of Gödel numbers of truths. Then the incompleteness theorem shows that V cannot be defined in arithmetic - there is no calculation that will determine the members of V. On the other hand the set {V}, of which V is the only member, can be defined in arithmetic, so that we can at least say what it is we can’t define. Also, for any value of n, Vn can be defined in arithmetic, where Vn is the set of Gödel numbers of all mathematical truths than can be expressed with the use of not more than n of the symbols & , V, , , , , Thus it is possible to have a system capable of proving all theorems of complexity less than some specified limit. Categorical Systems There is another way of approaching attempts to formalise Mathematics. If the aim is to use a formal system to define the number system, that requires a formal system which is satisfied by the numbers, but by nothing else. So we should expect to get a formal system with only one interpretation. Such a system is said to be categorical. A consequence of Gödel’s incompleteness theorem is that no first order formal system adequate to express arithmetic is categorical. However, that is more complicated than it sounds. We need to clarify the concept of a formal system’s having only one interpretation, since there is one sense in which a system that has one interpretation must also have others. For instance any system that is satisfied by the natural numbers 0, 1, 2... and the binary operations ‘+’ and ‘x’ will also be satisfied by the numbers 0, I, II, III,... and the operations ‘ADD’ and ‘MULTIPLY’, so we have to rule out alternative interpretations that differ only in notation. The mathematicians’ way of doing that is by using the idea of an isomorphism. Roughly speaking two models I1 and I2 of a formal system S are isomorphic when there is a one to one correspondence both between the individuals and between the functions of I1 and I2, such that if function f1 and individual a1  I1 correspond to function f2 and individual a2  I2 then f1(a1)  I1 corresponds to f2(a2)  I2, and similarly for functions of more than one variable, and if a sentence S 1 I1 corresponds to a sentence S2 I2 then S1 and S2 have the same truth value. If all the models of a formal system are isomorphic the system is said to be categorical. At this point a complication arises, for in that sense of the word no system with an infinite model is categorical, since according to the Skölem Löwenheim theorem any consistent theory with a denumerable model also has a non-denumerable model, and also any consistent theory with a non denumerable model has a denumerable model too, but models of different cardinality cannot be isomorphic since by the definition of ‘different cardinality’ there is no one to one correspondence between them. Mathematicians therefore settled for the more modest aim that all the denumerable models of a theory should be isomorphic, and yet even that modest aim that cannot be achieved by any consistent first order axiomatisation of the number system. The difficulty is that we want to say that the natural numbers are the numbers that Page 28 What is Philosophy Chapter 4 by Richard Thompson can be obtained by starting from zero and counting, 0, 0’, 0’’ ...and those numbers only but in first order logic we cannot express ‘those numbers only’ As we noted above, to say that requires (P)[ ((P(0) & (x)( P(x) P(x’))) (x)( P(x)) which involves reference to ‘all properties…‘ and is therefore second order logic. Incompleteness does not apply to real number arithmetic, only to the arithmetic of the natural numbers - the notoriously difficult Theory of Numbers, so one might say that incompleteness is a sign of the inadequacy of the natural numbers as opposed to the reals. Another way of looking at it is that it shows that the power of a mathematical theory to express problems is always greater than its power to solve them. Infinite Sets, Cardinals and Ordinals Perhaps the oddest part of Mathematics is that dealing with infinite numbers. I shall outline some of the problems about the infinite as a preliminary to examining the competing theories concerning the nature of Mathematics. While Frege was working on the foundations of Mathematics, Cantor was developing a theory of infinite numbers. From time to time earlier mathematicians had considered infinite aggregates, but had been discouraged by apparent contradictions. In the case of infinite numbers there is even a difficulty about defining equality. We learn to use ‘equality’ ’greater’ and ’less’ when dealing with finite numbers and so become accustomed to using two rules. (1) We can establish the number of members in each of two sets by counting each and comparing the results to show whether their numbers are equal or unequal. (2) Alternatively we can pair off members of the two sets and see if any elements are left over. If we are comparing the number of oranges in one box with the number of lemons in another, and we pair of an orange with a lemon until we run out of one or the other, and if we find that when we run out of oranges there are still some lemons left, that shows there are more lemons than oranges. In the case of infinite sets we cannot simply count their members, because the process would never end, so we are left pairing, but there is still a problem. In general terms, finite sets satisfy the following: (1) S and T have the same number of members if the members of S can be paired one-one with the members of T (2a) If S contains every member of T and also some additional elements, then S is larger than T, for instance (1, 2, 3, 4} is larger than {1, 2, 3} (2b) If every member of T can be paired one-one with the members of some proper subset of S, S is larger than T. ( A is a proper subset of S if every member of A belongs to S, but some members of S do not belong to A, so the set of mice is a proper subset of the set of mammals) However in the case of infinite sets (1) and (2) cannot both be true. Page 29 What is Philosophy Chapter 4 by Richard Thompson For instance consider the set N = {1,2,3,...} of natural numbers and the set E = {2,4,6,...} of even numbers. The function d: n  2n, n  N maps N onto E and the inverse function h: n  n/2, n  E maps E onto N so every even number can be paired with a different natural numbers, and every natural number can be paired with a different even number. 1 2 2 4 3 6 4 8 5 10 6 12 7 14 8 16 ... ... So rule (1) suggests there are just as many even numbers as numbers, while rule (2) suggests there should be fewer even numbers as they are a proper subset of the natural numbers.. Cantor proposed to reject rule (2) and specified that two sets are of the same cardinality if they satisfy (1). All it is possible to save of rule (2) is the weaker rule that, if A is a subset of S, or can be put in one to one correspondence with a subset of S, then (cardinality of a)  (cardinality of S) The failure of rule (2) is one way of defining an infinite set. A set is infinite if its members can be placed in one-one correspondence with the members of one of its own subsets. (not, of course, with any one of its own subsets) A surprising result is that the set of rational numbers (the set containing all the fractions and whole numbers) has the same cardinality as the set of natural numbers. That may be proved as follows. Establishing a one to one correspondence between the members of any set S and the natural numbers, is equivalent to ordering the members of S. Once ordered they can be numbered 0, 1, 2,... so we need to order the rational numbers. Notice first that we cannot order the rationals by arranging them in numerical order. For supposing we try, starting from zero and working upwards. What rational number comes next after zero? Not 1/2, for there are lots of positive fractions less than 1/2. For the same reason not 1/4 nor 1/100 nor 1/1000000. In general, for any positive fraction p/q, there is a smaller positive fraction, p/(2q) that is less than p/q but still greater than zero. We must therefore look for another way of ordering the rational numbers. For simplicity I shall consider just the positive rational numbers together with zero. Suppose first that the rational numbers are reduced to their lowest terms so that each is expressed in the form p/q, where p and q have no common factor greater than 1. The whole numbers, which are included in the rational numbers, will be represented as factions with denominator 1, and zero will be represented by 0/1. Arrange the rationals in order as follows. First group together all those where p + q = n for each value of n. Arrange the groups in ascending order of the value of n, Page 30 What is Philosophy Chapter 4 by Richard Thompson and within each group arrange the numbers in ascending order of size. so n=1 gives just 0/1 n=2 gives 0/2 and, 1/1 but we delete 0/2 since it is a more complicated equivalent of 0/1 n=3 gives 0/3 (deleted), 1/2, and 2/1 n=4 gives 0/4 (deleted) 1/3, 2/2 (deleted because equal to 1/1) and 3/1 our final list is then: 0/1, 1/1, 1/2, 2/1, 1/3, 3/1, 1/4, 2/3, 3/2, 4/1... so that 0/1 is mapped to zero, 1/1 to 1, 1/2 to 2, 2/1 to 3, 1/3 to 4, and so on. Any set with the same cardinality as the natural numbers is called denumerable or countable. The members of a denumerable set can be enumerated, that is, they can be arranged in order with a first, second third... I have just shown that the rational numbers are denumerable. So are the algebraic numbers. The real numbers (all the numbers that can be expressed by, finite or infinite, decimal expansions) are not denumerable. Notice first that, as with the rational numbers, we certainly can’t count the reals by arranging them in order of size. But that on its own that does not rule out their being some other way of enumerating them. Cantor proved that that cannot be done by using his famous diagonal procedure. His strategy was proof by contradiction. He supposed there is an enumeration of at least some of the reals, and then showed that if there were it would always be possible to find a real number that does not appear in that enumeration. For simplicity consider just the real numbers 0<r<1. Let each number in the enumeration be written as an infinite decimal. Let the nth real in that enumeration be R n. We can now construct a number D that is not included in the enumeration. We make th th sure of that by choosing D so that its n decimal place differs from the n decimal place th of the n number in the enumeration. There are many ways of doing that; the following is an example. Define D as follows. If the digit in the nth decimal place of R n is 7, let the digit in the nth decimal place of D be 3, otherwise let the digit in the nth decimal place of D be 7. D therefore differs from every real number in the enumeration. So if the proposed enumeration begins: 0.156714... 0.07239... 0.22576... D begins 0.737... The choice of its first three digits ensures that it differs from each of the first three numbers in the proposed ordering of the reals. Thus for any enumeration of a set of real numbers there is a real number that is not included, hence the reals cannot be enumerated. On the other hand the natural numbers can be mapped onto a subset of the reals by the identity mapping because the natural Page 31 What is Philosophy Chapter 4 by Richard Thompson numbers form a subset of the reals. We therefore say that the reals are of greater cardinality than the natural numbers. The cardinality of the natural numbers is referred to as  where  is Aleph, the Hebrew letter A. The cardinality of the reals is called c, for ‘continuum’ because the number of reals is the number of points on the real number line or continuum. It also equals the number of points in two dimensions, three dimensional, or any finitely dimensioned space. The set of subsets of the realsis of a still higher cardinality called f. Comparability of cardinal numbers Notation I refer to the cardinality of set S as n(S), although that is not the standard notation. The standard notation is S but I cannot generate such symbols by using the fonts and styles available in my word processor and have to create them as graphic elements by using the Microsoft Equation Editor. That in turn leads to problems in formatting the text so I shall not use that notation again. If S can be mapped 1-1 onto a subset of T but not vice versa, we write n(S) < n(T), and say that T has greater cardinality than S - it has more members. Using the Axiom of Choice it is possible to prove that for any two sets S and T, precisely one of the following is true: n(S) < n(T), n(S) = n(T), n(T) < n(S), so that sets can be simply ordered according to their cardinality. It follows that if S can be mapped to a subset of T, either n(S) < n(T), or n(S) = n(T), so that if S can be mapped to a subset of T and also T can be mapped to a subset of S, n(S) = n(T), providing a useful indirect way of showing that two sets have the same cardinality without finding a 1-1 mapping from either of them to the other, so that, assuming the axiom of choice, the equality of cardinals can thus be established without actually finding a 1-1 mapping from the whole of one set to the whole of the other. a Power sets If a finite set S has a members, the number of subsets of S is 2 > a That notation is extended to infinite sets. If a set S has cardinality a, the set of subsets of a S is called the power set of S and the cardinality of the power set is written 2 a 2 > a holds for infinite cardinals as well as for finite cardinals. The power set of S is in all cases of higher cardinality than S itself. a The argument for 2 > a is as follows. Suppose that there were a 1-1 mapping from the power set to S. Let St denote the member of the power set that is mapped to the element t  S So that St is a subset of S. Now define S* = {x: x  Sx} so S* is the set of elements that do not belong to the subset with which they are paired Page 32 What is Philosophy Chapter 4 by Richard Thompson Note that S* always exists; if there is no x in S satisfying x  Sx, S* is the empty set. However, S* cannot be mapped to any member of S, since if it were mapped to some element a, S* = Sa, and we have the contradiction a Sa  a S*  a  Sa , so the mapping envisaged is impossible. There cannot be a set of all sets Several decades before Russell discovered the paradox about the class of all classes that do not belong to themselves, Cantor discovered that a paradox arises if we try to form the set of all sets. Call that set SS The SS must contain all its own subsets so its cardinality must be at least as great as the cardinality of its own power set. However the cardinality of the power set must be greater than that of SS. So cardinality (SS)  Cardinality (2SS)  Cardinality (SS) Which is a contradiction, so there can be no set SS. ‘All sets’ is at best a proper class.  2 = c = the cardinality of the real numbers.   To set up a 1-1 correspondence between the set of natural numbers and the set of real numbers we can restrict our attention to the subset of real numbers 0  x <1, it is easy to show that that subset can be put in a 1-1 correspondence with the complete set of real numbers. To do so define y so that y = 0 if x = 0.5, and y = 1/(x - 0.5) if x  0.5, then every value of x: 0x<1 is mapped to a different real number, and every real number in the set {x: 0x<1} corresponds to some value of y. Write the real numbers as bicimals (binary fractions) A real number x ( 0x<1 ) is then mapped to the set of natural numbers Sx th where Sx= {n: the n bicimal digit of x = 1} Examples: 3/4 = 0.11 B (writing B for binary), so 3/4 corresponds to the set {1, 2) 1/3 = 0.0101010... B so 1/3 corresponds to the set {2,4,6...} = the set of all even numbers. Although that procedure maps every real number to a set of natural numbers, and   therefore shows that c  2 , it is not sufficient to show that c  2 because there are some cases where two bicimals correspond to the same real number and hence two sets of natural numbers can correspond to the same real number. There is a one to one correspondence between the sets of natural numbers and the set of infinite sequences of 1’s and zero’s. However, in some cases two different sequences of 1’s and zeros, when interpreted as bicimals, correspond to the same real Page 33 What is Philosophy Chapter 4 by Richard Thompson number. That happens when a bicimal contains only a finite number of zeros, and therefore ends in an infinite sequence of 1’s. That infinite terminal sequence of 1’s is actually a convergent geometric series, and we replace it by its sum. For instance 0.0100111111.… is equivalent to 0.0101 The numbers in question are rational numbers with denominator a power of 2 An analogue in the scale of 10 would be 1/9 = 0.11111, so 9/9 = 0.99999..., but we should actually write that as 1. Although 0.0100111111.… and 0.0101 represent the same number ( 5/16) they do not represent the same set of natural numbers. 0.0101 represents {1, 3} but 0.0100111111 represents the infinite set{ 1, 4,5,6.…} containing all the natural numbers except 0, 2 and 3. It is such sets, those with finite complements, that have so far not been linked to any real number. To complete the proof that c  2  we need to show that 2   c. That can be achieved by the finding mapping from sets of natural numbers to real numbers. As there is a one to one mapping from sets of natural numbers to sequences of 1’’s and zero’s, it suffices to find a mapping from those sequences to real numbers, so that every sequence maps to a different real number. Let the function u  bicimal(u) map sequences of 1’s and zero’s to the real numbers they represent when treated as the digits of a bicimal, so that bicimal(1101) = 0.1101(bicimal) = 13/16 Then to map every sequence to a different real, define u  real(u) so that: real(u) = bicimal(u)/2 if u terminates in an infinite sequence of 1’s, real(u) = bicimal(u)/2 + ½ otherwise. So every set of natural numbers can be mapped to a different sequence of 1’s and zero’s, and every such sequence can be mapped to a different real number, hence 2   c. As we have shown earlier that c  2  , 2  c Ordinal Numbers So far the only infinite numbers I’ve discussed have been infinite cardinals, numbers that measure the size of a set. Cantor originally studied infinite ordinals, that measure the length of an ordered sequence of elements. Ordinal numbers correspond to orderings of a special sort, called well orderings. To explain that I must discuss order relations generally. I shall use the symbol ‘<’ to represent an arbitrary order relation though there is no assumption that the ordering is by Page 34 What is Philosophy Chapter 4 by Richard Thompson magnitude. A partial order relation < of S, satisfies: for any x, y, z in S 1. x < y   y < x, the relation is anti-symmetric 2. x < y  x  y , the relation is irreflexive 3. x < y & y < z  x  z, the relation is transitive Examples are: people ordered by ancestry, so that ‘a < b’ means ‘a is an ancestor of b’, propositions ordered by entailment so that P < Q means ‘ P  Q’ , sets ordered by inclusion, so that ‘S < T’ means ‘S is a proper subset of T’ A partial order relation may be unable to compare some elements in its domain. For many pairs of people, neither is an ancestor of the other, and for many pairs of propositions neither entails the other. A simple order relation satisfies 1, 2, and 3, and also 4. (x)( y)(x<y  y < x  x = y) so that any two distinct elements can be ordered A well ordering of a set S is defined as simple order relation on S such that every non-empty subset has a first element. The natural numbers can be well ordered by size, 1, 2, 3.. Ordering the integers by size is a simple ordering but not a well ordering since the set of all negative integers, when ordered by size, is … -3, -2, -1 which has no first member. However, the integers can be well ordered, a simple way is 0, -1, +1, -2, +2,... When they are ordered like that the set of negative integers does have a first member, namely -1 Well Orderings and Ordinal Numbers It is well orderings that are measured by ordinal numbers. All orderings of the members of a finite set are of the same order type and all are well orderings, since any ordered finite set must have a first member. If a finite set S has n members, any ordering of S resembles the ordering of the natural numbers {1, 2,..n} so there is a first member, a second member and, finally an nth member. The orderings of finite sets are called ordinals of the first class Not all orderings of an infinite set are well orderings, counter examples are the orderings according to magnitude of the integers, of the rational numbers or of the real numbers. However members of an infinite set can be well ordered in many different ways, each defining an ordinal number (however see the comments below about the Well Ordering Theorem) Using the set of natural numbers as an example, the simplest well ordering is: 1,2,3,... which is said to be of order type , the smallest transfinite ordinal adding an extra element at the end increases the order type, so that 2,3,4,......1 is of order type  + 1 >  Page 35 What is Philosophy Chapter 4 by Richard Thompson while adding an additional element at the beginning as in x, 1,2,3, leaves the order type unchanged, so 1 +=   The concept of adding an additional element at the end of an infinite sequence needs some explanation. Strictly speaking it is impossible to put an additional element at the end of an infinite sequence because an infinite sequence has no end. To justify: 2,3,4,......1 we define an order relation x y so that: x 1 for all x not equal to 1, and provided neither x nor y = 1, x y holds when x y. That order relation arranges all the natural numbers except 1 in numerical order, and puts 1 last in the ordering of any set containing 1 I use the symbol ‘‘ to represent the order relation because I want to use ‘<‘ in its usual sense of ‘less than; Product of Ordinals The product of two order types, written as , is defined is the order type obtained by substituting for each element in a well ordering of type  a well ordering of type  so  =  since that just represents an infinite sequence of paired elements (1, 2), (3, 4),... the same as 1,2,3,4. On the other hand .2 > , since .2 is the ordinal number of the sequence (1,3,5,.....)(2,4,6,.... In which every odd number comes before any even number. Once again we have a definition that appears to assume we can finish listing an infinite set of numbers, and then follow on with other numbers In this case we can define the relevant order relation thus: Once again, to save confusion, I use ‘’ to represent the order relation so that I can use ‘<’ in its standard mathematical sense. Then define a  b as true when (1) a is odd and b is even (2) a and b are both even and a<b (3) a and b are both odd and a<b  It is possible to construct an ordering of type  by substituting a well ordering of type  in place of each element of another well ordering of type  . One way of doing that is represented by: 2 3 2 3 2 3 2, 2 , 2 ...3, 3 , 3 ... 5, 5 , 5 .… In which powers of 2, arranged in numerical order come before powers of 3, and then successively powers of all the primes. Page 36 What is Philosophy Chapter 4 by Richard Thompson Since orderings of finite sets are called ordinals of the first class, and orderings of denumerable sets are called ordinals of the second class Although many order types can be defined on a denumerable set it is generally supposed that we eventually reach order types that require uncountably many elements, though no one has ever been able to produce an example of such an order type. * * Suppose  is the first ordinal such that any set ordered  is uncountable, then the cardinal number of such a set is called . In a similar way it is possible to define,... Cardinals can be treated as a special class of ordinal, so that a cardinal is defined as an ordinal u such that there is no ordinal v < u for which n(u) = n(v) A cardinal can also be defined as a limit ordinal. An ordinal u is the limit of an increasing sequence 1, 2....r.....of type  if, for all the r and for any  < , all but at most a finite subset of the r satisfy  < r < . That shows that  is a limit ordinal, since any ordinal less than  is finite. The cardinal in question is  An ordinal that is of the form v = u + 1 where u is another ordinal, is called a successor ordinal. Any ordinal is either a limit ordinal or a successor ordinal, so it is either a cardinal or of the form v+1. Ordinal Numbers as Sets An ordinal number can be identified with the set of all lesser ordinals, thus the ordinal number 3 = {0, 1, 2}. It follows that there can be no set of all ordinals. For suppose there were such a set and denote it by n, then n defines an ordinal greater than any of its members, contradicting the assumption that it contains all ordinals. So n can only be a class, not a set. That is known as the Burali-Forti paradox, one of the logical paradoxes listed in Principia Mathematica, though it is a paradox only if we assume that there should be a set of all ordinals. There is no Greatest Ordinal There can be no greatest ordinal for suppose there were such a number, and call it it would be the set of all ordinals less than itself, so that forming a new set by adding  as an additional member would define a still greater ordinal  + l, contradicting the assumption that is the greatest. Nevertheless it is customary to use to denote the ‘greatest ordinal’ in informal discussions to decide what new axioms might be adopted as a basis for extending the theory of the infinite. The Well Ordering Theorem According to the Well ordering Theorem every set can be well ordered. The well Page 37 What is Philosophy Chapter 4 by Richard Thompson ordering theorem, the axiom of choice and the thesis that all cardinals are comparable are mutually equivalent in ZF; the axiom of choice could be replaced by either of the others without affecting the logical power of the system. The proof of the well ordering theorem is not constructive - it does not involve showing how to obtain a well ordering of an arbitrary set and so far as I know no set of cardinality higher than  has ever actually been well ordered, raising doubts about the well ordering theorem and therefore about the axiom of choice. Transfinite Induction A well ordering of a set supports a generalisation of proof by induction, known as transfinite induction. Supposing a set S is well ordered with a first element s1 and an order relation < (which need not be ‘less than’). Then from: (1) F(s1), and (2) whenever (x)(x < sF(x)), then F(s) we may infer (x)(x  S  F(x))  For an ordering of type, transfinite induction is equivalent to simple induction The Continuum problem It follows from the well ordering theorem that every infinite cardinal is a limit ordinal, in other words an Aleph, but the question is which cardinal is which Aleph.  Since c = 2 >  either c = 1 or c > 1The Continuum Problem is the problem of deciding which Aleph equals c, This cannot be solved in the ZF system. The Continuum Hypothesis is the hypothesis that c =  The Continuum Hypothesis can be neither proved or disproved in ZF. That has been shown by constructing two systems, one by adding the continuum hypothesis to ZF as an additional axiom, and one by adding its contradictory as an axiom. Each of the resulting systems is consistent. n The Generalised Continuum Hypothesis is the hypothesis that n+1 = 2 Like the Continuum Hypotheses this can be neither be proved nor disproved in ZF. However if the axiom of choice is replaced by the general continuum hypothesis, the axiom of choice is then provable, so the hypothesis is in that context stronger than the axiom of choice. Large Cardinals Cardinals are referred to as ‘large’ if they are inaccessible. An inaccessible cardinal is one that cannot be obtained either (1) by taking the limit of any set of smaller numbers, nor a (2) by an equation of the form b = 2 for any a < b is inaccessible, because any cardinal t less than  is finite so that 2t is also finite Page 38 What is Philosophy Chapter 4 by Richard Thompson and <  but any other inaccessible cardinal  would have to be very large and would satisfy  =  which, taken in isolation, is strongly counter-intuitive. I gather that references to large cardinals appear in the proofs of some results in combinetrics. However it is odd to present that as a reason for accepting large cardinals. The form of the argument seems to be: ‘We need assumption A to prove theorem T‘. That assumes we have good reason for expecting T to be true. In other words there must be reasons R for believing T. Yet R cannot be sufficient to prove T, else there would be no need to assume A. Mathematical proofs are often, probably almost always, preceded by informal discussions and the apparatus of mathematical proof has been developed to replace such discussions by formal proof. However it seems odd that the wish to formalise the proof of one result should lead us to adopt an axiom about something else. Perhaps the informal arguments behind the wish to postulate large cardinals are informal discussions of such cardinals, yet I have not yet tracked down any non-formal argument that appears to require large cardinals to formalise it. I should expect to find at least a reference to some set of large cardinality. Possibly the informal argument needing not be made respectable is equivalent to something in second order logic which is not clearly valid. My investigations into the matter are still under way. A reason for supposing there are large cardinals is the reflection principle, that if F(), for any conceivable property F, then F(a) for some a < ,where  is the class of ordinal numbers. The principle is justified by the argument that were F true only of ,  would be an ordinal since it could be defined as ‘the ordinal for which F(  )’. By the same argument F(a) must be true of  ordinals < for were it true of just  ordinals with  < , it would be possible to define  as the (+1)th ordinal with property F. Since an ordinal u is a cardinal if there is no ordinal v < u for which n(u) = n(v) (in other words if u is not a successor ordinal)must, it is argued, be a cardinal. Since  is neither a successor ordinal nor a limit of ordinals smaller than itself, it must be inaccessible, hence there must be  inaccessible ordinals <  . The first of these is 0; let the second be . The reasoning of the previous two paragraphs is neither a proof, nor the summary of a proof, but just an argument for strengthening set theory sufficiently to allow the existence of inaccessible cardinals to be proved. It seems to me that the argument is weak because  is not an ordinal, or perhaps it would be better to say that there is no  so its being false that  is a successor ordinal or a limit ordinal does not imply that is it an inaccessible cardinal, or any sort of cardinal. Such inferences are like arguing that, because it is false that the King of France is a Total Abstainer, he sometimes drinks alcohol. Theories about Mathematics. Page 39 What is Philosophy Chapter 4 by Richard Thompson The various incompleteness theorems make it hard to sustain either Logicism or Formalism. If no formal system can be complete, it is hard to maintain that such a system provides a contextual definition of the concepts involved, so that the Logicist can’t be confident that what he’s constructed in logic really is our ordinary number system, and the formalist can’t be confident it is the number system that his formal systems have defined. As we can’t define numbers by means of a formal system, it seems difficult to say what they are. Platonism, so called because it has some affinity to Plato’s Theory of Forms, holds that Mathematics is about real abstract objects that exist independently both of our thoughts and definitions, and of the material world. Platonism has proved attractive to some workers in the field, notably to Gödel, and to Russell before the writing of the Principia. But what can it mean to assert that numbers and other mathematical entities have an independent existence outside space and time? How could we ever find out anything about such entities? If we can somehow peep into the world of numbers why do we need to prove their properties instead of using our insights, and if we can’t peep, how can we tell that our proofs aren’t all completely wrong? I discuss some of the problems arising from Existence in chapter 5. Formalism is primarily a strategy for handling formal systems, and if we try to use it as a theory of mathematical truth, it seems to be the weakest position, since one can hardly identify Mathematics with a formal system unless there is a unique formal system. The formalist also has a problem in making sense of consistency proofs. A theory is pointless unless it is consistent, yet a theory cannot be proved consistent except in a stronger theory, so with which system should a formalist identify mathematical truth - that which he normally uses when doing mathematics, or the stronger theory in which he proves the first theory consistent? Logicism seems still arguable though only at the cost of including in logic the informal discussions as to what axioms we may plausibly postulate - we could call that informal logic, or possibly second order logic, which, Logicism has been abandoned by almost all Mathematicians and by most Philosophers, but I think it would be going too far to say that Gödel’s incompleteness theorem actually refutes Logicism. It was indeed shown that no effectively axiomatisable system can be adequate for Mathematics, but that does not show that the prospects for creating mathematical systems are any worse outside Logic than within it. Clearly no complete system can be created in first order formal logic so we have a number of systems of variable strength depending on what axioms are adopted. However the discussion of the merits of various axioms could be regarded as an application of informal logic. Informal logic must be complete in the weak sense that, for any mathematical truth T there will be some, more or less plausible, argument for incorporating it into our mathematics if it is not already a theorem. Second order logic, which is complete, can even be formalised. But the completeness of second order logic is not of any practical help in determining precisely which mathematical statements are true, because in second order logic there is no effective test that can be relied on to establish, in a finite number of steps, the validity of any proposed proof; but second order Logic may still have philosophical significance in defining the set of mathematical truths. I don’t, therefore, think that Logicism can be completely written off, but it seems no more that one Page 40 What is Philosophy Chapter 4 by Richard Thompson among several possible ways of looking at Mathematics. The troubles of Logicism and Formalism encouraged a revival of a more sceptical tradition with roots going back to Aristotle. Intuitionism Before Cantor published his findings it had been generally assumed that there could be no completed infinity. Aristotle had argued that if there were any infinite collection, we could consider it divided into parts. For simplicity suppose it divided into just two parts the simplification is mine not Aristotle’s. Then if both parts are finite so is the original collection. Hence at least one part must be infinite. There is therefore an infinite whole with an infinite part, contradicting the axiom that the whole is greater than the part. Aristotle, in common with all thinkers before Cantor, took it for granted that if there were an infinity there would be only one, so that any infinite collection would be the same size as any other. He also assumed that a proper subset of any class must have defer members than the whole class. Galileo had noticed the one-one correspondence between all the natural numbers on the one hand and the even numbers on the other, but considered it just evidence of the paradoxical nature of infinity. During the first half of the nineteenth century Mathematical Analysis was developed to free calculus and the theory of series and limits from all references to infinite collections or to infinitesimals. Cantor then decided to re-examine the status of the infinite, proposing to avoid Aristotle’s argument by allowing a set to be of the same cardinality as some of its subsets. After strong initial resistance most mathematicians acquiesced, but doubts revived as it became apparent that notions of infinity were still beset by difficulties. We’ve already remarked on the odd properties attributed to large cardinal numbers, but closer scrutiny reveals difficulties at an earlier stage, as soon as we go beyond   The Well Ordering Theorem asserts that every set can be well ordered, yet no set of cardinality higher than ever has been well ordered. Since ordinal numbers measure well orderings it follows that no one has ever produced an example of an ordinal number requiring a set of greater cardinality than  , yet  is defined as the cardinality of the smallest such ordinal, so that it seems unclear that  so defined actually has any sets to measure. The problem arises because we have relied on non constructive existence proofs, to show that there is a well ordering of every set and that there is a set of cardinality 1 Those proofs which provide no way of constructing the entities whose existence they purport to prove. The rejection of non-constructive existence proofs was the rallying cry of a party who came to be known as Intuitionists . Intuitionism is often considered to have originated with Kronecker (1823-1891) whose slogan was ‘God created the natural numbers, and man created Mathematics’ but the name came from Brouwer’s (1881-1966) claim that the basis of Mathematics is arithmetic, which is in turn based on our intuitive experience of the succession of moments in time. Intuitionists insist that all existence proofs should be constructive. They Page 41 What is Philosophy Chapter 4 by Richard Thompson reject the Axiom of Choice, which postulates the existence of a selection set even if there is no way of specifying its members, and they deny the existence of any set of cardinality greater than , understandably since the Axiom of Choice is usually assumed in proofs that 2  >  The Intuitionists deny that Logic is prior to Mathematics. They restrict the application of ordinary logic to finite sets, but assert that the use of infinite sets in Mathematics requires a different logic. For instance in an infinite domain P V Q may not be a theorem unless either P is a theorem, or Q is a theorem, so the ‘law of the excluded middle’ , P V P is not a theorem. Heyting (1898-1980) constructed an alternative Intuitionist propositional calculus, with four connectives none definable in terms of any of the others.  sometimes construed as ‘it is absurd’ replaces , and a full stop replaces &. The other connectives are V and  The system is not truth functional, tautologies cannot be identified by constructing truth tables, instead there is a set of eleven axioms: (1) P  (P.P) (7) P  (P V P) (2) P.Q  (Q.P) (8) (P V Q)  (Q V P) (3) (P  Q)  [(P.R ) (Q.R)] (9) [(P  R) .(Q  R) ] )  [(P V Q)  R (4) [(P  Q) .(Q  R) ] (P  R (10)  P (P  Q) (5) Q  (P  Q) (11) [(P  Q) . (P  Q) ]   P (6) [P.(P  Q)]  Q The rules of inference are the same as those in Principia Mathematica: substitution: in any axiom or theorem, replace all instances of any individual letter by any wff, and detachment: if P and P  Q are both theorems, infer Q. It is hard to decide quite what the various logical terms mean in Intuitionist logic. When discussing the standard propositional logic we noted that although & V  corresponded roughly to ‘and’ ‘or’ ‘not’ as we normally use the words, they were not precisely equivalent and  had only a tenuous connection to ‘if then’ However the standard logic can be elucidated by truth tables whereas the intuitionist alternative has no such grounding in truth conditions. It is particularly odd that the Intuitionist calculus has the theorem  P Pbut not  P P, suggesting that some propositions are intrinsically negative. It seems to be permitted to infer ‘Simon is not honest’ from ‘Simon is not not not honest’ but not to infer ‘Simon is dishonest’ from ‘Simon is not not dishonest’, even though the two inferences would ordinarily be considered to say the same thing in different words. The difference between the logical terms, and the roughly equivalent terms in ordinary language did not matter when we had truth tables to say precisely what the logical symbols did mean. The Intuitionist has no such way of explaining his terms. Of course he can explain the system as a whole by saying that the various symbols must take a set of meanings that make all the axioms of Intuitionist logic true, but that is far less informative than providing an independent definition for each symbol. Pis not Page 42 What is Philosophy Chapter 4 by Richard Thompson equivalent to  P, P V Q is not provable unless either P or Q is provable so that P V Pis not a theorem. is supposed to approximate to ‘implies’ yet axioms (5) and (10) seem to provide for a false proposition implying anything and for anything implying a true proposition, which are the ‘paradoxical’ results that made us withhold the title of ‘entails’ from the truth functional  Suppose that, within the Intuitionist calculus we define & and  thus: P & Q is ( P V Q) and P  Q is ( P. Q) suppose we also treat  so defined as material implication, then all the tautologies of the classical logic are provable in the Intuitionist system, which can therefore be viewed not as a subclass, but as a superclass of classical logic, with additional theorems. What those additional theorems might amount to became clear when Gödel showed that Heyting’s calculus is actually equivalent to another system known as Lewis’s S4, which is one of a number of modal logics constructed by C. I. Lewis (1883 - 1964). A modal logic is one that includes a symbol corresponding to ‘necessary’ or ‘provable’ Writing  for necessary. Gödel interpreted Heyting’s connectives as: PP (it is provable that P is not provable) P.Q = P & Q (P and Q are both provable) P V Q = P V Q P is provable or Q is provable (the V on the RHS is used in the truth functional sense) P  Q = P  Q (the on the RHS is used in the truth functional sense, so that Heyting’s ‘P Q’ means ‘P is provable materially implies Q is provable) I discuss modal logic in Chapter 5. Isaacson Daniel Isaacson (probably born in the late 1940‘s) defined ‘arithmetical truths’ as the subset of mathematical truths the truth of which is ‘perceivable directly on the basis of an articulation of our grasp of the fundamental nature and structure of the natural numbers, or directly from statements which themselves are arithmetical.’ He held that arithmetical truths so defined are precisely those that can be deduced from Peano’s axioms. A putative counter example to that thesis is Goodstein’s Theorem (see the appropriately titled supplementary document) which concerns finite sequences of natural numbers and yet is not deducible from the Peano axioms. Isaacson therefore thought himself committed to arguing that Goodstein’s Theorem is not an arithmetical truth. That is prima facie most implausible, for Goodstein’s theorem concerns finite sequences of natural numbers, several of which are short enough to be written down in their entirety. Supporters of Isaacson make some play with the fact that, except in a very few cases Goodstein sequences are much too long for us ever to write down the complete sequence, or even give even a rough estimate of how long the sequence is. However that amounts to no more than the observation that we cannot carry out the full calculation, and the same could be said of many results that are decidable within PA - the consequences of PA must include many theorems so complicated that even the theorem itself is too long for us ever to state it, let alone carry out a proof. Isaacson is there reduced to saying that the proof of Goodstein’s theorem, using infinite ordinals, is non arithmetical, but that is a very dangerous argument for him to use. When defended in that way Isaacson’s thesis is Page 43 What is Philosophy Chapter 4 by Richard Thompson easily confused with a different one - that PA encapsulates all our intuitions about the number system and that for that reason its consequences must comprise the whole body of arithmetical truth. Although that was not Isaacson’s argument, it is still worth considering it to see why it is unacceptable. PA was certainly an attempt to capture our intuitions of arithmetical truth, and it could at first be plausibly presented in that role because it is closely similar to a specification that does do justice to those intuitions, namely the second order analogue of PA. Intuition is an imprecise and fallible instrument, especially when applied to Mathematics - witness the contradictions in naive set theory, so it cannot be relied upon to distinguish first order from second order logic, hence the intuitive appeal that appears to belong to the former may well really arise by confusing it with the latter. If we allow arguments from the way theorems are proved, separation of the two theses is not at all easy. Saying that a proof is not part of arithmetic is hard to distinguish from saying that the proof does not take place within PA, when one is trying to identify arithmetic with the consequences of PA As I understand him, Isaacson claims that mathematical truths can be divided into two classes, lets call them Class A and Class B. He claims that Class A is the set of arithmetical truths and that all its members are provable in PA, and that class B is the set of truths are not arithmetical and none of which is provable in PA. If set A were defined as the set of truths provable in PA, Isaacson's thesis would reduce to: 'Truths provable in PA are provable in PA' He therefore needs to distinguish A and B by indicating that truths in the two sets are somehow qualitatively different, that they make assertions of a different sort so that they can be distinguished without reference to their proofs. Isaacson also needs to argue that Gödel sentences are not part of arithmetic. Yet Gödel sentences take the form of assertions that some diophantine equation, in a lot of unknowns and of fairly high degree, has no solution. If that is so, those sentences cannot be located outside arithmetic since, if such an equation does have solutions, that will be provable in PA since the proof would require only the use of the standard rules for the multiplication, addition and subtraction of natural numbers. It therefore follows from Isaacson’s thesis that there are some predicates such that propositions of the form F(a) are part of arithmetic, yet the corresponding propositions of the form (x)F(x) are not. Second for some equations with no solutions, their not having solutions will also be 3 4 2 6 2 3 provable in PA . For instance 4y z + 12 x yz - 16 x y z = 5 can have no solutions over the integers since for any integers x, y, z, the expression on the left hand side must be even, while the 5 on the right is odd. Thus it cannot be maintained that the unprovable sentences are of a different sort from any that are provable. Once again the criterion for being 'arithmetical' seems to be collapsing into provability in PA, making Isaacson’s thesis and empty tautology. When I once tried to make this point in conversation it was misunderstood so I’ll repeat it in slightly different way. I’m trying to say that the equations which cannot be proved (in PA) to have no solutions are of the same mathematical type as other equations Page 44 What is Philosophy Chapter 4 by Richard Thompson some of which do have solutions, and others of which can be proved in PA not to have any solutions. It is therefore not plausible to say that the Gödel statements are not part of arithmetic in some sense in which those other sentences that differ from the Gödel sentences only in being provable in PA, are part of arithmetic. The distinction between the two classes of sentence is not based on the sort of thing they say, but just on how they can be proved. Yet Isaacson claimed that the difference in the way statements could be proved corresponds to a profound difference in the sort of thing they said. I find it hard to discuss the class of statements provable in PA because in many cases it is hard to tell precisely what they are. Reference to Mathematical text books is not helpful because it is customary to use systems stronger than PA for Mathematics, and the systems mathematicians prefer are not elaborations of PA, but rather prefer elaborations of Zermelo's system. I don’t know how to equate what Mathematicians use to something on the lines of 'PA + these additional axioms' and suspect very few people could do that, because the question would not interest most mathematicians. Applied Mathematics When arithmetic is applied to collections of physical objects, we seem to have propositions that are both logically true, and informative about the physical world. For instance applying 2 + 3 = 5 to the placing in a box first of two coins, and then of another three coins, to produce a final total of five coins, seems to give us an arithmetical way of predicting the consequences of a sequence of physical actions. However that is an illusion sustained by overlooking two important points. The first is Frege’s observation that there is no unique number that represents any particular physical situation. The second point is that in deciding what are suitable objects for counting, and in choosing physical quantities to be used in measurement, we often favour types of units and physical quantities that are amenable to arithmetical manipulation. Consider the following examples: Example (1) Adding 15 g of sugar to 20 g of water we expect to get 35 g of syrup (assuming we don’t prolong the process sufficiently for significant evaporation) o o Example (2) Add an egg at 6 C to a saucepan containing water at 100 C, we do o not expect to get a combined temperature of 106 C The scientists’ search for conservation laws is a search for quantities that can be added in situations where some operation can plausibly be represented as ‘adding’ or ‘combining together‘ objects some feature of which can be measured by those quantities. Despite the ramblings of some educationists, we don’t investigate the masses of mixtures to discover the laws of arithmetic. We assume the laws of arithmetic and pick quantities that exemplify them, regarding mass and energy as therefore more fundamental than volume, temperature or colour. In fact it is only in exceptional cases that any physical process at all corresponds to a calculation. Example (3) There was £73 in my bank account when I paid in £40. In such a case the result is defined by the calculation 73 + 40 = 113, and no one would for a moment Page 45 What is Philosophy Chapter 4 by Richard Thompson consider the transaction as an experiment to find the answer to the arithmetical calculation. The only experiment imaginable in those circumstances would be one that tested the efficiency of the bank’s accounting procedures. Very often the use of arithmetic corresponds to no process outside the mind. Example (4) Alexander’s mother has three sisters and his father has two, so Alexander has 5 aunts. (That ignores inlaws and assumes no incest in his immediate family) In a case such we are just finding the number of elements in the union of two disjoint sets and there is no operation outside our imaginations corresponding to the formation of the union of those sets. This point is obscured by the fact that in practice we rarely think of set unions, since the rules for adding natural numbers are implicit in the counting process, which is commonly employed to introduce children to simple arithmetic. Discussions of practical arithmetic are sometimes further clouded by discussion of the consequences of bringing objects of various kinds into close proximity. Some animals may increase their numbers by reproduction, or reduce their numbers by eating each other, drops of liquid may merge into one. Although such questions are relevant if we want to know how many objects will be present in a certain place at a certain time, they have no bearing on the truth or otherwise of any proposition of arithmetic. It is, indeed, only by assuming the truth of arithmetic that we are able to tell whether objects placed together have reproduced or indulged in mutual destruction. If we put five more objects in a box that originally contained seven, 7+5 = 12 tells us how many objects were put in the box, and if the number of objects subsequently found in the box differs from 12, the discrepancy is a sign of foul play, not an indication that arithmetic is doubtful. In many cases it would not occur to us to expect to find all the objects totalled still intact. If having eaten two sausage rolls at a party, a guest responds to the host’s appeal to help finish up the leftovers by eating another three, we calculate the total consumption as five without entertaining any expectation that five intact sausage rolls repose somewhere in the guest’s digestive tract. In a desperate attempt to claim empirical content for the laws of arithmetic, people have claimed that addition is a generalisation from the results of adding various numbers of objects of the same sort. In that argument much rests on the phrase ‘same sort’ which either says too little or too much. Any two objects are in some respects the same. For instance I’ve have heard it said ‘you can’t add two apples to three pears, for what would be the units of the total?’ Although that makes a point, it does not follow that it is impossible to apply arithmetic to the aggregation of two apples and three pears; we just need a suitable notation. In a catering establishment where one apple and one pear each count as ‘one portion of fruit’ the answer would be ‘five portions of fruit’ On the other hand if we try to strengthen ‘same’ it is hard to stop short of ‘identical’ so that we never have collections of things that are all the same. There is an element of truth behind the claim that we cannot add objects of different sorts - namely Frege’s point that number is not intrinsic to any object or situation, but only comes into play when we specify a basis for numbering. Thus ‘2 + 3 = 5’ applies to the apples and pears only after we have specified that ‘portion of fruit’ is to be the basis of our counting. However, that does not justify treating the arithmetical statement as an Page 46 What is Philosophy Chapter 4 by Richard Thompson empirical generalisation. Summing up Mathematics has often been cited as the ideal against which other branches of knowledge should be measured. It has been represented as a body of certain knowledge independent of the evidence of the senses. Rationalists have held that Mathematics is a body of knowledge about a world of ideas, and have cited skill in Mathematics as evidence that we have some sort of direct access to such a world. Empiricists have often responded by saying that mathematical truths follow from the meanings of words used to express them, so that we create mathematical truth when we define our terms. That empiricist story seems plausible in the case of simple propositions about the results of individual calculations. ‘2 + 3 = 5’ could plausibly be rendered into logical notation without using any special mathematical terms at all. (see the appendix 2 where I consider the case of 2+2=4) The same cannot be said of universal statements about all numbers, but much simple algebra can be deduced from Peano’s axioms, and those axioms can plausibly be represented as just stating (part of) what we mean by number. However, as we progress to stronger systems, such as Zermelo’s set Theory and the progressively stronger systems obtained by adding additional axioms to that, it becomes increasingly implausible to regard those axioms as expressions of what we mean by numbers. But at the same time the successive axioms become increasingly open to doubt, the doubts becoming stronger with increasing remoteness from our intuitions about number, so that the Rationalist case for treating Mathematics as a body of unassailable truth is undermined by the same problems that weaken the Empiricist counter claim that mathematical truths are truths of logic. The axiom of choice cannot be represented as true a priori even in Kant’s weak sense of ‘true in any conceivable world’ because there are Intuitionists eloquently conceiving its falsehood. The Intuitionists don’t just say ‘we can imagine the Axiom of choice to be false’ they embody their imaginings in the elaborate detail of an alternative formal system. The status of Mathematics is much more complicated and uncertain than was supposed before the mid twentieth century. I think the balance of argument favours the Empiricists, since those axioms that go further than expressing the meaning of our concept of number are not so much based on an intuition of truth as adopted for reasons of convenience; Mathematicians decide it is convenient to have a particular axiom. If some Mathematical propositions are not logical truths, they may reasonably be regarded as the creation of mathematicians. All we can say with certainty is that the precise nature of Mathematical truth is highly controversial. Mathematics turns out to be a much odder subject than many had supposed, and its oddities disqualify it from being the template for any other branch of knowledge. The philosophical significance of Mathematics is not as great as has often been supposed. Technical Appendix: Measurable Cardinals One motive for wanting large cardinals is measure theory. Measure theory attempts to formalise such intuitions as that nearly all reals are irrational, because the reals are of Page 47 What is Philosophy Chapter 4 by Richard Thompson higher cardinality than the rationals. Any measure function  defined over a set M satisfies:  = 0 2. A  B =   (A  B) = (A) + (B) 3. (2) is extended to denumerable collections of pairwise disjoint sets. (4) A  B  (A)  (B) 1, 2, 3 and 4 do not completely specify a measure function. Adding further axioms can give any one of a variety of different measures that are used for various purposes. Lesbegue measure is the basis for a definition of integration that allows an integral to be defined even in cases where there are denumerably many discontinuities. The Lesbegue measure of the set of reals a  x  b is b - a. It can be proved that the Lesbegue measure of any finite set is zero, and so is the Lesbegue measure of any denumerable set. For instance, consider Dirichelet’s function d: C  {0,1} where C = {x: x  R, 0<x<1} and d is defined by d(x) = 0 if x is rational, d(x) = 1 otherwise, Lesbegue integration gives 1  d ( x). dx  1 0 Measurable cardinals arise from the attempt to define a different sort of measure, a two valued measure over a set M so that, (M) = 1, and (S) is either zero or 1 for any subset S of M, and in particular (A) = 0 for any finite or denumerable subset A. To indicate how strange that suggestion is, I have considered the difficulties that would be encountered if one tried to define a two values measure over the continuum, so that such a measure, if it could be defined at all, could only be defined over a set of much greater cardinality than the continuum. That does not show that the set should have inaccessible cardinality, but I think it does show the oddity of a two valued measure. The Impossibility of a two valued measure on the Continuum The requirement that any subset either has measure 1 or measure 0 implies that for any subset A, and its complement A’ = M - A, one of A, A’ has measure 1 and the other has measure 1. It also implies that if two subsets have measure 1, their intersection also has measure 1, for their complements both have measure zero, and so, therefore does the union of their complements, which is the complement of their intersection. Consider the set of reals -0.5  x  +0.5 Now consider the subsets -0.5  x  0 , 0< x  +0.5 and {0} The sum of their measures must equal 1, m{0} = 0 because the set is finite, so of the two subsets -0.5  x  0 , 0< x  +0.5 one must have measure 1 and the other measure 0. Pick the one with measure 1 and divide it into two in an analogous way. Continuing the process we can show that if there where a two valued measure on the continuum then for any positive integer n, there must be a set of the form: n {x: a/2  x  (a+1)/2n} which has measure 1, and the complement of which has Page 48 What is Philosophy Chapter 4 by Richard Thompson measure zero which is very odd because as n tends to infinity the Lesbegue measure of the first set would tend to zero, and that of the second to 1. Of course, the two valued measure, if it could be defined, would not be the same as the Lesbegue measure, but such a blatant discordance between two measures would at least be disturbing. I think the following may actually prove there cannot be a two valued measure on the continuum. For suppose there were. Let C = {x: 0  x  1, x  R} Consider L C s.t. x  L  {y:y  x} = 0 clearly 0  L, since {hence L is not empty Let the least upper bound of L be LB LB belongs to L, because {y:y  LB} = 0 and { LB} = 0 and the union of two sets of zero measure itself has zero measure. Now define the set U  C s.t. x  U  {y:y  x} = 0 clearly 1  U, since {hence U is not empty let the greatest lower bound of U be UB UB belongs to U, by an argument similar to that used in the case of L B and L (I) Suppose LB < UB then (t)(LB  t  UB) and {y:y  t} = 1 and {y:y  t} which contradicts the properties of a two valued measure. (The asymmetry in those formulae arises because I need the sets of be disjoint so that only on of them may contain t, but adding or removing a single element will not change the measure) (II) Suppose LB  UB then C = {x: x  UB}  {x: x  LB}, but each of those sets has measure zero contradicting the hypothesis that C has measure 1 Hence there can be no two valued measure on the continuum. Theorems about Measurable Cardinals Preliminary definitions: sup and regular. If S represents a set of ordinals, sup(S) = the first ordinal greater than any member of A. A regular cardinal k is one that cannot be expressed as sup(S) for any set S containing fewer than k ordinals less than k. If MC = (a)(a is a measurable cardinal), then Page 49 What is Philosophy Chapter 4 by Richard Thompson MC  The General Continuum Hypothesis  The Axiom of Choice Also MC implies the existence of non-constructible sets including non-constructible sets of natural numbers and that there are only countably many constructible sets of natural numbers or of rational numbers. It is possible to prove that any measurable cardinal M must be inaccessible, and, assuming the axiom of choice it must be strongly inaccessible, which means that (1) M is regular and a (2) a < M  2 < M Assuming the General Continuum hypothesis, inaccessibility is equivalent to strong inaccessibility, so that in the case of a measurable cardinal, whose existence entails the general continuum hypothesis, the two are equivalent. Some mathematicians doubt the general continuum hypothesis. Gödel himself thought that eventually reasons would be found to reject it. If that does happen that would imply there are no measurable cardinals, and presumably no cardinals even larger than that. Supposing some problem required the use a measurable set, what sort of members might that set have? The matter seems even more puzzling in the light of the Skölem Löwenheim theorem according to which every consistent theorem has a denumerable model. Thus even if a theory contains proofs of the existence of non denumerable sets, that theory can be interpreted as referring to a universe containing only denumerably many individuals. Of course denumerably many individuals can be used to construct a non denumerable collection of sets of individuals, but that does not explain the apparent existence of inaccessible cardinals which, by hypothesis, cannot be reached by repeated exponentiation of accessible cardinals. Thus any system that appears to require some set of measurable cardinality can in fact be satisfied by a model with only denumerably many elements, and unless measurable cardinals can somehow be conjured up by clever arrangements of denumerably many elements, they can hardly be required. Appendix 1 The use of the Axiom of Choice is usually defended pragmatically ‘We need it to prove so and so....” In that case the theorems cited as unprovable without the Axiom must have greater plausibility than the axiom itself, suggesting that it is they that should be axioms. I suspect that the motive for adding the Axiom of Choice is a wish to keep the system as simple as possible. Axiomatisation is increasingly seen as a tidying up operation, not as providing mathematics with a basis in a set of self evident truths. Appendix 2 I think that some simple arithmetical propositions can be reduced to logic. That certainly seems possible with the sort of examples Philosophers like to use, which are particular propositions of the form a + b = c, where a, b, and c are natural numbers. Consider ‘2 + 2 = 4’ Page 50 What is Philosophy Chapter 4 by Richard Thompson Analyse that as ‘If exactly two objects have property F, and exactly two objects have property G, and no object has both property F and property G, then exactly four objects either have property F or have property G’ ‘exactly two objects have property F’ is equivalent to (x)( y)[(F(x) & F(y) & (x = y) & (z)( F(z)  {(z = x) V (z = y)})] ‘exactly two objects have property G’ is equivalent to a similar formula with G in place of F ‘no object has both property F and property G’ is equivalent to (x)( F(x) & G(x)) ‘exactly four objects either have property F or have property G’ is equivalent to a formula which I shan’t write out in full, but the following fragments should give a general idea what is required: There are elements x, y, z, w, all distinct, (x)( y)(z)(w)( (x = y) &(x = z) &(x = w) …&(y = z) &…( Z= w) ) such that every one of them is either F or G, …((F(x) V G(x))& (F(y) V G(y))& (F(z)V G(z) & (F(w) V G(w)) none of them is both F and G, ...&(t)( (F(t) & G(t)), that actually says that nothing at all can be both F and G, but in the light of the next clause, that has the same truth value as the intended proposition. and anything that is either F or G, must be one of x, y, z, w ...&(t)( (F(t)VG(t))  ((t = x) V (t=y).....)) Note that nowhere have I quantified over variables representing numbers. I haven’t said that there is a number 2 or that there is a number 4. This analysis does not assert the existence of numbers, and they may, for the purposes of evaluating very simple expressions in arithmetic, be treated adjectivally., and in this context ‘very simple’ includes addition, subtraction, multiplication, or division of any numbers, however large. Page 51

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 4, Mathematics