Download The Discovery of the Computer

The Discovery of the Computer Summary David Hilbert posed a fundamental question, as to whether it is possible to decide if any given theorem expressed in a logical system is true or false, without producing all the possible theorems of the system. This so-called “decision problem” was answered by Alan Turing, who showed that it is not possible to decide if any theorem is true or false. In doing this, he discovered the computer. The story starts with Leibnitz whose dream was to establish a formal system of logic which could be used to solve debates such as found in courts of law. He developed a set of symbols and procedures which were the first step to his dream. These are remarkably similar to the logic developed by George Boole (who we know well from Boolean variables in our computer programs, and the logic gates, AND, OR, etc). Much later Claude Shannon established a correspondence between Boole’s logic and electrical switching circuits, which means that electronic circuits can be used to solve problems in logic mechanically ie without human thought. These circuits form the basis of all electronics present in our contemporary computers. Following Leibnitz, Frege established a stronger formal system of logic which led to Hilbert posing his fundamental question. In consideration of Hilbert’s problem, Alan Turing developed an abstract machine designed to answer Hilbert’s question. This “Turing Machine” captures the essence of our contemporary computer, which comprises a processor driven by a program. Yet Turing also went beyond this and showed that there is no difference between program and data; both are numbers. This was inspired by Kurt Godel, who showed how the symbols used in logic could be represented as numbers. Processing of these symbols could be reduced to arithmetical operation on the associated numbers. In later sessions, we shall show how this representation shows how an abstract Turing Machine is equivalent to a “register” based architecture such as the RISC and CISC machines we have studied. We also prove how computer languages can be developed with three or fewer basic programming constructs. But that is later. Leibnitz Leibnitz proposed an algebra of logic. This was similar to the algebra of mathematics, which gave rules on how to work with numbers, such as addition, subtraction and multiplication. In his algebra we have the following concepts: Definition B C  L Axiom 1 B C  C  B Axiom 2 A A  A L is a set of object comprised by those both in B and C. B is the set of men and C is the set of women. So L is the set of men and women. The set of men taken together with the set of women is the same as the set of women taken together with the set of men The set of fish taken together with the set of fish is just the set of fish (if A represents the set of fish). It is interesting to note that axiom 1 looks rather like the algebra x  y  y  x , for example 2 + 4 = 4 + 2. Yet axiom 2 looks like the mathematical expression x  x  x , for example, 2 + 2 = 2 which is clearly not true (according to arithmetic), so clearly the notations and operations are not equivalent. Or could they be? Boole Boole’s algebra also worked with sets or classes. For example, if x represents the class of living things, and y represents the class of cats, then what is x.y (“x times y”) Boole interpreted this as the class of things in both x AND y. In this case it would be (the class of living things) AND (the class of cats) = live cats! Or another example if x is the class of men and y is the class of children then x.y is the class of (men) AND (children) = boys! What about the expression x.x = x ? This is rather like Leibnitz’s Axiom 2. What does it mean here? Taking the above two examples, we find first the (the class of living things) AND (the class of living things) = (the class of living things). Or (the class of men) AND (the class of men) = (the class of men). Fairly straightforward. But then Boole did a master-stroke What does the expression x.x = x mean in ordinary algebra, involving numbers? In other words, which numbers can we plug into this expression to make it true? A little reflection suggests that there are only two possibilities, 0 and 1: 0.0 = 0 1.1 = 1 So he discovered (invented?) “Boolean variables”. Now, ordinary algebra deals with addition and subtraction, so Boole had to reflect on the meaning of expressions such as x + y and x – y. This was straightforward. Let us first consider x + y. If x is the class of men and y is the class of women then x + y = the class of men taken together with the class of women = class of adults = z. Now subtraction is easy to understand. If z is the class of adults and x is the class of men then it follows at once that z – x = the class of adults which are not men = the class of women. Returning to the solution of the equation x.x = x which had two algebraic values, 0 and 1, we ask how to interpret these in the language of classes? It is not difficult to see that “1” refers to every object under consideration (within the bounds of the discussion) and that “0” is the empty set containing no objects. With this interpretation, then what does the following expression mean? 1–x Well, if 1 means everything, then 1 – x means “everything not in x”. For example if x is the class of men, then 1 – x is everything else, from women, cats, coffee pots, … dirty socks. Now let’s return to the expression x.x = x (where x can be 0 or 1 using numbers) and ask what does this mean in the language of classes? We make the following re-arrangement: x.x = x 0 = x – x.x 0 = x.(1 – x) (subtracting x.x from both sides) (factoring out x) Well, taking x = the class of men, we know that (1 – x) is the class of everything which is not men. So the above expression means “(the class of men) AND (the class of everything which is not men), taken together is empty (=0). In other words, “you can’t be both a man and not-a-man (eg mouse)”. This logical system can be applied to logic studied by Aristotle which handles a form of inferences called “syllogisms” which proceed from propositions to conclusions. The premises and conclusions were sentences of a restricted type such as “All cats are animals”. This could be written “All X are Y”. Here is a typical syllogism Therefore All X areY All Y are Z All X are Z We say that this syllogism is “valid” since if the premises are true then the conclusion is true. For example Therefore All cats are mammals All mammals are vertebrates All cats are vertebrates Boolean logic can be used to demonstrate that this syllogism is valid as we shall now see. But how do we “code” “All X are Y” in Boolean algebra. Remember the meaning of x.(1 – x) = 0 above, that nothing belongs to both class x and the class which is everything else. Well, if we write x.(1 – y) = 0, then this means that there is nothing which belongs to x but not to y. That means everything in x belongs to y, in other words “All X are Y. Finally, we can unravel X.(1 – Y) = 0 to the equivalent expression X = XY. So here’s the coding of the above two premises. Therefore All X areY All Y are Z All X are Z X = XY Y = YZ (1) (2) (3) Now we must prove the conclusion (3). Here’s the steps: (i) take the first expression X = XY and replace Y with the second Y = YZ which gives us X = XYZ. (ii) group the terms like this X = (XY)Z (iii) Now replace the XY in the brackets with X using the first expression X = XY. That gives us X = XZ All X are Z Which proves the conclusion. Let’s now see how Boolean algebra can be used to check whether a more complex argument is correct. For example, consider the following argument (which may be valid or invalid – it’s our job to find out) If Higgins was born in Bristol, then Higgins is not a Cockney. Higgins is either a Cockney or an impersonator. Higgins is not an impersonator. Therefore Higgins was born in Bristol. If we strip this paragraph apart, then we can identify the following atomic sentences: P = Higgins was born in Bristol Q = Higgins is a Cockney. R = Higgins is an impersonator. Using these atomic sentences, we can code the above paragraph as If P then NOT Q Q OR R = true R = false P = true. Boole saw that the same algebra that worked for classes could also work for inferences of this kind. He interpreted an expression like X = 1 to mean that X is true. Likewise X = 0 means that X is false. With this understanding the results below followed: “X and Y” is written “X or Y” is written “If X then Y” is written XY = 1 X+Y=1 X.(1-Y) = 0 since this holds only if X and Y are both 1 since this holds if either X or Y are 1 since putting X = 1 gives (1-Y) = 0 hence Y=1 Using these ideas, the above “coding” of the paragraph can be written P.(1 – not Q) = 0 Q+R=1 R=0 P=1 (1) (2) (3) (4) So let’s see if this argument is valid. Using (3), putting R = 0 into (2) we find Q = 1. Putting this into (1) together with the value of P from (4) we find 1.(1 – 0) = 0 1.1 = 0 This is clearly incorrect, “one times one is not zero”. So there is an invalid argument. To solve this, let’s change the conclusion to P = 0. Putting this value of P together with Q = 0 into (1) we now get 0.1 = 0 Which is now OK. So the above values for P,Q,R give us the following understanding P=0 Q=1 R=0 Higgins was not born in Bristol Higgins is a Cockney Higgins is not an impersonator. You will have recognized our “truth tables” in the above discussion. They capture the essence of Boolean algebra, but were not used by Boole. It is suggested that Lewis Carroll discovered them in 1894. Summary. Boole produced a logic where variables x,y,z,…have one of two values (0,1). He also produced two operations, AND and OR. His logic was able to provide “inferences”. Claude Shannon Now we jump ahead in time almost 100 years to reflect on the work of Claude Shannon, who in his MSc Thesis “A Symbolic Analysis of Relay and Switching Circuits” showed how Boolean algebra could be realized by simple electrical switches. Here’s an image from his thesis In Fig.1 he shows a simple switch which is open. Fig.2 shows two switches in series which he proves can provide the Boolean operator “OR” (X + Y) and in Fig.3 he shows two switches in parallel which provide the Boolean operator “AND” (X.Y). He uses the Boolean variable in a rather strange way. An open switch is given the value X = 1 and a closed switch X = 0. We would probably do the opposite today. This is because he was thinking in terms of “hindrances”, whereas we would thing in terms of “conduction”. A “hindrance” is a block to electrical flow, ie an “open” switch. Let’s see how Fig.2 will generate a truth table. X = 0 (closed) Y = 0 (closed) X = 0 (closed) Y = 1 (open) X = 1 (open) Y = 0 (closed) X = 1 (open) Y = 1 (open) Total Hindrance = 0 Total Hindrance = 1 Total Hindrance = 1 Total Hindrance = 1 X+Y=0 X+Y=1 X+Y=1 X+Y=1 This has generated the truth table X Y X+Y 0 0 0 0 1 1 1 0 1 1 1 1 Which clearly corresponds to the “OR” function, where X + Y is true if X is true OR Y is true OR both are true. Now let’s have a look at his Fig.3 X = 0 (closed) Y = 0 (closed) X = 0 (closed) Y = 1 (open) X = 1 (open) Y = 0 (closed) X = 1 (open) Y = 1 (open) Total Hindrance = 0 Total Hindrance = 0 Total Hindrance = 0 Total Hindrance = 1 X .Y = 0 X. Y = 0 X .Y = 0 X .Y = 1 Which has generated the following truth table X 0 0 1 1 Y 0 1 0 1 X.Y 0 0 0 1 which corresponds to the “AND” function, where X.Y is true only if X AND Y are both true. Of course today, we do not use mechanical switches in our computer, we use electronic switches called “gates”. Our AND and OR gates have exactly the truth tables written above. Shannon therefore laid the ground for a digital electronic implementation of an abstract computer (whatever that is… we shall see later). But he also emphasised the correspondence between electrical circuits and the propositional calculus we have been discussing above in the section on Boole. The workshop associated with this session is based upon Shannon’s appreciation of electrical circuits and logical inferences. Here you will work with simulated logic gates to explore the logic of inference. Hilbert Together with Ackemann, David Hilbert published a book “Basics of a Theory of Logic” which developed Frege’s “first order logic”. While we have not discussed this, it extends the logic of AND, OR, NOT, IF with “there exists” and “for all”. Hilbert showed that mathematics could be described by this new logic, and they raised three important questions. Completeness. For any valid inference, would it be possible to prove the conclusion given the premises? In other words did their system of logic provide all possible correct proofs? (This was answered by Godel a couple of years later, in the affirmative). Consistency. Not required for the discussion. The Decision Problem. Could an algorithm be found which would tell you if a proposed inference is valid without constructing an entire proof. Constructing a proof may take a large number of steps. But this algorithm (if it existed) would take a finite number of simple steps. Since logic and maths had been shown to be equivalent, if this algorithm existed, then it could be used to answer any mathematical question. As mentioned above, Alan Turing turned to finding such an algorithm, and through his reasoning process discovered the computer. To understand what this decision problem is all about, we now turn to the study of two formal systems, the “MIU” system and the “pq” system, developed by Douglas Hofstadter in his book “Goedel Escher and Bach, An Eternal Goldern Braid”. Let’s start with the MIU system. This is a formal system known as a “Post production system” after the logician Emile Post. It is all about strings formed from three letters M, I, U, such as MU, UMIM, MIMU and so on. The system gives you a starting string called a “axiom” and also “rules” telling you how to “produce” other strings. These strings you produce are called “theorems”. To recap, you start with an axiom and rules and you end up with theorems. Here’s the axioms and rules Axiom MI Rule 1 If the last letter is I then you must add a U to the end Rule 2 If you have Mx (where x is a string of any length) then you can form Mxx Rule 3 You can replace any III in a string with U Rule 4 You can delete any UU in a string. Here are some theorems generated in this system. Axiom MI Rule1 MIU MIUIU Rule 2 MI MII MIIII MIU Axiom Rule 2 Rule 2 Rule 3 MIIII MIIIU MIUU MI Rule 1 Rule 3 Rule 4 Clearly we can generate a huge number of theorems using these production rules. But let’s return to understanding what the “decision problem” is about. How can we find out if the three strings noted above, MU, UMIM, MIMU are theorems of the system, ie can they be produced by the system. There are two possible approaches. First we can just use the rules in various combinations until the string MU or UMIM pops out. Then we know that the thing is indeed a theorem. But what if it does no pop out say after 100 production steps? We can’t conclude that it is not a theorem since perhaps it takes 200 production steps to generate it, or perhaps if it is really not a theorem, we will never produce it, but that requires an infinite amount of time. Clearly this approach is useless. And that’s the basis of Hilbert’s decision question. Can we find an algorithm (ie a process which executes in a finite amount of time which can decide the problem for us? This process must work for all possible strings we can write down. It’s easy to see that UMIM cannot be a string, we decide by looking at the string and at the rules and noticing that the string starts with a U. All theorems clearly start with M. But what about MU ? We have no idea what the decision procedure should be. Perhaps we can find a decision procedure, perhaps we can’t. That wasn’t Hilbert’s question. He wanted a proof that such a decision procedure exists or doesn’t exist, and that’s where Turing came in. Let’s take another system from Hofstadter, the “pq” system. Here are the axiom and rule set: Axioms xp-qx- is an axiom if x is composed of hyphens, eg -p–q -Rule 1 If xpyqz is a theorem then xpy-qz- is a theorem. Let’s make a couple of derivations, first starting from –p-q-- ie x = - and let’s take y = -. Then we get –p-q-–p--q--–p---q---–p----q----Now let’s take x = -- and y = ---. Then we get the axiom --p-q--- and the theorems are --p-q----p--q-----p---q------p----q-----For this system, we do have a decision procedure. For example, you can tell that the string --p---q----- is a theorem of this system, but --p-----q--- is not a theorem. How? Well this works by jumping outside the system and noticing that all strings express the simple rule of arithmetic addition, where p means “plus” and q means “equals”. So --p----q------ means “two plus four is six” That this works follows from the recipe to make axioms, which clearly produces strings which display the addition property. The production rule simply adds one onto the second operand and one onto the resulting sum. But remember we did not need to know this “higher-level” interpretation of the strings, our formal system was able to generate them automatically. Let us return to the “decision problem” here. We have been able to find a decision procedure by jumping out of the formal system and by using our intelligence, and recognising a system with which we are familiar, additions of numbers. Recognition of this similarity was given meaning to the original formal system. Unfortunately this was not what Hilbert had in mind when he was looking for an algorithm to decide is any given theorem was true in the logical system developed by Frege. He wanted to know if that algorithm could be made using the grammar of the system itself, and not by jumping out of it just as we have done.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download The Discovery of the Computer