Download The Discovery of the Computer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Truth-bearer wikipedia , lookup

Jesús Mosterín wikipedia , lookup

Axiom wikipedia , lookup

Inquiry wikipedia , lookup

Boolean satisfiability problem wikipedia , lookup

Interpretation (logic) wikipedia , lookup

Argument wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

History of logic wikipedia , lookup

Modal logic wikipedia , lookup

Quantum logic wikipedia , lookup

Foundations of mathematics wikipedia , lookup

Intuitionistic logic wikipedia , lookup

Theorem wikipedia , lookup

George Boole wikipedia , lookup

Natural deduction wikipedia , lookup

Propositional calculus wikipedia , lookup

Mathematical logic wikipedia , lookup

Axiom of reducibility wikipedia , lookup

Principia Mathematica wikipedia , lookup

Law of thought wikipedia , lookup

Laws of Form wikipedia , lookup

Transcript
The Discovery of the Computer
Summary
David Hilbert posed a fundamental question, as to whether it is possible to decide if any given theorem
expressed in a logical system is true or false, without producing all the possible theorems of the system. This
so-called “decision problem” was answered by Alan Turing, who showed that it is not possible to decide if
any theorem is true or false. In doing this, he discovered the computer.
The story starts with Leibnitz whose dream was to establish a formal system of logic which could be used to
solve debates such as found in courts of law. He developed a set of symbols and procedures which were the
first step to his dream. These are remarkably similar to the logic developed by George Boole (who we know
well from Boolean variables in our computer programs, and the logic gates, AND, OR, etc). Much later
Claude Shannon established a correspondence between Boole’s logic and electrical switching circuits, which
means that electronic circuits can be used to solve problems in logic mechanically ie without human thought.
These circuits form the basis of all electronics present in our contemporary computers. Following Leibnitz,
Frege established a stronger formal system of logic which led to Hilbert posing his fundamental question. In
consideration of Hilbert’s problem, Alan Turing developed an abstract machine designed to answer Hilbert’s
question. This “Turing Machine” captures the essence of our contemporary computer, which comprises a
processor driven by a program. Yet Turing also went beyond this and showed that there is no difference
between program and data; both are numbers. This was inspired by Kurt Godel, who showed how the symbols
used in logic could be represented as numbers. Processing of these symbols could be reduced to arithmetical
operation on the associated numbers. In later sessions, we shall show how this representation shows how an
abstract Turing Machine is equivalent to a “register” based architecture such as the RISC and CISC machines
we have studied. We also prove how computer languages can be developed with three or fewer basic
programming constructs. But that is later.
Leibnitz
Leibnitz proposed an algebra of logic. This was similar to the algebra of mathematics, which gave rules on
how to work with numbers, such as addition, subtraction and multiplication. In his algebra we have the
following concepts:
Definition
B C  L
Axiom 1
B C  C  B
Axiom 2
A A  A
L is a set of object
comprised by those both
in B and C.
B is the set of men and C is the set of women. So
L is the set of men and women.
The set of men taken together with the set of
women is the same as the set of women taken
together with the set of men
The set of fish taken together with the set of fish
is just the set of fish (if A represents the set of
fish).
It is interesting to note that axiom 1 looks rather like the algebra x  y  y  x , for example 2 + 4 = 4 + 2. Yet
axiom 2 looks like the mathematical expression x  x  x , for example, 2 + 2 = 2 which is clearly not true
(according to arithmetic), so clearly the notations and operations are not equivalent. Or could they be?
Boole
Boole’s algebra also worked with sets or classes. For example, if x represents the class of living things, and y
represents the class of cats, then what is
x.y
(“x times y”)
Boole interpreted this as the class of things in both x AND y. In this case it would be (the class of living
things) AND (the class of cats) = live cats! Or another example if x is the class of men and y is the class of
children then x.y is the class of (men) AND (children) = boys!
What about the expression
x.x = x
?
This is rather like Leibnitz’s Axiom 2. What does it mean here? Taking the above two examples, we find first
the (the class of living things) AND (the class of living things) = (the class of living things). Or (the class of
men) AND (the class of men) = (the class of men). Fairly straightforward. But then Boole did a master-stroke
What does the expression
x.x = x
mean in ordinary algebra, involving numbers?
In other words, which numbers can we plug into this expression to make it true? A little reflection suggests
that there are only two possibilities, 0 and 1:
0.0 = 0
1.1 = 1
So he discovered (invented?) “Boolean variables”. Now, ordinary algebra deals with addition and subtraction,
so Boole had to reflect on the meaning of expressions such as x + y and x – y. This was straightforward. Let
us first consider x + y. If x is the class of men and y is the class of women then
x + y = the class of men taken together with the class of women = class of adults = z.
Now subtraction is easy to understand. If z is the class of adults and x is the class of men then it follows at
once that
z – x = the class of adults which are not men = the class of women.
Returning to the solution of the equation x.x = x which had two algebraic values, 0 and 1, we ask how to
interpret these in the language of classes? It is not difficult to see that “1” refers to every object under
consideration (within the bounds of the discussion) and that “0” is the empty set containing no objects. With
this interpretation, then what does the following expression mean?
1–x
Well, if 1 means everything, then 1 – x means “everything not in x”. For example if x is the class of men, then
1 – x is everything else, from women, cats, coffee pots, … dirty socks.
Now let’s return to the expression x.x = x (where x can be 0 or 1 using numbers) and ask what does this mean
in the language of classes? We make the following re-arrangement:
x.x = x
0 = x – x.x
0 = x.(1 – x)
(subtracting x.x from both sides)
(factoring out x)
Well, taking x = the class of men, we know that (1 – x) is the class of everything which is not men. So the
above expression means “(the class of men) AND (the class of everything which is not men), taken together is
empty (=0). In other words, “you can’t be both a man and not-a-man (eg mouse)”.
This logical system can be applied to logic studied by Aristotle which handles a form of inferences called
“syllogisms” which proceed from propositions to conclusions. The premises and conclusions were sentences
of a restricted type such as “All cats are animals”. This could be written “All X are Y”. Here is a typical
syllogism
Therefore
All X areY
All Y are Z
All X are Z
We say that this syllogism is “valid” since if the premises are true then the conclusion is true. For example
Therefore
All cats are mammals
All mammals are vertebrates
All cats are vertebrates
Boolean logic can be used to demonstrate that this syllogism is valid as we shall now see. But how do we
“code” “All X are Y” in Boolean algebra. Remember the meaning of x.(1 – x) = 0 above, that nothing belongs
to both class x and the class which is everything else. Well, if we write x.(1 – y) = 0, then this means that
there is nothing which belongs to x but not to y. That means everything in x belongs to y, in other words “All
X are Y. Finally, we can unravel X.(1 – Y) = 0 to the equivalent expression X = XY. So here’s the coding of the
above two premises.
Therefore
All X areY
All Y are Z
All X are Z
X = XY
Y = YZ
(1)
(2)
(3)
Now we must prove the conclusion (3). Here’s the steps: (i) take the first expression X = XY and replace Y
with the second Y = YZ which gives us X = XYZ. (ii) group the terms like this X = (XY)Z (iii) Now replace the
XY in the brackets with X using the first expression X = XY. That gives us
X = XZ
All X are Z
Which proves the conclusion.
Let’s now see how Boolean algebra can be used to check whether a more complex argument is correct. For
example, consider the following argument (which may be valid or invalid – it’s our job to find out)
If Higgins was born in Bristol, then Higgins is not a Cockney. Higgins is either a
Cockney or an impersonator. Higgins is not an impersonator. Therefore Higgins was
born in Bristol.
If we strip this paragraph apart, then we can identify the following atomic sentences:
P = Higgins was born in Bristol
Q = Higgins is a Cockney.
R = Higgins is an impersonator.
Using these atomic sentences, we can code the above paragraph as
If P then NOT Q
Q OR R = true
R = false
P = true.
Boole saw that the same algebra that worked for classes could also work for inferences of this kind. He
interpreted an expression like X = 1 to mean that X is true. Likewise X = 0 means that X is false. With this
understanding the results below followed:
“X and Y” is written
“X or Y” is written
“If X then Y” is written
XY = 1
X+Y=1
X.(1-Y) = 0
since this holds only if X and Y are both 1
since this holds if either X or Y are 1
since putting X = 1 gives (1-Y) = 0 hence Y=1
Using these ideas, the above “coding” of the paragraph can be written
P.(1 – not Q) = 0
Q+R=1
R=0
P=1
(1)
(2)
(3)
(4)
So let’s see if this argument is valid. Using (3), putting R = 0 into (2) we find Q = 1. Putting this into (1)
together with the value of P from (4) we find
1.(1 – 0) = 0
1.1 = 0
This is clearly incorrect, “one times one is not zero”. So there is an invalid argument. To solve this, let’s
change the conclusion to P = 0. Putting this value of P together with Q = 0 into (1) we now get
0.1 = 0
Which is now OK. So the above values for P,Q,R give us the following understanding
P=0
Q=1
R=0
Higgins was not born in Bristol
Higgins is a Cockney
Higgins is not an impersonator.
You will have recognized our “truth tables” in the above discussion. They capture the essence of Boolean
algebra, but were not used by Boole. It is suggested that Lewis Carroll discovered them in 1894.
Summary. Boole produced a logic where variables x,y,z,…have one of two values (0,1). He also produced
two operations, AND and OR. His logic was able to provide “inferences”.
Claude Shannon
Now we jump ahead in time almost 100 years to reflect on the work of Claude Shannon, who in his MSc
Thesis “A Symbolic Analysis of Relay and Switching Circuits” showed how Boolean algebra could be
realized by simple electrical switches. Here’s an image from his thesis
In Fig.1 he shows a simple switch which is open. Fig.2 shows two switches in series which he proves can
provide the Boolean operator “OR” (X + Y) and in Fig.3 he shows two switches in parallel which provide the
Boolean operator “AND” (X.Y). He uses the Boolean variable in a rather strange way. An open switch is given
the value X = 1 and a closed switch X = 0. We would probably do the opposite today. This is because he was
thinking in terms of “hindrances”, whereas we would thing in terms of “conduction”. A “hindrance” is a block
to electrical flow, ie an “open” switch. Let’s see how Fig.2 will generate a truth table.
X = 0 (closed) Y = 0 (closed)
X = 0 (closed) Y = 1 (open)
X = 1 (open) Y = 0 (closed)
X = 1 (open) Y = 1 (open)
Total Hindrance = 0
Total Hindrance = 1
Total Hindrance = 1
Total Hindrance = 1
X+Y=0
X+Y=1
X+Y=1
X+Y=1
This has generated the truth table
X
Y
X+Y
0
0
0
0
1
1
1
0
1
1
1
1
Which clearly corresponds to the “OR” function, where X + Y is true if X is true OR Y is true OR both are
true.
Now let’s have a look at his Fig.3
X = 0 (closed) Y = 0 (closed)
X = 0 (closed) Y = 1 (open)
X = 1 (open) Y = 0 (closed)
X = 1 (open) Y = 1 (open)
Total Hindrance = 0
Total Hindrance = 0
Total Hindrance = 0
Total Hindrance = 1
X .Y = 0
X. Y = 0
X .Y = 0
X .Y = 1
Which has generated the following truth table
X
0
0
1
1
Y
0
1
0
1
X.Y
0
0
0
1
which corresponds to the “AND” function, where X.Y is true only if X AND Y are both true.
Of course today, we do not use mechanical switches in our computer, we use electronic switches called
“gates”. Our AND and OR gates have exactly the truth tables written above. Shannon therefore laid the
ground for a digital electronic implementation of an abstract computer (whatever that is… we shall see later).
But he also emphasised the correspondence between electrical circuits and the propositional calculus we have
been discussing above in the section on Boole.
The workshop associated with this session is based upon Shannon’s appreciation of electrical circuits and
logical inferences. Here you will work with simulated logic gates to explore the logic of inference.
Hilbert
Together with Ackemann, David Hilbert published a book “Basics of a Theory of Logic” which developed
Frege’s “first order logic”. While we have not discussed this, it extends the logic of AND, OR, NOT, IF with
“there exists” and “for all”. Hilbert showed that mathematics could be described by this new logic, and they
raised three important questions.
Completeness. For any valid inference, would it be possible to prove the conclusion given the premises? In
other words did their system of logic provide all possible correct proofs? (This was answered by Godel a
couple of years later, in the affirmative).
Consistency. Not required for the discussion.
The Decision Problem. Could an algorithm be found which would tell you if a proposed inference is valid
without constructing an entire proof. Constructing a proof may take a large number of steps. But this
algorithm (if it existed) would take a finite number of simple steps. Since logic and maths had been shown to
be equivalent, if this algorithm existed, then it could be used to answer any mathematical question. As
mentioned above, Alan Turing turned to finding such an algorithm, and through his reasoning process
discovered the computer.
To understand what this decision problem is all about, we now turn to the study of two formal systems, the
“MIU” system and the “pq” system, developed by Douglas Hofstadter in his book “Goedel Escher and Bach,
An Eternal Goldern Braid”.
Let’s start with the MIU system. This is a formal system known as a “Post production system” after the
logician Emile Post. It is all about strings formed from three letters M, I, U, such as MU, UMIM, MIMU
and so on. The system gives you a starting string called a “axiom” and also “rules” telling you how to
“produce” other strings. These strings you produce are called “theorems”. To recap, you start with an axiom
and rules and you end up with theorems. Here’s the axioms and rules
Axiom MI
Rule 1 If the last letter is I then you must add a U to the end
Rule 2 If you have Mx (where x is a string of any length) then you can form Mxx
Rule 3 You can replace any III in a string with U
Rule 4 You can delete any UU in a string.
Here are some theorems generated in this system.
Axiom
MI
Rule1
MIU
MIUIU Rule 2
MI
MII
MIIII
MIU
Axiom
Rule 2
Rule 2
Rule 3
MIIII
MIIIU
MIUU
MI
Rule 1
Rule 3
Rule 4
Clearly we can generate a huge number of theorems using these production rules. But let’s return to
understanding what the “decision problem” is about. How can we find out if the three strings noted above,
MU, UMIM, MIMU are theorems of the system, ie can they be produced by the system. There are two
possible approaches. First we can just use the rules in various combinations until the string MU or UMIM
pops out. Then we know that the thing is indeed a theorem. But what if it does no pop out say after 100
production steps? We can’t conclude that it is not a theorem since perhaps it takes 200 production steps to
generate it, or perhaps if it is really not a theorem, we will never produce it, but that requires an infinite
amount of time. Clearly this approach is useless. And that’s the basis of Hilbert’s decision question. Can we
find an algorithm (ie a process which executes in a finite amount of time which can decide the problem for
us? This process must work for all possible strings we can write down. It’s easy to see that UMIM cannot be a
string, we decide by looking at the string and at the rules and noticing that the string starts with a U. All
theorems clearly start with M. But what about MU ? We have no idea what the decision procedure should be.
Perhaps we can find a decision procedure, perhaps we can’t. That wasn’t Hilbert’s question. He wanted a
proof that such a decision procedure exists or doesn’t exist, and that’s where Turing came in.
Let’s take another system from Hofstadter, the “pq” system. Here are the axiom and rule set:
Axioms xp-qx- is an axiom if x is composed of hyphens, eg -p–q -Rule 1
If xpyqz is a theorem then xpy-qz- is a theorem.
Let’s make a couple of derivations, first starting from –p-q-- ie x = - and let’s take y = -. Then we get
–p-q-–p--q--–p---q---–p----q----Now let’s take x = -- and y = ---. Then we get the axiom --p-q--- and the theorems are
--p-q----p--q-----p---q------p----q-----For this system, we do have a decision procedure. For example, you can tell that the string --p---q----- is a
theorem of this system, but --p-----q--- is not a theorem. How? Well this works by jumping outside the system
and noticing that all strings express the simple rule of arithmetic addition, where p means “plus” and q means
“equals”. So
--p----q------
means “two plus four is six”
That this works follows from the recipe to make axioms, which clearly produces strings which display the
addition property. The production rule simply adds one onto the second operand and one onto the resulting
sum. But remember we did not need to know this “higher-level” interpretation of the strings, our formal
system was able to generate them automatically.
Let us return to the “decision problem” here. We have been able to find a decision procedure by jumping out
of the formal system and by using our intelligence, and recognising a system with which we are familiar,
additions of numbers. Recognition of this similarity was given meaning to the original formal system.
Unfortunately this was not what Hilbert had in mind when he was looking for an algorithm to decide is any
given theorem was true in the logical system developed by Frege. He wanted to know if that algorithm could
be made using the grammar of the system itself, and not by jumping out of it just as we have done.