Download Edward Feigenbaum - IEEE Computer Society

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Herbert A. Simon wikipedia , lookup

Stanford University centers and institutes wikipedia , lookup

Wizard of Oz experiment wikipedia , lookup

Computer vision wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Human-Computer Interaction Institute wikipedia , lookup

Human–computer interaction wikipedia , lookup

Expert system wikipedia , lookup

AI winter wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Computer Go wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Transcript
Interviews
Edward Feigenbaum
David Alan Grier
George Washington University
In May 2013, Edward Feigenbaum,
Kumagai Professor Emeritus at Stanford University, came to the IEEE
Computer Society offices to talk
about his career as one of the leaders
of artificial intelligence. For his work
in the field, Feigenbaum had just
received the Computer Society’s
2013 Pioneer Award. In this interview, part of which is printed here, we talked about his
career, his mentor Herbert Simon, and the development
of AI.
Feigenbaum is not only one of the founders of the
field of AI. He has been one of its leaders for more than
50 years. He became aware of the birth of AI when he
was an undergraduate at Carnegie Tech (later, Carnegie
Mellon University). Continuing to graduate school
there, he was mentored by Herbert Simon, and he also
collaborated with Allen Newell. In 1960, he took his
first academic job at the University of California at Berkeley. There he coedited an important anthology of
early AI research, Computers and Thought (McGraw-Hill,
1963), which is often called “AI’s first book.”
In 1965, he moved to Stanford University’s new
Computer Science Department. There he began a collaboration with Nobel laureate geneticist Joshua Lederberg, chemist Carl Djerassi, and Bruce Buchanan on
work that reshaped the paradigm of AI—expert systems,
knowledge engineering methods, and the concepts of
knowledge-based systems.
From 1965–1968, he led Stanford’s Computer Center. The early 1980s was a busy time for him. He studied
and wrote widely about Japan’s Fifth Generation Project, which attempted to marry AI with high-speed computing; helped to found the Association for the
Advancement of Artificial Intelligence (AAAI); served as
AAAI’s second president; and led and coedited the landmark four-volume Handbook of Artificial Intelligence
(Addison-Wesley).
In the 1990s, he turned to public service, serving at
the Pentagon as chief scientist of the Air Force from
1994–1997.
We began our discussions by talking about the time
when Herbert Simon introduced him to the ideas of a
“thinking machine.”
David Alan Grier: Let’s begin by talking about
the start of your career. How did you first get
interested in computing?
Computer and Thought book cover, circa 1963.
74
IEEE Annals of the History of Computing
Feigenbaum: As a undergraduate senior in the fall of
1955–1956 at Carnegie Tech, I took a graduate-level
seminar called “Mathematical models in the Social Sciences” from Herbert Simon—polymath, social scientist,
behavioral scientist, later Nobel Prize winner in economics, and cofounder of AI.
In the first seminar session after the Christmas and
New Year’s holiday break, January 1956, Herb opened
the seminar by reporting, “Over the Christmas holiday,
Allen Newell and I invented a thinking machine.” This
comment startled us six students. What did that mean?
Thinking? Machine? Those words did not seem to go
together.
What Simon was talking about the first fully working
AI program. It was named LT, the Logic Theory program. AI scientists know this program as the first heuristic search problem-solving program. It proved theorems
Published by the IEEE Computer Society
1058-6180/13/$31.00
c 2013 IEEE
(the propositional calculus) in chapter 2 of
Whitehead and Russell’s Principia Mathematica (three volumes, 1910–1913).
To allow the six of us to understand computing machines, Simon gave us copies of the
IBM 701 manual. I took that manual home
with me and read it straight through the
night. By morning, I was a born-again computer scientist, except there was no such
phrase then as “computer scientist.”
I knew two things: I was going to go into
this field involving AI and computers, and I
was going to come back to Carnegie Tech for
graduate school to work with Herb Simon,
and Al Newell, on these problems.
Fortunately for me, I was able to attend a
graduate student program on computers that
summer, 1956, held by IBM—their first such
summer program. IBM staff taught me programming for the IBM 650 (which Carnegie
Tech was about to acquire) and the new IBM
704.
When I returned to Carnegie, I walked
into Herb Simon’s office and said, “Okay,
here I am. What do I do?” Having cracked
open the issue of computer simulation of
human information processing, Herb was
interested in the modeling of human cognitive processes using computer language as a
modeling language rather than mathematical
language or English.
He was interested in more than just problem solving—for example, the processes of
human memory. That was the problem he
posed to me for my graduate student
research. This work later became my doctoral
thesis. I created a computer simulation model
called EPAM (Elementary Perceiver and Memorizer) that dealt with the simulation of phenomena that were well understood by the
experimental psychologists of the first half of
the 20th century. The experiments constituted a stable paradigm with many stable
results that I could use as fixed targets. EPAM
hit a lot of those fixed targets with a simple
model. The EPAM model structure has lasted
more than 50 years and has been active and
productive in psychology.
DAG: This begins a long period of
collaboration between you and Herb
Simon.
Feigenbaum: Yes, Herb and I collaborated in
detail on this EPAM model until 1965, when I
moved to Stanford. At that point, I decided to
change my focus. I moved from the part of AI
that we call computer simulation of
Feigenbaum at Stanford’s Computer Center, in 1966.
cognition—the “psychology side” of AI—to
the side we call artificial intelligence—the
engineering side. This part of AI aims at programming computers to be not only as smart
as people, but much smarter than people.
When interviewing me for one of her
books, the writer Pamela McCorduck asked
me, “Can a machine ever be as smart as a person?” I said, “No.” She was startled at that
answer because she thought I really did
believe that a program could be far smarter
than a person. I replied that there will never
be a moment at which a machine is as smart
as a person because as soon as we know how
to make it as smart as a person we engineers
will make it smarter. So there’s no stopping
point right there, it’s an unstable point in the
engineering.
DAG: Many of your first papers were
published by RAND. When did you
begin working there?
Feigenbaum: My RAND affiliation started in
the summer of 1957. During the 1956–1957
school year, RAND began working on public
versions of their list processing languages.
These Information Processing Languages, or
IPLs, preceded Lisp by three years. The IPLs
were languages that ran only on RAND’s JOHNIAC computer, a copy of the Institute for
Advanced Study computer.
Allen Newell and a few graduate students,
including me, began work on a version of IPL
that could be used on a wider basis. My job in
the summer of 1957 was to go preach those
list processing ideas (what would in Silicon
Valley language now be called “being an
evangelist.”). I was evangelizing at a division
of RAND that had become its own corporation, System Development Corporation,
preaching list processing languages as the
most flexible way of writing programs for air
defense.
October–December 2013
75
Interviews
In the summer of 1958, I went back to
RAND to program the IBM 704 and 709 versions of IPL, called IPL-V. By that time Newell
had decreed that we would not write code
until we had published the manual; then
we’d write the code specifically for that manual. That’s what I did in the summer of 1958.
In the summer of 1959, I did some more of
that and also did some coding of EPAM. I was
making my final runs on EPAM because I was
going to take my doctoral thesis exam when I
got back to Carnegie in September.
DAG: So you completed your PhD,
moved to UC Berkeley in 1960, and
started your research on artificial
intelligence.
Feigenbaum: In the early part of the 1960s, I
was looking for a project that would put me
on the path toward the “ultra-intelligent
computer.” I became interested in inductive
reasoning as compared with deductive reasoning. In 1963, I described inductive reasoning as the next step in AI in the introduction
to Computers and Thought.
I was looking for an experimental
vehicle—sometimes I call it a playpen—in
which to examine ideas about inductive
thinking. In particular, I decided to look at
the inductive thinking of those people who
are professionals at doing that kind of thinking, namely scientists. That’s what they do.
They induce hypotheses from empirical data.
Fortunately, with tremendous good luck, I
met Joshua Lederberg, a Nobel Prize winner
in medicine for his breakthrough work in
molecular genetics.
Lederberg was interested in exactly the
same subject, the computer modeling of scientific thinking. He was changing interests
from mainline genetics to instrumentation
and computation. He said, let’s try it. The
problem he suggested involved the inductive
interpretation of organic molecular structure.
Our goal was to write a program that could
interpret experimental data in the Stanford
Mass Spectrometry Laboratory, headed by
our chemistry collaborator Carl Djerassi (yet
another genius).
We really had quite a team working on the
DENDRAL project: excellent PhD students
and post-docs and excellent young computer
scientists, foremost among whom was Bruce
Buchanan. By 1970, we had made tremendous progress on the problem and pretty
much formulated how to do knowledge engineering, which is the fundamental set of concepts and methods behind expert systems.
76
IEEE Annals of the History of Computing
Japanese poster announcing a Feigenbaum
lecture.
Around 1972 came our first application to
clinical medicine: the Mycin conversational
system for diagnosing blood infections (Ted
Shortliffe’s thesis). Throughout the 1970s we
challenged ourselves in many other areas. I
did a military application to interpret coastal
sonar data to infer (induce) what Soviet submarines were patrolling the West Coast and
what they were doing. That application,
HASP, was classified at the time.
We did engineering applications to civil
engineering and x-ray crystallography. In Silicon Valley style, we started several companies, beginning with IntelliGenetics and later
IntelliCorp. Then, Teknowledge and later
Design Power, specializing in applications of
expert systems to the engineering design of
boilers.
DAG: Let’s move to the Japanese Fifth
Generation Project, which was an
important part of your career. When
did you start getting involved
in Japan?
Feigenbaum: My interest in the Fifth Generation Project began in 1979 when I was teaching at the University of Tokyo for one school
term. But my interest in Japan began much
earlier, in 1970, when I met my wife, who’s
Japanese.
I was much taken with the scope of the
Japanese government’s vision of what they
wanted to achieve in artificial intelligence
in a 10-year project. When my book with
Pamela McCorduck, The Fifth Generation
(Addison-Wesley Longman, 1983), was published, it was translated immediately into Japanese, so I became celebrity in Japan because
I was saying a lot of nice things about their
work.
DAG: One of the people who started
working with you at this point is
Pamela McCorduck. How did she come
into your life?
Feigenbaum: Very far back. When I arrived at
Berkeley I was in the School of Business
Administration. Julian Feldman and I were
both in the Management Science Research
Center. We both got grant money to pursue
our research.
One of the things you do when you get a
little grant money is to hire a secretary to
help you. In this case, we needed an especially good and literate one because we were
putting together Computers and Thought. We
hired Pamela in her junior year as an English
major at Berkeley. She worked for us until
after the book was published and then left to
work somewhere else.
When I moved to Stanford I invited her to
come to Stanford to work as my secretary
there, which she did. Later, she decided she
wanted to be a professional writer and went
off to do a master’s degree at Columbia. So
we’ve had a long collaboration, which
included The Fifth Generation.
The Fifth Generation was well received
partly because it had an interesting narrative—Japan trying very hard to catch up.
Japan was starting a new project to focus on
AI and parallel computing, both of which
were hot subjects in the United States.
The Fifth Generation actually tells two
important stories, one about ARPA (the
Advanced Research Projects Agency, later
renamed DARPA) and one about Japan. ARPA
began funding artificial intelligence research
in 1963. My first substantial support at Berkeley came from ARPA. I thought the ARPA
story was marvelous—how ARPA support
evolved and was so important to the US—
and the American public didn’t know about
that. The Japanese public basically didn’t
know that scientists in their government, at
MITI’s Electrotechnical Lab, were doing such
good thinking in computer science. So it was
interesting all the way around.
Feigenbaum with Tom Rindfleisch, circa 1978.
DAG: What do you think Japan’s Fifth
Generation Project accomplished, and
where did it fall short?
Feigenbaum: The success was that they trained
a generation or two of young researchers in
AI and computer design. They learned how
to do parallel computing, how to do AI, and
what it means to build an AI system. All of
that penetrated right back into industry
because the project work was being done by
engineers that came from industry: Hitachi,
Fujitsu, Toshiba, and other industrial giants.
I can tell you first where the project fell
short and then why. They worked hard on
building a high-speed Prolog or “logic programming” machine and never developed
application-based methodologies. I spoke to
them twice about this issue. The first time
was when the project began. The second was
after they had been working for five years. I
told them that they had to learn knowledge
engineering and knowledge representation.
They had to work on knowledge representation schemes and acquiring knowledge from
expert humans, tailoring all this to the needs
of specific knowledge-based applications.
They needed to be more empirical.
For the first five years they didn’t do that.
They just went along doing their engineering
at a level too abstract. What is Prolog? How
can I make the Prolog language better? How
can I make a super Prolog? How can I cast a
super Prolog into machine logic? And how
can I make it run fast (the parallel computing
work)?
By the end of year five they didn’t have
much to demo. They really started on their
demo six months before the “big demo
show” was supposed to occur. Later rather
October–December 2013
77
Interviews
than earlier they got the message, and they
worked hard on those demos. They did get
some good results, by the end of year 10! It
was too late by that time. They had consumed the amount of government money
that the rest of the science and engineering
community would allow them to consume
because everyone else wanted the government money for other initiatives.
DAG: Harry Huskey was one of the
early leaders of the IEEE Computer
Society. How did you and he
collaborate?
Feigenbaum: I’m so happy to talk about Harry,
who is now close to 100 years old. Harry had
become a professor at the Electrical Engineering Department at Berkeley. Remember I was
in the Business School, but there were few
people interested in computing in the Business School, so I had to find my connections
elsewhere. Harry was one of them.
When J.C.R. Licklider started the Information Processing Techniques Office at ARPA,
he talked to Herb Simon and asked who he
should give research money to. Herb said Ed
Feigenbaum and Julian Feldman, among
others. Licklider asked the same question of
his contacts in the computer hardware community, and they said Harry Huskey.
Licklider wanted a computer halfway
between the small time-shared minicomputers, such as the DEC PDP-1, and the huge
time-sharing systems like Project MAC at MIT.
He wanted one built out of a next-generation,
more powerful minicomputer. He gave that
task to Huskey. Huskey gave that task to his
younger colleague David Evans and went on
sabbatical to India. So for a year I, as coprincipal investigator with Dave Evans, learned how
to design a computer system architecture!
DAG: In 1994, you received the Turing
Award. In your acceptance speech, you
tried to lay out some questions for
what computing needed to do and
make sure that the ideas behind them
were solid. How in 1994 did you view
the development of artificial
intelligence?
Feigenbaum: To be honest I haven’t read my
Turing Award speech since then, but I do
remember a couple of key things that I
wanted to say in that paper. I think that those
ideas are correct, but alas, I must admit that I
also failed to predict in any way, shape, or
form the direction in which AI really went in
78
IEEE Annals of the History of Computing
the first decade of the 21st century—namely,
to statistical machine learning. That change
was not predicted in my Turing Award speech
nor in Raj Reddy’s speech on the same day.
What I did say was that a good way to
think about where AI fits into the entire spectrum of information technology and computer science is what I call the “What to How
Spectrum.” The How end of the spectrum is
about the way a computer really works. It
deals with little instructions, like Clear and
Add, Shift Left N—dozens of tiny steps, even
in the firmware. As we moved from how to
what, we got Fortran, which allowed us to
express computational needs in formulas—
algebraic language. This moved us a little bit
away from the how end of the spectrum.
Then we got business-oriented languages.
Then we got domain-specific languages, like
ICE, which was a civil engineering language.
Later, we got object-oriented languages.
Far from there, at the What end of the
spectrum sits AI. At that end, you, the user,
tells the computer what it is that you want it
to do, what your goals are, in free flowing
natural language, with all of its ambiguities
and subtleties. The AI programs must have
the knowledge, reasoning power, and heuristics to employ to achieve these goals for you
so you don’t have to be a programmer.
Recently, we’ve seen examples of this.
These days you can pick up an iPhone and
can ask a knowledge-based AI program called
Siri to do something for you on the iPhone,
like set an alarm for a certain time tomorrow
morning. Of course you don’t need Siri to do
that. You could read the manual and figure
out the screen touches to get the alarm to
come up and how to set it and all that. But
instead you just say what it is you want, and
you say it in whatever natural languages Siri
understands.
Expert systems aren’t at the What end
because most expert systems don’t support
conversation in a free-flowing style about our
specific goals requiring expertise.
The other point I wanted to make in the
Turing Award lecture was that AI programs
behave intelligently to the extent that they
know a great deal about the domain of discourse in which they are being asked to perform. For the most part, cognition is not
based on deep reasoning; it’s based on broad
knowledge. In the case of expertise as captured in expert systems, it is both broad and
deep knowledge in a particular domain.
Here’s another way to say it: it is more
important for an AI program to know a lot
than to think deeply. This is the Knowledge
Principle in AI. With essentially one major
exception, it’s proved to be universally true.
The one major exception occurred when a
high-speed chess machine, IBM’s Deep Blue,
beat Kasparov, the world’s chess champion,
largely by brute-force searching with only a
small amount of specialized chess knowledge. Deep Blue examined approximately
200 or 300 million paths per move whereas
Kasparov examined perhaps 200 to 2,000
well-selected paths per move.
So we have a data point “off the curve”
that I choose not to ignore: a chess program
that doesn’t have a lot of knowledge but does
a great deal of computing. It’s a place to
examine the possibilities for intelligent
search using the extremely high-speed computers of today and tomorrow. What other
problems can we solve the way that the IBM
researchers solved the problem of playing
chess? Can we elicit creative ideas from a program by searching huge combinatorial
spaces, spaces that people are not good at
exploring?
You can’t treat creativity as an abstract
concept. A creative act is just a behavior with
special characteristics. It is a behavior that is
novel and perhaps startling to people. It may
even be “new to mankind,” never before
seen. It may be elegant or beautiful, as evaluated by people. There are several instances
of that that have been done.
For example, take the AI program we
talked about, the Logic Theory program. AI’s
first program proved a theorem in chapter 2
of Principia Mathematica much more elegantly
than Whitehead and Russell proved it. The
exchange of letters between Simon and Russell is remarkable, as exhibited in Herb
Simon’s autobiography. Russell says he was
prepared to believe that a computer can do
anything, and why did he and Whitehead
waste all the time doing this work? This level
of behavioral excellence (creativity?) was
done with AI’s first program that was not able
to do a lot of combinatorial search because it
was running on a 40,000 operations per second computer with 4,000 words of memory.
DAG: What do you think about one of
the successors to IBM’s Deep Blue, the
system Watson that won the game
show Jeopardy?
Feigenbaum: Watson found ways to deal with
an immense amount of “surface-level”
knowledge—the immensity of knowledge
Why do we have to get
programs to read from
text? Because in
knowledge lies the
power.
that is available over the Internet. I have
heard the claim, and find it plausible, that
Watson had access to 1 trillion individual
“items” of knowledge.
The great thing about the software architecture of Watson is that it does not rely on
any one algorithmic or heuristic method to
make its decisions. Watson’s builders used
what I like to call a “hybrid architecture.” I
did the same thing when my team and I
designed the expert system for detecting
Soviet submarines. Marvin Minsky has an
imaginative name for hybrid architectures.
He calls them the “kitchen sink model of the
mind.” Anything you can think of that will
work, put it to work, and the Watson people
did.
It took them a few years to arrive at a high
level of performance, but the performance
was slow. Finally David Ferrucci said, “Okay
that’s good enough. Now let’s just make Watson respond within three seconds.” Watson
engineers threw 2,800 high-speed boards at
the speed-up problem. That is, IBM said,
“We’ll build a supercomputer for Watson so
that it could press the Jeopardy button with a
good answer within three seconds.” The
developers used a beautiful engineering
method. The AI and system engineering was
very analytic, very systematic, using lots of
imaginative techniques. Great work!
DAG: Looking back over 50 years of AI
research that you’ve seen, what do you
think are the big accomplishments?
What are the things that really have
worked well?
Feigenbaum: Let me just mention a few. But
first, keep in mind that intelligence, hence
AI, is a multidimensional concept. It goes
from perceptual things like speech and
vision, on the one hand, all the way to deep
thinking—serious quality thinking, best
expertise in the world—on the other hand.
October–December 2013
79
Interviews
written about why this works at all. But as
engineers know, there were a lot of good
radios built before engineers figured out radio
theory.
DAG: What is the first thing that the
researchers need to focus on? What’s
the big problem facing AI right now?
Edward Feigenbaum and David Alan Grier at IEEE Computer Society
offices in Los Alamitos, California, May 2013.
If you look at just the Stanford AI lab working on these things in the mid-to-late 1960s,
you’ll find the robotics work included the
first vision system, the first mobile system,
the first assembly system, and a system that
could assemble a water pump. At Stanford
and CMU you’ll find Raj Reddy’s work on
speech understanding. You’ll find the early
work on expert systems.
By now, in the second decade of this century, speech understanding is a major triumph. AI’s work on vision—what computer
vision systems are doing these days—is spectacular. The work on expert systems was very
influential in IT. There were tens of thousands, maybe hundreds of thousands of them
built. If you want to check it out, do a Google
search with the phrase “business rules” and
see the many business rule systems that have
been built. (“Business rules” is a phrase
invented by expert systems software companies to market their products.)
IBM’s Watson, as I said, is remarkable. So
was the 2005 work of Sebastian Thrun and
his group, at Stanford (later Google), on the
Stanford self-driving car. The car’s performance was utterly, unexpectedly awesome,
winning the $2 million Darpa prize. They
built a great integrated system using both AI
and other engineering methods.
Heuristic problem solving is a major concept. That is an excellent model of how
everyone thinks. Last but not least is the surprising surge of statistically based machine
learning algorithms and systems. These have
outperformed anyone’s expectations, for reasons that no one really understands deeply.
There’s a great set of doctoral theses to be
80
IEEE Annals of the History of Computing
Feigenbaum: I’ll state it and then I’ll explain
why. I already made the case, at a private conference that was convened in the early part of
this century by DARPA, for what AI research
DARPA should be supporting. My idea was
the one that was voted by the group of eminent AI researchers as the most important.
We need software for knowledge acquisition for AI programs by reading text, reading
books, reading the Web. Not by painstakingly
doing knowledge engineering as in the past.
One knowledge engineer, one expert going
over the individual cases? That’s not the way
people built their culture.
We have succeeded so well as a species
because we found a way to use language to
record our experiences and thoughts, to write
them, to pass them on to the next generation. Two of the greatest inventions of all
time were writing and printing. We move our
culture to the next generation mostly by
reading text. These days the text is in either
“atoms” or “bits.”
Why do we have to get programs to read
from text? Because in knowledge lies the
power. It’s that Knowledge Principle I talked
about before. AI systems will not become intelligent until they are widely knowledgeable.
One of the things that we don’t know how
to do well yet is to accumulate immense
amounts of what Douglas Lenat calls
“common sense knowledge,” the knowledge
of ordinary things. This is the “glue” that
helps the knowledge of specific domains to
work well and robustly. We’ll get that from
reading text, just as we will read and acquire
domain knowledge.
DAG: Have we missed any topics here?
What did you think I was going to ask
and I haven’t?
Feigenbaum: Recently I was involved in a
study for the Air Force on future technologies. I once served as chief scientist of the Air
Force so I get involved in long-range forecasts
of scientific things for the Air Force.
In scanning all the different science and
technology horizons that the Air Force is
looking at for the next 10, 20, and 30 years, I
noticed that the people writing those
forecasts, the current generation of scientists,
engineers, and users, were not extrapolating
enough the great change that enormous
amounts of computation coming up in the
next 10 to 30 years will make to all fields. It
was almost as if they were giving five-year
projections. They weren’t absorbing the fact
that computing changes everything, often
profoundly.
I think your wider audience needs to consider profound change. For example, the primary tool of the new physics today is
computers, not mathematics. Physics is no
longer the mathematics that gets scribbled
on big yellow pages like Einstein did.
What is not understood is that there will
be coming, in the next 10 to 20 years, some
really excellent AI-enabled computer-human
interfaces in which computers can do vastly
better things than they are currently doing in
the service of human work. And people can
do whatever residual work there is that people do best.
These interfaces will allow that mixture of
human-computer interaction, not just where
the machine is serving the person but where
the human and the computer are cooperating
on a task. This will have much greater consequences than most people today understand.
DAG: Thank you so much for your time.
Feigenbaum: Thank you for allowing me to
express my views.
David Alan Grier is the author of When Computers Were Human (Princeton University Press,
2007), The Company We Keep (IEEE CS, 2012),
and other books. He writes the #Erranthashtag column
for Computer and is a former editor in chief of the
Annals. Contact him at [email protected].
October–December 2013
81