Download Expert Systems - Department of Computer Science

Document related concepts

Clinical decision support system wikipedia , lookup

Soar (cognitive architecture) wikipedia , lookup

Hubert Dreyfus's views on artificial intelligence wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Enactivism wikipedia , lookup

Intelligence explosion wikipedia , lookup

AI winter wikipedia , lookup

Computer Go wikipedia , lookup

Ecological interface design wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Human–computer interaction wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Personal knowledge base wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Transcript
Review of Schank’s Scripts: consist of a set of
slots.
Associated with each slot may be information about
the kinds of values it may contain, as well as
default values.
Scripts have causal structure – events connected to
earlier events that make them possible, and later
events they enable.
Headers of scripts indicate when a script should be
activated
Related to the concept of Frames (Minsky) which
was earlier and for more static structures (e.g. a
room). Scripts more like a big verb dictionary,
Frames more like one for nouns.
What background knowledge do we need to
understand a story?
What information does the writer expect us to
infer?
Are we likely to have both in a predetermined
script?
How do when know when a story has stopped
following a script? (Compare: how do we
know when the person we are talking to has
changed the subject--some people never
notice!)
De Jong’s ‘sketchy script
matcher’ FRUMP
At Yale around 1977 DeJong developed
a new form of SAM (Lehnert’s Script
Applier Mechanism)
 It sought only to fill initially determined
predicate values of interest to a user
 It worked mainly on newspaper stories
about terrorism.

For example, FRUMP wants to find out type of car,
object it collided with, location of accident,
number of people killed/injured, who was at fault.
Skims new story to identify appropriate script.
Then tries to answer expectations.
Connected to UPI wire service.
UPI Story.Pisa, Italy. Officials today searched for
the black box flight recorder aboard an Italian air
force transport plane to determine why the
aircraft crashed into a mountainside killing 44
persons. They said the weather was calm and
clear, except for some ground level fog, when the
US-made Hercules C130 transport plane hit Mt
Serra moments after takeoff Thursday.
The pilot described as one of the country’s most
experienced, did not report any trouble in a brief
radio conversation before the crash.
FRUMP summary:
44 people were killed when an airplane crashed
into a mountain in Italy today.

FRUMP is not like a ‘full’ restaurant
script (for air disaster) but it simply fills a
small number of slots (not necessarily
ordered) like NUMBER_DEAD,
WHERE_CRASH, WHEN_ CRASH.

FRUMP was never statistically
evaluated
But FRUMP was the forerunner
of a 1990’s technology
Information Extraction, where
‘templates’ of slots and fillers are filled
from web or newspaper text at high
speed and huge volume)
 This new AI technology was created by
US Government funding in the 1990s
and is highly statistical and competitive
between groups/universities/companies.

How do humans perform tasks?
Part of the aim of research on Script as was to find
a way of giving a program the same knowledge
that humans use to understand a story--and
Script theory was very influential in Psychology.
Similarly, in research on Expert Systems, aim is to
capture, and apply, the knowledge that human
experts have.
And in earlier examples, e.g. GPS, idea was to
mimic human problem solving ability.
It makes sense to emulate humans in Artificial
Intelligence research.

One of the original motivations for AI research
was to understand human mind.

But also to get computers to do clever things,
no matter how!

Difficult to provide an account of intelligence
without reference to what humans can do.
Although our changed conception of intelligence
now is less human-based e.g. perhaps a bee is
capable of intelligent behaviour.
But if we are concerned to emulate humans, we
need to find out how humans think, if we think
psychology has ways of telling us that reliably
Ways of finding out how people
work……..





Introspection (most AI experiments, like
CD/Sripts)
Protocol analysis (Activity reports--GPS)
Psychology experiments
One problem for expert systems is that the
introspection of experts is unreliable (plumbers
cant always tell you how they do it).
Much psychology is unsurprising but
sometimes helpful--e.g. that people usually
cant remember surface words only content-which is consistent with CD’s claims.
Return to Expert Systems
SHRDLU, and blocks microworld. Domainspecific knowledge (as opposed to domaingeneral knowledge).
Understood substantial subset of English by
representing and reasoning about a very
restricted domain.
Had knowledge of microworld, (but no real
understanding).
But program too complex to be extended to real
world.
Expert systems: also relied on depth of
knowledge of constrained domain.
But commercially exploitable. ‘Real’
applications.
SHRDLU Dead end: program very complex,
also little to do with real world.
General realisation that programs that
performed well within limits of microworlds,
could not capture complexity of everyday
human reasoning.
Remember that SHRDLU would have to
process AN INTERESTING BOOK by
accessing all the books it knew in its
database and all the interesting things!
Hubert Dreyfus (1972): criticism of idea that
reasoning and intelligence could be captured
by logical rules.
Dreyfus was part of the first major reaction
against the claims of AI in the 1970s (cf. UK
Govt. Lighthill Report).
Weizenbaum (1976): pointing out that his ELIZA
‘had come close to passing Turing Test.(!)
Humans too ready to attribute intelligence to
unintelligent devices. Risk of oversold
programs.
But some of this was just breast beating for
profit (Weizenbaum’s Computer Power and
Human Reason was Reader’s Digest Book of
the Month!). Overselling how much one had
done even while repenting!
References for Knowledge Representation
Rich and Knight (1991) Artificial Intelligence,
McGraw-Hill, Inc. Chapter 4.
Cawsey, A. (1997) Essentials of Artificial
Intelligence, Prentice-Hall. (see also web
reference on course page)
Russell and Norvig (1995) Artificial Intelligence:
A modern approach. Chapter 3.
Introspective evidence of stages of learning
a skill or expertise – e.g. car driving or
chess playing
 Novice. Car driver or chess player is
consciously following rules.
 Expert: can decide what to do ‘ without
thinking’ – making decisions about what to
do based on resemblance of current
situation to many previously experience
situations.
best chess players can usually instantly
recognise what is a good move.
expert driver knows when slowing down is
needed without thinking about it. (e.g.
becomes difficult to drive if you consciously
reflect about gear shifting and try to decide
If this intuition is correct, there is more to real
expert understanding than following rules.
BUT a few problems where (rule driven) expert
systems can perform as well as experts.
And even in the absence of claims that expert
systems think like humans, these may well be
a useful tools.
Probably work best when used as consultant or
aide to human expert or novice.
Examples are medical diagnostic systems,
optimal layout systems for space, and
scheduling algorithms. Feigenbaum’s
DENDRAL at Stanford predicts chemical
compounds.
Criticisms by Hubert Dreyfus
Dreyfus: points out ways in which AI theorists have
overclaimed about what they can do.
e.g. Feigenbaum claims that ‘DENDRAL has been
in use for many years at university and industrial
chemical labs around the world’.
But ‘..when we called several university and
industrial sites that do mass spectroscopy, we
were surprised to find that none of them use
DENDRAL..’
Dreyfus: Programming attempts to capture
ordinary, or common sense knowledge and
reasoning ability are doomed to failure.
Such knowledge cannot be captured by programs
because it is too contextual and open-ended.
For Dreyfus, the real expert is not following rules
Strong AI: building programs that actually think (or
striving towards this)
Weak AI 1: Applications – trying to perform tasks
that would require intelligence if performed by
humans.

Some attempt to simulate human solutions
Weak AI 2: Modelling human cognition
Expert Systems sometimes do better than human
experts. e.g. Buchanan, 1982, MYCIN did
better than panel of experts in evaluating ten
selected meningitis cases.
But expert systems benefit from being applied in an
area where computer can exploit an ability to
follow rules.
Four major problems for expert systems

Brittleness. Cannot fall back on general
knowledge – e.g. if mistake in entering data
for medical expert system, entering that
patient is 130 years old, and weighs 40
pounds. ES would not guess values switched.
 No Meta-knowledge. Expert systems do not
know their own limitations.
 Knowledge acquisition. Still bottleneck in new
domains.
 Validation. Difficult to know what to compare it
to (unless compared to human experts
diagnosing real world problems).
Domain-specific knowledge versus domainindependent knowledge
Expert systems: good at domain-specific
knowledge, bad at domain-independent.
PUFF knows nothing about medical complaints
except conditions of the lung (i.e. knowledge
very specific), and may not even know
whether lungs are above or below knees
(example of common knowledge about
human anatomy).
Does that matter?
Would we care if it diagnosed us efficiently?
Why are we obsessed with being a human
whole?
Is an ES like an Idiot savant: person who is
basically retarded, but able to perform very well
in one limited domain. e.g. calculating day on
which particular dates fall.
From Lenat and Guha (1990) (in Rich and Knight,
1991, Artificial Intelligence)
System: How old is the patient?
Human: (looking at his 1957 chevrolet) 33
System: Are there any spots on the patients body?
Human: (noticing rust spots) Yes.
System: What colour are the spots?
Human: Reddish-brown.
System: The patient has measles (probability 0.9)
More like ‘automated reference manuals’
(Copeland, 1993).
Advantages of Expert Systems
Human experts can lose expertise.
Ease of transfer of artificial expertise.
No effect of emotion in artificial expertise.
Expert systems are a low cost alternative –
expensive to develop but cheap to operate.
Limitations:
Lack of creativity, not adaptive, lack sensory
experience, narrow focus, and no
commonsense knowledge (or metaknowledge).
Lack of wider understanding
Winograd (Shrdlu’s programmer)
‘..There is a danger inherent in the label ‘expert
system’. When we talk of a human expert we
connote someone whose depth of
understanding serves not only to solve
specific well-formulated problems, but also to
put them into a larger context. We distinguish
between experts and idiot savants. Calling a
program an expert is misleading….’
Can lead to inappropriate expectations
But may be useful if users can be educated
about proper expectations (are people getting
used to limited machines?)
See following two paragraphs (from HayesRoth, 1983)
Summaries of pulmonary function diagnosis of
particular patient. One by human expert,
other by expert system (PUFF).
Conclusions: the low diffusing capacity, in
combination with obstruction and a high total
lung capacity is consistent with a diagnosis of
emphysema. Although bronchodilators were
only slightly useful in this one case,
prolonged use may prove beneficial to the
patient.
PULMONARY FUNCTION DIAGNOSIS:
MODERATELY SEVERE OBSTRUCTIVE
AIRWAYS DISEASE. EMPHYSEMATOUS
TYPE.
Conclusions: Overinflation, fixed airway
obstruction and low diffusing capacity would
all indicate moderately severe obstructive
airway disease of the emphysematous type.
Although there is no response to
bronchodilators on this occasion, more
prolonged use may prove to me more helpful.
PULMONARY FUNCTION DIAGNOSIS: OBSTRUCTIVE AIRWAYS DISEASE,
MODERATELY SEVERE
EMPHYSEMATOUS TYPE.
No totally automatic ways of constructing expert
knowledge bases, but there are programs
which interact with domain experts to extract
expert knowledge efficiently.
e.g. finding holes in knowledge and prompting
expert to fill them.
AND/OR checking for consistency in knowledge
OR Alternative to interviewing expert: looking at
sample problem and solutions, and inferring
its own rules.
e.g. bank’s problem of deciding whether to
approve a loan. Instead of interviewing loan
oficers, look at past loans, and try to generate
loans that will maximise number of good
loans in the future.
Expert system Shells also marketed. e.g.
EMYCIN (Empty Mycin) Consists of the shell
of an expert system, without domain specific
knowledge.
New knowledge domain can be entered, and
make use of same rule mechanisms.
Evaluate expert systems: good idea or not?
How important is it to have systems that are
commercially viable, and made use of in the
real world?
Would you be happy to rely on a medical Expert
System instead of a doctor?
Advantages
Disadvantages
Reliance of expert systems on domain specific
knowledge
Also on heuristics operating on the knowledge
Knowledge-base: need to find a way of
representing knowledge. MYCIN: production
rules.
Also need to draw appropriate inferences –
inference-engine.
Need to work out what knowledge is
appropriate, and to get it into the knowledgebase.
Knowledge engineering
Based on protocol analysis (GPs pioneered
this) : human subjects encouraged to think
aloud as they solved problems. Protocols
later analysed to reveal concepts and
procedures employed.
Protocol analysis used alongside Logic Theorist
by Newell and Simon.
Interaction between expert system builder,
knowledge engineer, and human experts in
some problem area.
Some computational psychologists (e.g.
Schvaneveldt) used networks to represent
knowledge elicited as associations of
concepts.
Automated Knowledge Acquisition and
Evaluation
Alternative to time-consuming and expensive
knowledge engineering.
Evaluation depends entirely on task for which
ES are designed.
If they function as assistants (like DENDRAL)
we need only that they do not miss any
solutions with respect to given set of
constraints, and take a reasonable length of
time.
If like MYCIN they generate whole solutions, we
need evaluation against human experts (or
rival expert systems).
Evaluation of expert systems.
Comparison to experts: need to follow
experimental procedures, i.e. so raters don’t
know which are human and which are
computer’s solutions.
DENDRAL: used as expert’s assistant, rather
than stand alone expert. Heuristic search
technique constrained by knowledge of
human expert.
‘…supports hundreds of international users
every day, assisting in structure elucidation
problems for such things as antibiotics and
impurities in manufactured chemicals..’
(Jackson, 1990)
.
MYCIN: performance compares favourably with
human experts. But never used in hospitals
Suggested reasons (Jackson, 1990)
 Its knowledge base is incomplete since it
does not cover anything like the full spectrum
of infectious diseases.
 Running it would have required more
computing power than hospitals could afford.
 Interface not good.
 Trade union protectionism by US doctors?
MYCIN. (Shortliffe and Buchanan, Stanford).
Expert system which attempts to recommend
appropriate therapies for patients with
bacterial infections.
Four part decision process:
 Deciding if the patient has a significant
infection
 Determining the possible organisms involved
 Selected a set of drugs that might be
appropriate
 Choosing the most appropriate drug or
combination of drugs.
MYCIN has five components.





A knowledge base
A dynamic patient database
A consultation program
An explanation program
A knowledge acquisition program, for
adding or changing rules.
Once MYCIN finds the identities of the diseasecausing organisms, it tries to select therapy
to treat disease.
IF the identity of the organism is pseudomonas
THEN therapy should be selected from
among the following drugs:
 Colistin (.98)
 Polymyxin (.96)
 Gentamicin (.96)
 Carbenicillin (.65)
 Sulfisoxazole (.64)
(decimal numbers show prob. of arresting
growth of pseudomonas).
Expert systems typically use production
rules: (IF – THEN rules)
e.g. MYCIN rule
If:
 The stain of the organism is gram-positive,
and
 The morphology of the organism is coccus,
and
 The growth conformation of the organism is
clumps,
then there is suggestive evidence (0.7) that the
identity of the organism is staphylococcus.
MYCIN contains more than 500 such rules.
Complex interactions of rules gives high level of
performance.
- at level of human specialists in blood
infections (and much better than GPs)
(Shortliffe, 1976).
The UK NHS is said to be shifting to ‘evidence
based medicine’ and is VERY short of
experts, so be optimistic!
Diagnostic knowledge (knowledge-based) is
represented as a set of rules
IF
 The site of the culture is blood, and
 The stain of the organism is gram net, and
 The morphology of the organism is rod, and
 The patient has been seriously burned
THEN there is evidence (0.4) that the identity of
the organism is pseudomonas.
MYCIN control structure.
Has top level goal
IF (1) there is an organism which requires
therapy, and (2) consideration has been given
to any other organisms requiring therapy
THEN compile a list of possible therapies, and
determine the best one in this list.
These rules used to reason backward to the
clinical data (backward chaining).
Possible bacteria causing infection are
considered in turn.
MYCIN attempts to prove whether they are
involved.
Another actual expert system
DENDRAL project, began at Stanford University
(USA) in 1965.
Feigenbaum and Lederberg.
Aim: to determine the molecular structure of an
unknown organic compound.
Analysed data from mass spectrometer.
Mass spectrometer – bombards chemical
sample with beam of electrons, causing
compound to fragment, and components to
be rearranged.
But complex molecule can fragment in different
ways; can only make predictions about which
bonds will break.
Has data from mass spectogram (i.e. after
bonds have broken), and has to work out
what the original compound was.
Although there are constraints (i.e. has
identified chemical formula of compound, and
presence/absence of certain substructural
features) still many possibilities.
DENDRAL planner can assist in decision about
which constraints to impose.
DENDRAL could figure out (on basis of vast
amount of data from mass spectographs)
which organic compound was being
analysed.
Performance relevant data, formulated
hypotheses about compound’s molecular
structure, and tested hypotheses by way of
further predictions.
Output was list of possible molecular
compounds ranked in terms of decreasing
plausibility.
Required constraints – based on conclusions
already drawn.
 Forbidden constraints – rules out possibilities
that don’t fit the data, or because resultant
structures are chemically unstable.

BUT: does not emulate ways in which humans
would actually solve problems.
DENDRAL (in 1960s) – beginning of divide
between simulation of human behaviour,
and trying to arrive at intelligence by any
means available.
Problems:
 Best way to achieve intelligent behaviour
may be to emulate human intelligence.
 Most interesting aspect of AI is the light it
throws on understanding the human mind.
 Yet…expert systems do work!
Examples of domains for Expert Systems:
 Engineering
- Design
- Fault finding
- Manufacturing planning
- Scheduling
Scientific analysis
 Medical diagnosis
 Financial analysis

Expert System Shell
User
User
Interface
Explanation
system
Inference
engine
Knowledge
base editor
Case
specific
data
Knowledge
base
Knowledge-base, contains representation of
domain-specific knowledge.
Inference engine – performs reasoning.
Two kept separate.
Normal method for representing knowledge in
an expert system:
IF-THEN rules.
Often rules do not have certain conclusions:
dealing with uncertainty.
Main approaches to knowledge representation
in AI.
 Logic
 Frames and semantic networks
 If-then rules within a rule-based system
General characteristics
Expert system: program designed to replicate
decision making process of human expert.
Basic idea: experts have a great deal of
knowledge, and this knowledge could be
provided in some formal manner to a
program.
 Requires knowledge base.
Knowledge base entered by knowledge
engineer – ‘knowledge engineering’,
involves interviewing and observing experts,
and converting their words and action into a
knowledge base
 Reasoning mechanisms to apply knowledge
to problems
 Mechanism for explaining their decisions.
Example: rules for diagnosing household
emergency.
Rule 1: If coughing THEN add smoky
Rule 2: If wet and NOT raining THEN add
burstpipe
Rule 3: If NOT coughing AND alarm-rings THEN
ADD burglar
Rule 4: If smoky AND hot THEN ADD fire
Space of hypotheses for Home Emergency
expert system: fire, burstpipe, burglar.
Try to prove fire (R4)
Try to prove smoky and hot (R1 for smoky)
Are you coughing? – no (smoky fails, therefore
fire fails)
Try to prove burst pipe (R2)
Are you getting wet? No (burst pipe fails)
Burglar? Not coughing (known) and alarm rings.
Is there an alarm ringing? Yes
I conclude there’s been a burglary.
Often some explanation facility included.
e.g.:
SYSTEM: Are you coughing?
USER: No
SYSTEM: Are you getting wet?
USER: No
SYSTEM: Is there an alarm ringing?
USER: Why?
SYSTEM: I’m trying to prove there has been a
burglary. It has been established that it is not
the case that you are coughing. Therefore, if
there is an alarm ringing then we can
conclude that there is a burglary. Is there an
alarm ringing?
USER: Yes
SYSTEM: I conclude that there has been a
burglary.
USER: How?
SYSTEM: This follows from Rule 3: If NOT
coughing and alarm-rings THEN ADD burglar.
You told me that it is not the case that you are
coughing. You told me that there is an alarm
ringing.
Use of backward chaining as problem
solving strategy.
Algorithm: To prove G
- If G is in current facts it is proved.
- Otherwise, if G is askable then ask user,
record their answer as a new current fact,
and succeed or fail according to their
response.
- Otherwise, find a rule which can be used to
conclude G and try to prove each of that
rule’s preconditions.
- Otherwise fail G.
Fire scenario of rules and facts
R1: IF smoky AND hot THEN ADD fire
R2: IF alarm-beeps THEN ADD smoky
R3: IF alarm-beeps THEN ADD ear-plugs
R4: IF fire THEN ADD switch-on-sprinklers
R5: IF smoky THEN ADD poor-visibility
F1: alarm-beeps
F2: hot
Proving Switch-on-sprinklers
Try to prove G1 switch-on-sprinklers
Matches Rule 4: try to prove G2 fire
Matches Rule 1: try to prove G3 smoky and G4
hot
G3 matches R2.
New goals G5: alarm beeps, G4: hot.
Goals satisfied (by F1 and F2):
THEREFORE sprinkler switched on.
Backward chaining again:
If you know what the conclusion might be:
backward chaining may be better.
e.g. start with goal to prove, like switch-onsprinkler.
To prove goal G:
 If G is in initial facts it is proven
 Otherwise find a rule which ca be used to
conclude G, and try to prove its preconditions.
 Otherwise, fail G.
Forward Chaining
Facts held in working memory
 Find all the rules which have preconditions
satisfied
 Select one (using conflict resolution
strategies---see below)
 Perform actions in conclusion, maybe
modifying working memory.
Revised simple example:
Rule 1: IF hot AND smoky THEN ADD fire
Rule 2: IF alarm-beeps THEN ADD smoky
Rule 3: IF fire THEN ADD switch-on-sprinklers
Fact 1: alarm-beeps
Fact 2: hot

Check to see rules whose conditions hold
(=R2) Add new fact to working memory:
Fact 3: smoky.

Check again (=R1) Add new fact. Fact 4:
Fire.

Check again (=R3) Sprinklers on!
What happens if more than one rule has its
conditions satisfied?
Rule 1: IF hot AND smoky THEN ADD fire
Rule 2: IF alarm-beeps then add smoky
Rule 3: IF fire THEN ADD switch-on-sprinklers
Rule 4: IF hot AND dry THEN switch on
humidifier
Rule 5: IF fire THEN delete dry.
Fact 1: alarm-beeps
Fact 2: dry
Fact 3: hot
In first cycle, 2 rules apply: Rule 2 and Rule 4.
If Rule 4 chosen, humidifier switched on.
If Rule 2 chosen, then Rules 1, 3 and 5 apply,
and humidifier never switched on.
Therefore, Forward chaining systems need
conflict resolution strategies.
For example – we could prefer rules involving
facts recently added to memory. Therefore, if
Rule 2 fires, next rule is Rule 1 as smoky
recently added.
Or could prioritise rules. Give Rule 4 a lower
priority.
Inference by pattern matching:
Increases flexibility and allows more complex
facts:
e.g. Temperature (kitchen, hot) instead of hot
Could have Rule 6:
If Temperature (room, hot) AND
Environment (room, smoky),
Then ADD
Fire-in (Room).
Fact 6: Temperature (kitchen, hot)
Fact 7: Environment (kitchen, smoky)
Therefore Fire-in (Kitchen)
added to memory.
Forward versus backward chaining:
depends on how many possible hypotheses to
consider.
If few, then backward chaining (e.g. MYCIN).
If many, then forward chaining (e. XCON).
Backward chaining also known as abduction,
the basic form of scientific explanation
(I.e. find some assumption that proves this fact
true).
Necessary ES components: IF-THEN rules,
+ facts, + interpreter
Two types of interpreter: forward chaining and
backward chaining.
Forward chaining: Start with some facts, and
use rules to draw new conclusions.
Backward chaining: Start with hypothesis (goal)
to prove, and look for rules to prove that
hypothesis.
Forward chaining: data-driven (alias bottom-up)
Backward chaining: goal-driven (alias top-down)