* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Expert Systems - Department of Computer Science
Clinical decision support system wikipedia , lookup
Soar (cognitive architecture) wikipedia , lookup
Hubert Dreyfus's views on artificial intelligence wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Intelligence explosion wikipedia , lookup
Computer Go wikipedia , lookup
Ecological interface design wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Human–computer interaction wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Personal knowledge base wikipedia , lookup
Review of Schank’s Scripts: consist of a set of slots. Associated with each slot may be information about the kinds of values it may contain, as well as default values. Scripts have causal structure – events connected to earlier events that make them possible, and later events they enable. Headers of scripts indicate when a script should be activated Related to the concept of Frames (Minsky) which was earlier and for more static structures (e.g. a room). Scripts more like a big verb dictionary, Frames more like one for nouns. What background knowledge do we need to understand a story? What information does the writer expect us to infer? Are we likely to have both in a predetermined script? How do when know when a story has stopped following a script? (Compare: how do we know when the person we are talking to has changed the subject--some people never notice!) De Jong’s ‘sketchy script matcher’ FRUMP At Yale around 1977 DeJong developed a new form of SAM (Lehnert’s Script Applier Mechanism) It sought only to fill initially determined predicate values of interest to a user It worked mainly on newspaper stories about terrorism. For example, FRUMP wants to find out type of car, object it collided with, location of accident, number of people killed/injured, who was at fault. Skims new story to identify appropriate script. Then tries to answer expectations. Connected to UPI wire service. UPI Story.Pisa, Italy. Officials today searched for the black box flight recorder aboard an Italian air force transport plane to determine why the aircraft crashed into a mountainside killing 44 persons. They said the weather was calm and clear, except for some ground level fog, when the US-made Hercules C130 transport plane hit Mt Serra moments after takeoff Thursday. The pilot described as one of the country’s most experienced, did not report any trouble in a brief radio conversation before the crash. FRUMP summary: 44 people were killed when an airplane crashed into a mountain in Italy today. FRUMP is not like a ‘full’ restaurant script (for air disaster) but it simply fills a small number of slots (not necessarily ordered) like NUMBER_DEAD, WHERE_CRASH, WHEN_ CRASH. FRUMP was never statistically evaluated But FRUMP was the forerunner of a 1990’s technology Information Extraction, where ‘templates’ of slots and fillers are filled from web or newspaper text at high speed and huge volume) This new AI technology was created by US Government funding in the 1990s and is highly statistical and competitive between groups/universities/companies. How do humans perform tasks? Part of the aim of research on Script as was to find a way of giving a program the same knowledge that humans use to understand a story--and Script theory was very influential in Psychology. Similarly, in research on Expert Systems, aim is to capture, and apply, the knowledge that human experts have. And in earlier examples, e.g. GPS, idea was to mimic human problem solving ability. It makes sense to emulate humans in Artificial Intelligence research. One of the original motivations for AI research was to understand human mind. But also to get computers to do clever things, no matter how! Difficult to provide an account of intelligence without reference to what humans can do. Although our changed conception of intelligence now is less human-based e.g. perhaps a bee is capable of intelligent behaviour. But if we are concerned to emulate humans, we need to find out how humans think, if we think psychology has ways of telling us that reliably Ways of finding out how people work…….. Introspection (most AI experiments, like CD/Sripts) Protocol analysis (Activity reports--GPS) Psychology experiments One problem for expert systems is that the introspection of experts is unreliable (plumbers cant always tell you how they do it). Much psychology is unsurprising but sometimes helpful--e.g. that people usually cant remember surface words only content-which is consistent with CD’s claims. Return to Expert Systems SHRDLU, and blocks microworld. Domainspecific knowledge (as opposed to domaingeneral knowledge). Understood substantial subset of English by representing and reasoning about a very restricted domain. Had knowledge of microworld, (but no real understanding). But program too complex to be extended to real world. Expert systems: also relied on depth of knowledge of constrained domain. But commercially exploitable. ‘Real’ applications. SHRDLU Dead end: program very complex, also little to do with real world. General realisation that programs that performed well within limits of microworlds, could not capture complexity of everyday human reasoning. Remember that SHRDLU would have to process AN INTERESTING BOOK by accessing all the books it knew in its database and all the interesting things! Hubert Dreyfus (1972): criticism of idea that reasoning and intelligence could be captured by logical rules. Dreyfus was part of the first major reaction against the claims of AI in the 1970s (cf. UK Govt. Lighthill Report). Weizenbaum (1976): pointing out that his ELIZA ‘had come close to passing Turing Test.(!) Humans too ready to attribute intelligence to unintelligent devices. Risk of oversold programs. But some of this was just breast beating for profit (Weizenbaum’s Computer Power and Human Reason was Reader’s Digest Book of the Month!). Overselling how much one had done even while repenting! References for Knowledge Representation Rich and Knight (1991) Artificial Intelligence, McGraw-Hill, Inc. Chapter 4. Cawsey, A. (1997) Essentials of Artificial Intelligence, Prentice-Hall. (see also web reference on course page) Russell and Norvig (1995) Artificial Intelligence: A modern approach. Chapter 3. Introspective evidence of stages of learning a skill or expertise – e.g. car driving or chess playing Novice. Car driver or chess player is consciously following rules. Expert: can decide what to do ‘ without thinking’ – making decisions about what to do based on resemblance of current situation to many previously experience situations. best chess players can usually instantly recognise what is a good move. expert driver knows when slowing down is needed without thinking about it. (e.g. becomes difficult to drive if you consciously reflect about gear shifting and try to decide If this intuition is correct, there is more to real expert understanding than following rules. BUT a few problems where (rule driven) expert systems can perform as well as experts. And even in the absence of claims that expert systems think like humans, these may well be a useful tools. Probably work best when used as consultant or aide to human expert or novice. Examples are medical diagnostic systems, optimal layout systems for space, and scheduling algorithms. Feigenbaum’s DENDRAL at Stanford predicts chemical compounds. Criticisms by Hubert Dreyfus Dreyfus: points out ways in which AI theorists have overclaimed about what they can do. e.g. Feigenbaum claims that ‘DENDRAL has been in use for many years at university and industrial chemical labs around the world’. But ‘..when we called several university and industrial sites that do mass spectroscopy, we were surprised to find that none of them use DENDRAL..’ Dreyfus: Programming attempts to capture ordinary, or common sense knowledge and reasoning ability are doomed to failure. Such knowledge cannot be captured by programs because it is too contextual and open-ended. For Dreyfus, the real expert is not following rules Strong AI: building programs that actually think (or striving towards this) Weak AI 1: Applications – trying to perform tasks that would require intelligence if performed by humans. Some attempt to simulate human solutions Weak AI 2: Modelling human cognition Expert Systems sometimes do better than human experts. e.g. Buchanan, 1982, MYCIN did better than panel of experts in evaluating ten selected meningitis cases. But expert systems benefit from being applied in an area where computer can exploit an ability to follow rules. Four major problems for expert systems Brittleness. Cannot fall back on general knowledge – e.g. if mistake in entering data for medical expert system, entering that patient is 130 years old, and weighs 40 pounds. ES would not guess values switched. No Meta-knowledge. Expert systems do not know their own limitations. Knowledge acquisition. Still bottleneck in new domains. Validation. Difficult to know what to compare it to (unless compared to human experts diagnosing real world problems). Domain-specific knowledge versus domainindependent knowledge Expert systems: good at domain-specific knowledge, bad at domain-independent. PUFF knows nothing about medical complaints except conditions of the lung (i.e. knowledge very specific), and may not even know whether lungs are above or below knees (example of common knowledge about human anatomy). Does that matter? Would we care if it diagnosed us efficiently? Why are we obsessed with being a human whole? Is an ES like an Idiot savant: person who is basically retarded, but able to perform very well in one limited domain. e.g. calculating day on which particular dates fall. From Lenat and Guha (1990) (in Rich and Knight, 1991, Artificial Intelligence) System: How old is the patient? Human: (looking at his 1957 chevrolet) 33 System: Are there any spots on the patients body? Human: (noticing rust spots) Yes. System: What colour are the spots? Human: Reddish-brown. System: The patient has measles (probability 0.9) More like ‘automated reference manuals’ (Copeland, 1993). Advantages of Expert Systems Human experts can lose expertise. Ease of transfer of artificial expertise. No effect of emotion in artificial expertise. Expert systems are a low cost alternative – expensive to develop but cheap to operate. Limitations: Lack of creativity, not adaptive, lack sensory experience, narrow focus, and no commonsense knowledge (or metaknowledge). Lack of wider understanding Winograd (Shrdlu’s programmer) ‘..There is a danger inherent in the label ‘expert system’. When we talk of a human expert we connote someone whose depth of understanding serves not only to solve specific well-formulated problems, but also to put them into a larger context. We distinguish between experts and idiot savants. Calling a program an expert is misleading….’ Can lead to inappropriate expectations But may be useful if users can be educated about proper expectations (are people getting used to limited machines?) See following two paragraphs (from HayesRoth, 1983) Summaries of pulmonary function diagnosis of particular patient. One by human expert, other by expert system (PUFF). Conclusions: the low diffusing capacity, in combination with obstruction and a high total lung capacity is consistent with a diagnosis of emphysema. Although bronchodilators were only slightly useful in this one case, prolonged use may prove beneficial to the patient. PULMONARY FUNCTION DIAGNOSIS: MODERATELY SEVERE OBSTRUCTIVE AIRWAYS DISEASE. EMPHYSEMATOUS TYPE. Conclusions: Overinflation, fixed airway obstruction and low diffusing capacity would all indicate moderately severe obstructive airway disease of the emphysematous type. Although there is no response to bronchodilators on this occasion, more prolonged use may prove to me more helpful. PULMONARY FUNCTION DIAGNOSIS: OBSTRUCTIVE AIRWAYS DISEASE, MODERATELY SEVERE EMPHYSEMATOUS TYPE. No totally automatic ways of constructing expert knowledge bases, but there are programs which interact with domain experts to extract expert knowledge efficiently. e.g. finding holes in knowledge and prompting expert to fill them. AND/OR checking for consistency in knowledge OR Alternative to interviewing expert: looking at sample problem and solutions, and inferring its own rules. e.g. bank’s problem of deciding whether to approve a loan. Instead of interviewing loan oficers, look at past loans, and try to generate loans that will maximise number of good loans in the future. Expert system Shells also marketed. e.g. EMYCIN (Empty Mycin) Consists of the shell of an expert system, without domain specific knowledge. New knowledge domain can be entered, and make use of same rule mechanisms. Evaluate expert systems: good idea or not? How important is it to have systems that are commercially viable, and made use of in the real world? Would you be happy to rely on a medical Expert System instead of a doctor? Advantages Disadvantages Reliance of expert systems on domain specific knowledge Also on heuristics operating on the knowledge Knowledge-base: need to find a way of representing knowledge. MYCIN: production rules. Also need to draw appropriate inferences – inference-engine. Need to work out what knowledge is appropriate, and to get it into the knowledgebase. Knowledge engineering Based on protocol analysis (GPs pioneered this) : human subjects encouraged to think aloud as they solved problems. Protocols later analysed to reveal concepts and procedures employed. Protocol analysis used alongside Logic Theorist by Newell and Simon. Interaction between expert system builder, knowledge engineer, and human experts in some problem area. Some computational psychologists (e.g. Schvaneveldt) used networks to represent knowledge elicited as associations of concepts. Automated Knowledge Acquisition and Evaluation Alternative to time-consuming and expensive knowledge engineering. Evaluation depends entirely on task for which ES are designed. If they function as assistants (like DENDRAL) we need only that they do not miss any solutions with respect to given set of constraints, and take a reasonable length of time. If like MYCIN they generate whole solutions, we need evaluation against human experts (or rival expert systems). Evaluation of expert systems. Comparison to experts: need to follow experimental procedures, i.e. so raters don’t know which are human and which are computer’s solutions. DENDRAL: used as expert’s assistant, rather than stand alone expert. Heuristic search technique constrained by knowledge of human expert. ‘…supports hundreds of international users every day, assisting in structure elucidation problems for such things as antibiotics and impurities in manufactured chemicals..’ (Jackson, 1990) . MYCIN: performance compares favourably with human experts. But never used in hospitals Suggested reasons (Jackson, 1990) Its knowledge base is incomplete since it does not cover anything like the full spectrum of infectious diseases. Running it would have required more computing power than hospitals could afford. Interface not good. Trade union protectionism by US doctors? MYCIN. (Shortliffe and Buchanan, Stanford). Expert system which attempts to recommend appropriate therapies for patients with bacterial infections. Four part decision process: Deciding if the patient has a significant infection Determining the possible organisms involved Selected a set of drugs that might be appropriate Choosing the most appropriate drug or combination of drugs. MYCIN has five components. A knowledge base A dynamic patient database A consultation program An explanation program A knowledge acquisition program, for adding or changing rules. Once MYCIN finds the identities of the diseasecausing organisms, it tries to select therapy to treat disease. IF the identity of the organism is pseudomonas THEN therapy should be selected from among the following drugs: Colistin (.98) Polymyxin (.96) Gentamicin (.96) Carbenicillin (.65) Sulfisoxazole (.64) (decimal numbers show prob. of arresting growth of pseudomonas). Expert systems typically use production rules: (IF – THEN rules) e.g. MYCIN rule If: The stain of the organism is gram-positive, and The morphology of the organism is coccus, and The growth conformation of the organism is clumps, then there is suggestive evidence (0.7) that the identity of the organism is staphylococcus. MYCIN contains more than 500 such rules. Complex interactions of rules gives high level of performance. - at level of human specialists in blood infections (and much better than GPs) (Shortliffe, 1976). The UK NHS is said to be shifting to ‘evidence based medicine’ and is VERY short of experts, so be optimistic! Diagnostic knowledge (knowledge-based) is represented as a set of rules IF The site of the culture is blood, and The stain of the organism is gram net, and The morphology of the organism is rod, and The patient has been seriously burned THEN there is evidence (0.4) that the identity of the organism is pseudomonas. MYCIN control structure. Has top level goal IF (1) there is an organism which requires therapy, and (2) consideration has been given to any other organisms requiring therapy THEN compile a list of possible therapies, and determine the best one in this list. These rules used to reason backward to the clinical data (backward chaining). Possible bacteria causing infection are considered in turn. MYCIN attempts to prove whether they are involved. Another actual expert system DENDRAL project, began at Stanford University (USA) in 1965. Feigenbaum and Lederberg. Aim: to determine the molecular structure of an unknown organic compound. Analysed data from mass spectrometer. Mass spectrometer – bombards chemical sample with beam of electrons, causing compound to fragment, and components to be rearranged. But complex molecule can fragment in different ways; can only make predictions about which bonds will break. Has data from mass spectogram (i.e. after bonds have broken), and has to work out what the original compound was. Although there are constraints (i.e. has identified chemical formula of compound, and presence/absence of certain substructural features) still many possibilities. DENDRAL planner can assist in decision about which constraints to impose. DENDRAL could figure out (on basis of vast amount of data from mass spectographs) which organic compound was being analysed. Performance relevant data, formulated hypotheses about compound’s molecular structure, and tested hypotheses by way of further predictions. Output was list of possible molecular compounds ranked in terms of decreasing plausibility. Required constraints – based on conclusions already drawn. Forbidden constraints – rules out possibilities that don’t fit the data, or because resultant structures are chemically unstable. BUT: does not emulate ways in which humans would actually solve problems. DENDRAL (in 1960s) – beginning of divide between simulation of human behaviour, and trying to arrive at intelligence by any means available. Problems: Best way to achieve intelligent behaviour may be to emulate human intelligence. Most interesting aspect of AI is the light it throws on understanding the human mind. Yet…expert systems do work! Examples of domains for Expert Systems: Engineering - Design - Fault finding - Manufacturing planning - Scheduling Scientific analysis Medical diagnosis Financial analysis Expert System Shell User User Interface Explanation system Inference engine Knowledge base editor Case specific data Knowledge base Knowledge-base, contains representation of domain-specific knowledge. Inference engine – performs reasoning. Two kept separate. Normal method for representing knowledge in an expert system: IF-THEN rules. Often rules do not have certain conclusions: dealing with uncertainty. Main approaches to knowledge representation in AI. Logic Frames and semantic networks If-then rules within a rule-based system General characteristics Expert system: program designed to replicate decision making process of human expert. Basic idea: experts have a great deal of knowledge, and this knowledge could be provided in some formal manner to a program. Requires knowledge base. Knowledge base entered by knowledge engineer – ‘knowledge engineering’, involves interviewing and observing experts, and converting their words and action into a knowledge base Reasoning mechanisms to apply knowledge to problems Mechanism for explaining their decisions. Example: rules for diagnosing household emergency. Rule 1: If coughing THEN add smoky Rule 2: If wet and NOT raining THEN add burstpipe Rule 3: If NOT coughing AND alarm-rings THEN ADD burglar Rule 4: If smoky AND hot THEN ADD fire Space of hypotheses for Home Emergency expert system: fire, burstpipe, burglar. Try to prove fire (R4) Try to prove smoky and hot (R1 for smoky) Are you coughing? – no (smoky fails, therefore fire fails) Try to prove burst pipe (R2) Are you getting wet? No (burst pipe fails) Burglar? Not coughing (known) and alarm rings. Is there an alarm ringing? Yes I conclude there’s been a burglary. Often some explanation facility included. e.g.: SYSTEM: Are you coughing? USER: No SYSTEM: Are you getting wet? USER: No SYSTEM: Is there an alarm ringing? USER: Why? SYSTEM: I’m trying to prove there has been a burglary. It has been established that it is not the case that you are coughing. Therefore, if there is an alarm ringing then we can conclude that there is a burglary. Is there an alarm ringing? USER: Yes SYSTEM: I conclude that there has been a burglary. USER: How? SYSTEM: This follows from Rule 3: If NOT coughing and alarm-rings THEN ADD burglar. You told me that it is not the case that you are coughing. You told me that there is an alarm ringing. Use of backward chaining as problem solving strategy. Algorithm: To prove G - If G is in current facts it is proved. - Otherwise, if G is askable then ask user, record their answer as a new current fact, and succeed or fail according to their response. - Otherwise, find a rule which can be used to conclude G and try to prove each of that rule’s preconditions. - Otherwise fail G. Fire scenario of rules and facts R1: IF smoky AND hot THEN ADD fire R2: IF alarm-beeps THEN ADD smoky R3: IF alarm-beeps THEN ADD ear-plugs R4: IF fire THEN ADD switch-on-sprinklers R5: IF smoky THEN ADD poor-visibility F1: alarm-beeps F2: hot Proving Switch-on-sprinklers Try to prove G1 switch-on-sprinklers Matches Rule 4: try to prove G2 fire Matches Rule 1: try to prove G3 smoky and G4 hot G3 matches R2. New goals G5: alarm beeps, G4: hot. Goals satisfied (by F1 and F2): THEREFORE sprinkler switched on. Backward chaining again: If you know what the conclusion might be: backward chaining may be better. e.g. start with goal to prove, like switch-onsprinkler. To prove goal G: If G is in initial facts it is proven Otherwise find a rule which ca be used to conclude G, and try to prove its preconditions. Otherwise, fail G. Forward Chaining Facts held in working memory Find all the rules which have preconditions satisfied Select one (using conflict resolution strategies---see below) Perform actions in conclusion, maybe modifying working memory. Revised simple example: Rule 1: IF hot AND smoky THEN ADD fire Rule 2: IF alarm-beeps THEN ADD smoky Rule 3: IF fire THEN ADD switch-on-sprinklers Fact 1: alarm-beeps Fact 2: hot Check to see rules whose conditions hold (=R2) Add new fact to working memory: Fact 3: smoky. Check again (=R1) Add new fact. Fact 4: Fire. Check again (=R3) Sprinklers on! What happens if more than one rule has its conditions satisfied? Rule 1: IF hot AND smoky THEN ADD fire Rule 2: IF alarm-beeps then add smoky Rule 3: IF fire THEN ADD switch-on-sprinklers Rule 4: IF hot AND dry THEN switch on humidifier Rule 5: IF fire THEN delete dry. Fact 1: alarm-beeps Fact 2: dry Fact 3: hot In first cycle, 2 rules apply: Rule 2 and Rule 4. If Rule 4 chosen, humidifier switched on. If Rule 2 chosen, then Rules 1, 3 and 5 apply, and humidifier never switched on. Therefore, Forward chaining systems need conflict resolution strategies. For example – we could prefer rules involving facts recently added to memory. Therefore, if Rule 2 fires, next rule is Rule 1 as smoky recently added. Or could prioritise rules. Give Rule 4 a lower priority. Inference by pattern matching: Increases flexibility and allows more complex facts: e.g. Temperature (kitchen, hot) instead of hot Could have Rule 6: If Temperature (room, hot) AND Environment (room, smoky), Then ADD Fire-in (Room). Fact 6: Temperature (kitchen, hot) Fact 7: Environment (kitchen, smoky) Therefore Fire-in (Kitchen) added to memory. Forward versus backward chaining: depends on how many possible hypotheses to consider. If few, then backward chaining (e.g. MYCIN). If many, then forward chaining (e. XCON). Backward chaining also known as abduction, the basic form of scientific explanation (I.e. find some assumption that proves this fact true). Necessary ES components: IF-THEN rules, + facts, + interpreter Two types of interpreter: forward chaining and backward chaining. Forward chaining: Start with some facts, and use rules to draw new conclusions. Backward chaining: Start with hypothesis (goal) to prove, and look for rules to prove that hypothesis. Forward chaining: data-driven (alias bottom-up) Backward chaining: goal-driven (alias top-down)