* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 1. Procedural knowledge Vs Declarative Knowledge - E
Philosophy of artificial intelligence wikipedia , lookup
Unification (computer science) wikipedia , lookup
Ecological interface design wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Sequent calculus wikipedia , lookup
Computer Go wikipedia , lookup
Logic programming wikipedia , lookup
Soar (cognitive architecture) wikipedia , lookup
AI : UNIT 5 PAGE : 1 Department of computer Science Study material Class : III B.Sc ( Computer Science) Subject : Artificial Intelligence Semester : VI Year : 2002-2003 Portion : V – Unit Staff : I.Gobi Syllabus: Representing knowledge using rules: Procedural Vs Declarative knowledge – Logic programming – Forward Vs Backward reasoning - Matching - Control knowledge Brief explanation of expert systems 1. Procedural knowledge Vs Declarative Knowledge A declarative representation is one in which the knowledge is specified but the use to which the knowledge is to be put is not known. So to use declarative representation, we must have a program that explains what is to be done to the knowledge and how? For example, a set of logical assertions can be combined with a resolution theorem prover to give a complete program for solving problems but in some cases the logical assertions can be viewed as a program rather than data to a program. Hence the implication statements define the legitimate reasoning paths and automatic assertions provide the starting points of those paths. These paths define the execution paths which is similar to the ‘if then else “ in traditional programming. So logical assertions can be viewed as a procedural representation of knowledge. A procedural representation is one in which the control information that is necessary to use the knowledge is embedded in the knowledge itself. The real difference between declarative and procedural views of knowledge lies in where control information reside. For example, consider the following Man (Marcus) Man (Caesar) Person (Cleopatra) ∀ x: Man(x) Person(x) Now, try to answer the question. ?Person(y) The knowledge base justifies any of the following answers. Y=Marcus Y=Caesar Y=Cleopatra We get more than one value that satisfies the predicate. If only one value is needed, then the answer to the question will depend on the order in which the assertions are examined during the search for a response. If the assertions are declarative then they do not themselves say anything about how they will be examined. In case of procedural representation, they say how they will be examined. The order of assertions is very important in a procedural knowledge base, so we have to specify the order in which the assertions are examined that is in the order in which they appear in the program, and the search will proceed Depth-first ( a path is selected and will be continued immediately). If we do that then the answer to the above question is Y= Cleopatra. We will see the difference between procedural and declarative representation clearly. Consider the following assertions. Man (Marcus) AI : UNIT 5 PAGE : 2 Man (Caesar) x Man(x) Person(x) Person (Cleopatra) In declarative view, both the knowledge bases are same, but viewed procedurally and using the same control model, we get the answer Marcus, because the first statement that can achieve the person goal is the inferential rule x : Man(x) Person(x) The rule sets up is a sub goal to find man. The statements are read from the beginning and now Marcus is found to satisfy the sub goal and thus the goal. 2. LOGIC PROGRAMMING Logic programming is a programming paradigm in which logical assertions are viewed as programs. These are several logic programming systems PROLOG is one of them. A PROLOG program consists of several logical assertions when each is a horn clause ie) [ a clause with at most one positive literal] Ex : P, P V Q, P Q The facts are represents on Horn Clause for 2 reasons. Because of a uniform representation, a simple and efficient interpreter can be written. The logic of Horn Clause is decidable. The first 2 differences are from the fact that PROLOG programs are actually sets of Horn clause that have been transformed as follows:1. If the Horn Clause contains no negative literal then leave it as it is. 2. Other wise rewrite the Horn clauses as an implication, combining all of the negative literals in to the antecedent of the implications and the single positive literal into the consequent. This procedure causes a clause which originally consisted of a disjunction of literals ( one of them was positive) to be transformed into a single implication whose antecedent is a conjunction universally quantified. But when we apply this transformation, any variables that occurred in negative literals and so now occur in the antecedent become existentially quantified, while the variables in the consequent are still universally quantified, for example the PROLOG clause P(x) : - Q(x,y) is equal to logical expression x: y: Q (x,y) P(x). The difference between the logic and PROLOG representation is that the PROLOG interpretation has a fixed control strategy and so the assertions in the PROLOG program defines a particular search path to answer to any question. But, the logical assertions define only the set of answers but not about how to choose among those answers if there is more than one. Consider the following example: a. Logical representation x : pet(x) ۸ small (x) apartmentpet(x) x : cat(x) ۷ dog(x) pet(x) x : poodle (x) dog (x) ۸ small (x) poodle (fluffy) B. Prolog representation apartmentpet (x) : -- pet(x), small (x) pet (x): -- cat (x) pet (x) :-- dog(x) dog(x) :-- poodle (x) small (x) :-- poodle(x) poodle (fluffy) AI : UNIT 5 PAGE : 3 Both representations contains 2 types of statements Fact (no variables) represents statements about specific object. Rule (with variables) representation about clause of objects. Difference between logic and PROLOG representation 1. In logic representation, the variables are explicitly quantified. In PROLOG representation, the quantities are implicitly provided by the way the variables are interpreted. In PROLOG the constants & variables differentiated by the way in which these are written 2. In logic ^ and V operators are used. In PROLOG ۸ is replaced by ‘,’ but none for V. But it is represented as a list of alternative statements, any one of them must provide the basis for a conclusion. 3. In Logic p q, (i.e.) p implies q. In PROLOG it is written q : -p. This is very useful since the interpreter always work backward from a goal. (i.e.) The basic PROLOG control strategy is explained below: Step1. Start with a problem statement, which is viewed as a goal to be proved. Step2. Look for assertions that can prove the goal (Consider facts which prove the goal directly and also consider any rule whose head matches the goal.) Step3. To decide which rule is fact to be applied to the current problem call a standard unification procedure. Step 4. Reason backward from that goal until a path if formed that terminates with assertions in the program. If a goal has more than one conjunction part, prove the paths in which they appear. Ex: To find a value of X that satisfies a predicate apartmentpet (x). The goal is ? -- apartmentpet (x). The interpreter looks for a fact apartmentpet or a rule with that predicate at its head (usually in a PROLOG program the facts containing given predicate come before the rules for that predicate. So the facts are used if appropriate and the rules will be used only if the facts are not available) there are no facts with predicate pet. But there are two rules rather than one, which contain the predicate on the right hand side. So one of them must be selected. The selection is based on the order in which they are given. The first one fails, because there is no fact with the predicate cat in the left hand side. The second one is selected, because there are facts, which contain the predicate dog. And using the rule about dogs and poodles we get the result fluffy. Now the second clause small (x) of the initial rule must be checked. Here also the variable x is bound to poodle(fluffy), so the result to the program is apartmentpet (fluffy). Logical negation (^) cannot be represented explicitly in PROLOG, so it is not possible to encode directly the logical assertions for (x): dog (x) ^ cat (x). Instead the negation is represented implicitly by the lack of assertion. This leads to the problem solving strategy called negation as failure. Ex: suppose the goal is ? – cat (fluffy), the program would return false. Because it is unable to prove that fluffy is a cat. But this program returns the same answer when given the goal is ?- cat (mittens), even though the program knows nothing about mittens. So what we need is the closed world assumption, which states that all relevant true assertions are available in the knowledge base or they can be derived from those assertions. The assertions, which are not present, can therefore be assumed to be false. This assumption causes a serious problem when the knowledge base is incomplete. A great advantage of logic programming is that the programmer needs only to specify rules and facts, since a search engine is built directly in to the language. The AI : UNIT 5 PAGE : 4 disadvantage is that the search control is fixed. Even though PROLOG allows programs to use search strategies other than depth first with back tracking, it is difficult to do so. 3. FORWARD Vs BACKWARD REASONING Search is nothing but looking for a goal in a search free. The two types or search are: i) Forward starts from the start ii) Backward start form the goal. The production system views the forward and backward as symmetric processes. Consider a game of playing 8 puzzle. The rules defined are Square 1 empty and square contains file n. Square 2 empty and square contains the file n. Square1 empty Square4 contains file n. Square4 empty and Square1 contains n. We can solve the problem in 2 ways: 1. Reason forward from the initial states Step 1. Begin building a tree of move sequences by starting with the initial configuration at the root of the tree. Step 2. Generate the next level of tree by finding all rules whose left hand side matches against the root node. The right hand side is used to create new configurations. Step3. Generate the next level by considering the nodes in the previous level and applying it to all rules whose left hand side match. 2.Reasoning backward from the goal states: Step 1. Begin building a tree of move sequences by starting with the goal node configuration at the root of the tree. Step 2. Generate the next level of tree by finding all rules whose right hand side matches against the root node. The left hand side is used to create new configurations. Step3. Generate the next level by considering the nodes in the previous level and applying it to all rules whose right hand side match. The same rules can be used in both cases. In forward reasoning the left hand side of the rules are matched against the current state and right sides are used to generate the new state. In backward reasoning the right hand side of the rules are matched against the current state and left sides are used to generate the new state. There are four factors influencing the type of reasoning. They are Step 1. Are there more possible start or goal state? We move from smaller set of sets to the longer. Step 2. In what direction is the branching factor greater? We proceed in the direction with the lower branching factor. Step 3. Will the program be asked to justify its reasoning process to a user? If, so then it is selected since it is very close to the way in which the user thinks. Step 4. What kind of event is going to trigger a problem-solving episode? Note: In case of arrival of a new fact forward reasoning makes sense .In case of query, which needs a response, backward reasoning, is more natural. Example 1: It is easier to drive from an unfamiliar place from home, rather than from home to an unfamiliar place. If you consider a home as starting place an unfamiliar as goal then we have to back track from unfamiliar place to home. Example 2: Consider a problem of symbolic integration. The problem space is a set of formulas, which contains integral expressions. Here START is equal to the given formula with some integrals. GOAL is equivalent to the expression of the formula with out any integral. Here we start from the formula with some integrals and proceed to an integral free expression rather than starting from an integral free expression AI : UNIT 5 PAGE : 5 Example 3: The third factor is nothing but deciding whether the reasoning process can justify its reasoning. If it justifies then it can be applied. For example, doctors are usually unwilling to accept any advice from diagnostics process because it cannot explain its reasoning. 3.1 BI -DIRECTIONAL SEARCH Instead of searching either forward or backward, you can search both simultaneously. That is, start forward from a stating state and backward from a goal state simultaneously until the paths meet. This strategy is called Bi-directional search. The following figure shows the reason for Bi-directional search to be ineffective. The two searches may pass each other resulting in more work. Based on the form of the rules one can decide whether the same rules can be applied for both forward and backward reasoning. If left hand side and right of the rule contain pure assertions then the rule can be reversed. And so the same rule can be applied for both types of reasoning. If the right side of the rule contains an arbitrary procedure then the rule cannot be reversed. In this case while writing the rule the commitment to direction of reasoning must be made. 4. MATCHING A cleaver search involves choosing rules from a set of rules that leads to a solution. To a select rule we need to compare the current state of the system and the preconditions of the rule. The following are some proposals of matching. 4.1. Indexing To select rules compare each ones precondition to the current state and extract all the ones that match. But there are two problems in this. In case of very interesting problems we need to use a large number of rules. Scanning through all of them at each step of the search would be hopelessly inefficient. It is not always immediately obvious whether rules preconditions are satisfied by a particular state. In case of first problem, instead of scanning through rules, use the current state as an index to the rules and select the matching ones. For example, in case of chess play we assign an index to each board position. That is treat the board description as a large number and use a hashing function which treats that number as an index into the rules. Assign a same key to describe rules that describes a given board position. It is better to write rules in more general form. But the indexing is not simpler in this case. There is often a trade off between the ease of writing rules and the simplicity of matching process. In Prolog, and many theorems providing system, rules are indexed by the predicates they contain. So one can access the rules to be applied quickly. In chess example, rules can be indexed by process and their positions. Although the approach has some limitations, it is very important to the efficient operation of rule-based systems. 4.2. Matching with Variables Sometimes it may be a nontrivial problem to examine a particular rule and a given problem state and determine whether the preconditions of the rule are satisfied. When preconditions are not stated as exact descriptions of particular situation problems AI : UNIT 5 PAGE : 6 arise. One has to discover whether there is a match between a particular situation and preconditions of a given rule that needs a significant search process. Consider the following facts and rules about families. Facts:SON (Mary, John) SON (John, Bill) SON (Bill, Tom) SON (Bill, Joe) DAUGHTER (John, Sue) DAUGHTER (Sue, Judy) Rules:SON (x, y) ^ SON ( y, z) GRANDSON (x, z) DAUGHTER (x, y) ^ SON ( y, z) GRANDSON (x, z) SON (x, y) ^ DAUGHTER ( y, z) GRANDDAUGHTER (x, z) Suppose we want to answer the question “Who is John’s Grandson?” We have to apply rule 1 and 2, which involve the predicate grandson. To see whether preconditions for rule 1 are satisfied. We must find a single substitution for y such that it is true that Son (John, y) and Son (z, x) for some values of z. For this we must find a y that satisfies both the predicates. We can do this either by checking the entire Johns’ Sons and seeing if any of them has a son or by checking everyone who has a son and seeing if any them are John’s son. So in this case if we start from John’s sons, then the searching process is minimum. This example illustrates two issues (1) It is important to record a match between a pattern and a state description and also the bindings, which are performed during the match process and some bindings could be used in the action part of the rule. In the above example, the matcher must pass the information to the rule applier. The preconditions of rule (1) are satisfied with x=John, Y=Bill and Z=Tom, the rule applier uses the bindings of x and z as John and Ton and concludes Grandson (John, Tom) from Grandson (x, z). (2) In a non-trivial matching, a single rule may match the current problem state in more than one way thus leading to several alternative right side actions. In the above example, rule(1) can match either the binding of z to Tom or binding of z to Joe. Either binding is sufficient to lead to a solution. It is difficult to answer to the question who is the great grandson.? 4.3. Complex and Approximate Matching A more complex matching process is needed when the preconditions of a rule specify the properties that are not stated explicitly in the description of the current state. In this case, a separate set of rules must be used to describe how some properties can be inferred from others. An even more complex matching process is required to decide the rules to be applied if some preconditions approximately match the current situation. For example, a speak understanding program must contain rules that map from description of a physical waveform to phones. There is much variability in the physical signal as a result of background noise differences in the wave the individual speak, etc., so one can find only an approximate match between the rules that describes an ideal sound and the input that describes an un ideal world. Approximate matching is not good since when we increase the tolerance allowed in the match we also increase the number of rules that will match, which leads to an increase in size of the search processing. But approximate matching is however superior to the exact matching in situations, such as speech understanding where exact matching may often result in an empty set of rules being matched and so the search process comes to a halt. AI : UNIT 5 PAGE : 7 4.4. Conflict Resolution The result of a matching process is the list of rules whose antecedents have matched the current state description along with whatever variables binding were generated by the matching process. The searching method decides the order of the rules. But sometimes it is useful to incorporate some of the decision making into the matching process. This phase of the matching process is called conflict resolution. There are the 3 basic approaches to the problem of conflict resolution in a production system Assign a preference based on the rule that match Assign a preference based on the object that match Assign a preference based on the action that the matched rule would perform Preference Based on rules There are 2 common ways of assigning a preference based on the rules themselves (i) The priority is given to the rules in the order in which they appear. Example PROLOG: The priority scheme is to give priority to special case rules over rules that are more general. For example, in water jag problem rules 11 and 12 are special case of rules 9 and 5. Expert problem solvers use the specific rules. The matcher has to reject the rules, which are more general than others. There are 2 ways to decide the general rules. 1. If the set of preconditions of one rule contains all the preconditions of another (plus some other), the second rule is more general then the first 2. If the preconditions of one rule are the same as them of another except that in the first case variables are specified where in the second there are constants, then the first rule is more general than the second. Preferences based on objects The searching is based on the importance of the objects that are matched. There are a variety ways it can happen. Consider ELIZA, a language-understanding program, which matches the patterns against the user’s sentence in order to find a rule to apply. Patterns are looked for the combinations of the important key words. The input sentence may contain several keywords that ELIZA knows. ELIZA takes some keywords as more significant than others based on some facts. Then the pattern matcher returns the match involving the highest priority keyword. Another form of priority matching can occur as a function of the position of the matchable objects in the current state description. For example, in case of human-shortterm memory (STM), rules can be matched against the current contents of STM and used to generate rules such as producing output to the environment or storing something in long-term memory. Try a matcher that first tries to match against the objects that have most recently entered in STM and compares against the older ones only if the newer elements do not match. Preferences based on states. Suppose there are several rules waiting to fire. One way of selecting among them is to fire all of them temporarily and to examine the results of each. Then, using a heuristic function that can evaluate the resulting states, compare the merits of the results and select the perfect one. Throw away the unnecessary ones. This approach is identical to best-search procedure. The drawback of this approach is that LISP- coded search is powerful and difficult to modify. 5. CONTROL KNOWLEDGE Search control knowledge: The knowledge about which paths are most likely to lead quickly to goal state is often called as Search control knowledge: It can take many forms: AI : UNIT 5 PAGE : 8 1. Knowledge about which states are more preferable to others. 2. Knowledge about which rule to apply in a given situation. 3. Knowledge about the order in which to pursue sub goals. 4. Knowledge about useful sequences of rules to apply. The first type of knowledge can be represented with heuristic functions. There are many ways of representing the other types of control knowledge. For example, the rules can be labeled and partitioned. Another method is to assign cost and probability of success measures to rules. The problem solver can then use probabilistic function to choose a cost-effective alternative at each point in the search. Knowledge can be used to represent knowledge. For this reason the search control knowledge is called meta-knowledge. A number of AI systems represent their control knowledge with rules. Examples SOAR and PRODIGY Soar SOAR is a general architecture for building intelligent systems. It is based on a set of specific, cognitively motivated hypotheses about the structure of human problem solving. These hypothesis are derived from what we know about human-short memory, practice effects etc. Long-term memory is stored as a set of productions( or rules) Short –time memory is a buffer that is affected by perceptions and serves as a storage area for the facts deduced by rules in long term memory. All problem solving activities take place as state space traversal All intermediate and final results of problem solving are remembered. Prodigy It is a general- purpose problem-solving system that incorporates several different learning mechanisms. A good idea of learning in PRODIGY is directed automatically constructing a set of control rules to improve search in a particular domain. PRODIGY can acquire knowledge in number of ways. Through hand coding by programmers. Through a static of analysis of the domain’s operators. Through looking at traces of its own-problem solving behaviour. PRODIGY learns control rules from experience but unlike SOAR it also learns from failures.