Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DEALING WITH UNCERTAINTY (1) WEEK 5 CHAPTER 3 Introduction • The world is not a well-defined place. • There is uncertainty in the facts we know: – What’s the temperature? Imprecise measures – Is X a good president? Imprecise definitions – Where are the road pits? Imprecise knowledge • There is uncertainty in our inferences – If I have red scars a itchy rash and was gardening all weekend Iprobably have poison ivy • People make successful decisions all the time anyhow. 2 Sources of Uncertainty • Uncertain data – missing data, unreliable, ambiguous, imprecise representation, inconsistent, subjective, derived from defaults, noisy… • Uncertain knowledge – Multiple causes lead to multiple effects – Incomplete knowledge of causality in the domain – Probabilistic/stochastic effects • Uncertain knowledge representation – restricted model of the real system – limited expressiveness of the representation mechanism • inference process – Derived result is formally correct, but wrong in the real world – New conclusions are not well-founded (eg, inductive reasoning) – Incomplete, default reasoning methods 3 Reasoning Under Uncertainty • So how do we do reasoning under uncertainty and with inexact knowledge? – heuristics • ways to mimic heuristic knowledge processing methods used by experts ( limit the search for solution) – empirical associations • experiential reasoning and based on limited observations • Verifiable or provable by means of observation or experiment. • Guided by practical experience and not theory, as in medicine. – probabilities • objective (frequency counting) • subjective (human experience ) 4 Decision making with uncertainty • Rational behavior: – For each possible action, identify the possible outcomes – Compute the probability of each outcome – Compute the utility of each outcome – Compute the probability-weighted (expected) utility over possible outcomes for each action – Select the action with the highest expected utility (principle of Maximum Expected Utility) 5 Some Relevant Factors • expressiveness – can concepts used by humans be represented adequately? – can the confidence of experts in their decisions be expressed? • comprehensibility – representation of uncertainty – utilization in reasoning methods • correctness – probabilities – relevance ranking – long inference chains • computational complexity – feasibility of calculations for practical purposes • reproducibility – Do the observations deliver the same results when repeated? 6 Basic Probability • Probability theory enables us to make rational decisions. • Which mode of transportation is safer ( more safety): – Car or Plane? – What is the probability of an accident? Basic Probability Theory • An experiment has a set of potential outcomes, e.g., throw a dice • The of an experiment is the set of all possible outcomes, e.g., {1, 2, 3, 4, 5, 6}. • An event is a subset of the sample space. – {2} – {3, 6} – even = {2, 4, 6} – odd = {1, 3, 5} Probability as Relative Frequency • An event has a probability. • Consider a long sequence of experiments. If we look at the number of times a particular event occurs in that sequence, and compare it to the total number of experiments, we can compute a ratio. • This ratio is one way of estimating the probability of the event. • P(E) = (# of times E occurred)/(total # of trials) – 100 attempts are made to swim a length in 30 secs. The swimmer succeeds on 20 occasions (tries); therefore the probability that a swimmer can complete the length in 30 secs is: • 20/100 = 0.2 • Failure = (1- 0.2) or 0.8 • The experiments, the sample space and the events must be defined clearly for probability to be meaningful – What is the probability of an accident? Theoretical Probability • Principle of Indifference - Alternatives are always to be judged probabley if we have no reason to expect or prefer one over the other. • Each outcome in the sample space is assigned equal probability. • Example: throw a dice – P({1})=P({2})= ... =P({6})=1/6 Law of Large Numbers • As the number of experiments increases the relative frequency of an event more closely approximates the theoretical probability of the event. – if the theoretical assumptions hold. • Buffon’s Needle for Computing π – Draw parallel lines 1 inch apart on a plane – Throw a 1-inch needle on the plane – P( needle crossing a line )=2/π number of throws 2 number of crossings Large Number Reveals Untruth in Assumptions Why ? Results of 1,000,000 throws of a dice Number 1 2 3 4 5 6 Fraction .155 .159 .164 .169 .174 .179 Axioms of Probability Theory • Suppose P(.) is a probability function, then 1. for any event E, 0≤P(E) ≤1. …..How ? 2. P(S) = 1, where S is the sample space. 3. for any two mutually exclusive events E1 and E2, P(E1 E2) = P(E1) + P(E2) • Any function that satisfies the above three axioms is a probability function. Joint Probability • Let A, B be two events, the joint probability of both A and B being true is denoted by P(A, B). • Example: P(spade) is the probability of the top card being a spade. P(king) is the probability of the top card being a king. P(spade, king) is the probability of the top card being both a spade and a king, i.e., the king of spade. P(king, spade)=P(spade, king) ??? Properties of Probability 1. P(E) = 1– P(E) 2. If E1 and E2 are logically equivalent, then P(E1)=P(E2). – E1: Not all philosophers are more than six feet tall. – E2: Some philosopher is not more that six feet tall. Then P(E1)=P(E2). 3. P(E1, E2)≤P(E1). Conditional Probability • The probability of an event may change after knowing another event. The probability of A given B is denoted by P(A|B). • Example – P( W=space ) the probability of a randomly selected word from an English text is ‘space’ – P( W=space | W’=outer) the probability of ‘space’ if the previous word is ‘outer’ Example A:the top card of a deck of poker cards is a king of spade P(A) = 1/52 However, if we know B: the top card is a king then, the probability of A given B is true is P(A|B) = 1/4. How to Compute P(A|B)? Business Students Of 100 students completing a course, 20 were business major. Ten students received A in the course, and three of these were business majors., suppose A is the event that a randomly selected student got an A in the course, B is the event that a randomly selected event is a business major. What is the probability of A? What is the probability of A after knowing B is true? B not B A 3 20 7 80 Probabilistic Reasoning • Evidence – What we know about a situation. • Hypothesis – What we want to conclude. • Compute – P( Hypothesis | Evidence ) Credit Card Authorization • E is the data about the applicant's age, job, education, income, credit history, etc, • H is the hypothesis that the credit card will provide positive return. • The decision of whether to issue the credit card to the applicant is based on the probability P(H|E). Medical Diagnosis • E is a set of symptoms, such as, coughing, sneezing, headache, ... • H is a disorder, e.g., common cold, SARS, flu. • The diagnosis problem is to find an H (disorder) such that P(H|E) is maximum. Basics of Probability Theory • mathematical approach for processing uncertain information – sample space set X = {x1, x2, …, xn} • collection of all possible events • can be discrete or continuous – probability number P(xi): likelihood of an event xi to occur • non-negative value in [0,1] • total probability of the sample space is 1 • for mutually exclusive events, the probability for at least one of them is the sum of their individual probabilities • experimental probability – based on the frequency of events • subjective probability – based on expert assessment 24 Compound Probabilities • describes independent events – do not affect each other in any way • joint probability of two independent events A and B P(A B) = P(A) * P (B) • union probability of two independent events A and B P(A B) = P(A) + P(B) - P(A B) =P(A) + P(B) - P(A) * P (B) 25 Probability theory • Random variables – Domain • Atomic event: complete specification of state • Prior probability: degree of belief without any other evidence • Joint probability: matrix of combined probabilities of a set of variables • Alarm, Burglary, Earthquake – Boolean (like these), discrete, continuous • Alarm=True Burglary=True Earthquake=False alarm burglary earthquake • P(Burglary) = .1 • P(Alarm, Burglary) = alarm ¬alarm burglary .09 ¬burglary .1 .01 .8 26 Independence • When two sets of propositions do not affect each others’ probabilities, we call them independent, and can easily compute their joint and conditional probability: – Independent (A, B) if P(A B) = P(A) P(B), P(A | B) = P(A) • For example, {moon-phase, light-level} might be independent of {burglary, alarm, earthquake} – Then again, it might not: Burglars might be more likely to burglarize houses when there’s a new moon (and hence little light) – But if we know the light level, the moon phase doesn’t affect whether we are burglarized – Once we’re burglarized, light level doesn’t affect whether the alarm goes off • We need a more complex notion of independence, and methods for reasoning about these kinds of relationships 27 Conditional independence • Absolute independence: – A and B are independent if P(A B) = P(A) P(B); equivalently, P(A) = P(A | B) and P(B) = P(B | A) • A and B are conditionally independent given C if – P(A B | C) = P(A | C) P(B | C) • This lets us decompose the joint distribution: – P(A B C) = P(A | C) P(B | C) P(C) • Moon-Phase and Burglary are conditionally independent given Light-Level • Conditional independence is weaker than absolute independence, but still useful in decomposing the full joint probability distribution 28 Conditional Probabilities • describes dependent events – affect each other in some way • conditional probability of event a given that event B has already occurred P(A|B) = P(A B) / P(B) 29 Q&A 30