Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS 416 Artificial Intelligence Lecture 14 Uncertainty Chapter 13 An apology to Red Sox fans The only team ever in baseball to take a 3-0 series to a game seven I was playing the probabilities… Shortcomings of first-order logic Consider dental diagnosis • – Not all patients with toothaches have cavities. There are other causes of toothaches Shortcomings of first-order logic What’s wrong with this? • An unlimited number of toothache causes Shortcomings of first-order logic Alternatively, create a causal rule • Again, not all cavities cause pain. Must expand Shortcomings of first-order logic Both diagnostic and causal rules require countless qualifications • Difficult to be exhaustive – Too much work – We don’t know all the qualifications – Even correctly qualified rules may not be useful if the realtime application of the rules is missing data Shortcomings of first-order logic As an alternative to exhaustive logic… Probability Theory • Serves as a hedge against our laziness and ignorance Degrees of belief I believe the glass is full with 50% chance • Note this does not indicate the statement is half-true – We are not talking about a glass half-full • “The glass is full” is the only statement being considered • My statement indicates I believe with 50% that the statement is true. There are no claims about what other beliefs I have regarding the glass. – Fuzzy logic handles partial-truths Decision Theory What is rational behavior in context of probability? • Pick answer that satisfies goals with highest probability of actually working? – Sometimes more risk is acceptable • Must have a utility function that measures the many factors related to an agent’s happiness with an outcome • An agent is rational if and only if it chooses the action that yields the highest expected utility, averaged over all the possible outcomes of the action Building probability notation Propositions • Like propositional logic. The things we believe Atomic Events • A complete specification of the state of the world Prior Probability • Probability something is true in absence of other data Conditional Probability • Probability something is true given something else is known Propositions Like propositional logic • Random variables refer to parts of the world with unknown status • Random variables have a well-defined domain – Boolean – Discrete (countable) – Continuous Atomic events A complete specification of the world • All variables in the world are assigned values • Only one atomic event can be true • The set of all atomic events is exhaustive – at least one must be true • Any atomic even entails the truth or falsehood of every proposition Prior probability The degree of belief in the absence of other info • P (Weather) – P (Weather == sunny) = 0.7 – P (Weather == rainy) = 0.2 – P (Weather == cloudy) = 0.08 – P (Weather == snowy) = 0.02 • P (Weather) = <0.7, 0.2, 0.08, 0.02> – Probability distribution for the random variable Weather Prior probability - Discrete Joint probability distribution • P (Weather, Natural Disaster) = an n x m table of probs – n = instances of weather – m = instances of natural disasters Full joint probability distribution • Probabilities for all variables are established What about continuous variables where a table won’t suffice? Prior probability - Continuous Probability density functions (PDFs) • P (X = x) = Uniform [18, 26] (x) – The probability that tomorrow’s temperature is 20.5 degrees Celsius is U [18, 26] (20.5) = 0.125 Conditional probability The probability of a given all we know is b • P (a | b) Written as an unconditional probability • Axioms of probability • All probabilities are between 0 and 1 • Necessarily true propositions have probability 1 Necessarily false propositions have probability 0 • The probability of disjunction is: Using axioms of probability The probability of a proposition is equal to the sum of the probabilities of the atomic events in which it holds: An example Maginalization: Conditioning: Conditional probabilities Conditional probabilities Normalization Two previous calculations had the same denominator • P(cavity | toothache) = a P(cavity, toothache) – = a [P(cavity, toothache, catch) + P(cavity, toothache, ~catch)] – = a [<0.108, 0.016> + <0.012, 0.064>] = a<0.12, 0.08> = <0.6, 0.4> Generalized (X = cavity, e = toothache, y = catch) P (X, e, y) is a subset of the full joint distribution Using the full joint distribution It does not scale well… • n Boolean variables – Table size O (2n) – Process time O (2n) Independence Independence of variables in a domain can dramatically reduce the amount of information necessary to specify the full joint distribution • Adding weather (four states) to this table requires creating four versions of it (one for each weather state) = 8*4=32 cells Independence • P (toothache, catch, cavity, Weather=cloudy) = P(Weather=cloudy | toothache, catch, cavity) * P(toothache, catch, cavity) Because weather and dentistry are independent • P (Weather=cloudy | toothache, catch, cavity) = P (Weather = cloudy) • P (toothache, catch, cavity, Weather=cloudy) = P(Weather=cloudy) * P(toothache, catch, cavity) 4-cell table 8-cell table Bayes’ Rule Useful when you know three things and need to know the fourth Example Meningitis • Doctor knows meningitis causes stiff necks 50% of the time • Doctor knows unconditional facts – The probability of having meningitis is 1 / 50,000 – The probability of having a stiff neck is 1 / 20 • The probability of having meningitis given a stiff neck: Power of Bayes’ rule Why not collect more diagnostic evidence? • Statistically sample to learn P (m | s) = 1 / 5,000 If P(m) changes… due to outbreak… Bayes’ computation adjusts automatically, but sampled P(m | s) is rigid Conditional independence Consider the infeasibility of full joint distributions • We must know P(toothache and catch) for all Cavity values Simplify using independence • Toothache and catch are not independent • Toothache and catch are independent given the presence or absence of a cavity Conditional independence Toothache and catch are independent given the presence or absence of a cavity • If you know you have a cavity, there’s no reason to believe the toothache and the dentist’s pick are related Conditional independence In general, when a single cause influences multiple effects, all of which are conditionally independent (given the cause) Naïve Bayes Even when “effect” variables are not conditionally independent, this model is sometimes used • Sometimes called a Bayesian Classifier