Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TDT70: Uncertainty in Artificial Intelligence Chapter 1 and 2 Fundamentals of probability theory The sample space is the set of possible outcomes of an experiment. A subset of a sample space is called an event. In general, we say that an event A is true for an experiment if the outcome of the experiment is an element of A. To measure our degree of uncertainty about an experiment we assign a probability P(A) to each event A in the sample space. Conditional probabilities Conditional probability: The fundamental rule: Bayes’ rule: The events A and B are independent if: Probability calculus for variables A variable is defined as a collection of sample spaces. A variable can be considered an experiment, and for each outcome of the experiment the variable has a corresponding state. For example, if D is a variable representing the outcome of rolling a die, then its state space would be sp(D) = (1,2,3,4,5,6) For a variable A with states a1,...an, we express our uncertainty about its state through a probability distribution P(A) over these states: P(A) = (x1,...,xn); where xi is the probability of A being in a state ai. Joint probability tables and marginalization From a joint probability table P(A,B), the probability distribution P(A) can be calculated by considering the outcomes of B that can occur together with each state ai of A. Joint probability table example: b1 b2 b3 a1 0.16 0.12 0.12 a2 0.24 0.28 0.08 Causal Networks A causal network is a directed graph, where the variables represent propositions A variable represents a set of possible states of affairs. A variable is in exactly one of its states; which one may be unknown Causal networks, example Fuel? Fuel Meter Standing Clean Spark Plugs Start? •A way of structuring a situation for reasoning under uncertainty is to construct a graph representing causal relations between events. D-separation Definition: Two distinct variables A and B in a causal network are d-separated if for all paths between A and B, there is an intermediate variable V (distinct from A and B) such that either the connection is serial or diverging and V is instantiated the connection is converging, and neither V nor any of V’s descendants have received evidence. If A and B are not d-separated, we call them d-connected D-seperation, cont. A B C Evidence may be transmitted through a serial connection unless the state of the variable in the connection is known. When the state of a variable is known, we say that the variable is instantiated. Rainfall Water level Flooding D-seperation, cont. A B C ... E Evidence may be transmitted through a diverging connection unless it is instantiated. B,C,....,E are d-separated given A Sex Hair length Stature D-seperation, cont. B C ... E A Evidence may be transmitted through a converging connection only if either the variable in the connection or one of its descendants has received evidence. Not enough fuel Dirty spark plugs Car won’t start D-separation (again) Definition: Two distinct variables A and B in a causal network are d-separated if for all paths between A and B, there is an intermediate variable V (distinct from A and B) such that either the connection is serial or diverging and V is instantiated the connection is converging, and neither V nor any of V’s descendants have received evidence. If A and B are not d-separated, we call them d-connected D-separation, cont. The Markov blanket of a variable A is the set consisting of the parents of A, the children of A, and the variables sharing a child with A. Has the property that when instantiated, A is d-separated from the rest of the network. Bayesian Networks A Bayesian network consists of the following: A set of variables and a set of directed edges between variables. Each variable has a finite set of mutually exclusive states. The variables together with the directed edges form an acyclic directed graph. To each variable A with parents B1,...,Bn, a conditional probability table P(A|B1,...,Bn ) is attached. Bayesian Networks, cont. The model’s d-separation properties should correspond to our perception of the world’s conditional independence properties. If A and B are d-separated given evidence e, then the probability calculus used for Bayesian networks must yield P(A|e) = P(A|B,e) The general chain rule Let U {A1,...An} be a set of variables. Then for any probability distribution P(U) we have P(U) = P(An|A1,...An-1) ⋅ P(An-1|A1,...An-2)... ⋅ P(A2|A1)P(A1) The chain rule for Bayesian networks Let BN be a Bayesian network over U = {A1,...,An}. Then BN specifies a unique joint probability distribution P(U) given by the product of all conditional probability tables specified in BN: where pa(Ai) are the parents of Ai in BN, and P(U) reflects the properties of BN. Inserting evidence Let A be a variable with n states. A finding on A is an ndimensional table of zeros and ones. E.g. (0,0,0,1,0,0,1,0) Semantically, a finding is a statement that certain states of A are impossible. Let BN be a Bayesian network over the universe U, and let e1,...,em be findings. Then And for A U we have Questions?