Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TDT70: Uncertainty in Artificial
Intelligence
Chapter 1 and 2
Fundamentals of probability theory
The sample space is the set of possible outcomes of an
experiment.
A subset of a sample space is called an event.
In general, we say that an event A is true for an experiment if
the outcome of the experiment is an element of A.
To measure our degree of uncertainty about an
experiment we assign a probability P(A) to each event A
in the sample space.
Conditional probabilities
Conditional probability:
The fundamental rule:
Bayes’ rule:
The events A and B are independent if:
Probability calculus for variables
A variable is defined as a collection of sample spaces.
A variable can be considered an experiment, and for each
outcome of the experiment the variable has a corresponding
state.
For example, if D is a variable representing the outcome of rolling a
die, then its state space would be sp(D) = (1,2,3,4,5,6)
For a variable A with states a1,...an, we express our
uncertainty about its state through a probability
distribution P(A) over these states:
P(A) = (x1,...,xn); where xi is the probability of A being in a state
ai.
Joint probability tables and marginalization
From a joint probability table P(A,B), the probability
distribution P(A) can be calculated by considering the
outcomes of B that can occur together with each state ai
of A.
Joint probability table example:
b1
b2
b3
a1
0.16
0.12
0.12
a2
0.24
0.28
0.08
Causal Networks
A causal network is a directed graph, where the variables
represent propositions
A variable represents a set of possible states of affairs.
A variable is in exactly one of its states; which one may be unknown
Causal networks, example
Fuel?
Fuel Meter Standing
Clean Spark Plugs
Start?
•A way of structuring a situation for reasoning
under uncertainty is to construct a graph
representing causal relations between events.
D-separation
Definition: Two distinct variables A and B in a causal network
are d-separated if for all paths between A and B, there is an
intermediate variable V (distinct from A and B) such that either
the connection is serial or diverging and V is instantiated
the connection is converging, and neither V nor any of V’s
descendants have received evidence.
If A and B are not d-separated, we call them d-connected
D-seperation, cont.
A
B
C
Evidence may be transmitted through a serial connection
unless the state of the variable in the connection is
known.
When the state of a variable is known, we say that the variable
is instantiated.
Rainfall
Water
level
Flooding
D-seperation, cont.
A
B
C
...
E
Evidence may be transmitted through a diverging
connection unless it is instantiated.
B,C,....,E are d-separated given A
Sex
Hair
length
Stature
D-seperation, cont.
B
C
...
E
A
Evidence may be transmitted through a converging
connection only if either the variable in the connection or
one of its descendants has received evidence.
Not
enough
fuel
Dirty
spark
plugs
Car
won’t
start
D-separation (again)
Definition: Two distinct variables A and B in a causal network
are d-separated if for all paths between A and B, there is an
intermediate variable V (distinct from A and B) such that either
the connection is serial or diverging and V is instantiated
the connection is converging, and neither V nor any of V’s
descendants have received evidence.
If A and B are not d-separated, we call them d-connected
D-separation, cont.
The Markov blanket of a variable A is the set consisting of the
parents of A, the children of A, and the variables sharing a
child with A.
Has the property that when instantiated, A is d-separated from the
rest of the network.
Bayesian Networks
A Bayesian network consists of the following:
A set of variables and a set of directed edges between variables.
Each variable has a finite set of mutually exclusive states.
The variables together with the directed edges form an acyclic
directed graph.
To each variable A with parents B1,...,Bn, a conditional probability
table P(A|B1,...,Bn ) is attached.
Bayesian Networks, cont.
The model’s d-separation properties should correspond
to our perception of the world’s conditional
independence properties.
If A and B are d-separated given evidence e, then the
probability calculus used for Bayesian networks must yield
P(A|e) = P(A|B,e)
The general chain rule
Let U {A1,...An} be a set of variables. Then for any
probability distribution P(U) we have
P(U) = P(An|A1,...An-1) ⋅
P(An-1|A1,...An-2)... ⋅ P(A2|A1)P(A1)
The chain rule for Bayesian networks
Let BN be a Bayesian network over U = {A1,...,An}. Then
BN specifies a unique joint probability distribution P(U)
given by the product of all conditional probability tables
specified in BN:
where pa(Ai) are the parents of Ai in BN, and P(U) reflects the properties of BN.
Inserting evidence
Let A be a variable with n states. A finding on A is an ndimensional table of zeros and ones.
E.g. (0,0,0,1,0,0,1,0)
Semantically, a finding is a statement that certain states of A
are impossible.
Let BN be a Bayesian network over the universe U, and
let e1,...,em be findings. Then
And for A U we have
Questions?