Download Pr(Amount,Time)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Working with Probability Tables
Recall Discrete Probability
• Sample space is made up if discrete outcomes
• Discrete: Nominal (categories)
Experiment: Did he do it?
• Discrete: Ordinal (orderable somehow)
Experiment: How many arsons will there be in
my neighbourhood this year?
Recall Discrete Probability
• Each outcome is associated with a probability:
Table:
G
I
Pr(G) =57% Pr(I) =43%
Pr(TO)
TO: G
57
I
43
Recall Discrete Probability
• Each outcome is associated with a probability:
•
Time:
We can have one or more discrete random
variates too:
A 3D histogram
Amount:
low
morning
28
noon
3
night
0
medium
3
18
13
high
1
10
24
Pr(Time,Amount)
Recall Discrete Probability
• Each outcome is associated with a probability:
•
Here is what a table with 3 random variates could
look like:
A:
B:
C: c1
c2
c3
c4
a1
b1
5.2
7.5
2.4
0.9
a1
b2
0.3
2.9
8.0
0.2
a1
b3
6.0
1.1
8.7
5.7
a2
b1
3.2
1.2
1.3
6.9
a2
b2
2.1
0.4
4.7
7.3
a2
b3
5.1
5.9
8.0
5.0
Pr(A,B,C)
•
We can’t really make a visual histogram anymore
though…
Recall Discrete Probability
• Each outcome is associated with a probability:
•
Here is what a table with 3 random variates could
look like (or):
This is how R likes a multidimensional table
C = c1
C = c2
B
A
b1
5.2
3.2
a1
a2
C = c3
b2
0.3
2.1
b3
6.0
5.1
a1
a2
A
b1
2.4
1.3
b2
8.0
4.7
b3
8.7
8.0
b1
7.5
1.2
a1
a2
C = c4
B
A
B
b2
2.9
0.4
B
A
a1
a2
b1
0.9
6.9
b2
0.2
7.3
Pr(A,B,C)
•
b3
1.1
5.9
We can’t really make a visual histogram anymore
though…
b3
5.7
5.0
Recall Discrete Probability
• Each outcome is associated with a probability:
•
Here is what a table with 3 random variates could
look like (or):
•
We can’t really make a
visual histogram anymore
though…
B
b1
b1
b1
b1
b2
b2
b2
b2
b3
b3
b3
b3
b1
b1
b1
b1
b2
b2
b2
b2
b3
b3
b3
b3
C
c1
c2
c3
c4
c1
c2
c3
c4
c1
c2
c3
c4
c1
c2
c3
c4
c1
c2
c3
c4
c1
c2
c3
c4
Pr(A,B,C)
5.2
7.5
2.4
0.9
0.3
2.9
8.0
0.2
6.0
1.1
8.7
5.7
3.2
1.2
1.3
6.9
2.1
0.4
4.7
7.3
5.1
5.9
8.0
5.0
Pr(A,B,C)
I find this format most general,
easiest to read and least ambiguous
A
a1
a1
a1
a1
a1
a1
a1
a1
a1
a1
a1
a1
a2
a3
a4
a5
a6
a7
a8
a9
a10
a11
a12
a13
Recall Discrete Probability
• Each outcome is associated with a probability:
•
Conditional probability can be represented as
tables too:
Amount
:
low
medium
high
Time:
morning
noon
night
87.0
8.7
4.3
9.1
59.1
31.8
0.8
34.4
64.9
Pr(Amount|Time) = Pr(Time,Amount)/Pr(Time)
Representation of Probability Tables in R
•
A 1D “prior” table (one RV):
Pr(A
)
A: yes 0.37
no 0.63
library(gRbase)
We’ll use gRbase and
gRain packages a lot when
working with Bayesian
networks
parray
R code:
A <- parray("A",
levels = list(c("yes","no")),
values =
c( 0.37,0.63))
output:
Representation of Probability Tables in R
•
A 2D table, either a joint PMF or conditional PMF:
Pr(Amount,Time)
Pr(Amount,Time)
Amount:
Time:
morning
0.28
noon
0.03
night
0
medium
0.03
0.18
0.13
high
0.01
0.1
0.24
low
Column variable
A.and.T <- parray(“Amount”,”Time”,
Row variable
list(
c("low","medium","high"),
# Amount levels
c("morning","noon","night") # Time levels
),
values=rbind(
You can enter any values
c(28, 3, 0),
and choose to normalize
c( 3, 18, 13),
them.
c( 1, 10, 24)
),
normalize = "all"
)
Representation of Probability Tables in R
•
A 2D table, either a joint PMF or conditional PMF:
Pr(Amount,Time)
Pr(Amount,Time)
Amount:
output:
Time:
morning
0.28
noon
0.03
night
0
medium
0.03
0.18
0.13
high
0.01
0.1
0.24
low
Representation of Probability Tables in R
•
A 2D table, either a joint PMF or conditional PMF:
Pr(Amount|Time)
Pr(Amount | Time)
Amount:
Time:
morning
0.87
noon
0.091
night
0.008
medium
0.09
0.591
0.344
high
0.04
0.319
0.649
low
A.given.T <- parray(c("Amount", "Time"),
list(
c("low","medium","high"),
# Amount levels
c("morning","noon","night") # Time levels
),
values=rbind( c(0.87, 0.091, 0.008),
c(0.09, 0.591, 0.344),
c(0.04, 0.319, 0.649) )
)
•
NOTE: You the user has to remember if it is joint or
conditional
Representation of Probability Tables in R
•
A 2D table, either a joint PMF or conditional PMF:
Pr(Amount|Time)
Pr(Amount | Time)
Amount:
Time:
morning
0.87
noon
0.091
night
0.008
medium
0.09
0.591
0.344
high
0.04
0.319
0.649
low
output:
•
NOTE: You the user has to remember if it is joint or
conditional
Representation of Probability Tables in R
•
Higher nD (n = 3 or more) tables are arrays of 2D
tables.
•
These are really hard to enter by had so let’s just look at one
for now:
Pr(Transfer | Artist, Location)
A 3D table
Representation of Probability Tables in R
•
Higher nD (n = 3 or more) tables are arrays of 2D
tables.
•
Same table as pervious slide, just in an easier to read format:
Pr(Transfer | Artist, Location)
Working with Probability Tables
•
So it’s clear that we can represent lots of discrete
probability distributions (they are probability mass
functions, PMFs actually) as tables: f
•
How can we “combine” them while respecting the laws
of probability?
•
gRbase library makes it easy!
•
Add tables = tableAdd()
•
Multiply tables = tableMult()
•
Divide tables = tableDiv()
•
Marginalize around selected RV = tableMargin()
Example with Probability Tables
Consider two variables, A and B, that can co-occur. A has two
states and B has three states. The probabilities of A and B can
be listed in a table:
Pr(A,B) b1
b2
b3
a1
0.14 0.12 0.12
a2
0.24 0.26 0.08
Compute:
Compute:
Compute:
Compute:
Example with Probability Tables
Compute:
(margin is B)
Compute:
(divide)
Compute:
(marginalize)
Compute:
(multiply and
divide)
# Input: Pr(A,B)
A.and.B <- parray(c("A", "B"),
levels=c(2,3),
values=rbind(c(0.14, 0.16, 0.12),
c(0.24, 0.26, 0.08)
)
)
A.and.B
# Output
sum(A.and.B)
# Check
B <- tableMargin(A.and.B, margin = "B")
B
sum(B)
# Pr(B)
# Output
# Check
A.given.B <- tableDiv(A.and.B, B)
t(A.given.B)
colSums(t(A.given.B))
# Pr(A|B)
# Output
# Check
A <- tableMargin(A.and.B, margin = "A")
A
sum(A)
# Pr(A)
# Output
# Check
B.given.A <- tableDiv(tableMult(A.given.B, B), A) # Pr(B|A)
t(B.given.A)
# Output
colSums(t(B.given.A))
# Check
The Law of Total Probability
•
Suppose a sample space can be partitioned into a set of
disjoint events Bi such that
B
B
4
1
B3
A
B2
The Law of Total Probability
•
Suppose a sample space can be partitioned into a set of
disjoint events Bi such that
•
The probability of an arbitrary event A in Ω can be written as:
Law of total probability
Example: A medical test
• Professor Shenkin LOVES hamburgers. But he’s also a
hypochondriac. He thinks he is infected with “Mad Cow Disease”
(MCD), so he gets himself tested (T).
• The true positive rate of the test is: Pr(T+ | MCD+) = 0.7
• The false positive rate of the test is: Pr(T+ | MCD-) = 0.1
• The background prevalence of MCD in the yummy cow population is:
Pr(MCD+) = 0.02
What is the most likely joint outcome?
What is the probability that Prof. Shenkin tests positive for MCD, Pr(T+)?
Suppose Professor Shenkin is positive for MCD. What is the probability
that he truly has MCD, Pr(MCD+| T+)?
Use probability table arithmetic in R/gRbase
library(gRbase)
# Pr(MCD) table:
MCD <- parray("MCD",
levels = list(c("MCD+","MCD-")),
values =
c(0.02,0.98))
MCD
sum(MCD)
# Pr(T | MCD) table:
T.given.MCD <- parray(c("T", "MCD"),
list(
c("T+","T-"),
c("MCD+","MCD-")
),
values=rbind( c(0.7, 0.1),
c(0.3, 0.9)
)
)
T.given.MCD
colSums(T.given.MCD)
# Pr(T,MCD) = Pr(T | MCD) Pr(MCD)
T.and.MCD <- tableMult(T.given.MCD,MCD)
T.and.MCD
sum(T.and.MCD)
# Pr(T)
TT <- tableMargin(T.and.MCD, margin = "T")
TT
sum(TT)
# Pr(MCD | T) = Pr(T | MCD) Pr(MCD)/Pr(T)
MCD.givenT <- tableDiv(tableMult(T.given.MCD, MCD), TT)
t(MCD.givenT)
colSums(t(MCD.givenT))
Related documents