Download Basics on Probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Basics on Probability
Jingrui He
09/11/2007
Coin Flips

You flip a coin


Head with probability 0.5
You flip 100 coins

How many heads would you expect
Coin Flips cont.

You flip a coin




Head with probability p
Binary random variable
Bernoulli trial with success probability p
You flip k coins



How many heads would you expect
Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables

Random variables (RVs) which may take on
only a countable number of distinct values


E.g. the total number of heads X you get if you
flip 100 coins
X is a RV with arity k if it can take on exactly
one value out of x1 , , xk 

E.g. the possible values that X can take on are 0,
1, 2,…, 100
Probability of Discrete RV


Probability mass function (pmf): P  X  xi 
Easy facts about pmf

 PX  x   1


i
i

 
 P X  xi  X  x j  0 if i  j
 P X  x X  x  P X  x  P X  x
i
j
i
j
 P X  x1  X  x2   X  xk  1




 if i  j
Common Distributions
 Uniform X U 1, , N 


X takes values 1, 2, …, N

PX  i  1 N

E.g. picking balls of different colors from a box
Binomial X

Bin  n, p 
X takes values 0, 1, …, n
n i
n i
 P  X  i     p 1  p 
i

E.g. coin flips
Coin Flips of Two Persons

Your friend and you both flip coins



Head with probability 0.5
You flip 50 times; your friend flip 100 times
How many heads will both of you get
Joint Distribution

Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together


E.g. P(You get 21 heads AND you friend get 70
heads)

x

y
P X  x  Y  y  1
E.g.
 
50
100
i 0
j 0
P  You get i heads AND your friend get j heads   1
Conditional Probability

P  X  x Y  y  is the probability of X  x ,
given the occurrence of Y  y


E.g. you get 0 heads, given that your friend gets
61 heads
P X  x Y  y 
P X  x  Y  y
P Y  y
Law of Total Probability

Given two discrete RVs X and Y, which take
values in x1 , , xm  and  y1 , , yn  , We have
 P X  x  Y  y 
  P  X  x Y  y P  Y  y 
P  X  xi  
j
i
j
i
j
j
j
Marginalization
Marginal Probability
Joint Probability
 P X  x  Y  y 
  P  X  x Y  y P  Y  y 
P  X  xi  
j
i
j
i
Conditional Probability
j
j
j
Marginal Probability
Bayes Rule

X and Y are discrete RVs…
P X  x Y  y 


P X  xi Y  y j 
P X  x  Y  y
P Y  y


P Y  y j X  xi P  X  xi 
 P Y  y
k
j

X  xk P  X  xk 
Independent RVs


Intuition: X and Y are independent means that
X  x neither makes it more or less probable
that Y  y
Definition: X and Y are independent iff
P X  x  Y  y  P X  x P Y  y 
More on Independence

P X  x  Y  y  P X  x P Y  y 
P X  x Y  y  P X  x

P Y  y X  x  P Y  y
E.g. no matter how many heads you get, your
friend will not be affected, and vice versa
Conditionally Independent RVs


Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
P X  x  Y  y Z  z   P X  x Z  z  P Y  y Z  z 
More on Conditional Independence
P X  x  Y  y Z  z   P X  x Z  z  P Y  y Z  z 
P  X  x Y  y, Z  z   P  X  x Z  z 
P  Y  y X  x, Z  z   P  Y  y Z  z 
Monty Hall Problem




You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?
Host reveals
Goat A
or
Host reveals
Goat B
Host must
reveal Goat B
Host must
reveal Goat A
Monty Hall Problem: Bayes Rule

Ci : the car is behind door i, i = 1, 2, 3
P  Ci   1 3

Hij : the host opens door j after you pick door i



P H ij Ck

i j
0
0
jk


ik
1 2
 1 i  k , j  k
Monty Hall Problem: Bayes Rule cont.



WLOG, i=1, j=3
P  C1 H13  
P  H13
P  H13 C1  P  C 1 
P  H13 
1 1 1
C1  P  C1    
2 3 6
Monty Hall Problem: Bayes Rule cont.

P  H13   P  H13 , C1   P  H13 , C2   P  H13 , C3 
 P  H13 C1  P  C1   P  H13 C2  P  C2 

1
1
  1
6
3
1

2
16 1
P  C1 H13  

12 3
Monty Hall Problem: Bayes Rule cont.



16 1
P  C1 H13  

12 3
1 2
P  C2 H13   1    P  C1 H13 
3 3
You should switch!
Continuous Random Variables



What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function f  x  that describes the
probability density in terms of the input
variable x.
PDF

Properties of pdf




f  x   0, x



f  x  1
f  x   1 ???
Actual probability can be obtained by taking
the integral of pdf

E.g. the probability of X being between 0 and 1 is
P  0  X  1 

1
0
f  x dx
Cumulative Distribution Function


FX  v   P  X  v 
Discrete RVs
 FX  v  


vi
P  X  vi 
Continuous RVs
 
 FX v 


v

f  x  dx
d
FX  x   f  x 
dx
Common Distributions

N ,
Normal X
 
 f x 


2

1
 x   
exp 
, x 
2
2
2




E.g. the height of the entire population
0.4
0.35
0.3
0.25
f(x)

2
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
x
1
2
3
4
5
Common Distributions cont.
Beta X Beta  ,  
1
 1
 1
x 1  x  , x  0,1
 f  x;  ,   
B  ,  
     1 : uniform distribution between 0 and 1

E.g. the conjugate prior for the parameter p in
Binomial distribution
1.6
1.4
1.2
1
f(x)

0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
x
0.6
0.7
0.8
0.9
1
Joint Distribution


Given two continuous RVs X and Y, the joint
pdf can be written as fX,Y  x, y 

x y
f X,Y  x, y dxdy  1
Multivariate Normal

Generalization to higher dimensions of the
one-dimensional normal
Covariance Matrix

f X  x1 ,
, xd  
1
 2 
d 2

12
T 1
 1

 exp   x      x    
 2

Mean
Moments

Mean (Expectation):   E  X 
 Discrete RVs: E  X    vi P  X  vi 
v
i


Continuous RVs: E  X  



Variance: V  X   E  X   


Discrete RVs: V  X  

Continuous RVs: V  X  
xf  x  dx
2
 vi    P  X  vi 
2
vi



 x    f  x dx
2
Properties of Moments

Mean




E  X  Y  E  X  E  Y
E  aX  aE  X
If X and Y are independent, E  XY  E  X  E  Y
Variance

V  aX  b   a 2V  X 

If X and Y are independent, V  X  Y   V (X)  V (Y)
Moments of Common Distributions
 Uniform X U 1, , N 
 Mean 1  N  2 ; variance  N  1 12
2

Binomial X



Mean np ; variance np 2
Normal X

Bin  n, p 

N , 2
Mean  ; variance  2
Beta X Beta  ,  


Mean      ; variance

        1
2
Probability of Events

X denotes an event that could possibly happen


P(X) denotes the likelihood that X happens,
or X=true


E.g. X=“you will fail in this course”
What’s the probability that you will fail in this
course?
 denotes the entire event set


   X, X
The Axioms of Probabilities


0 <= P(X) <= 1
P    1

P  X1  X2 

disjoint events
Useful rules

  i P  Xi  , where X i are
P  X1  X2   P  X1   P  X2   P  X1  X2 
 
 P X  1 P X
Interpreting the Axioms

X1
X2
Related documents