Download Lec9Probability05

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
Lecture 9
5.3 Discrete Probability
5.3 Bayes’ Theorem
We have seen that the following holds:
P( E F )
 P( E | F ) P( F )  P( E F )
P( F )
P( E F )
P( F | E ) 
 P( F | E ) P( E )  P( E F )
P( E )
P( E | F ) 
P( F | E ) P( E )
P( E | F ) P( F )
P( E | F ) 
 P( F | E ) 
P( F )
P( E )
We can write one conditional probability in terms of the other: Bayes’ Theorem
5.3
Example:
What is the probability that a family with 2 kids has two boys, given that they
have at least one boy? (all possibilities are equally likely).
S: all possibilities: {BB, GB, BG, GG}.
E: family has two boys: {BB}.
F: family has at least one boy: {BB, GB, BG}.
E
F = {BB}
GG
F
E
BB
p(E|F) = (1/4) / (3/4) = 1/3
Now we compute the probability of P(F|E), what is the probability that
a family with two boys has at least one boy ?
P(F|E) = P(E|F) P(F) / P(E) = 1/3 * ¾ / ¼ = 1
BG
GB
5.3 Expected Values
The definition of an expected value of a random variable is:
E ( X )   X ( s) p( s)
sS
This equivalent to: E ( X ) 

P( X  r ) r
r X ( S )
Example: What is the expected number of heads if we toss a fair coin
n times?
 We know that the distribution for this experiment is the Binomial distribution:
n!
P(k , n; p) 
p k (1  p) n k
k !(n  k )!
5.3
Therefore we need to compute:
n
E ( X )   k P( X  k )
k 0
n
 k
k 0
np
n!
p k (1  p) n  k
k !(n  k )!
5.3
Expectation are linear:
Theorem: E(X1+X2) = E(X1) + E(X2)
E(aX + b) = aE(X) + b
Examples:
1) Expected value for the sum of the values when a pair of dice is rolled:
X1 = value of first die, X2 value of second die:
E(X1+X2) = E(X1) + E(X2) = 2 * (1+2+3+4+5+6)/6 = 7.
2) Expected number of heads when a fair coin is tossed n times
(see example previous slide)
 Xi is the outcome coin toss i. Each has a probability of p of coming up heads.
linearity: E(X1+...+Xn) = E(X1) + ... + E(xn) = n p.
5.3
More examples:
A person checking out coats mixed the labels up randomly.
When someone collects his coat, he checks out a coat randomly from the
remaining coats. What is the expected number of correctly returned coats?
There are n coats checked in.
Xi = 1 of correctly returned, and 0 if wrongly returned.
Since the labels are randomly permuted, E(Xi) = 1/n
E(X1+...Xn) = n 1/n = 1 (independent of the number of checked in coats)
5.3 Geometric distribution
Q: What is the distribution of waiting times until a tail comes up, when we toss a fair
coin?
A: Possible outcomes: T, HT, HHT, HHHT, HHHHT, .... (infinitely many possibilities)
P(T) = p, P(HT) = (1-p) p, P(HHT) = (1-p)^2 p, ....
P( X  k )  (1  p)k 1 p
geometric distribution
(matlab)
X(s) = number of tosses before
success.

Normalization:
 P( X  k )  1
k 1


 (1  p)
k 1
k 1
p 1
5.3 Geometric Distr.

E ( X )   k (1  p ) k 1 p
k 1

d
 p   (1  p) k
dp
k 1
Here is how you can compute the
expected value of the waiting time:
d 1 
k
 p
(1

p
)
p

dp p k 1
d 1 
l 1 
 p
(1

p
)
p


dp p  l  2

d 1 

l 1
 p
(1

p
)
p

p


dp p  l 1

d 1
 p
(1  p)
dp p
1

p
5.3 Independence
Definition: Two random variables X(s) and Y(s) on a sample space S
are independent if the following holds:
r1 , r2
P( X (s)  r1  Y (s)  r2 )  P( X (s)  r1 ) P(Y (s)  r2 )
Examples
1) Pair of dice is rolled. X1 is value first die, X2 value second die.
Are these independent?
P(x1=r1) = 1/6
P(X2=r2)=1/6
P(X1=r1 AND X2=r2)=1/36 = P(X1=r1) P(X2=r2): YES independent.
2) Are X1 and X=X1+X2 independent?
P(X=12) =1/36
P(X1=1)=1/6
P(X=12 AND X1=1)=0 which is not the product: P(X=12) P(X1=1)
5.3 Independence
Theorem: If two random variables X and Y are independent
over a sample space S then: E(XY)=E(X) E(Y). (proof, read book)
Note1: The reverse is not true: Two random variables do not have
to be independent for E(XY)=E(X)E(Y) to hold.
Note2: If 2 random variables are not independent, it follows that E(XY) does not
have to be equal to E(X)E(Y), although it might still happen.
Example: X counts number of heads when a coin is tossed twice:
P(X=0) =1/4 (TT)
P(X=1)=1/2 (HT,TH)
P(X=2) =1/4 (HH).
E(X) = 1x½+2x1/4=1.
Y counts the number of tails: E(Y)=1 as well (symmetry, switch role H,T).
However,
P(XY=0) = 1/2 (HH,TT)
P(XY=1) =1/2 (HT,TH)
E(XY) = 0x1/2 + 1x1/2=1/2
5.3 Variance
The average of a random variable tells us noting about the spread of a probability
distribution. (matlab demo)
Thus we introduce the variance of a probability distribution:
Definition: The variance of a random variable X over a sample space S is given by:
variance
V ( X )   ( X ( s)  E ( X )) 2 p( s )
sS
 E (( X  E ( X )) 2 )
 E ( X 2 )  2 E ( XE ( X ))  E ( E ( X ) 2 )
 E ( X 2 )  2E ( X )2  E ( X )2
 E ( X 2 )  E ( X )2
standard deviation
 (X )  V (X )
(this is the width of the distribution)
5.3 Variance
Theorem: For independent random variables the variances add: (proof in book)
E ( X  Y )  E ( X )  E (Y ) (always true)
V ( X  Y )  V ( X )  V (Y )
( X , Y independent )
Example:
1) We toss 2 coins, Xi(H)=1, Xi(T)=0.
What is the STD of X=X1+X2?
X1 and X2 are independent.
V(X1+X2)=V(X1)+V(X2)=2V(X1)
E(X1)=1/2
V(X1) = (0-1/2)^2 x ½ + (1-1/2)^2 x ½ =1/4
V(X) = ½
STD(X)=sqrt(1/2).
5.3 Variance
What is the variance of the number of successes when n independent
Bernoulli trials are performed.
V(X) = V(X1+...+Xn)=nV(X1)
V(X1) = (0-p)^2 x (1-p) + (1-p)^2 x p = p^2(1-p) + p(1-p)^2=p(1-p)
V(X)=np(1-p)
(matlab demo)