Download lecture 08 slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Probability and Estimation
Probability and Statistics
• Thought experiment
– Know the population
– Will sample n members at random
• What can we say about probability of sample statistics?
– p(M = x) for some value x
– p(M1 > M2)
– etc.
• All hypothetical
– Based on assumptions about population
– Worked out mathematically
– Not based on any real data
• Logic behind all inferential statistics
– IF the population has a certain distribution, what will the
probabilities be?
Marbles
• Half red, half blue. p(red) = 1/2
?
• 2/3 red, 1/3 blue. p(red) = 2/3
?
• Probability is just a bag of marbles
– Population = bag
– Sample = drawing from bag
• Sample four marbles
– p(2 reds, 2 blues) = 6/16
? = .375
Probability Distributions
• Probability distribution
– Distribution defined only by probabilities (not frequencies)
– Can think of as infinite population
• Discrete vs. continuous probability distributions
– Discrete: like histogram, but p(x) instead of f(x)
• Probability is height of each bar
– Continuous: density function
0.40
• Random variable
Probability
• Probability is area under the curve
0.30
– Variable defined by probability distribution
0.20
– Can take on one of many values
– Each value (or range of values) 0.10
has a probability
0.00
0
1
2
3
Number of reds
4
Expected Value
• Expected value
Mean of a probability distribution
E ( R)   x  p( R  x)
   x  p( x)
x
x
Probability
p(R = x)
0.40
x
0
1
2
3
4
0.30
0.20
0.10
0.00
0
1
2
3
R (number of reds)
4
p(x)
.0625
.25
.375
.25
.0625
xp(x)
0
0.25
0.75
0.75
0.25
 = 2.00
Properties of Expected Value
• Multiplying by a fixed number
– I give you $3 per red: E(3R) = 3E(R)
?
– E(cR) = cE(R)
• Adding a fixed number
– I give you $1 per red, plus a bonus $1: E(R + 1) = E(R)
? +1
– E(R + c) = E(R) + c
• Sum of two random variables
– E(R + G) = E(R)
? + E(G)
• Relationship to mean
– Expected value of any random variable equals its mean
E ( R)   x  p ( R  x)
x
 mean( R)
Sampling from a Population
• Sampling a single item (n = 1)
– What is E(X)?
x
æ
ö
E ç å X ÷ = E ( X[1] + ...+ X[n])
è sample ø
= E(X[1]) + ...+ E(X[n])
• Sampling many items
æ
ö
– What is E ç å X ÷ ?
è sample ø
= m + m + ...+ m = n × m
æ
ö
æ å X ö Eç å X ÷
è sample ø n × m
sample
ç
÷
E
=
=
=m
ç n ÷
n
n
è
ø
0.40
– What is E(mean(X))?
0.30
 E(M) = 
Probability
p(X = x)
E(X) = å x × p(x) = m
0.20
• Unbiased estimators
0.10
– Statistic a is unbiased estimator of parameter a if E(a) = a
– Sample mean is 0.00
unbiased estimator of population mean
0
1
2
x
3
4
Biased Estimators: The Example of Variance
• Population variance
– (X-)2 is unbiased estimator of s2
s2 =
å ( X - m) 2
pop
N
æ å (X -m)22 ö
å (X -m)
• Many observations:
2
2
ç
÷
sample
E
=
s
=
mean
(
X
m)
n
– Problem: We don't know 
ç n ÷
è
ø
å (X - M) 2
= E ( X - m) 2
(
sample
n
(
?
sample

•
å ( X - M )2
sample
)
å (X - M) 2
• Sample mean always shifted toward sample
(X – )2
is biased estimate of
n
s2
– Not a good definition for sample variance
M
)
n
<
å (X - m) 2
sample
n
)22ö
æ åå(X(X-M
-M )
sample
E çç n ÷÷ < s 2
è n ø
Sample Variance
• Goal: Define sample variance to be unbiased estimator
of population variance
– E(s2) = s2
• Problem: Obvious answer is biased
)22ö
æ åå(X(X-M
-M )
– E çsample n . ÷ < s 2
ç n ÷
è
ø
– M is always closer to X than  is
• Solution:
– s. 2
=
å (X -M )2
sample
n-1
– Unbiased: E(s2) = s2
• Sample standard deviation
s=
å (X -M )2
sample
n-1
Take-away Points
• Random variables and probability distributions
• Expected value
– Formula: E ( R)   x  p( R  x)
x
– Relationship to mean
– Adding and multiplying
• Samples and sample statistics as random variables
• Biased and unbiased estimators
– M is an unbiased estimator of : E(M) = 
å (X -M )2
– sample
is a biased estimator of s2
n
• Sample variance å (X -M )2
– Formula:
s2 =
sample
n-1
– Unbiased estimator of population variance
• Sample standard deviation:
s=
å (X -M )2
sample
n-1
Related documents