Download Math 141 - Lecture 8: Estimation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Math 141
Lecture 8: Estimation
Albyn Jones1
1 Library 304
[email protected]
www.people.reed.edu/∼jones/courses/141
Albyn Jones
Math 141
Last Time
Expected value of a sum of independent RV’s:
E(X1 + X2 + . . . Xn ) = E(X1 ) + E(X2 ) + . . . E(Xn )
Variance of a sum of independent RV’s:
Var(X1 + X2 + . . . Xn ) = Var(X1 ) + Var(X2 ) + . . . Var(Xn )
X ∼ Binomial(n, p):
E(X ) = np
Var(X ) = σX2 = npq = np(1 − p)
√
SD(X ) = σX = npq
Albyn Jones
Math 141
Aside on Randomization
Random Sample
For a finite population, every subset of n members of the
population is equally likely to be selected. If the population
is large, and n small relative to the population, it is
approximately the same as sampling with replacement,
yielding at least approximate independence.
Representative Sample
A Useless Idea! See the discussion in Zetterberg (2004).
The phrase quota sampling refers to the attempt to get
‘representative samples’.
The Point: randomization ensures independence, and
protects against selection bias.
Albyn Jones
Math 141
Terminology
Definition: IID
(Mutually) Independent and Identically Distributed random
variables:
Albyn Jones
Math 141
Terminology
Definition: IID
(Mutually) Independent and Identically Distributed random
variables:
Independence: knowing the value of one RV gives no
information about the value of any other.
Albyn Jones
Math 141
Terminology
Definition: IID
(Mutually) Independent and Identically Distributed random
variables:
Independence: knowing the value of one RV gives no
information about the value of any other.
Identically Distributed: all random variables are drawn from
the same population or distribution. Thus they all have the
same expected value and variance: the population mean
and population variance.
Albyn Jones
Math 141
Sums of IID Random Variables: I
Let X1 , X2 , . . . , Xn be a sample of n IID RV’s from a
population with mean µ and standard deviation σ.
Let Sn be their sum:
Sn =
n
X
Xi = X1 + X2 + . . . + Xn
i=1
What are E(Sn ) and SD(Sn )?
Albyn Jones
Math 141
Sums of IID Random Variables: II
X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard
deviation σ.
Albyn Jones
Math 141
Sums of IID Random Variables: II
X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard
deviation σ.
The expected value of a sum is the sum of the expected
values:
E(Sn ) =
n
X
E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ
i=1
Albyn Jones
Math 141
Sums of IID Random Variables: II
X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard
deviation σ.
The expected value of a sum is the sum of the expected
values:
E(Sn ) =
n
X
E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ
i=1
The variance of a sum of independent RV’s is the sum of
their variances:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
Albyn Jones
Math 141
Sums of IID Random Variables: II
X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard
deviation σ.
The expected value of a sum is the sum of the expected
values:
E(Sn ) =
n
X
E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ
i=1
The variance of a sum of independent RV’s is the sum of
their variances:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
√
p
√
SD(Sn ) = Var(Sn ) = nσ 2 = σ n
Albyn Jones
Math 141
Sums of IID Random Variables: II
X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard
deviation σ.
The expected value of a sum is the sum of the expected
values:
E(Sn ) =
n
X
E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ
i=1
The variance of a sum of independent RV’s is the sum of
their variances:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
√
p
√
SD(Sn ) = Var(Sn ) = nσ 2 = σ n
√
Note: the fact that SD(Sn ) ∝ n, rather than n, is
tremendously important, as we shall see.
Albyn Jones
Math 141
Statistics!
Definition: Statistic
Any function of the data!
Let X1 , X2 , . . . , Xn be our data. Examples:
Albyn Jones
Math 141
Statistics!
Definition: Statistic
Any function of the data!
Let X1 , X2 , . . . , Xn be our data. Examples:
The sample mean
P
X =
Albyn Jones
Xi
n
Math 141
Statistics!
Definition: Statistic
Any function of the data!
Let X1 , X2 , . . . , Xn be our data. Examples:
The sample mean
P
X =
Xi
n
The sample variance (why (n-1) instead of n?)
P
(Xi − X )2
2
s =
n−1
Albyn Jones
Math 141
Statistics!
Definition: Statistic
Any function of the data!
Let X1 , X2 , . . . , Xn be our data. Examples:
The sample mean
P
X =
Xi
n
The sample variance (why (n-1) instead of n?)
P
(Xi − X )2
2
s =
n−1
The sample median: For odd n, the middle observation
which has rank (n + 1)/2. For even n, most use the
average of the two middle observations, with ranks n/2
and n/2 + 1.
Albyn Jones
Math 141
Sample Median Illustrated
Odd
Sample Medians
Even
●
0
●
●
●
●
●
1
2
3
●
●
●
●
4
5
6
Data
Albyn Jones
●
Math 141
7
R Code for last graph
Xodd <- seq(1.5,5.5,1)
Xeven <- 1:6
# make up datasets
plot(Xeven,rep(1,6),xlim=c(0,7),ylim=c(0,3),
xlab="Data", ylab=" ", pch=19,
col="blue",yaxt="n")
points(3.5,1,pch=9,cex=1.5) # plot median
points(Xodd,rep(2,5),pch=19,col="red") # odd n
points(3.5,2,pch=9,cex=1.5) # plot median
# add labels and title
axis(2,at=c(1,2),labels=c("Even","Odd"))
title("Sample Medians")
dev.copy(pdf,"Median.pdf") # save for posterity
dev.off()
Albyn Jones
Math 141
Averages of IID Random Variables
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Let X be the
sample mean.
Albyn Jones
Math 141
Averages of IID Random Variables
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Let X be the
sample mean.
Recall that
E(aX + bY ) = aE(X ) + bE(Y )
Albyn Jones
Math 141
Averages of IID Random Variables
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Let X be the
sample mean.
Recall that
E(aX + bY ) = aE(X ) + bE(Y )
Since
P
X =
Xi
Sn
=
n
n
and E(Sn ) = nµ we have
E(X ) =
1
1
E(Sn ) = nµ = µ
n
n
Albyn Jones
Math 141
Averages of IID Random Variables
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Let X be the
sample mean.
Recall that
E(aX + bY ) = aE(X ) + bE(Y )
Since
P
X =
Xi
Sn
=
n
n
and E(Sn ) = nµ we have
E(X ) =
1
1
E(Sn ) = nµ = µ
n
n
This fact is commonly taken to be a Good Feature: the
expected value of the sample mean is the population
mean!
Albyn Jones
Math 141
Variance of an Average of IID RV’s
Recall that Var (bX ) = b2 Var (X ).
Albyn Jones
Math 141
Variance of an Average of IID RV’s
Recall that Var (bX ) = b2 Var (X ).
We know the variances add:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
Albyn Jones
Math 141
Variance of an Average of IID RV’s
Recall that Var (bX ) = b2 Var (X ).
We know the variances add:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
Therefore
Var(X ) = Var
Sn
n
Albyn Jones
=
1
σ2
1
2
Var(S
)
=
nσ
=
n
n
n2
n2
Math 141
Variance of an Average of IID RV’s
Recall that Var (bX ) = b2 Var (X ).
We know the variances add:
Var(Sn ) =
n
X
Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2
i=1
Therefore
Var(X ) = Var
Sn
n
=
and hence
SD(X ) =
Albyn Jones
1
σ2
1
2
Var(S
)
=
nσ
=
n
n
n2
n2
q
σ
Var(X ) = √
n
Math 141
Terminology!
Definition: Standard Error
The standard deviation of an estimate (like X ) is called the
standard error, primarily to distinguish it from the standard
deviation of the population from which the data were sampled.
Suppose X1 , X2 , . . . , Xn are our data, and we estimate
µ = E(X ) by X . If SD(Xi ) = σ, then
σ
SE(X ) = √
n
Albyn Jones
Math 141
Example: Coin tossing
Let X be the number of Heads in n independent tosses of a fair
coin. X is the sum of n Bernoulli(1/2) trials Yi , with E(Yi ) = 1/2
√
and σ = pq = 1/2.
Here X = X /n = p̂ is the sample proportion, and by the last
result:
1
1/2
E(p̂) =
SE(p̂) = √
2
n
If n = 100, SE(p̂) =
1
20
= .05.
If n = 400, SE(p̂) =
1
40
= .025.
More data helps! Note: to get twice the precision, we need four
times the sample size.
Albyn Jones
Math 141
Sample Means are useful!
The fact that
E(X ) = µ = E(X )
and
√
SE(X ) = σ/ n
means that X is a useful estimator of µ. It is right ‘on the
average’, and the larger the sample size, the smaller the SD. In
other words, more data gives us a better guess (smaller error)!
Albyn Jones
Math 141
The Law of Large Numbers
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Then
X →P µ
In other words, for large samples, with very high probability,
X ≈µ
Albyn Jones
Math 141
Criteria for Estimation
what makes a good estimator
Let θ̂n be an estimator of the parameter θ based on n
observations.
Albyn Jones
Math 141
Criteria for Estimation
what makes a good estimator
Let θ̂n be an estimator of the parameter θ based on n
observations.
UnBiasedness It has the right expected value:
E(θ̂n ) = θ
Albyn Jones
Math 141
Criteria for Estimation
what makes a good estimator
Let θ̂n be an estimator of the parameter θ based on n
observations.
UnBiasedness It has the right expected value:
E(θ̂n ) = θ
Consistency It gets close to the population value as the
sample size n gets larger:
θ̂n →P θ
Albyn Jones
Math 141
Criteria for Estimation
what makes a good estimator
Let θ̂n be an estimator of the parameter θ based on n
observations.
UnBiasedness It has the right expected value:
E(θ̂n ) = θ
Consistency It gets close to the population value as the
sample size n gets larger:
θ̂n →P θ
Small Mean Squared Error We prefer estimators with
smaller MSE:
MSE(θ̂n ) = E(θ̂n − θ)2
the mean squared deviation from the target. If E(θ̂n ) = θ,
the MSE is just the variance.
Albyn Jones
Math 141
Examples
Estimating the population mean
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Are the
following estimators unbiased and or consistent?
Albyn Jones
Math 141
Examples
Estimating the population mean
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Are the
following estimators unbiased and or consistent?
P
X =
Albyn Jones
Xi
n
Math 141
Examples
Estimating the population mean
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Are the
following estimators unbiased and or consistent?
P
Xi
n
X =
X1
Albyn Jones
Math 141
Examples
Estimating the population mean
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Are the
following estimators unbiased and or consistent?
P
Xi
n
X =
X1
median(X1 , X2 , . . . , Xn )
Albyn Jones
Math 141
Examples
Estimating the population mean
Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a
population with mean µ and standard deviation σ. Are the
following estimators unbiased and or consistent?
P
Xi
n
X =
X1
median(X1 , X2 , . . . , Xn )
1+
P
Xi
n
Albyn Jones
Math 141
Estimating a Proportion
Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂
be the sample proportion (aka X ). Are the following estimators
unbiased and or consistent?
Albyn Jones
Math 141
Estimating a Proportion
Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂
be the sample proportion (aka X ). Are the following estimators
unbiased and or consistent?
The sample proportion:
p̂
Albyn Jones
Math 141
Estimating a Proportion
Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂
be the sample proportion (aka X ). Are the following estimators
unbiased and or consistent?
The sample proportion:
p̂
The first trial:
X1
Albyn Jones
Math 141
Estimating a Proportion
Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂
be the sample proportion (aka X ). Are the following estimators
unbiased and or consistent?
The sample proportion:
p̂
The first trial:
X1
The plus 4 estimator:
P
2 + Xi
p̂ =
n+4
?
Albyn Jones
Math 141
Estimating a Proportion: p̂? =
X +2
n+4
X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq.
Albyn Jones
Math 141
Estimating a Proportion: p̂? =
X +2
n+4
X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq.
First, observe that
p̂? =
X +2
X
2
=
+
n+4
n+4 n+4
Albyn Jones
Math 141
Estimating a Proportion: p̂? =
X +2
n+4
X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq.
First, observe that
p̂? =
X +2
X
2
=
+
n+4
n+4 n+4
Thus
E(p̂? ) = E
X +2
n+4
=E
X
2
np
2
+
=
+
n+4
n+4
n+4 n+4
That is not equal to p, so p̂? is biased.
Albyn Jones
Math 141
On the Other Hand
2
n+4
→ 0 as n gets large, and
np
n
=p
→p
n+4
n+4
Albyn Jones
Math 141
On the Other Hand
2
n+4
→ 0 as n gets large, and
np
n
=p
→p
n+4
n+4
What happens to the variance?
X
npq
X +2
?
= Var
=
Var(p̂ ) = Var
n+4
n+4
(n + 4)2
and
pq
n+4
Albyn Jones
n
n+4
Math 141
→0
On the Other Hand
2
n+4
→ 0 as n gets large, and
np
n
=p
→p
n+4
n+4
What happens to the variance?
X
npq
X +2
?
= Var
=
Var(p̂ ) = Var
n+4
n+4
(n + 4)2
and
pq
n+4
n
n+4
→0
Thus p̂? →P p, and we have a consistent estimator. Since
Var(p̂? ) < Var(p̂), one is biased, the other has larger
variance. We have an interesting question: which one is
better?
Albyn Jones
Math 141
Compare MSE’s!
0.015
0.010
0.005
MLE
Plus4
0.000
MSE
0.020
0.025
Mean Squared Error for n =10
0.0
0.2
0.4
0.6
p
Albyn Jones
Math 141
0.8
1.0
Summary
Criteria for Estimators:
Unbiasedness, consistency, small mean squared error: we
want to get it right if we have enough data, and we want as
much precision as possible with the data we have.
Sample Means
With
√ IID data, X is unbiased, and the SE is proportional to
1/ n:
σ
SE(X ) = √X
n
Albyn Jones
Math 141
Related documents