Download Mathematical statistics Week 2/b: Finite sample

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Mathematical statistics
Week 2/b: Finite sample properties of estimators
Péter Elek and Ádám Reiff
26th September, 2013
1.
Unbiasedness, relative efficiency and MSE-criterion
Unbiasedness.
• Consider an estimator θb that is used to estimate a population parameter θ.
– θb is a random variable and its distribution depends on the true value of
θ.
b = θ for all possible values
• θb is said to be an unbiased estimator of θ if E(θ)
of θ.
b 6= θ for at least one value of θ, then θb is a biased estimator of θ.
• If E(θ)
• Bias of θb is Biasθb (θ) = E θb − θ.
– The bias is also a function of the true parameter value θ.
Example.
• Example (from previous lecture): the (true) sampling distribution of the
sample mean and sample median are:
x
m
0
1
2
3
4
5
6
8
9
12
1
27
7
27
3
27
3
27
3
27
6
27
3
27
3
27
3
27
0
0
1
27
13
27
0
0
0
0
0
1
27
7
27
• Which is an unbiased estimator of the population mean?
1
Relative efficiency.
• Consider θb1 and θb2 , which are two unbiased estimators of a population parameter θ.
• Estimator θb1 is said to be relatively more efficient than θb2 (to estimate θ) if
V ar(θb1 ) ≤ V ar(θb2 ) for all θ, with strict inequality for at least one value of
θ.
– i.e. the variance of its sampling distribution is smaller
Efficiency (best unbiased estimator).
• The unbiased estimator θb is said to be efficient (or best unbiased estimator)
if it has the smallest sampling variance among all unbiased estimators.
b ≤ V ar(θb2 ) for all
– That is, for any other θb2 unbiased estimator, V ar(θ)
possible values of θ.
– Note: efficiency is sometimes defined in a different way.
• Since the sampling variances may depend on the true value of θ (and hence
functions of θ are compared in the definition), a best unbiased estimator does
not always exist.
Mean squared error (MSE) criterion.
• Comparing variances is useful only for unbiased estimators.
• To compare more general estimators, we can use the mean squared error.
2 b
b
• M SE θ = E θ − θ
.
• MSE criterion: we choose the estimator that has a smaller MSE.
2 b
b
• Proposition: M SE θ = V ar θ + Bias θb
2 Proof of M SE θb = V ar θb + Bias θb .
• We have:
2 2 b
b
b
b
b
= E (θ − E(θ)) − (θ − E(θ))
M SE θ = E θ − θ
i
2 h
b
b
b
θ − E(θ)
= E θb − E(θ)
− 2E θb − E(θ)
+E
2
2 b
θ − E(θ)
.
b the second term is zero, and the third term is
• Here the first term is V ar(θ),
2
b
b
not a random variable, so it is θ − E(θ) = Bias2 (θ).
• Therefore M SE θb = V ar θb + Bias2 θb .
Example (cont.).
• Which of the previous two sample statistics (sample mean and sample median) has smaller MSE at the particular true parameter value?
– "True parameter value" now means: the three possible outcomes occur
with equal probability.
1
3
• V ar(x) = 27
(0 − 5)2 + 27
(1 − 5)2 + . . . +
M SE(x) = V ar(x) because of unbiasedness.
1
27 (12
− 5)2 = 8.6667 and
7
7
2
2
• V ar(m) = 27
(0−4.5556)2 + 13
27 (3−4.5556) + 27 (12−4.5556) = 20.9136
and M SE(m) > V ar(m) because of the bias.
• Hence x is better than m in terms of the MSE-criterion. (And also the former
is unbiased, while the latter is not.)
2.
Finite sample properties of sample mean and sample
variance
2.1.
Properties of the sample mean
Properties of the sample mean of an i.i.d. sample.
• Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a distribution with unknown population mean (expected value) µ and variance σ 2 .
• Then the sample mean is unbiased for µ.
Pn
P
i=1 Xi
– E(X) = E
= n1 E ( ni=1 Xi ) =
n
1
n
1
n
Pn
i=1 E(Xi )
=
1
n
Pn
i=1 µ
=
(nµ) = µ.
√
• Its standard deviation is proportional to 1/ n.
Pn
P
i=1 Xi
– V ar(X) = V ar
= n12 V ar ( ni=1 Xi ) =
n
1 Pn
2 = 1 nσ 2 = σ 2
σ
2
2
i=1
n
n
n
√
– sd(X) = σ/ n
3
1
n2
Pn
i=1 V
ar(Xi ) =
Best linear unbiased estimator (BLUE).
• Best linear unbiased estimator (BLUE): an unbiased estimator is BLUE if
it has the smallest variance among all unbiased estimators, which are linear
combinations of the sample elements.
• The sample mean is not always the best unbiased estimator for µ. (There are
"weird" counter-examples.)
• But it is the best linear unbiased estimator for µ in the case of an i.i.d.
sample.
Proof of BLUE-property of the sample mean in an i.i.d. sample.
P
• Let θb be an arbitrary unbiased linear estimator: θb = ni=1 ai Xi , with E θb =
µ.
P
P
• Then E θb = E ( ni=1 ai Xi ) = ni=1 ai E (Xi ) = µ, so by unbiasedness
P
we have ni=1 ai = 1.
P
P
P
• Also, V ar θb = V ar ( ni=1 ai Xi ) = ni=1 a2i V ar (Xi ) = σ 2 ni=1 a2i .
• P
Hence, since V ar (x) =
n
i=1 ai = 1.
σ2
n ,
we only have to prove that
Pn
2
i=1 ai
≥
1
n
if
• But this is true because of the inequality between the quadratic and arithmetic mean. So X is indeed the Best Linear Unbiased Estimator of µ.
Properties of the sample mean in a normal sample.
• Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a N (µ, σ 2 ) distribution
with unknown parameters.
• Then the sample mean is not only BLUE but it is the best unbiased estimator
for µ.
– i.e. it is the best among all (even nonlinear) estimators. We do not prove
this.
• Moreover, as the linear combination of normal random variables
is also nor
σ
√
mally distributed, X is also normally distributed: X ∼ N µ, n .
4
Exercise 1.
• A filling machine is set to pour 500 g-s of cereal into a box container. Denote
the actual weight of cereal filled into the container by X, and assume that
X ∼ N 500, 202 .
• A random sample of n = 25 boxes is selected (i.e. x1 , . . . , x25 is drawn),
and the plant manager stops the process if x > 510 or x < 490.
• What is the probability of stopping?
Solution.
2
• Since X ∼ N 500, 202 and n = 25, X ∼ N (500, 20
25 = 16), and
X−500
√
∼ N (0, 1).
20/ 25
• Therefore
Pr(stop) = 1 − Pr(notstop)
= 1 − Pr(490
< X < 510) =1 −
510−500
490−500
X−500
X−500
√
Pr 20/√25 < 20/√25 < 20/√25 = 1 − Pr −2.5 < 20/
< 2.5 =
25
1 − [Φ(2.5) − Φ(−2.5)] = 2 − 2Φ(2.5) = 0.0124.
2.2.
Properties of the sample variance
"Ideal" sampling variance.
• The "ideal" sample variance is obtained when
we treat the expected value µ
P
n
2
i=1 (Xi −µ)
as known in the variance formula: s2ideal =
n
.
– "Ideal" because we have the generally unknown µ instead of X in the
expression.
• In an i.i.d. sample, s2ideal is an unbiased estimator of σ 2 .
– Proof: E s2ideal =
Pn
i=1
E [(Xi −µ)2 ]
n
=
nσ 2
n
= σ2
ns2
• Moreover, in a normally distributed sample, σideal
follows a chi-squared
2
distribution with n degrees of freedom.
2
2 Pn
ns2
Xi −µ
– Proof: s2ideal = σn
. Hence σideal
is the sum of squa2
i=1
σ
res of n independent standard normal variables.
Exercise 2.
• Assume that the size of output of some production process is distributed
normally: X ∼ N (10, 0.12 ) (i.e. the true variance of the size is known).
• We draw a sample of n = 25 observations from a large number of outputs.
• What is the probability that the ideal sample variance will exceed 0.014?
5
Solution.
• Pr(s2ideal > 0.014) = Pr
2
nsideal
Pr
<
35
≈ 0.1
2
σ
ns2ideal
σ2
>
25·0.014
0.01
= Pr
ns2ideal
σ2
> 35 = 1 −
• from the table of chi-squared distribution with 25 degrees of freedom.
"Actual" uncorrected sample variance.
• "Actual" uncorrected sample variance is: s2 =
Pn
2
i=1
(Xi −X )
n
– Actual because µ is replaced by x
• E(s2 ) =
n−1 2
n σ ,
hence it is a biased estimator of σ 2 .
• Proof:
– s2 =
Pn
i=1
2
2
(Xi −X )
n
Pn
i=1
=
2
[(xi −µ)−(X−µ)]
n
= ... =
Pn
2
i=1 (Xi −µ)
n
−
X −µ .
h
i
– In the expected value we have E (Xi − µ)2 = V ar(Xi ) = σ 2 , and
h
h
2 i
2 i
2
also we have E X − µ
= E X − E(X)
= V ar(X) = σn .
– Therefore E(s2 ) = σ 2 −
σ2.
σ2
n
=
n−1 2
n σ ,
so s2 is a biased estimator of
Corrected sample variance.
2
2
n
n
i=1 (Xi −X )
i=1 (Xi −X )
n
• s∗2 = n−1
=
is unbiased for σ 2 , and it is called
n
n−1
the corrected sample variance.
P
P
2
n
(Xi −X )
• Moreover, in a normally distributed sample, s∗2 = i=1n−1
= ... =
2 2
Pn
2
∗2
P
2
(n−1)s
n
Xi −µ
X−µ
n
i=1 (Xi −µ)
√
−
X
−
µ
,
so
=
−
i=1
n−1
n−1
σ
σ2
σ/ n
P
is distributed as χ2n−1 .
– We "lose" one degree of freedom because µ was replaced by X.
– Correct proof is based on induction (not covered in class).
6
Further properties in case of a normal sample.
2
• Since (n − 1)s∗2 = ns2 , therefore ns
= (n−1)s
σ2
σ2
distribution.
∗2
• Further, V ar (n−1)s
= 2(n − 1) (why?).
2
σ
• Therefore V ar s∗2 =
σ4
V
(n−1)2
ar
(n−1)s∗2
σ2
=
∗2
also follows a χ2n−1
σ4
2(n
(n−1)2
• By similar arguments it is easy to see that V ar(s2 ) =
− 1) =
2σ 4
n−1 .
2(n−1)σ 4
.
n2
"Normalized" sample mean in a normal sample.
• "Normalized" sample mean:
X−µ
√
s∗ / n
• Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a N (µ, σ 2 ) distribution
with unknown parameters.
• We know that
–
–
X−µ
√
σ/ n
q
∼ N (0, 1) and
(n−1)s∗2 /σ 2
n−1
=
s∗
σ
q
∼
χ2n−1
n−1 .
• Further, these two random variables are independent.
– We do not prove this.
• So
X−µ
√
σ/ n
s∗ /σ
=
X−µ
√
s∗ / n
follows a tn−1 -distribution.
Exercise 3.
• Assume that the test-scores at the CEU math entry test are distributed normally.
• From a random sample of 25 students we find that x = 70 and s∗2 = 400.
• Find the "probability" that the expected value of the results lies in the [60, 80]
interval!
7
Solution.
• We know that
X−µ
√
s∗ / n
∼ t24 ,
• so Pr(60 < µ < 80) = Pr(−80 < −µ < −60) = Pr
√
= 0.98
Pr −2.5 < sX−µ
∗ / n < 2.5
X−80
√
20/ 25
<
X−µ
√
s∗ / n
<
X−60
√
20/ 25
• where the last result is taken from the table of the t-distribution with 24
degrees of freedom.
Material.
• Further exercises are on a separate sheet (with solutions).
• Wooldridge Appendix C.1-C.2
• Casella-Berger 5.1-5.3 (except for Theorem 5.2.11), 7.1, 7.3.1-7.3.2.
– only to the extent covered in the course
8
=