Download ANCILLARY STATISTICS AND BASU`S THEOREM 1. Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
ANCILLARY STATISTICS AND BASU’S THEOREM
1. Introduction
Let X = (X1 , . . . , Xn ) be a random sample from fθ , where θ ∈ Θ.
We say that a statistic V is ancillary if the distribution of V does not
depend on θ; that is, there exists a probability measure ν on the range
of V such that Pθ (V ∈ A) = ν(A) for all θ ∈ Θ. Thus, we might think
that ancillary statistics tell us nothing about the parameter θ and are
some sense the opposite of sufficient statistics. This is not completely
true, as you will see later in the course, ancillary statistics can still
be helpful in parameter estimation. However, we have the following
theorem which makes precise our intuition.
Theorem 1 (Basu’s theorem). Let X = (X1 , . . . , Xn ) be a random
sample from fθ , where θ ∈ Θ. If V is an ancillary statistic, and T
is complete and sufficient for θ, then T and V are independent with
respect to Pθ for all θ ∈ Θ.
Proof. Let ν be so that Pθ (V ∈ A) = ν(A) for all θ ∈ Θ. Fix A, a subset
in the range of V , and set Y := 1[V ∈ A]. Consider φ(T ) = E(Y |T ).
We argued, in the proof of Rao-Blackwell theorem, that function φ does
not depend on θ. Consider the function
u(t) = φ(t) − ν(A),
which also does not depend on θ. We have that Eθ u(T ) = EE(Y |T ) −
ν(A) = ν(A) − ν(A) = 0, from which we deduce from the completeness
of T that u = 0 and φ(T ) = ν(A) Pθ -almost surely, for all θ ∈ Θ. This
is enough to obtain the independence of T and V .
Consider B, a subset in the range of T . Let Z = 1[T ∈ B]. We have
the following calculation:
Pθ (T ∈ B, V ∈ A) =
=
=
=
=
=
Eθ (ZY )
Eθ Eθ (ZY |T ) property of conditional expectations
Eθ (ZEθ (Y |T )) taking out what is known
Eθ (Zν(A)) consequence of completeness
ν(A)Pθ (T ∈ B)
Pθ (T ∈ B)Pθ (V ∈ A)
1
2
ANCILLARY STATISTICS AND BASU’S THEOREM
2. Finding ancillary statistics
Let X = (X1 , . . . , Xn ). We say that X is a random sample from
a location model, if there exist i.i.d. random variables W1 , . . . , Wn
such that X has the same distribution as (W1 + θ, . . . , Wn + θ) for some
unknown θ ∈ Θ. Here it is important to note that the distribution of
W1 is fixed and does not depend on θ.
Exercise 2. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼
N (µ, 1), where µ ∈ R is unknown. Show that this is a random sample
from a location model.
Solution. Let Z1 , . . . , Zn be i.i.d. standard normal random variables.
Clearly, Zi + µ ∼ N (µ, 1) and X has the same distribution as (Z1 +
µ, . . . , Zn + µ).
We say that the statistic V = v(X) is location-invariant if
v(x + d) = v(x1 + d, . . . , xn + d) = v(x)
for all x and all d ∈ Θ.
Lemma 3. Location-invariant statistics for locations models are ancillary.
Proof. We have that
d
v(X) = v(W + θ) = v(W ),
for some W = (W1 , . . . , Wn ), where the Wi are i.i.d. and do not depend on θ. Since W and v do not depend on θ we are done. Here,
d
we by v(X) = v(W + θ), we mean that they have they are equal in
distribution, that is, have the same distribution.
Exercise 4. Show that sample variance for a random sample from the
normal distribution with unknown mean µ and known variance σ 2 = 1
is an ancillary statistic.
Solution. Let x ∈ Rn . Recall that
n
1 X
s (x) :=
(xi − x̄)2 .
n − 1 i=1
2
Let x + d = (x1 + d, . . . , xn + d). Observe that x + d = x̄ + d, so that
n
1 X
s (x + d) =
(xi + d − (x̄ + d))2 = s2 (x).
n − 1 i=1
2
Thus S 2 is a location-invariant statistic, and hence ancillary.
ANCILLARY STATISTICS AND BASU’S THEOREM
3
Let X = (X1 , . . . , Xn ). We say that X is a random sample from
a scale model, if there exist i.i.d. random variables W1 , . . . , Wn such
that X has the same distribution as (θW1 , . . . , θWn ) for some unknown
θ∈Θ
Exercise 5. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼
U nif (0, θ) where θ > 0. Show that this is a random sample from a
scale model.
Solution. This follows immediately from the fact that if W1 ∼ U nif (0, 1),
then θW1 ∼ U nif (0, θ), for θ > 0.
We say that the statistic V = v(X) is scale-invariant if
v(cx) = v(cx1 , . . . , cxn ) = v(x)
for all x and all c ∈ Θ.
Exercise 6. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼
U nif (0, θ) where θ > 0. Show that the statistic given by
X1 + · · · + Xn
V =
X1
is scale-invariant.
Solution. The c0 s cancel!
Lemma 7. Scale-invariant statistics for scale models are ancillary.
Proof. We have that
d
v(X) = v(θW ) = v(W ),
for some W = (W1 , . . . , Wn ), where the Wi are i.i.d. and do not depend
on θ. Since W and v do not depend on θ we are done.
3. Applying Basu’s theorem
Let X = (X1 , . . . , Xn ) be a random sample from the normal distribution. An important quantity that you may have encountered in
elementary statistics courses is given by
T =
X̄ − µ
√ .
S/ n
Of course if we replaced S with the σ the (true) standard deviation,
then T ∼ N (0, 1). However, in practice, we may not know σ, we may
have to settle for the consistent point estimator S. Here, T is almost a
standard normal random variable, and the distribution of this random
variable is so important that it has a name–the student’s t-distribution
4
ANCILLARY STATISTICS AND BASU’S THEOREM
with n − 1 degrees of freedom. The following Proposition is useful in
characterizing this distribution and computing the its density.
Proposition 8. Let X = (X1 , . . . , Xn ) be a random sample from the
normal distribution. The sample mean X̄ is independent of the sample
variance S 2 .
Proof. Consider the mean µ ∈ R to be the unknown parameter, and
σ 2 to be known. In this setup, we already know that the sample mean
is a sufficient statistic; we also argued earlier that X̄ is complete. One
can also show that this normal family is of regular exponential class so
our usual theory applies.
We can also view this normal family as a location model, were S 2 is
a location-invariant statistic, and thus ancillary. By Basu’s theorem,
we obtain that X̄ and S 2 are independent.
Exercise 9 (Computing the distribution of Student’s t-distribution).
Let X = (X1 , . . . , Xn ) be a random sample from the normal distribution
with mean µ and variance σ 2 . We want to compute the distribution of
T :=
X̄ − µ
√ .
S/ n
(a) Let Z ∼ N (0, 1). Recall that we said that Z 2 ∼ χ2 (1). For an
positive integer r, in general, we say that a random variable X ∼
Γ(α = r/2, β = 2) has a chi-squared distribution with parameter
r (which is referred to as the degrees of freedom). Show that these
two definitions are consistent; that is, Z 2 ∼ Γ(1/2, 2).
(b) Show that U := n−1
S 2 ∼ χ2 (n − 1). Hint: you might want to make
σ2
use of the algebra that we used to directly prove that the sample
mean is sufficient.
You will also use Proposition 8.
√
n
(c) Let V := σ (X̄ − µ) ∼ N (0, 1) Some algebra gives that
V
T =p
U/(n − 1)
has a distribution that does not depend on σ or µ; it only depends
on n.
(d) You know the density for U and V , and they are independent, so
you can easily write down the joint density.
(e) From the joint density, you can obtain (via change of variables and
computingpthe Jacobian) the joint density of (S, T ), where S =
U, T = V (n − 1)/U , from which you can get the density of T .
end of midterm 2 coverage