Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ANCILLARY STATISTICS AND BASU’S THEOREM 1. Introduction Let X = (X1 , . . . , Xn ) be a random sample from fθ , where θ ∈ Θ. We say that a statistic V is ancillary if the distribution of V does not depend on θ; that is, there exists a probability measure ν on the range of V such that Pθ (V ∈ A) = ν(A) for all θ ∈ Θ. Thus, we might think that ancillary statistics tell us nothing about the parameter θ and are some sense the opposite of sufficient statistics. This is not completely true, as you will see later in the course, ancillary statistics can still be helpful in parameter estimation. However, we have the following theorem which makes precise our intuition. Theorem 1 (Basu’s theorem). Let X = (X1 , . . . , Xn ) be a random sample from fθ , where θ ∈ Θ. If V is an ancillary statistic, and T is complete and sufficient for θ, then T and V are independent with respect to Pθ for all θ ∈ Θ. Proof. Let ν be so that Pθ (V ∈ A) = ν(A) for all θ ∈ Θ. Fix A, a subset in the range of V , and set Y := 1[V ∈ A]. Consider φ(T ) = E(Y |T ). We argued, in the proof of Rao-Blackwell theorem, that function φ does not depend on θ. Consider the function u(t) = φ(t) − ν(A), which also does not depend on θ. We have that Eθ u(T ) = EE(Y |T ) − ν(A) = ν(A) − ν(A) = 0, from which we deduce from the completeness of T that u = 0 and φ(T ) = ν(A) Pθ -almost surely, for all θ ∈ Θ. This is enough to obtain the independence of T and V . Consider B, a subset in the range of T . Let Z = 1[T ∈ B]. We have the following calculation: Pθ (T ∈ B, V ∈ A) = = = = = = Eθ (ZY ) Eθ Eθ (ZY |T ) property of conditional expectations Eθ (ZEθ (Y |T )) taking out what is known Eθ (Zν(A)) consequence of completeness ν(A)Pθ (T ∈ B) Pθ (T ∈ B)Pθ (V ∈ A) 1 2 ANCILLARY STATISTICS AND BASU’S THEOREM 2. Finding ancillary statistics Let X = (X1 , . . . , Xn ). We say that X is a random sample from a location model, if there exist i.i.d. random variables W1 , . . . , Wn such that X has the same distribution as (W1 + θ, . . . , Wn + θ) for some unknown θ ∈ Θ. Here it is important to note that the distribution of W1 is fixed and does not depend on θ. Exercise 2. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼ N (µ, 1), where µ ∈ R is unknown. Show that this is a random sample from a location model. Solution. Let Z1 , . . . , Zn be i.i.d. standard normal random variables. Clearly, Zi + µ ∼ N (µ, 1) and X has the same distribution as (Z1 + µ, . . . , Zn + µ). We say that the statistic V = v(X) is location-invariant if v(x + d) = v(x1 + d, . . . , xn + d) = v(x) for all x and all d ∈ Θ. Lemma 3. Location-invariant statistics for locations models are ancillary. Proof. We have that d v(X) = v(W + θ) = v(W ), for some W = (W1 , . . . , Wn ), where the Wi are i.i.d. and do not depend on θ. Since W and v do not depend on θ we are done. Here, d we by v(X) = v(W + θ), we mean that they have they are equal in distribution, that is, have the same distribution. Exercise 4. Show that sample variance for a random sample from the normal distribution with unknown mean µ and known variance σ 2 = 1 is an ancillary statistic. Solution. Let x ∈ Rn . Recall that n 1 X s (x) := (xi − x̄)2 . n − 1 i=1 2 Let x + d = (x1 + d, . . . , xn + d). Observe that x + d = x̄ + d, so that n 1 X s (x + d) = (xi + d − (x̄ + d))2 = s2 (x). n − 1 i=1 2 Thus S 2 is a location-invariant statistic, and hence ancillary. ANCILLARY STATISTICS AND BASU’S THEOREM 3 Let X = (X1 , . . . , Xn ). We say that X is a random sample from a scale model, if there exist i.i.d. random variables W1 , . . . , Wn such that X has the same distribution as (θW1 , . . . , θWn ) for some unknown θ∈Θ Exercise 5. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼ U nif (0, θ) where θ > 0. Show that this is a random sample from a scale model. Solution. This follows immediately from the fact that if W1 ∼ U nif (0, 1), then θW1 ∼ U nif (0, θ), for θ > 0. We say that the statistic V = v(X) is scale-invariant if v(cx) = v(cx1 , . . . , cxn ) = v(x) for all x and all c ∈ Θ. Exercise 6. Let X = (X1 , . . . , Xn ) be a random sample, where X1 ∼ U nif (0, θ) where θ > 0. Show that the statistic given by X1 + · · · + Xn V = X1 is scale-invariant. Solution. The c0 s cancel! Lemma 7. Scale-invariant statistics for scale models are ancillary. Proof. We have that d v(X) = v(θW ) = v(W ), for some W = (W1 , . . . , Wn ), where the Wi are i.i.d. and do not depend on θ. Since W and v do not depend on θ we are done. 3. Applying Basu’s theorem Let X = (X1 , . . . , Xn ) be a random sample from the normal distribution. An important quantity that you may have encountered in elementary statistics courses is given by T = X̄ − µ √ . S/ n Of course if we replaced S with the σ the (true) standard deviation, then T ∼ N (0, 1). However, in practice, we may not know σ, we may have to settle for the consistent point estimator S. Here, T is almost a standard normal random variable, and the distribution of this random variable is so important that it has a name–the student’s t-distribution 4 ANCILLARY STATISTICS AND BASU’S THEOREM with n − 1 degrees of freedom. The following Proposition is useful in characterizing this distribution and computing the its density. Proposition 8. Let X = (X1 , . . . , Xn ) be a random sample from the normal distribution. The sample mean X̄ is independent of the sample variance S 2 . Proof. Consider the mean µ ∈ R to be the unknown parameter, and σ 2 to be known. In this setup, we already know that the sample mean is a sufficient statistic; we also argued earlier that X̄ is complete. One can also show that this normal family is of regular exponential class so our usual theory applies. We can also view this normal family as a location model, were S 2 is a location-invariant statistic, and thus ancillary. By Basu’s theorem, we obtain that X̄ and S 2 are independent. Exercise 9 (Computing the distribution of Student’s t-distribution). Let X = (X1 , . . . , Xn ) be a random sample from the normal distribution with mean µ and variance σ 2 . We want to compute the distribution of T := X̄ − µ √ . S/ n (a) Let Z ∼ N (0, 1). Recall that we said that Z 2 ∼ χ2 (1). For an positive integer r, in general, we say that a random variable X ∼ Γ(α = r/2, β = 2) has a chi-squared distribution with parameter r (which is referred to as the degrees of freedom). Show that these two definitions are consistent; that is, Z 2 ∼ Γ(1/2, 2). (b) Show that U := n−1 S 2 ∼ χ2 (n − 1). Hint: you might want to make σ2 use of the algebra that we used to directly prove that the sample mean is sufficient. You will also use Proposition 8. √ n (c) Let V := σ (X̄ − µ) ∼ N (0, 1) Some algebra gives that V T =p U/(n − 1) has a distribution that does not depend on σ or µ; it only depends on n. (d) You know the density for U and V , and they are independent, so you can easily write down the joint density. (e) From the joint density, you can obtain (via change of variables and computingpthe Jacobian) the joint density of (S, T ), where S = U, T = V (n − 1)/U , from which you can get the density of T . end of midterm 2 coverage