Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
PSTAT 120B Probability and Statistics - Week 7 Fang-I Chu University of California, Santa Barbara May 15, 2013 Fang-I Chu PSTAT 120B Probability and Statistics Announcement Office hour: Tuesday 11:00AM-12:00PM Please make use of office hour or email if you have question about hw problem. Put a circle around your name on roster if you bring your two blue books and hand them to me (after section). Fang-I Chu PSTAT 120B Probability and Statistics Topic for review Sufficiency Exercise #9.37 Exercise #9.48 Exercise #9.65 Fang-I Chu PSTAT 120B Probability and Statistics Sufficiency Let Y1 , Y2 , . . . , Yn denote a random sample from a probability distribution with unknown parameter θ. Then the statistic U = g (Y1 , Y2 , . . . , Yn ) is said to be sufficient for θ if the conditional distribution of Y1 , Y2 , . . . , Yn , given U, does not depend on θ. Theorem 9.4. Let U be a statistic based on the random sample Y1 , Y2 , . . . , Yn . Then U is a sufficient statistic for the estimation of a parameter θ if and only if the likelihood L(θ) = L(y1 , y2 , . . . , yn |θ) can be factored into two nonnegative functions, L(y1 , y2 , . . . , yn |θ) = g (u, θ) × h(y1 , y2 , . . . , yn ) where g (u, θ) is a function only of u and θ and h(y1 , y2 , . . . , yn ) is not a function of θ. Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.37 9.37 Let X1 , X2 , . . . , Xn denote n independent and identically distributed Bernoulli random variables such that P(Xi = 1) = p and P(Xi = 0) = 1 − p P for each i = 1, 2, . . . , n. Show that ni=1 Xi is sufficient for p by using the factorization criterion given in Theorem 9.4. Fang-I Chu PSTAT 120B Probability and Statistics #9.37 Proof: 1. Information: Xi s are n i.i.d random variables from Bernoulli distribution with parameters p pdf for Xi : f (x|p) = p x (1 − p)1−x 2. Goal: Show that . Pn i=1 Xi is sufficient for p. Fang-I Chu PSTAT 120B Probability and Statistics 9.37 Proof: 3. Bridge: Pn Pn Likelihood function: L(p|x) = p i=1 xi (1 − p)n− i=1 xi Use Theorem Pn9.4. Pn Pn g (u, p) = p i=1 xi (1 − p)n− i=1 xi , u = i=1 xi and h(x) = 1 4. Fine tune: U = Pn i=1 Xi is sufficient for p. Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.48 9.48 Refer to Exercise 9.44. If β is known, (a) show that the Pareto distribution is in the exponential family.(b) What is a sufficient statistic for α?(c) Argue that there is no contradiction between your answer to this exercise and the answer you found in Exercise 9.44. (a)Proof: 1. Information: Yi s are i.i.d random variables from Pareto distribution with parameters α and known β pdf for Yi : f (y |α, β) = αβ α y −(α+1) I (y ≥ β) definition of exponential family: A family of pdfs is called an exponential family if it P can be expressed as n f (y |θ) = h(y )c(θ)exp i=1 wi (θ)ti (y ) . 2. Goal: Show that the Pareto distribution is in the exponential family. Fang-I Chu PSTAT 120B Probability and Statistics #9.48 (a)Proof: 3. Bridge: Rewrite the density function as f (y |α) = αβ α exp − (α + 1) ln y I (y ≥ β) Pn Likelihood function: L(α|y) = αn β nα exp − (α + 1) i=1 ln yi note β is known. 4. Fine tune: h(y ) = 1, c(α) = αn β nα , wi (α) = −(α + 1) and ti (y ) = ln y . Pareto belongs to exponential family! P Note from above, we have also obtained ni=1 ln Yi is sufficient for α Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.48 9.48 (b) What is a sufficient statistic for α?(c) Argue that there is no contradiction between your answer to this exercise and the answer you found in Exercise 9.44. (b)Proof: Known Yi s are i.i.d random variables from Pareto distribution with parameters α and known β pdf for Yi : f (y |α, β) = αβ α y −(α+1) I (y ≥ β) Goal: Find the sufficient statistics for α. Fang-I Chu PSTAT 120B Probability and Statistics #9.48 (b)Proof: Way to approach: Likelihood function: −(α+1) Qn Qn L(α, β) = αn β nα i=1 I (yi ≥ β), note β is i=1 yi known Qn Qn −(α+1) Qn g (u, α) = αn β nα i=1 yi i=1 I (yi ≥ β), u = i=1 yi and h(y) = 1 4. Fine tune: U = Qn i=1 Yi is sufficient for α. Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.48 9.48 (c) Argue that there is no contradiction between your answer to this exercise and the answer you found in Exercise 9.44. (c)Proof: 1. Information: From part (a), we obtained the Pndensity with exponential form and the sufficient statistic is i=1 ln Yi From partQ(b) (same as exercise 9.44), we have sufficient n statistics i=1 Yi of α 2. Goal: Argue that there is no contradiction between (a) and (b). Fang-I Chu PSTAT 120B Probability and Statistics #9.48 (c)Proof: 3. Bridge: Pn i=1 ln Yi = ln Qn i=1 Yi (why? check it!) 4. Fine tune: we’ve got our proof! Note: a function of sufficient statistics is always sufficient as well. Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 #9.65 In this exercise, we illustrate the direct use of the Rao- Blackwell theorem. Let Y1 , . . . , Yn be independent Bernoulli random variables with p(yi |p) = p yi (1 − p)1−yi , yi = 0, 1. That is, P(Yi = 1) = p and P(Yi = 0) = 1 − p. Find the P MVUE of p(1 − p), which is a term in the variance of Y or W = ni=1 Yi , by the following steps. Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 #9.65 (a)Let f (T = 1 0 if Y1 = 1 and Y2 = 0 elsewhere Show that E (T ) = p(1 − p) Proof: 1. Information: Y1 , . . . , Yn are random samples from a Bernoulli distribution with parameter p. pdf for Yi : p(yi |p) = p yi (1 − p)1−yi , yi = 1, 0 2. Goal: Show that E (T ) = p(1 − p). Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 (a) Proof: 3. Bridge: E (T ) = P(T = 1) = P(Y1 = 1, Y2 = 0) = P(Y1 = 1)P(Y2 = 0) = p(1 − p) 4. Fine tune: We got our proof! Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 #9.65 (b). Show that P(T = 1|W = w ) = w (n − w ) . n(n − 1) Proof: 1. Information: Y1 , . . . , Yn are random samples from a Bernoulli distribution with parameter p. pdf for Yi : p(yi |p) = p yi (1 − p)1−yi , yi = 1, 0 2. Goal: (b). Show that P(T = 1|W = w ) = Fang-I Chu w (n−w ) n(n−1) . PSTAT 120B Probability and Statistics Exercise 9.65(b) Proof: 3. Bridge: W ∼ binomial(n, p) P(Y1 = 1, Y2 = 0, W = w ) P(W = w ) Pn P(Y1 = 1, Y2 = 0, i=3 Yi = w − 1) = P(W = w ) P P(Y1 = 1)P(Y2 = 0)P( i=3 Yi = w − 1) = P(W = w ) w −1 n−2 (1 − p)n−2−(w −1) p(1 − p) w −1 p ( why?) = n w n−w w p (1 − p) P(T = 1|W = w ) = = w (n − w ) . n(n − 1) 4. Fine tune: We got our proof! Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 #9.65 (c) Show that E (T |W ) = and hence that n W W n Ȳ (1 − Ȳ ) 1− = n−1 n n n−1 nȲ (1−Ȳ (n−1) is the MVUE of p(1 − p) 1. Information: From (a), T is unbiased estimator of p(1 − p) (n−w ) From (b): P(T = 1|W = w ) = wn(n−1) W is sufficient for p 2. Goal: Show that n W E (T |W ) = n−1 n 1− nȲ (1−Ȳ (n−1) W n = n n−1 Ȳ (1 − Ȳ ) and hence that is the MVUE of p(1 − p) Fang-I Chu PSTAT 120B Probability and Statistics Exercise 9.65 (c) Proof: 3. Bridge: W is sufficient for p(1 − p) E (T |W ) = P(T = 1|W ) W n−W = n n−1 n W n − W = n−1 n n n W W = 1− n−1 n n n(Ȳ (1 − Ȳ ) = n−1 E (T |W ) is unbiased for p(1 − p) (why?) and it is a function of sufficient statistics W for p(1 − p) 4. Fine tune: we have got our proof! Fang-I Chu PSTAT 120B Probability and Statistics Remark 1. To find sufficient statistics: Make sure you know how to obtain likelihood function. Review the definition of likelihood function from lecture. For most sufficiency problems, the first step is usually to write out likelihood function and then apply factorization criterion. It is legal to have h(y) = 1. 2 . Regarding MVUE: If X is a MVUE of unknown parameter, θ, E (X ) = θ i.e. MVUE is an unbiased estimator. X must be function of sufficient statistics for θ. (by Lehmman and Scheffe) Fang-I Chu PSTAT 120B Probability and Statistics