Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 141 Lecture 8: Estimation Albyn Jones1 1 Library 304 [email protected] www.people.reed.edu/∼jones/courses/141 Albyn Jones Math 141 Last Time Expected value of a sum of independent RV’s: E(X1 + X2 + . . . Xn ) = E(X1 ) + E(X2 ) + . . . E(Xn ) Variance of a sum of independent RV’s: Var(X1 + X2 + . . . Xn ) = Var(X1 ) + Var(X2 ) + . . . Var(Xn ) X ∼ Binomial(n, p): E(X ) = np Var(X ) = σX2 = npq = np(1 − p) √ SD(X ) = σX = npq Albyn Jones Math 141 Aside on Randomization Random Sample For a finite population, every subset of n members of the population is equally likely to be selected. If the population is large, and n small relative to the population, it is approximately the same as sampling with replacement, yielding at least approximate independence. Representative Sample A Useless Idea! See the discussion in Zetterberg (2004). The phrase quota sampling refers to the attempt to get ‘representative samples’. The Point: randomization ensures independence, and protects against selection bias. Albyn Jones Math 141 Terminology Definition: IID (Mutually) Independent and Identically Distributed random variables: Albyn Jones Math 141 Terminology Definition: IID (Mutually) Independent and Identically Distributed random variables: Independence: knowing the value of one RV gives no information about the value of any other. Albyn Jones Math 141 Terminology Definition: IID (Mutually) Independent and Identically Distributed random variables: Independence: knowing the value of one RV gives no information about the value of any other. Identically Distributed: all random variables are drawn from the same population or distribution. Thus they all have the same expected value and variance: the population mean and population variance. Albyn Jones Math 141 Sums of IID Random Variables: I Let X1 , X2 , . . . , Xn be a sample of n IID RV’s from a population with mean µ and standard deviation σ. Let Sn be their sum: Sn = n X Xi = X1 + X2 + . . . + Xn i=1 What are E(Sn ) and SD(Sn )? Albyn Jones Math 141 Sums of IID Random Variables: II X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard deviation σ. Albyn Jones Math 141 Sums of IID Random Variables: II X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard deviation σ. The expected value of a sum is the sum of the expected values: E(Sn ) = n X E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ i=1 Albyn Jones Math 141 Sums of IID Random Variables: II X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard deviation σ. The expected value of a sum is the sum of the expected values: E(Sn ) = n X E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ i=1 The variance of a sum of independent RV’s is the sum of their variances: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 Albyn Jones Math 141 Sums of IID Random Variables: II X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard deviation σ. The expected value of a sum is the sum of the expected values: E(Sn ) = n X E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ i=1 The variance of a sum of independent RV’s is the sum of their variances: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 √ p √ SD(Sn ) = Var(Sn ) = nσ 2 = σ n Albyn Jones Math 141 Sums of IID Random Variables: II X1 , X2 , . . . , Xn are n IID RV’s with mean µ and standard deviation σ. The expected value of a sum is the sum of the expected values: E(Sn ) = n X E(Xi ) = E(X1 ) + . . . + E(Xn ) = nµ i=1 The variance of a sum of independent RV’s is the sum of their variances: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 √ p √ SD(Sn ) = Var(Sn ) = nσ 2 = σ n √ Note: the fact that SD(Sn ) ∝ n, rather than n, is tremendously important, as we shall see. Albyn Jones Math 141 Statistics! Definition: Statistic Any function of the data! Let X1 , X2 , . . . , Xn be our data. Examples: Albyn Jones Math 141 Statistics! Definition: Statistic Any function of the data! Let X1 , X2 , . . . , Xn be our data. Examples: The sample mean P X = Albyn Jones Xi n Math 141 Statistics! Definition: Statistic Any function of the data! Let X1 , X2 , . . . , Xn be our data. Examples: The sample mean P X = Xi n The sample variance (why (n-1) instead of n?) P (Xi − X )2 2 s = n−1 Albyn Jones Math 141 Statistics! Definition: Statistic Any function of the data! Let X1 , X2 , . . . , Xn be our data. Examples: The sample mean P X = Xi n The sample variance (why (n-1) instead of n?) P (Xi − X )2 2 s = n−1 The sample median: For odd n, the middle observation which has rank (n + 1)/2. For even n, most use the average of the two middle observations, with ranks n/2 and n/2 + 1. Albyn Jones Math 141 Sample Median Illustrated Odd Sample Medians Even ● 0 ● ● ● ● ● 1 2 3 ● ● ● ● 4 5 6 Data Albyn Jones ● Math 141 7 R Code for last graph Xodd <- seq(1.5,5.5,1) Xeven <- 1:6 # make up datasets plot(Xeven,rep(1,6),xlim=c(0,7),ylim=c(0,3), xlab="Data", ylab=" ", pch=19, col="blue",yaxt="n") points(3.5,1,pch=9,cex=1.5) # plot median points(Xodd,rep(2,5),pch=19,col="red") # odd n points(3.5,2,pch=9,cex=1.5) # plot median # add labels and title axis(2,at=c(1,2),labels=c("Even","Odd")) title("Sample Medians") dev.copy(pdf,"Median.pdf") # save for posterity dev.off() Albyn Jones Math 141 Averages of IID Random Variables Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Let X be the sample mean. Albyn Jones Math 141 Averages of IID Random Variables Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Let X be the sample mean. Recall that E(aX + bY ) = aE(X ) + bE(Y ) Albyn Jones Math 141 Averages of IID Random Variables Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Let X be the sample mean. Recall that E(aX + bY ) = aE(X ) + bE(Y ) Since P X = Xi Sn = n n and E(Sn ) = nµ we have E(X ) = 1 1 E(Sn ) = nµ = µ n n Albyn Jones Math 141 Averages of IID Random Variables Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Let X be the sample mean. Recall that E(aX + bY ) = aE(X ) + bE(Y ) Since P X = Xi Sn = n n and E(Sn ) = nµ we have E(X ) = 1 1 E(Sn ) = nµ = µ n n This fact is commonly taken to be a Good Feature: the expected value of the sample mean is the population mean! Albyn Jones Math 141 Variance of an Average of IID RV’s Recall that Var (bX ) = b2 Var (X ). Albyn Jones Math 141 Variance of an Average of IID RV’s Recall that Var (bX ) = b2 Var (X ). We know the variances add: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 Albyn Jones Math 141 Variance of an Average of IID RV’s Recall that Var (bX ) = b2 Var (X ). We know the variances add: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 Therefore Var(X ) = Var Sn n Albyn Jones = 1 σ2 1 2 Var(S ) = nσ = n n n2 n2 Math 141 Variance of an Average of IID RV’s Recall that Var (bX ) = b2 Var (X ). We know the variances add: Var(Sn ) = n X Var(Xi ) = Var(X1 ) + . . . + Var(Xn ) = nσ 2 i=1 Therefore Var(X ) = Var Sn n = and hence SD(X ) = Albyn Jones 1 σ2 1 2 Var(S ) = nσ = n n n2 n2 q σ Var(X ) = √ n Math 141 Terminology! Definition: Standard Error The standard deviation of an estimate (like X ) is called the standard error, primarily to distinguish it from the standard deviation of the population from which the data were sampled. Suppose X1 , X2 , . . . , Xn are our data, and we estimate µ = E(X ) by X . If SD(Xi ) = σ, then σ SE(X ) = √ n Albyn Jones Math 141 Example: Coin tossing Let X be the number of Heads in n independent tosses of a fair coin. X is the sum of n Bernoulli(1/2) trials Yi , with E(Yi ) = 1/2 √ and σ = pq = 1/2. Here X = X /n = p̂ is the sample proportion, and by the last result: 1 1/2 E(p̂) = SE(p̂) = √ 2 n If n = 100, SE(p̂) = 1 20 = .05. If n = 400, SE(p̂) = 1 40 = .025. More data helps! Note: to get twice the precision, we need four times the sample size. Albyn Jones Math 141 Sample Means are useful! The fact that E(X ) = µ = E(X ) and √ SE(X ) = σ/ n means that X is a useful estimator of µ. It is right ‘on the average’, and the larger the sample size, the smaller the SD. In other words, more data gives us a better guess (smaller error)! Albyn Jones Math 141 The Law of Large Numbers Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Then X →P µ In other words, for large samples, with very high probability, X ≈µ Albyn Jones Math 141 Criteria for Estimation what makes a good estimator Let θ̂n be an estimator of the parameter θ based on n observations. Albyn Jones Math 141 Criteria for Estimation what makes a good estimator Let θ̂n be an estimator of the parameter θ based on n observations. UnBiasedness It has the right expected value: E(θ̂n ) = θ Albyn Jones Math 141 Criteria for Estimation what makes a good estimator Let θ̂n be an estimator of the parameter θ based on n observations. UnBiasedness It has the right expected value: E(θ̂n ) = θ Consistency It gets close to the population value as the sample size n gets larger: θ̂n →P θ Albyn Jones Math 141 Criteria for Estimation what makes a good estimator Let θ̂n be an estimator of the parameter θ based on n observations. UnBiasedness It has the right expected value: E(θ̂n ) = θ Consistency It gets close to the population value as the sample size n gets larger: θ̂n →P θ Small Mean Squared Error We prefer estimators with smaller MSE: MSE(θ̂n ) = E(θ̂n − θ)2 the mean squared deviation from the target. If E(θ̂n ) = θ, the MSE is just the variance. Albyn Jones Math 141 Examples Estimating the population mean Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Are the following estimators unbiased and or consistent? Albyn Jones Math 141 Examples Estimating the population mean Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Are the following estimators unbiased and or consistent? P X = Albyn Jones Xi n Math 141 Examples Estimating the population mean Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Are the following estimators unbiased and or consistent? P Xi n X = X1 Albyn Jones Math 141 Examples Estimating the population mean Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Are the following estimators unbiased and or consistent? P Xi n X = X1 median(X1 , X2 , . . . , Xn ) Albyn Jones Math 141 Examples Estimating the population mean Suppose that X1 , X2 , . . . , Xn are n IID RV’s sampled from a population with mean µ and standard deviation σ. Are the following estimators unbiased and or consistent? P Xi n X = X1 median(X1 , X2 , . . . , Xn ) 1+ P Xi n Albyn Jones Math 141 Estimating a Proportion Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂ be the sample proportion (aka X ). Are the following estimators unbiased and or consistent? Albyn Jones Math 141 Estimating a Proportion Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂ be the sample proportion (aka X ). Are the following estimators unbiased and or consistent? The sample proportion: p̂ Albyn Jones Math 141 Estimating a Proportion Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂ be the sample proportion (aka X ). Are the following estimators unbiased and or consistent? The sample proportion: p̂ The first trial: X1 Albyn Jones Math 141 Estimating a Proportion Suppose that X1 , X2 , . . . , Xn are n IID Bernoulli(p) RV’s. Let p̂ be the sample proportion (aka X ). Are the following estimators unbiased and or consistent? The sample proportion: p̂ The first trial: X1 The plus 4 estimator: P 2 + Xi p̂ = n+4 ? Albyn Jones Math 141 Estimating a Proportion: p̂? = X +2 n+4 X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq. Albyn Jones Math 141 Estimating a Proportion: p̂? = X +2 n+4 X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq. First, observe that p̂? = X +2 X 2 = + n+4 n+4 n+4 Albyn Jones Math 141 Estimating a Proportion: p̂? = X +2 n+4 X ∼ Binomial(n, p), so E(X ) = np and Var(X ) = npq. First, observe that p̂? = X +2 X 2 = + n+4 n+4 n+4 Thus E(p̂? ) = E X +2 n+4 =E X 2 np 2 + = + n+4 n+4 n+4 n+4 That is not equal to p, so p̂? is biased. Albyn Jones Math 141 On the Other Hand 2 n+4 → 0 as n gets large, and np n =p →p n+4 n+4 Albyn Jones Math 141 On the Other Hand 2 n+4 → 0 as n gets large, and np n =p →p n+4 n+4 What happens to the variance? X npq X +2 ? = Var = Var(p̂ ) = Var n+4 n+4 (n + 4)2 and pq n+4 Albyn Jones n n+4 Math 141 →0 On the Other Hand 2 n+4 → 0 as n gets large, and np n =p →p n+4 n+4 What happens to the variance? X npq X +2 ? = Var = Var(p̂ ) = Var n+4 n+4 (n + 4)2 and pq n+4 n n+4 →0 Thus p̂? →P p, and we have a consistent estimator. Since Var(p̂? ) < Var(p̂), one is biased, the other has larger variance. We have an interesting question: which one is better? Albyn Jones Math 141 Compare MSE’s! 0.015 0.010 0.005 MLE Plus4 0.000 MSE 0.020 0.025 Mean Squared Error for n =10 0.0 0.2 0.4 0.6 p Albyn Jones Math 141 0.8 1.0 Summary Criteria for Estimators: Unbiasedness, consistency, small mean squared error: we want to get it right if we have enough data, and we want as much precision as possible with the data we have. Sample Means With √ IID data, X is unbiased, and the SE is proportional to 1/ n: σ SE(X ) = √X n Albyn Jones Math 141