Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distributions of sampling statistics Chapter 6 Sample mean & sample variance Sample vs. population A population is a large collection of items that have measurable values associated with the experimental study A proper sampling technique is adopted to select items, that is so called a sample, from the large collection in order to draw some conclusions about the population. From selecting the small one to prospecting the large whole Definition of sample If X1,X2,X3,…,Xn are i.i.d. variables with the distribution F, then they constitute a sample from the distribution F. The population distribution F is usually not specified completely. And sometimes it is supposed that F is specified up to a set of unknown parameters. The parametric inference problem emerges. A statistics is a random variable whose value is determined by the sample data and used to inference the supposed parameter. The sample mean X 1 X 2 ... X n X n _ E[X]=E[(X1+X2+…Xn)/n]=(1/n)(E[X1]+E[X2]+…E[Xn]) =μ Var(X)=Var{(X1+X2+…Xn)/n} =(1/n2)(nσ2 )=σ2/n c.f. population mean & variance: μ,σ2 If the sample size n increases, then the sample variance of X will decrease. See fig. 6.1 The central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance σ2 •The sum of a large number of independent random variables has a distribution that is approximately normal X 1 X 2 ... X n is approximat ely normal with mean n and variance n 2 X 1 X 2 ... X n n is approximat ely a standard normal random variable n See example 6.3b and p.206 the binomial trials Approximate distribution of the sample mean _ n X X i / n, i 1 _ X n Z (0,1) / n •See example 6.3d, 6.3e How large a sample is needed? If the underlining population distribution is _ normal, then the sample mean X will also be normal regardless of the sample size. A general thumb is that one can be confident of the normal approximation whenever the sample size n is at least 30. The sample variance n sample variance S 2 (X i 1 _ i X )2 sample standard deviation S ( n 1) S n (X 2 i 1 _ i , n 1 X) 2 S2 , n X i 1 _ i _ n 2 ( n 1) E[ S ] E X i nE[ X i 1 E[ X i 1 2 i _ 2 ] nE[ X ] n {Var ( X i 1 _ n 2 n n 2 n 2 n( 2 E[ S 2 ] , 2 2 n 2 nX 2 ] i _ n{Var ( X ) E[ X ]2 } 2 2 _ ) E[ X i ] } nE[ X 2 / n) n 2 ( n 1) , an unbiased estimator 2 , 2 ] _ Joint distribution of X and S n ( X i ) i 1 n 2 2 _ 2 ( X X ) i i 1 2 Xi Xi X i 1 i 1 n _ 2 n A chi-square distribution with n degree of freedom 2 _ n( X ) 2 2 2 , n( X ) _ 2 A chi-square distribution with 1 degree of freedom A chi-square distribution with n-1 degree of freedom Implications If X1,X2,X3,…,Xn, is a sample from a normal population having mean μ and variance σ2, then _ 2 X X ~ N ( , ), ~ Z (0,1) n / n 2 2 S ( n 1) ~ x ( n 1), 2 _ _ n _ ( X ) ( X ) ~ t n 1 S S n Sampling from a finite population A binomial random variable 2 E[X]=np, σ =np(1-p) _ If X is the proportion of the sample that has a special characteristic and equal to X/n, then _ E[ X ] E[ X ] / n p, S / n p(1 p) / n By approximation: _ ( X ) Z S p) n ~ Z (0,1) p (1 p ) / n (X Homework #5 Problem 8,10,,15,23,28