Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SAMPLING DISTRIBUTION Introduction • In real life calculating parameters of populations is usually impossible because populations are very large. • Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference. • The sampling distribution of the statistic is the tool that tells us how close is the statistic to the parameter. 2 Sampling Distribution of the Mean • An example – A die is thrown infinitely many times. Let X represent the number of spots showing on any throw. – The probability distribution of X is x 1 2 3 4 5 6 p(x) 1/6 1/6 1/6 1/6 1/6 1/6 E(X) = 1(1/6) + 2(1/6) + 3(1/6)+ ………………….= 3.5 V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92 3 Throwing a die twice – sample mean • Suppose we want to estimate m from the mean X of a sample of size n = 2. • What is the distribution of X? 4 Throwing a die twice – sample mean Sample 1 2 3 4 5 6 7 8 9 10 11 12 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 Mean Sample Mean 1 13 3,1 2 1.5 14 3,2 2.5 2 15 3,3 3 2.5 16 3,4 3.5 3 17 3,5 4 3.5 18 3,6 4.5 1.5 19 4,1 2.5 2 20 4,2 3 2.5 21 4,3 3.5 3 22 4,4 4 3.5 23 4,5 4.5 4 24 4,6 5 Sample 25 26 27 28 29 30 31 32 33 34 35 36 Mean 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 3 3.5 4 4.5 5 5.5 3.5 4 4.5 5 5.5 6 5 Sample 1 2 3 4 5 6 7 8 9 10 11 12 Mean Sample Mean 1 13 3,1 2 1.5 14 3,2 2.5 2 15 3,3 3 2.5 16 3,4 3.5 3 17 3,5 4 3.5 18 3,6 4.5 1.5 19 4,1 2.5 x x 2 20 4,2 3 2.5 21 4,3 3.5 3 22 4,4 4 3.5 23 4,5 4.5 4 24 4,6 5 1,1 The distribution of 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 Sample 25 26 27 28 29 30 2 31 32x 33 34 35 36 X when n = 2 Note : m m 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 and Mean 2 x 2 3 3.5 4 4.5 5 5.5 3.5 4 4.5 5 5.5 6 E( x) =1.0(1/36)+ 1.5(2/36)+….=3.5 6/36 5/36 V(X) = (1.0-3.5)2(1/36)+ (1.5-3.5)2(2/36)... = 1.46 4/36 3/36 2/36 1/36 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 x 6 Sampling Distribution of the Mean n5 m x 3.5 2x .5833 ( ) 5 2 x n 10 m x 3.5 n 25 m x 3.5 2 2x .2917 ( x ) 10 2 2x .1167 ( x ) 25 6 7 Sampling Distribution of the Mean n5 m x 3.5 2x .5833 ( ) 5 2 x n 10 m x 3.5 n 25 m x 3.5 2x .2917 ( ) 10 2x .1167 ( ) 25 2 x Notice that x2is issmaller smallerthan than. x.2 The larger the sample size the smaller 2x. .Therefore, Therefore,x tends xtends to fall closer to m, as the sample size increases. 2 x 8 SAMPLING DISTRIBUTION • Let X1, X2,…,Xn be a r.s. of size n from a population and let T(x1,x2,…,xn) be a real (or vector-valued) function whose domain includes the sample space of (X1, X2,…,Xn). Then, the r.v. or a random vector Y=T(X1, X2,…,Xn) is called a statistic. The probability distribution of a statistic Y is called the sampling distribution of Y. 9 SAMPLING DISTRIBUTION • The sample mean is the arithmetic average of the values in a r.s. X1 X 2 X n 1 n X Xi n n i 1 • The sample variance is the statistic defined by 1 n 2 S Xi X n 1 i1 2 • The sample standard deviation is the statistic defined by S. 10 SAMPLING FROM THE NORMAL DISTRIBUTION Properties of the Sample Mean and Sample Variance • Let X1, X2,…,Xn be a r.s. of size n from a N(m,2) distribution. Then, 2 a) X and S are independent rvs. b) X ~ N m , / n 2 n 1 S 2 c) ~ n1 2 2 11 SAMPLING FROM THE NORMAL DISTRIBUTION • Let X1, X2,…,Xn be a r.s. of size n from a N(m,2) distribution. Then, X m ~ N 0,1 / n •Most of the time is unknown, so we use: X m . S/ n 12 SAMPLING FROM THE NORMAL DISTRIBUTION In statistical inference, Student’s t distribution is very important. 13 SAMPLING FROM THE NORMAL DISTRIBUTION • Let X1, X2,…,Xn be a r.s. of size n from a N(mX,X2) distribution and let Y1,Y2,…,Ym be a r.s. of size m from an independent N(mY,Y2). • If we are interested in comparing the variability of the populations, one quantity of interest would be the ratio 2 2 2 2 X / Y S X / SY 14 SAMPLING FROM THE NORMAL DISTRIBUTION • The F distribution allows us to compare these quantities by giving the distribution of S X2 / SY2 S X2 / X2 2 2 ~ Fn1,m1 2 2 X / Y SY / Y • If X~Fp,q, then 1/X~Fq,p. • If X~tq, then X2~F1,q. 15 CENTRAL LIMIT THEOREM If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of X will resemble a normal distribution. Random Sample (X1, X2, X3, …,Xn) X Random Variable (Population) Distribution X as n Sample Mean Distribution 16 Sampling Distribution of the Sample Mean mX m X 2 2 n or X n If X is normal, X is normal. If X is non-normal, X is approximately normally distributed for sample size greater than or equal to 30. X m X ~ N( m , / n ) Z ~ N( 0,1 ) / n 2 17 EXAMPLE 1 • The amount of soda pop in each bottle is normally distributed with a mean of 32.2 ounces and a standard deviation of 0.3 ounces. – Find the probability that a bottle bought by a customer will contain more than 32 ounces. – Solution • The random variable X is the 0.7486 amount of soda in a bottle. P( x 32) P( x m 32 32.2 ) x .3 P( z .67) 0.7486 x = 32 m = 32.2 18 EXAMPLE 1 (contd.) • Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per bottle. • Solution – Define the random variable as the mean amount of soda per bottle. x m 32 32.2 ) x .3 4 P( z 1.33) 0.9082 P( x 32) P( 0.9082 0.7486 x = 32 x 32 m = 32.2 m x 32.2 19 Sampling Distribution of a Proportion • The parameter of interest for nominal data is the proportion of times a particular outcome (success) occurs. • To estimate the population proportion ‘p’ we use the sample proportion. The number of successes The estimate of p = p^ = X n 20 Sampling Distribution of a Proportion • Since X is binomial, probabilities about p^ can be calculated from the binomial distribution. • Yet, for inference about ^p we prefer to use normal approximation to the binomial whenever it approximation is appropriate. 21 Approximate Sampling Distribution of a Sample Proportion • From the laws of expected value and variance, it can be shown that E( p̂ ) = p and V( p̂)=p(1-p)/n • If both np ≥ 5 and n(1-p) ≥ 5, then z ˆp p p (1 p ) n • Z is approximately standard normally distributed. 22 EXAMPLE – A state representative received 52% of the votes in the last election. – One year later the representative wanted to study his popularity. – If his popularity has not changed, what is the probability that more than half of a sample of 300 voters would vote for him? 23 EXAMPLE (contd.) Solution • The number of respondents who prefer the representative is binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 and n(1-p) = 300(1-.52) = 144 (both greater than 5) P( pˆ .50 ) P pˆ p p(1 p ) n .7549 (.52 )(1 .52 ) 300 .50 .52 24