Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9 Sampling Distributions 1 Introduction In this chapter we study some relationships between population and sample characteristics. Generally, we are interested in population parameters such as Mean return Variability of demand Proportion of defectives in a production line 2 Introduction Such parameters are usually unknown Therefore, we draw a samples from the population, and use them to make inference about the parameters. This is done by constructing sample statistics, that have close relationship to the population parameters. 3 Introduction Samples are random, so the sample statistic is a random variable. As such it has a sample distribution. Sample distributions for various statistics are studied in this chapter 4 9.1 Sampling Distribution of the Mean Example 1 A die is thrown infinitely many times. Let X represent the number of spots showing on any throw. The probability distribution of X is x 1 2 3 4 5 6 p(x) 1/6 1/6 1/6 1/6 1/6 1/6 E(X) = 1(1/6) + 2(1/6) + 3(1/6)+ ………………….= 3.5 V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92 5 Throwing a die twice – sample mean Suppose we want to estimate m from the mean x of a sample of size n = 2. What is the distribution of x ? 6 Throwing a die twice – sample mean these are the means of each pair2 throws These are And all the possible pairs of values for the Sample 1 2 3 4 5 6 7 8 9 10 11 12 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 Mean Sample Mean 1 13 3,1 2 1.5 14 3,2 2.5 2 15 3,3 3 2.5 16 3,4 3.5 3 17 3,5 4 3.5 18 3,6 4.5 1.5 19 4,1 2.5 2 20 4,2 3 2.5 21 4,3 3.5 3 22 4,4 4 3.5 23 4,5 4.5 4 24 4,6 5 Sample 25 26 27 28 29 30 31 32 33 34 35 36 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 Mean 3 3.5 4 4.5 5 5.5 3.5 4 4.5 5 5.5 6 7 The distribution of x when n = 2 Calculating the relative frequency of each value of x we have the following results x Frequency 1 1 1.5 2 2.0 3 Relative freq 1/36 2/36 3/36 (1+1)/2 = 1 (1+2)/2 = 1.5 (2+1)/2 = 1.5 2.5 4 4/36 3.0 5 3.5 6 5/36 (1+3)/2 = 2 (2+2)/2 = 2 (3+1)/2 = 2 6/36 4.0 5 5/36 4.5 4 4/36 5.0 3 3/36 5.5 6.0 2 1 2/36 1/36 Notice there are 36 possible pairs of values: 1,1 1,2 ….. 1,6 2,1 2,2 ….. 2,6 ……………….. 6,1 6,2 ….. 6,6 8 The Relationship between the sample size and the sampling distribution of the sample mean n5 m x 3.5 n 10 m x 3.5 n 25 m x 3.5 2x .5833 ( ) 5 2x 2 x .2917 ( ) 10 2x .1167 ( ) 25 2 x As the sample size changes, the mean of the sample mean does not change! 2 x 9 The Relationship between the sample size and the sampling distribution of the sample mean n5 m x 3.5 n 10 m x 3.5 n 25 m x 3.5 2x .5833 ( ) 5 2x 2 x .2917 ( ) 10 2x .1167 ( ) 25 2 x As the sample size increases, the variance of the sample mean decreases! 2 x 10 The Relationship between the sample size and the sampling distribution of the sample mean n5 m x 3.5 n 10 m x 3.5 n 25 m x 3.5 2x .5833 ( ) 5 2x 2 x .2917 ( ) 10 2x .1167 ( ) 25 2 x Also, note the interesting relationship between the sample size and the variance of the sample mean. We’ll formalize this relationship soon. 2 x 11 The Sample Variance Demonstration: Why is the variance of the sample mean is smaller than the population variance. Mean = 1.5 Mean = 2. Mean = 2.5 Population 1 1.5 2 2.5 3 Compare the range of the population Let us take samples to the range of the sample mean. of two observations. Click 12 The Central Limit Theorem If a random sample is drawn from any population, the sampling distribution of the sample mean is: Normal if the parent population is normal, Approximately normal if the parent population is not normal, provided the sample size is sufficiently large. The larger the sample size, the more closely the sampling distribution of x will resemble a normal distribution. 13 The Parameters of the Sampling Distribution of X The mean of X is equal to the mean of the parent population μx μx The variance of X is equal to the parent population variance divided by ‘n’. 2 σ σ 2x x n 14 The Sampling Distribution of X - Example Example 2 The amount of soda pop in each bottle is normally distributed with a mean of 32.2 ounces and a standard deviation of .3 ounces. Find the probability that a bottle bought by a customer will contain more than 32 ounces. 15 The Sampling Distribution of X - Example Example 2 Solution The random variable X is the amount of soda in a bottle. 0.7486 P(x 32) x = 32 m = 32.2 x μ 32 32.2 P(x 32) P( ) P(z .67) 0.7486 σx .3 16 The Sampling Distribution of X Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per bottle. Solution Define the random variable as the mean amount of soda per bottle. x m 32 32.2 ) x .3 4 P( z 1.33) 0.9082 P( x 32) P( P(x 32) 0.9082 x 32 m x 32.2 17 The Sampling Distribution of X Example 3 The average weekly income of B.B.A graduates one year after graduation is $600. Suppose the distribution of weekly income has a standard deviation of $100. What is the probability that 35 randomly selected graduates have an average weekly income of less than $550? Solution P(x 550) P( x μ 550 600 ) σx 100 35 P(z 2.97) 0.0015 18 The Sampling Distribution of X Example 3 – continued If a random sample of 35 graduates actually had an average weekly income of $550, what would you conclude about the validity of the claim that the average weekly income is 600? Solution With m = 600 the probability to have a sample mean as low as 550 is very small (0.0015). The claim that the mean weekly income is $600 is probably unjustified. It will be more reasonable to assume that m is smaller than $600, because then a sample mean of $550 becomes more probable. 19 < 9.2 Sampling Distribution of a Sample Proportion (p) The parameter of interest for qualitative (nominal) data is the proportion of times a particular outcome (success) occurs for a given population. This is the motivation for studying the distribution of the sample proportion 20 < 9.2 Sampling Distribution of a Sample Proportion (p) Let X be the number of times an event of interest takes place (we can call such an event a success just like the definition we used for the binomial experiment) The number of successes < The sample proportion = p = X n 21 < 9.2 Sampling Distribution of a Sample Proportion (p) < < Since X is binomial, probabilities for p can be calculated from the binomial distribution. Yet, for inference about p we prefer to use normal approximation to the binomial. 22 Approximate Sampling Distribution of a Sample Proportion p̂ From the laws of expected value and variance, it can be shown that mp̂= p and p̂2 = p(1-p)/n Z is calculated by: Z ˆ p p p(1 p) n If both np > 5 and n(1-p) > 5, then Z is approximately standard normal. 23 Approximate Sampling Distribution of a Sample Proportion Example 5 A state representative received 52% of the votes in the last election. One year later the representative wanted to study his popularity. If his popularity has not changed, what is the probability that more than half of a sample of 300 voters would vote for him? 24 Approximate Sampling Distribution of a Sample Proportion Example 5 Solution The number of respondents who prefer the representative is binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 > 5 n(1-p) = 300(1-.52) = 144 > 5. The normal approximation can be applied here: ˆ p p .50 .52 ˆ P(p .50) P .7549 p(1 p) n .0288 25 Using Sampling Distributions for Inference Sampling distributions can be used to make an inference about population parameters For example let us look at an inference about the population mean Generally we’ll compare the actual sample mean with a hypothesized value of the unknown population mean, and make an informed decision about the likelihood of this hypothesis 26 Using Sampling Distributions for Inference Let us guess what the value of m is, and build a symmetrical interval around m large enough to make it very likely that the sample mean falls inside it. If the sample mean falls outside the interval (although this is very unlikely), we tend to believe that m is different than the value of m we guessed. The sampling distribution of the sample mean helps in performing the calculations. Large probability that x falls inside [mD, m+D] mD m m+D x 27 Using Sampling Distributions for Inference Suppose .95 is considered sufficiently large probability the sample mean falls inside the interval. Let us build a symmetrical interval around m. Using the notation m D and m + D we have: P(m D x m + D) = .95. x 28 Using Sampling Distributions for Inference Performing the usual standardization we find that the interval covering 95% of the distribution of the sample mean is: σ σ μ 1.96 x μ + 1.96 n n 0.95 x μ 1.96 σ n μ + 1.96 σ n 29 Using Sampling Distributions for Inference Now let us apply this interval to example 3. P(m 1.96 P(600 1.96 x m + 1.96 n 100 m n x 600 + 1.96 ) .95 100 ) .95 25 25 Which reduces to P(560.8 x 639.2) .95 Conclusion There is 95% chance that the sample mean falls within the interval [560.8, 639.2] if the population mean is 600. Since the sample mean was 550, the population mean is probably not 600. 30 Optional: Sampling Distribution of the Difference Between Two Means The difference between two means can become a parameter of interest when the comparison between two populations is studied. To make an inference about m1 - m2 we observe the distribution of x1 x.2 31 9.3 Normal Distribution of the Difference Between two Sample Means The distribution of x1 x 2 is normal if The two samples are independent, and The parent populations are normally distributed. If the two populations are not both normally distributed, but the sample sizes are 30 or more, the distribution of x1 x 2 is approximately normal. 32 9.3 Normal Distribution of the Difference Between two Sample Means Applying the laws of expected value and variance we have: μ x1 x 2 μ1 μ2 σ 2 x1 x 2 σ12 σ 22 + n n We can define: Z ( x1 x 2 ) (m1 m 2 ) 12 22 + n1 n2 33 9.3 Normal Distribution of the Difference Between two Sample Means Example 6 The starting salaries of MBA students from two universities (WLU and UWO) are $62,000 (stand.dev. = $14,500), and $60,000 (stand. dev. = $18,300). What is the probability that a sample mean of WLU students will exceed the sample mean of UWO students? (nWLU = 50; nUWO = 60) 34 9.3 Normal Distribution of the Difference Between two Sample Means Example 6 – Solution We need to determine P( x1 x2 0) m1 - m2 = 62,000 - 60,000 = $2,000 12 22 14,500 2 18,3002 + + $3,128 n n 50 60 x1 x2 (m1 - m2 ) 0 2000 P( x1 x2 0) P( ) 2 2 3128 1 2 + n1 n2 P( z .64 ) .5 + .2389 .7389 35