Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions In general, you do not calculate values for parameters. Because N is too large, taking a census for this purpose is not practical. Instead, you estimate them from a sample using sample statistics. However, if you think carefully, you will note that values of a sample statistic calculated from dierent samples will be dierent from sample to sample. That is, if you repeatedly draw new samples (say, of the same size) from the same population and calculate the same sample statistic, you will get a set of dierent numerical values. This leads to the concept of Sampling Distribution. Sampling Distribution: It is the distribution of a sample statistic calculated from samples of n measurements drawn randomly from a population. The sample is chosen randomly, hence the individuals represented are in it by chance. A sample statistic is calculated from the individuals represented in the sample; hence, the number you calculate is obtained by chance as well. Hence the sample statistic is a random variable and so has a distribution and this distribution is called the Sampling Distribution. Sampling Distributions are very complicated to \list out" because there are so many possible samples that could be selected (especially when N is large). Instead, what we derive or approximate the sampling distribution in other ways: probability theory, statistical mathematical theory, ... See Example 6.1 for a simple example of a sampling distribution obtained by lising out all the possible values (\outcomes") of the sample mean X . Question: Find the sampling distribution of the sample mean of the number of red M&M's in a random sample of 4 packages ? 20 21 Sampling Distribution of the Sample Mean Suppose a random sample of n observations are drawn from any population with mean and standard deviation , both unknown. If you decide to use X as a point estimator of , there are two properties of the sampling distribution of X that holds regardless of the population from which the sample was drawn. Let X = Mean of the sampling distribution of X , and X = Standard Deviation of the sampling distribution of X . Then,irrespective of the population sampled, 1. X = 2. X = pn X is often referred to as the standard error of the mean. These two properties hold whatever the sample size is selected. Further, if the sample was selected from a Normal distribution, then we know that the sampling distribution of X is exactly N (X ; X2 ) 22 23 What if the population distribution is not Normal ? Can we still say that the sampling distribution of X is Normal ? Yes, we can, using the Central Limit Theorem (CLT). It says that the sampling distribution of X will be approximately a Normal distribution for suciently large n, (say, n 30) even when the sample is drawn from a possibly, non-Normal population, as long as it is not heavily skewed. Of course, it is important to note carefully that X = and X = pn are true independent of whether the population distribution is Normal or not. The approximation of the sampling distribution of X to the Normal distribution given by the CLT improves as the sample size n gets larger. See Figure 6.10 for a visual illustration of this. 24 Once we can say that the sampling distribution of X is Normal, we can use that fact to make statements concerning the probability of the sample mean being in a specied range of values. A random sample of n = 36 is drawn from a Normal population with mean = 10 and standard deviation = 12. Example: What is the mean and standard deviation of the sam- pling distribution of X ? X = = 10 X = pn = p1236 = 2 What is the shape of the sampling distribution of X ? Does your answer depend on sample size ? { The sampling distribution X is Normal { No, since the population is Normally distributed. Find P r(X > 11). 1 0 11 ; 10 CC X ; 10 B B > P r(X > 11) = P r @ 2 2 A = P r(Z > 0:50) = 0:5 ; 0:1915 = 0:3085 Read Examples 25 Example 6.8 A manufacturer of automobile batteries claims that the distribution of the lengths of its best battery has a mean of 54 months and a standard deviation of 6 months. Suppose a consumer group decides to check this claim by purchasing a sample of 50 batteries and subjecting them to tests that determine battery life. Assuming that the manufactures claim is true, Describe the sampling distribution of the mean lifetime of a sample of 50 batteries. Using the CLT, the required distribution is approximately Normally distributed. Furthermore, the mean of this distribution is X = 50, and the standard deviation is 6 = :85 months X = p = r n (50) 6.7 and 6.8 26 27 What is the probability the consumer group's sample has a mean life of 52 or fewer months? The probability that needs to be calculated can be stated as P r(X 52). Since X has an approximate Normal distribution with mean 54 and standard deviation .85, we can calculate this probability as follows: 1 0 52 ; 54 CC X ; 54 B B :85 A P r(X 52) = P r @ :85 = P r(Z ;2:35) = P r(Z > 2:35) = 0:5 ; 0:4906 = 0:0094 Thus, the probability the consumer group will observe sample mean life of 52 or less is 0.0094. If the 50 tested batteries do have a mean of 52 or fewer months, the consumer group will have strong evidence that the manufactures claim is untrue because the chance of that happening is very small. 28