Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 8.1 ~ Sampling Distributions Introduction to Probability and Statistics Ms. Young Sec. 8.1 Objective After this section you will understand the fundamental ideas of sampling distributions and how the distribution of sample means and distribution of sample proportions are formed. You will also learn the notation used to represent sample means and proportions. Sec. 8.1 Reporting Statistics How does someone come up with these statistics? The mean daily protein consumption by Americans is 67 grams Nationwide, the mean hospital stay after delivery of a baby decreased from 3.2 days in 1980 to the current mean of 2.0 days Thirty percent of high school girls in this country believe they would be happier being married than not married About 5% of all American children live with a grandparent A sample is drawn from the population, the sample statistic is found, and an inference is made about the entire population based on what was found in the sample What is the difference between the first two statements and the last two statements? The first two statements give estimates of a mean of a quantity The last two statements say something about a proportion of the population Sec. 8.1 Notation Review n Sample size μ (mu) Population mean x Sample mean σ (sigma) Population standard deviation s Sample standard deviation z Z-score r Correlation coefficient ∑x (sigma) Sum of x values P(x) Probability of x Sec. 8.1 Sample Means: The Basic Idea A distribution of sample means is a histogram that shows the distribution of a sample statistic, such as a mean, taken from ALL samples of a particular size Ex. ~ refer to supplemental activity Concluding questions for the activity What do you notice about the mean of the distribution of sample means in comparison to the population mean (242.4)? It is the same as the population mean What do you notice about the histogram? As the sample size increases, the distribution narrows and clusters around the population mean Sec. 8.1 Sample Means Cont’d… When you work with ALL possible samples of a population of a given size, the mean of the distribution of sample means is always the population mean Typically, the population size is too large to calculate the means for all possible samples, so we calculate the mean of a sample, x , to estimate the population mean, μ So when all you have is the mean of a sample, then that is your best estimate for the population mean When you are working with very large populations, as your sample size increases, the distribution will look more and more like a normal distribution and the distribution of sample means will approach the population mean This allows us to make inferences about a population Sec. 8.1 Sample Means Cont’d… In working with samples from a large population, you cannot expect the estimate of the population mean to be perfect This is known as a sampling error – the error that is introduced by working with a sample The more samples that you gather, the better your estimate will be, but if you can only gather one sample, that is your best estimate Sec. 8.1 Example of sampling error: The following values are results from a survey of 400 students who were asked how many hours they spend per week using a search engine on the Internet. n = 400 μ = 3.88 σ = 2.40 Sec. 8.1 Suppose these were the values that were randomly selected to obtain a sample of 32 students: Sample 1 1.1 3.8 1.7 7.8 5.7 2.1 6.8 6.5 1.2 4.9 2.7 0.3 3.0 2.6 0.9 6.5 1.4 2.4 5.2 7.1 2.5 2.2 5.5 7.8 5.1 3.1 3.4 5.0 4.7 6.8 7.0 6.5 The mean of this sample is xx̄ = 4.17 x̄ is a sample statistic because it We say that x comes from a sample of the entire population. Sec. 8.1 Now suppose a different sample of 32 students was selected from the 400: Sample 2 1.8 0.4 4.0 5.2 5.7 6.5 0.5 3.9 3.1 2.4 1.2 5.8 0.8 5.4 2.9 6.2 5.7 7.2 0.8 7.2 0.9 6.6 5.1 4.0 5.7 3.2 7.9 2.5 3.6 3.1 5.0 3.1 For this sample x̄ is = 3.98. Now you have two sample means that don’t agree with each other (4.17 & 3.98 respectively), and neither one agrees with the true population mean (3.88). This is an example of sampling error. Sec. 8.1 In summary, when including all possible samples of size n, the characteristics of the distribution of sample means are as follows: • The distribution of sample means is approximately a normal distribution. • The mean of the distribution of sample means is the mean of the population. • The standard deviation of the distribution of sample means depends on the population standard deviation and the sample size. s n Sec. 8.1 Example 1 - Sampling Farms Texas has roughly 225,000 farms, more than any other state in the United States. The actual mean farm size is μ = 582 acres and the standard deviation is σ = 150 acres. For random samples of n = 100 farms, find the mean and standard deviation of the distribution of sample means. What is the probability of selecting a random sample of 100 farms with a mean greater than 600 acres? Solution: Because the distribution of sample means is a normal distribution, its mean should be the same as the mean of the entire population, which is 582 acres. The standard deviation of the sampling distribution is 150 s n 100 150 15 10 s n Sec. 8.1 Example 1 - Sampling Farms Solution: (cont.) A sample mean of acres therefore has a standard score of sample mean – pop. mean 600 – 582 z= = = 1.2 standard deviation 15 According to the z-score table, this standard score is in the 88th percentile, so the probability of selecting a sample with a mean less than 600 acres is about 0.88. Thus, the probability of selecting a sample with a mean greater than 600 acres is about 0.12. Sec. 8.1 Sample Proportions Much of what you have learned about distribution of sample means carries over to distributions of sample proportions Suppose instead of being interested in knowing how many hours per week students spend using search engines, we took those same 400 students and asked them a simple Yes or No question, “Do you own a car?” (refer to the raw data on P.341) If you counted carefully, you would find that 240 of the 400 responses are Y’s, so the exact proportion, or population proportion is p = 0.6 (240/400) This would be a population parameter Sec. 8.1 Sample Proportions Cont’d… When you take a sample to estimate the population proportion, you follow the same process as you do when taking a sample to estimate the population mean The sample proportion is represented with p̂ (read as p-hat) Sec. 8.1 Example 2 ~ Analyzing a Sample Proportion Consider the distribution of sample proportions shown on P.341. Assume that its population proportion is p = 0.6 and its standard deviation is 0.1. Suppose you randomly select the following sample of 32 responses: YYNYYYYNYYYYYYNYYNYYYNYYNYYNYNYY Compute the sample proportion, p, p̂ for the number of Y’s in this sample. How far does it lie from the population proportion? What is the probability of selecting another sample with a proportion greater than the one you selected? Solution: The proportion of Y responses in this sample is p̂ = 24 = 0.75 32 Sec. 8.1 Example 2 ~ Analyzing a Sample Proportion Solution: (cont.) Using a population proportion of 0.6 and a standard deviation of 0.1, we find that the sample statistic, p̂ = 0.75, has a standard score of sample proportion – pop. proportion 0.75 – 0.6 z= = = 1.5 standard deviation 0.1 The sample proportion is 1.5 standard deviations above the mean of the distribution. Using the z-score chart, we see that a standard score of 1.5 corresponds to the 93rd percentile. The probability of selecting another sample with a proportion less than the one we selected is about 0.93. Thus, the probability of selecting another sample with a proportion greater than the one we selected is about 1 – 0.93 = 0.07.