Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling Distributions & Confidence Intervals 1 Doing Statistics for Business Chapter 7 Objectives Motivation for Point Estimators Common Point Estimators Desirable Properties of Point Estimators Distribution of the Sample Mean: Large Sample or Known 2 Doing Statistics for Business Chapter 7 Objectives (con’t) The Central Limit Theorem - A More Detailed Look Drawing Inferences by Using the Central Limit Theorem Large Sample Confidence Intervals for the Mean 3 Doing Statistics for Business Chapter 7 Objectives (con’t) Distribution of the Sample Mean: Small Sample and Unknown Small Sample Confidence Intervals for the Mean Confidence Intervals for Qualitative Data Sample Size Calculations 4 Doing Statistics for Business Figure 7.1 Relationship between probability and inferential statistics Probability Population Sample Inferential Statistics 5 Doing Statistics for Business A Point Estimate is a single number calculated from sample data. It is used to estimate a parameter of the population. A Point Estimator is the formula or rule that is used to calculate the point estimate for a particular set of data. 6 Doing Statistics for Business TRY IT NOW! Sales of CD’s Comparing Point Estimators Calculated From Samples Selected From Different Populations Use a software package such as Excel or Minitab to simulate picking a sample of size n=10 from two different populations or use the samples given on the following slide. 7 Doing Statistics for Business TRY IT NOW! Sales of Gizmos Comparing Point Estimators Calculated From Samples Selected From Different Populations (con’t) . Store 1 95 99 95 99 101 102 102 102 92 99 Store 2 97 97 103 103 101 98 98 106 100 96 8 Doing Statistics for Business TRY IT NOW! Sales of Gizmos Comparing Point Estimators Calculated From Samples Selected From Different Populations (con’t) Suppose the first variable is daily sales of a new CD at Store 1. The second variable is daily sales of the new CD at Store 2. Select a sample of size n=10 days from both stores. Assume that the days sales at both stores are normally distributed with a mean of 100 and a standard deviation of 3. Find X X and X - X 1 2 1 2 9 Doing Statistics for Business TRY IT NOW! Sales of CDs Comparing Point Estimators Calculated From Samples Selected From Different Populations (con’t) What do you notice about the difference in sample means even though the population means are the same? Find s12 (Store 1 standard deviation squared), s and 2 2 s12 / s22 What do you notice about the ratio of the two sample variances? 10 Doing Statistics for Business An Unbiased Estimator yields an estimate that is fair. It neither systematically overestimates the parameter nor systematically underestimates the parameter. 11 Doing Statistics for Business TRY IT NOW! The Diaper Company Comparing the Variability of Two Point Estimators Weights in grams for the next 5 hourly samples taken at the diaper company are shown below: Hour 6: Hour 7: Hour 8: Hour 9: Hour 10: 54.89 54.32 54.14 54.11 55.21 55.06 55.72 55.18 54.05 55.40 55.45 54.91 55.78 53.60 53.87 55.23 54.40 55.37 55.97 55.09 55.75 55.78 55.69 55.86 55.70 12 Doing Statistics for Business TRY IT NOW! The Diaper Company Comparing the Variability of Two Point Estimators (con’t) For each sample, calculate the sample mean and the sample median. Find the average of the sample means and the average of the sample medians. Find the standard deviation of the sample means and the standard deviation of the sample medians. Which point estimator has less variability? 13 Doing Statistics for Business The probability distribution of a point estimator or a sample statistic is called a Sampling Distribution. The Standard Error is the standard deviation of the sampling distribution of a point estimator. It measures how much the point estimator or sample statistic varies from sample to sample. 14 Doing Statistics for Business Central Limit Theorem (CLT) In random sampling from a population, with mean and standard deviation , when n is large enough, the distribution of is X approximately normal with a mean equal to and a standard deviation equal to / n 15 Doing Statistics for Business Figure 7.2 The Diaper Company Histogram of Individual Diaper Weights Histogram of Individual Diaper Weights Frequency 50 40 30 20 10 0 53.5 54 54.5 55 55.5 56 Weight (grams) 56.5 16 Doing Statistics for Business Figure 7.3 The Diaper Company Histogram of 52 Sample Means Histogram of Average Diaper Weight 20 Frequency 15 10 5 0 53 53.5 54 54.5 55 55.5 56 56.5 Average Weight (grams) 17 Doing Statistics for Business Figure 7.4 Graphs of two populations with the same mean but different standard deviations 18 Doing Statistics for Business Figure 7.5 Dotplots of Sample Means . .. ..:. : -+---------+---------+---------+---------+---------+-----St Dev = 1 . . . . . . : . . -+---------+---------+---------+---------+---------+-----St Dev = 5 23.50 24.00 24.50 25.00 25.50 26.00 19 Doing Statistics for Business TRY IT NOW! The Central Limit Theorem Exploring the Third Point Use a software package such as Excel or Minitab to simulate picking 10 samples each of size n = 35 from two different populations: Population 1: Monthly sales of a leading on-line bookstore- normally distributed with a mean of = 25 ($1,000) and a standard deviation of = 1($1,000) Population 2: Monthly sales of a leading on-line bookstore- normally distributed with a mean of = 25 ($1,000) and a standard deviation of = 3 ($1,000) 20 Doing Statistics for Business TRY IT NOW! The Central Limit Theorem Exploring the Third Point (con’t) X For each sample, calculate a sample mean, . Find the average and standard deviation of the 10 X ‘s from population 1 samples. Find the average and standard deviation of the 10 samples. X ‘s from population 2 21 Doing Statistics for Business TRY IT NOW! The Central Limit Theorem Impact of Sample Size on Standard Error Using the random number table in Appendix A and the 350 values shown in the previous Try It Now! exercise in your textbook, select 10 samples of size 5 from population 1. Review Section 2.4 if you need a refresher on how to use the random number table. For each sample, calculate a sample mean, X Find the average and standard deviation of the 10 X ‘s. Compare these values to the corresponding values that you found in the previous Try It Now! for population 1. In that case the sample size was n = 35. 22 Doing Statistics for Business 68% 95% 99.7% Figure 7.6A Sampling Distribution of X 23 Doing Statistics for Business Discovery Exercise 1.1 The Central Limit Theorem in Action Part I. Draw a picture of a normal distribution with mean of 80 and standard deviation of 5. This is the population we will sample from. 24 Doing Statistics for Business Discovery Exercise 1.1 The Central Limit Theorem in Action (con’t) Part II. Generate and examine 100 random samples. For this exercise you will need to generate 100 samples each consisting of 30 value selected from a normal distribution with a mean of 80 and a standard deviation of 5. Part III. Create a distribution of X for samples of size n = 30. 25 Doing Statistics for Business Figure 7.6B Sampling Distribution for X when = 55.00 Figure 7.6C Sampling Distribution for X when = 54.50 68% 95% 68% 99.7% 95% 99.7% 26 Doing Statistics for Business TRY IT NOW! Cost of Books Comparing The Sample Mean to the Claimed Population Mean A university states the average student spends $225 per semester on books. Based on your own experience you feel that this is an underestimate of the true expenditure. You ask 30 of your friends how much they spent on textbooks this past semester and you obtain the following data: 27 Doing Statistics for Business TRY IT NOW! Cost of Books Comparing The Sample Mean to the Claimed Population Mean (con’t) 214 233 234 236 239 241 241 244 245 247 248 248 248 249 250 253 254 254 258 260 262 262 263 265 269 274 276 277 277 281 Based on these data, do you have reason to tell the university that its statement is inaccurate? 28 Doing Statistics for Business A Confidence Interval or an Interval Estimate is a range of values with an associated probability or Confidence Level, 1 – . The probability quantifies the chance that you have an interval that contains the true population parameter. 29 Doing Statistics for Business Figure 7.7. Normal Distribution with 0.05 in the tails. 30 Doing Statistics for Business TRY IT NOW! The Bottle-Filling Problem A sample of 36 bottles had a sample mean of x = 32.10 oz. The population standard deviation, , was assumed to be 0.1 oz. Find a 95% confidence interval for . How wide is the interval? Now find a 98% confidence interval for . Which interval is wider? 31 Doing Statistics for Business Figure 7.8 Comparison of Confidence Intervals and µ 32 Doing Statistics for Business Discovery Exercise 7.2 Exploring Confidence Intervals for From a population of college students across the United States, a sample was selected to find our how many hours per week a typical student spends playing sports. Part I. A random sample of 2500 students was selected. The sample mean, x was found to be 12.5 hours. The population standard deviation, , is known to be 1.05 hours. Given this information, find: 33 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for (a) a 90% confidence interval for . (b) a 92% confidence interval for . (c) a 94%confidence interval for . (d) a 96% confidence interval for . 34 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for (e) a 98% confidence interval for (f) Discuss what happens to the size of the interval as the level of the confidence increases. 35 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for Part II. A random sample of 2500 students was selected. The sample mean,x was found to be 10.5 hours. The population standard deviation, , is known to be 1.05 hours. Given this information, find: (a) a 90% confidence interval for 36 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for (b) a 92% confidence interval for . (c)a 94%confidence interval for . (d) a 96% confidence interval for (e) a 98% confidence interval for 37 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for Compare the intervals found in Part I with those found in Part II. Discuss what happened to the confidence interval due to the change in the value of the sample mean x. Part III. A random sample of 2500 students was selected. The sample mean, x was found to be 12.5 hours. Suppose you that the population standard deviation, , is actually 2.05 hours. Given this information, find: 38 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for (a) a 90% confidence interval for (b) a 92% confidence interval for . (c)a 94%confidence interval for . (d) a 96% confidence interval for (e) a 98% confidence interval for 39 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for Compare the intervals found in Part I with those found in Part III. Discuss what happened to the confidence intervals due to the change in the value of the population standard deviation, . Part IV. A random sample of 2000 students was selected. The sample mean, x was found to be 12.5 hours. The population standard deviation, , is known to be 1.05 hours. Given this information, find: 40 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for (a) a 90% confidence interval for (b) a 92% confidence interval for . (c)a 94%confidence interval for . (d) a 96% confidence interval for (e) a 98% confidence interval for 41 Doing Statistics for Business Discovery Exercise 7.2 (con’t) Exploring Confidence Intervals for Compare the intervals found in Part I with those found in Part IV. Discuss what happened to the confidence interval due to the change in the value of the sample size, n. 42 Doing Statistics for Business Figure 7.9 t-distribution with 5 Degrees of Freedom 43 Doing Statistics for Business Figure 7.10 t-distribution with 25 Degrees of Freedom Figure 7.11 t-distribution with 1 and 50 Degrees of Freedom 44 Doing Statistics for Business Upper Tail Areas Degrees of Freedom 20 21 22 23 24 25 26 0.25 0.6870 0.6864 0.6858 0.6853 0.6848 0.6844 0.6840 0.1 1.3253 1.3232 1.3212 1.3195 1.3178 1.3163 1.3150 0.05 1.7247 1.7207 1.7171 1.7139 1.7109 1.7081 1.7056 0.025 2.0860 2.0796 2.0739 2.0687 2.0639 2.0595 2.0555 0.01 2.5280 2.5176 2.5083 2.4999 2.4922 2.4851 2.4786 0.005 2.8453 2.8314 2.8188 2.8073 2.7970 2.7874 2.7787 Figure 7.12 A portion of the t table 45 Doing Statistics for Business TRY IT NOW! Retirement Years Confidence Interval for A survey shows that a growing number of Americans are willing to make sacrifices to become home owners despite increasing job and financial worries. The Federal National Mortgage Association surveyed 1857 Americans and found that 67% would put off retirement for 10 years to own a home. Find a 90% confidence interval for the proportion of all Americans who would put off retirement for 10 years to own a home. 46 Doing Statistics for Business TRY IT NOW! Bottle Filling Finding the Sample Size How many bottles does the bottle manufacturer need to sample to be 98% confident that the error is at most 0.002 oz? Remember that the population standard deviation is 0.1 oz. 47 Doing Statistics for Business TRY IT NOW! Retirement Years Sample Size Calculation for How many Americans must be sampled to determine the percentage who would put off retirement for 10 years to own a home? The estimate should not differ from the actual population proportion by more than 3% with a confidence of 90%. 48 Doing Statistics for Business Finding Confidence Intervals Using KaddStat Instructions for small sample confidence interval for mean- all others are done similarly From the Kadd menu select Confidence Intervals>One Sample>Population Mean using t The dialog box opens. 49 Doing Statistics for Business Finding Confidence Intervals Using Kadd(con’t) 50 Doing Statistics for Business Finding Confidence Intervals Using Excel (con’t) 1. 2. 3. 4. 5. First, indicate the level of confidence as a percent. Select User Input if you already have the summary statistics or Input Range if you have raw data. Indicate how the sampling was done Indicate where you want the output to appear Click OK 51 Doing Statistics for Business Output for Confidence Interval for Small Sample usiing t 52 Doing Statistics for Business Chapter 7 Summary In this chapter you have learned: The basics of estimating population parameters, in particular how to estimate the average of a numeric characteristic of a population, , and the proportion of a population that has a certain characteristic, . The estimates are calculated from a sample selected from the population. 53 Doing Statistics for Business Chapter 7 Summary Each sample yields a slightly different estimate of the population parameter. Thus, estimators are themselves random variables. When the random variable is an estimator, the distribution is called a sampling distribution. The sampling distribution as a mean and a standard deviation, called the standard error. 54 Doing Statistics for Business Chapter 7 Summary In this chapter you have learned: How to use the sampling distribution of X to calculate probabilities and make inferences about and for . How to create confidence intervals for and for . How to calculate the required sample size to achieve a certain level of precision with a specified confidence. 55