Download Lecture note

Chapter 12 Inference About One Population 1 12.1 Introduction • In this chapter we utilize the approach developed before to describe a population. – Identify the parameter to be estimated or tested. – Specify the parameter’s estimator and its sampling distribution. – Construct a confidence interval estimator or perform a hypothesis test. 2 12.1 Introduction • We shall develop techniques to estimate and test three population parameters. – Population mean m – Population variance s2 – Population proportion p 3 12.2 Inference About a Population Mean When the Population Standard Deviation Is Unknown Recall that when s is known we use the following statistic to estimate and test a population mean z xm s n When s is unknown, we use its point estimator s, and the z-statistic is replaced then by the t-statistic 4 The t - Statistic ZZt t t Z ttt Z x  m xm Z t t t t t Z ss n s s s sss n s ssss sssssss When the sampled population is normally distributed, the t statistic is Student t distributed. 5 The t - Statistic Using the t-table t The t distribution is mound-shaped, and symmetrical around zero. d.f. = v2 v1 < v2 d.f. = v1 0 xm s n The “degrees of freedom”, (a function of the sample size) determine how spread the distribution is (compared to the normal distribution) 6 Testing m when s is unknown • Example 12.1 - Productivity of newly hired Trainees 7 Testing m when s is unknown • Example 12.1 – In order to determine the number of workers required to meet demand, the productivity of newly hired trainees is studied. – It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring. – Can we conclude that this belief is correct, based on productivity observation of 50 trainees 8 (see file Xm12-01). Testing m when s is unknown • Example 12.1 – Solution – The problem objective is to describe the population of the number of packages processed in one hour. – The data are interval. H0:m = 450 H1:m > 450 – The t statistic t x m s n d.f. = n - 1 = 49 9 Testing m when s is unknown • Solution continued (solving by hand) – The rejection region is t > ta,n – 1 ta,n - 1 = t.05,49 @ t.05,50 = 1.676. From the data we have  x i  23,019 2 x  i  10,671,357, thus 23,019 x  460 .38, and 50  x  x   n  2 s2 2 i i n 1 s  1507 .55  38.83  1507 .55. 10 Testing m when s is unknown Rejection region • The test statistic is t 1.676 x m s n  460.38  450 38.83 50 1.89  1.89 • Since 1.89 > 1.676 we reject the null hypothesis in favor of the alternative. • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level. 11 Testing m when s is unknown t-Test: Mean Pack ages Mean 460.38 Standard Deviation 38.83 Hypothesized Mean 450 df 49 t Stat 1.89 P(T<=t) one-tail 0.0323 t Critical one-tail 1.6766 P(T<=t) two-tail 0.0646 t Critical two-tail 2.0096 .05 .0323 • Since .0323 < .05, we reject the null hypothesis in favor of the alternative. • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level. 12 Estimating m when s is unknown • Confidence interval estimator of m when s is unknown x  ta s 2 n d.f .  n  1 13 Estimating m when s is unknown • Example 12.2 – An investor is trying to estimate the return on investment in companies that won quality awards last year. – A random sample of 83 such companies is selected, and the return on investment is calculated had he invested in them. – Construct a 95% confidence interval for the mean return. 14 Estimating m when s is unknown • Solution (solving by hand) – The problem objective is to describe the population of annual returns from buying shares of quality award-winners. – The data are interval. x  15 .02 s 2  68 .98 s  68 .98  8.31 – Solving by hand • From the Xm12-02 we determine x  ta 2, n 1 s @ 15 .02  1.990 n t.025,82@ t.025,80 8.31 83  13 .19,16 .85  15 Estimating m when s is unknown t-Estimate: Mean Mean Standard Deviation LCL UCL Returns 15.02 8.31 13.20 16.83 16 Checking the required conditions • We need to check that the population is normally distributed, or at least not extremely nonnormal. • There are statistical methods to test for normality (one to be introduced later in the book). • From the sample histograms we see… 17 A Histogram for Xm12- 01 14 12 10 8 6 4 2 0 400 425 450 475 500 525 550 Packages A Histogram for Xm12- 02 30 575 More 25 20 15 10 5 0 -4 2 8 14 Returns 22 30 More 18 12.3 Inference About a Population Variance • Sometimes we are interested in making inference about the variability of processes. • Examples: – The consistency of a production process for quality control purposes. – Investors use variance as a measure of risk. • To draw inference about variability, the parameter of interest is s2. 19 12.3 Inference About a Population Variance • The sample variance s2 is an unbiased, consistent and efficient point estimator for s2. (n  1)s 2 • The statistic has a distribution called Chi2 s squared, if the population is normally distributed. d.f. = 5 2  (n  1)s 2 s 2 d.f .  n  1 d.f. = 10 20 Testing and Estimating a Population Variance • From the following probability statement P(21-a/2 < 2 < 2a/2) = 1-a we have (by substituting 2 = [(n - 1)s2]/s2.) (n  1)s 2  2a / 2  s2  (n  1)s 2 12a / 2 21 Testing the Population Variance • Example 12.3 (operation management application) – A container-filling machine is believed to fill 1 liter containers so consistently, that the variance of the filling will be less than 1 cc (.001 liter). – To test this belief a random sample of 25 1-liter fills was taken, and the results recorded (Xm12-03) – Do these data support the belief that the variance is less than 1cc at 5% significance level? 22 Testing the Population Variance • Solution – The problem objective is to describe the population of 1-liter fills from a filling machine. – The data are interval, and we are interested in the variability of the fills. – The complete test is: H0: s2 = 1 2 2 H1: s <1 (n  1)s 2 The test statistic is   The rejection region . s is  2  12a ,n1 2 23 Testing the Population Variance • Solving by hand – Note that (n - 1)s2 = S(xi - x)2 = Sxi2 – (Sxi)2/n – From the sample (Xm12-03) we can calculate Sxi = 24,996.4, and Sxi2 = 24,992,821.3 – Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78 2 ( n  1 ) s 20.78 2    2  20.78, 2 s 1 12a ,n1  .295,251  13.8484. There is insufficient evidence to reject the hypothesis that the variance is less than 1. Since 13.8484  20.78, do not reject the null hypothesis. 24 Testing the Population Variance a = .05 1-a = .95 Rejection region  2  13.8484 13.8484 20.8 2 .295,251 Do not reject the null hypothesis 25 Estimating the Population Variance • Example 12.4 – Estimate the variance of fills in Example 12.3 with 99% confidence. • Solution – We have (n-1)s2 = 20.78. From the Chi-squared table we have 2a/2,n-1 = 2.005, 24 = 45.5585 21a/2,n-1 2.995, 24 = 9.88623 26 Estimating the Population Variance • The confidence interval estimate is (n  1)s (n  1)s 2 s  2 2 a / 2 1a / 2 2 2 20.78 20.78 2 s  45.5585 9.88623 .46  s  2.10 2 27 12.4 Inference About a Population Proportion • When the population consists of nominal data, the only inference we can make is about the proportion of occurrence of a certain value. • The parameter p was used before to calculate these probabilities under the binomial distribution. 28 12.4 Inference About a Population Proportion • Statistic and sampling distribution – the statistic used when making inference about p is: x p̂  where n x  the number of successes . n  sample size . – Under certain conditions, [np > 5 and n(1-p) > 5], p̂ is approximately normally distributed, with m = p and s2 = p(1 - p)/n. 29 Testing and Estimating the Proportion • Test statistic for p p̂  p Z p(1  p) / n where np  5 and n(1  p)  5 • Interval estimator for p (1-a confidence level) p̂  z a / 2 p̂(1  p̂) / n provided np̂  5 and n(1  p̂)  5 30 Additional example Testing the Proportion • Example 12.5 (Predicting the winner in election day) – Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day. – Based on the data presented in Xm12-05 where 1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote? 31 Testing the Proportion • Solution – The problem objective is to describe the population of votes in the state. – The data are nominal. – The parameter to be tested is ‘p’. – Success is defined as “Vote republican”. – The hypotheses are: H0: p = .5 H1: p > .5 More than 50% vote Republican 32 Testing the Proportion – Solving by hand • The rejection region is z > za = z.05 = 1.645. • From file we count 407 success. Number of voters participating is 765. • The sample proportion is p̂  407 765  .532 • The value of the test statistic is Z p̂  p p(1  p) / n  .532  .5 .5(1  .5) / 765  1.77 • The p-value is = P(Z>1.77) = .0382 33 Testing the Proportion z-Test : Proportion Sample Proportion Observations Hypothesized Proportion z Stat P(Z<=z) one-tail z Critical one-tail P(Z<=z) two-tail z Critical two-tail 0.532 765 0.5 1.77 0.0382 1.6449 0.0764 1.96 There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican. 34 Estimating the Proportion • Nielsen Ratings – In a survey of 2000 TV viewers at 11.40 p.m. on a certain night, 226 indicated they watched “The Tonight Show”. – Estimate the number of TVs tuned to the Tonight Show in a typical night, if there are 100 million potential television sets. Use a 95% confidence level. – Solution pˆ  za / 2 pˆ (1  pˆ ) / n  .113  1.96 .113 (1  .113 ) / 2000 .113  .014 35 Estimating the Proportion • Solution z - Estimate: Proportion Viewers Sample Proportion Observations LCL UCL 0.113 2000 0.099 0.127 A confidence interval estimate of the number of viewers who watched the Tonight Show: LCL = .099(100 million)= 9.9 million UCL = .127(100 million)=12.7 million 36 Selecting the Sample Size to Estimate the Proportion • Recall: The confidence interval for the proportion is pˆ  za / 2 pˆ (1  pˆ ) / n • Thus, to estimate the proportion to within W, we can write W  za / 2 pˆ (1  pˆ ) / n 37 Selecting the Sample Size to Estimate the Proportion • The required sample size is  za / 2 pˆ (1  pˆ ) n W     2 38 Sample Size to Estimate the Proportion • Example – Suppose we want to estimate the proportion of customers who prefer our company’s brand to within .03 with 95% confidence.  1.96 p̂(1  p̂) – Find the sample size. n – Solution .03  W = .03; 1 - a = .95, therefore a/2 = .025, so z.025 = 1.96 Since the sample has not yet been taken, the sample proportion is still unknown. We proceed using either one of the following two methods: 39    2 Sample Size to Estimate the Proportion • Method 1: – There is no knowledge about the value of p̂ • Let p̂  .5 . This results in the largest possible n needed for a 1-a confidence interval of the form p̂  .03 . • If the sample proportion does not equal .5, the actual W will be narrower than .03 with the n obtained by the formula below. • Method 2: – There is some idea about the value of p̂ • Use the value of p̂ to calculate the sample size  1.96 .5(1  .5) n .03  2    1,068  1.96 .2(1  .2)  n .03  2    683  40

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture note