Download Set 8: Inference for the Mean of a Population

Business Statistics for Managerial Decision Inference for the Population Mean Test for a Population Mean  There are four steps in carrying out a significance test:     State the hypothesis. Calculate the test statistic. Find the p-value. State your conclusion by comparing the p-value to the significance level . Example: Blood pressures of executives  The medical director of a large company is concerned about the effects of stress on the company’s younger executives. According to the National Center for health Statistics, the mean systolic blood pressure for males 35 to 44 years of age is 128 and the standard deviation in this population is 15. The medical director examines the records of 72 executives in this age group and finds that their mean systolic blood pressure is X  129.93. Is this evidence that the mean blood pressure for all the company’s young male executives is higher than the national average? Use  = 5%. Example: Blood pressures of executives  Hypotheses: H0:  = 128 Ha:  > 128  Test statistic: z  x   0 129.93  128   1.09  n 15 72 P-value: P  p( z  1.09)  1  .8621  .1379 Example: Blood pressures of executives  Conclusion:  About 14% of the time, a SRS of size 72 from the general male population would have a mean blood pressure as high as that of executive sample. The observed X  129.93 is not significantly higher than the national average. Example: A company-wide health promotion campaign  The company medical director institutes a health promotion campaign to encourage employees to exercise more and eat a healthier diet. One measure of the effectiveness of such a program is a drop in blood pressure. Choose a random sample of 50 employees, and compare their blood pressures from annual physical examination given before the campaign and again a year later. The mean change in blood pressure for these n = 50 employees is X  6 . We take the population standard deviation to be  = 20. Use  = 5%. Example: A company-wide health promotion campaign  Hypotheses: H0:  =0 Ha:  < 0  Test statistic: z  x  0 60   2.12  n 20 50 P-value: P  P( z  2.12)  0.0170 Example: A company-wide health promotion campaign  Conclusion:  A mean change in blood pressure of –6 or better would occur only 17 times in 1000 samples if the campaign had no effect on the blood pressures of the employees. This is convincing evidence that the mean blood pressure in the population of all employees has decreased. Example:Testing Pharmaceutical products  The Deely Laboratory analyzes pharmaceutical products to verify the concentration of active ingredients. Such chemical analyses are not perfectly precise. Repeated measurements on the same specimen will give slightly different results. The results of repeated measurements follow a Normal distribution quite closely, the analysis procedure has no bias, so that the mean  of the population of all measurements is the true concentration in the specimen. The standard deviation of this distribution is a property of the analytical procedure and is known to be  = 0.0068 gram per liter. The laboratory analyzes each specimen three times and reports the mean results. Example:Testing Pharmaceutical products  A client sends a specimen for which the concentration of active ingredient is supposed to be 0.86%. Deely’s three analyses give concentrations 0.8403  0.8363 0.8447 Is there significant evidence at the 1% level that the true concentration is not 0.86%? Example:Testing Pharmaceutical products  Hypotheses: H0:  = 0.86 Ha:   0.86  Test Statistic: The mean of the three analyses is X  .8404 . The one sample z test statistic is therefore z x  0 .8404  .86   4.99  n .0068 3 Example:Testing Pharmaceutical products   We do not need to find the exact P-value to assess significance at the  = 0.01 level. Look in the table A under tail area 0.005 because the alternative is two-sided. The z-values that are significant at the 1% level are z > 2.575 and z < -2.575. Our observed z = -4.99 is significant P-value versus fixed      In our example , we concluded that the test statistic z = -4.99 is significant at the 1% level. The observed z is far beyond the critical value for  = 0.01, and the evidence against H0 is far stronger than 1% significance suggests. The P-value P = .0000006 (from a statistical software) gives a better sense of how strong the evidence is. The P-value is the smallest level  at which the data are significant. Inference for the Mean of a Population      Both confidence intervals and tests of significance for the mean  of a Normal population are based on the sample mean X , which estimates the unknown . The sampling distribution of X depends on . There is no difficulty when  is known. When  is unknown, we must estimate it. The sample standard deviation s is used to estimate the population standard deviation . The t-distribution   Suppose we have a simple random sample of size n from a Normally distributed population with mean  and standard deviation . The standardized sample mean, or one-sample z statistic x z  0  n has the standard Normal distribution N(0, 1). When we substitute the standard deviation of the mean (standard error) s /n for the /n, the statistic does not have a Normal distribution. The t-distribution   It has a distribution called t-distribution. The t-distribution  Suppose that a SRS of size n is drawn from a N(, ) population. Then the one sample t statistic t x s n has the t-distribution with n-1 degrees of freedom.  There is a different t distribution for each sample size.  A particular t distribution is specified by giving the degrees of freedom. The t-distribution     We use t(k) to stand for t distribution with k degrees of freedom. The density curves of the tdistributions are symmetric about 0 and are bell shaped. The spread of t distribution is a bit greater than that of standard Normal distribution. As degrees of freedom k increase, t(k) density curve approaches the N(0, 1) curve. The one –Sample t Confidence Interval  Suppose that an SRS of size n is drawn from a population having unknown mean . A level C confidence interval for  is  s x t* n Where t* is the value for the t (n-1) density curve with area C between –t* and t*. The margin of error is t*  s n This interval is exact when the population distribution is Normal and is approximately correct for large n in other cases. Example: Estimating the level of Vitamin C  The following data are the amount of vitamin C, measured in milligram per 100 grams (mg/100g) of the corn soy blend (dry basis), for a random sample of size 8 from a production run: 26 31 23 22 11 22 14 31 We want to find a 95% confidence interval for , the mean vitamin C content of the corn soy blend (CSB) produced during this run. Example: Estimating the level of Vitamin C  The sample mean X  22.50 and the standard deviation s = 7.19 with degrees of freedom n-1 = 8-1 =7. The standard error of X is SEx   s s   2.54 n 8 From table we find t* = 2.365. The 95% confidence interval is s 7.19  22.5  2.365  22.5  (2.365)( 2.54) n 8  22.5  6  (16.5,28.5) x t* The one-sample t test:Summary Example: Is the Vitamin C level correct?  The specifications for the CSB state that the mixture should contain 2 pounds of vitamin premix fro every 2000 pounds of product. These specifications are designed to produce a mean () vitamin C content in the final product of 40mg/100 g. We test the null hypothesis that the mean vitamin C content of the production run in the previous example conforms to these specifications. Use  = 5%. Example: Is the Vitamin C level correct?  Hypotheses: H0:  = 40 Ha:   40  Test statistic: t  x   0 22.5  40   6.88 s n 7.2 8 P-value: P  2 P(t  6.88)  Because the degrees of freedom are n-1 =7, this t statistic has t(7) distribution. Example: Is the Vitamin C level correct?  From the largest entry in the df =7 line of the table we see that P(t  5.408)  .0005   We conclude that the Pvalue is less than 20.0005, or P < .001. We reject H0 and conclude that the vitamin C content for this run is below the specification. Matched Pairs t procedures    Comparative studies are usually preferred to single-sample investigations because of the protection they offer against confounding. In a matched pairs study, subjects are matched in pairs and the outcomes are compared within each matched pair. One situation calling for matched pairs is beforeand-after observations on the same subjects. Matched Pairs t procedures    A matched pair analysis is needed when there are two measurements or observations on each individual and we want to examine the change from the first to the second. For each individual, subtract the “before” measure from the “after” measure. Analyze the difference using the one-sample confidence interval and significance testing procedures. Example: The effect of language instruction  A company contracts with a language institute to provide individualized instruction in foreign languages for its executives who will be posted overseas. Is the instruction effective? Last year 20 executives studied French. All had some knowledge of French, so they were given the Modern Language Association’s listening test of understanding of spoken French before the instruction began. After several weeks of immersion in French, the executives took the listening test again. The following table gives the pretest and posttest scores. Example: The effect of language instruction Example: The effect of language instruction  To analyze these data     Subtract pretest score from the posttest score. These differences appear in the “gain” column in previous table. These 20 differences form a single sample. To assess whether the institute significantly improved the executives’ comprehension of spoken French we test H0 :  = 0 Ha:  > 0 Example: The effect of language instruction    Here  is the mean improvement that would be achieved if the entire population of executives received similar instruction. The null hypothesis says that no improvement occurs, and the alternative hypothesis says that posttest scores are higher on the average. The 20 differences have X  2.5 and s  2.893 Example: The effect of language instruction  The one sample t statistic is therefore t  x 0 2.5  0   3.86 s n 2.893 20 P-value is found from the t(19) distribution. P  P(t  3.86)  T-table shows that 3.86 lies between the upper .001 and .0005 critical values of the t(19) distribution. The P-value therefore lies between these values. Example: The effect of language instruction  Conclusion:   The improvement in scores was significant. We have strong evidence that the posttest scores are systematically higher. A statistically significant but very small improvement in language ability would not justify the expense of the individualized instruction. A confidence interval allows us to estimate the amount of improvement. Example: The effect of language instruction  Find a 90% confidence interval for the mean improvement in the entire population.   The critical value t* = 1.729 from t-table for 90% confidence. The confidence interval is: s 2.893  2.5  1.729 n 20  2.5  1.12  (1.38, 3.62) x t*

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Set 8: Inference for the Mean of a Population