Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Week 5 Dr. Jenne Meyer Article review 5-Step Hypothesis Testing Procedure Step 1: Set up the null and alternative hypotheses. Step 2: Pick the level of significance (value of ) and find the rejection region. Step 3: Calculate the test statistics. Step 4: Decide whether or not to reject the null hypothesis. Step 5: Interpret the statistical decision in terms of the stated problem. Two-tail test H0: = 750 HA: 750 Lower-tail test H0: 700 HA: < 700 Upper-tail test H0: 800 HA: >800 The Rejection Region is the range of values of the test statistics that will lead you to reject the null hypothesis. One-tailed /2 /2 Two-tailed For a large sample: Z X n For a small sample: X t s n **At least 30 units Apply hypothesis testing to different populations and samples in business research situations Test of single population, small sample size Test of a single proportion Test of two populations, large sample size Test of two populations, small sample size Test for difference in two population proportions The 5-Step Hypothesis Testing Procedure is the same for all these processes. For a small sample: Small sample and unknown σ Calculations are identical to those for z Becomes identical to z for n > 30 Uses degrees of freedom: df = n - 1 X t s n Review t-table Common Values of and df and the Corresponding t-Values Upper Tail Area Degrees of Freedom (df) 20 21 22 23 24 25 26 1000000 0.25 1.185 1.183 1.182 1.180 1.179 1.178 1.177 1.150 0.10 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.645 0.05 2.086 2.080 2.074 2.069 2.064 2.060 2.056 1.960 0.01 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.576 0.005 3.153 3.135 3.119 3.104 3.091 3.078 3.067 2.807 Example A State Highway Patrol periodically samples vehicle speeds at various locations on a particular roadway. The sample of vehicle speeds is used to test the hypothesis H0: m < 65mph The locations where H0 is rejected (average speed exceeds 65mph) are the best for radar traps. At Location X, a sample of 16 vehicles shows a mean speed of 68.2 mph with a standard deviation of 3.8 mph. Use an level of significance=.05 to test the hypothesis. Example, cont. H0: < 65mph HA: > 65mph Rejection Region =.05 d.f.=16-1=15, ta = 1.753 t 1.753 Since 3.37 > 1.753, we reject H0. n = 16 x = 68.2 mph s = 3.8 mph t x 0 68.2 65 3.37 s / n 3.8 / 16 Conclusion: We are 95% confident that the mean speed of vehicles at Location X is greater than 65 mph. Location X is a good candidate for a radar trap. Use if concerned with a proportion of the population, p, that have a particular characteristic Can be used with nominal data Use the same 5-Step Hypothesis Testing Procedures Test Statistic calculated Z= p-p √ p (1- p) / n Example For a Christmas and New Year’s week, the National Safety Council estimated that 500 people would be killed and 25,000 injured on the nation’s roads. The NSC claimed that 50% of the accidents would be caused by drunk driving. A sample of 120 accidents showed that 67 were caused by drunk driving. Use these data to test the NSC’s claim with a = 0.05. Example, cont. H0: p = .5 HA: p .5 =.05 Rejection Region Rejection Region p = 67/120 n = 120 Z= (67/120) - .5 .5(1 - .5) 120 = 1.278 Since –1.96 < 1.278 < 1.96, we do not reject H0. Conclusion: There is insufficient evidence to suggest that the population proportion of accidents caused by drunk driving is different from 50% Example, cont. z = p1 p2 pc (1 pc ) p (1 pc ) c n1 n2 . 1167 - . 0880 Rejection Region Z 1.645 . 1036 (1 - . 1036 ) . 1036 (1 - . 1036 ) + 300 250 = 1.099 Since 1.099 < 1.645, we do not reject H0. Conclusion: There is insufficient evidence to suggest that there is an difference between the proportion of unmarried workers missing more than 5 days of work than the proportion of married ones Often we are interested in comparing two different, independent populations Population 1 Population 2 Sample 1 Sample 2 Figure 13.1 Two Populations and Two Samples When comparing two different, independent populations the Null Hypothesis takes on the form: H0: s- p = 0 H0: s ≤ p H0: s = p When comparing two different, independent populations the with large n, the test statistic looks like z ( X1 X 2 ) 12 n1 22 n2 If population std. dev. are unknown, use s1 and s2 instead of σ’s Example A study was conducted to compare the mean years of service for those retiring in 1979 with those retiring last year at Acme Manufacturing Co. At the .01 significance level can we conclude that the workers retiring last year gave more service based on the following sample data? Note: Let pop #1= “last year” Sample Mean Sample Standard Deviation Sample Size Population #1 "Last Year" Population #2 1979 30.4 3.6 45 25.6 2.9 40 Example, cont. H0: LY < 1979 HA: LY > 1979 =.01 Rejection Region Z 2.326 Example, cont. Sample Mean Sample Standard Deviation Sample Size z Population #1 "Last Year" Population #2 1979 30.4 3.6 45 25.6 2.9 40 30.4 25.6 2 2 3.6 2.9 45 40 6.80 Z 2.326 Since 6.80 > 2.326, we reject H0. Conclusion: There is sufficient evidence at the 99% confidence level to suggest that the mean years of service of those retiring last year is greater than the mean years of service of those retiring in 1979. When comparing two different, independent populations with unknown variances that are assumed equal) with small n, the test statistic looks like t ( X1 X 2 ) D 1 1 s n1 n2 df = n1 + n2 – 2 2 p s 2 p ( n1 1) s12 ( n2 1) s22 n1 n2 2 (Pooled Sample Variance) Example To determine whether there is a difference in the time involved in using two versions of software, the new version of the software is compared to the original. Samples are taken from two independent groups using the software (data below). At the .01 significance level, is there a difference in the mean amount of time required to use two versions of software? Version 1 6 8 6 9 10 Version 2 5 9 8 7 7 6 Example, cont. H0 : 1- 2 = 0 HA : 1- 2 =/ 0 =.01 Because we have a two tailed test, there is /2 = .005 in each tail df = n1 + n2 – 2 =5+6–2 =9 Version 1 6 8 6 9 10 Version 2 5 9 8 7 7 6 S12 = variance 1 S22 = variance 2 3.2 X1 bar = mean 1 2.0 X2 bar = mean 2 7.8 7.0 From t-table, critical cutoffs for two-tail, alpha/2=.005, df=9 is 3.25 Example, cont. 2 2 ( n 1 )( s ) ( n 1 )( s (4)(3.2) (5)( 2.0) 1 2 2) sp 2 1 2.53 n1 n2 2 562 t X X 2 1 2 S p (1 / n 1 / n ) 1 2 7 .8 7 .0 .80 .83 2 .53 (1 / 5 1 / 6 ) .963 Since .83 < 3.25, we do not reject H0. Conclusion: There is insufficient evidence to suggest that there is a difference between the mean time to use the two versions of software When comparing two different population proportions, the Null Hypothesis takes on the form: H0: p1- p2 = 0 H0: p1 = p2 The test statistic looks like: p1 p2 z pc (1 pc ) pc (1 pc ) n1 n2 where pc Total number of successes X 1 X 2 Total number in samples n1 n2 = (the weighted mean of the two sample proportions) Are unmarried workers more likely to be absent from work than married workers? A sample of 250 married workers showed 22 missed more than 5 days last year, while a sample of 300 unmarried workers showed 35 missed more than five days. Use a .05 significance level. Note: let pop #1= unmarried workers. Example, cont. H 0 : pu = pm H A : pu > pm = .05 Rejection Region Z pu = Unmarried Workers = X1/n1 = 35/300 = .1167 pm = Married Workers = X2/n2 = 22/250 = .0880 35 + 22 = = . 1036 pc 300 + 250 1.645 Chapter 9: problems 10, 15, 17, 18, 20, 40 Chapter 10: problems 12, 14 (two sample), 32, 33 (proportions)