Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
IS 310 Business Statistics CSU Long Beach IS 310 – Business Statistics Slide 1 Chapter 12 Tests of Goodness of Fit and Independence Goodness of Fit Test: A Multinomial Population Test of Independence Goodness of Fit Test: Poisson and Normal Distributions IS 310 – Business Statistics Slide 2 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 1. Set up the null and alternative hypotheses. 2. Select a random sample and record the observed frequency, fi , for each of the k categories. 3. Assuming H0 is true, compute the expected frequency, ei , in each category by multiplying the category probability by the sample size. IS 310 – Business Statistics Slide 3 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 4. Compute the value of the test statistic. 2 ( f e ) 2 i i ei i 1 k where: fi = observed frequency for category i ei = expected frequency for category i k = number of categories Note: The test statistic has a chi-square distribution with k – 1 df provided that the expected frequencies are 5 or more for all categories. IS 310 – Business Statistics Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 5. Rejection rule: p-value approach: Reject H0 if p-value < Critical value approach: Reject H0 if 2 2 where is the significance level and there are k - 1 degrees of freedom IS 310 – Business Statistics Slide 5 Multinomial Distribution Goodness of Fit Test Example: Finger Lakes Homes (A) Finger Lakes Homes manufactures four models of prefabricated homes, a two-story colonial, a log cabin, a split-level, and an A-frame. To help in production planning, management would like to determine if previous customer purchases indicate that there is a preference in the style selected. IS 310 – Business Statistics Slide 6 Multinomial Distribution Goodness of Fit Test Example: Finger Lakes Homes (A) The number of homes sold of each model for 100 sales over the past two years is shown below. SplitAModel Colonial Log Level Frame # Sold 30 20 35 15 IS 310 – Business Statistics Slide 7 Multinomial Distribution Goodness of Fit Test Hypotheses H0: pC = pL = pS = pA = .25 Ha: The population proportions are not pC = .25, pL = .25, pS = .25, and pA = .25 where: pC = population proportion that purchase a colonial pL = population proportion that purchase a log cabin pS = population proportion that purchase a split-level pA = population proportion that purchase an A-frame IS 310 – Business Statistics Slide 8 Multinomial Distribution Goodness of Fit Test Rejection Rule Reject H0 if p-value < .05 or 2 > 7.815. With = .05 and k-1=4-1=3 degrees of freedom Do Not Reject H0 Reject H0 7.815 IS 310 – Business Statistics 2 Slide 9 Multinomial Distribution Goodness of Fit Test Expected Frequencies e1 = .25(100) = 25 e3 = .25(100) = 25 e2 = .25(100) = 25 e4 = .25(100) = 25 Test Statistic 2 2 2 2 ( 30 25 ) ( 20 25 ) ( 35 25 ) ( 15 25 ) 2 25 25 25 25 =1+1+4+4 = 10 IS 310 – Business Statistics Slide 10 Multinomial Distribution Goodness of Fit Test Conclusion Using the p-Value Approach Area in Upper Tail .10 .05 .025 .01 .005 2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Because 2 = 10 is between 9.348 and 11.345, the area in the upper tail of the distribution is between .025 and .01. The p-value < . We can reject the null hypothesis. Note: A precise p-value can be found using Minitab or Excel. IS 310 – Business Statistics Slide 11 Multinomial Distribution Goodness of Fit Test Conclusion Using the Critical Value Approach 2 = 10 > 7.815 We reject, at the .05 level of significance, the assumption that there is no home style preference. IS 310 – Business Statistics Slide 12 Test of Independence: Contingency Tables 1. Set up the null and alternative hypotheses. 2. Select a random sample and record the observed frequency, fij , for each cell of the contingency table. 3. Compute the expected frequency, eij , for each cell. (Row i Total)(Column j Total) eij Sample Size IS 310 – Business Statistics Slide 13 Test of Independence: Contingency Tables 4. Compute the test statistic. 2 i j ( f ij eij ) 2 eij 5. Determine the rejection rule. 2 2 Reject H0 if p -value < or . where is the significance level and, with n rows and m columns, there are (n - 1)(m - 1) degrees of freedom. IS 310 – Business Statistics Slide 14 Contingency Table (Independence) Test Example: Finger Lakes Homes (B) Each home sold by Finger Lakes Homes can be classified according to price and to style. Finger Lakes’ manager would like to determine if the price of the home and the style of the home are independent variables. IS 310 – Business Statistics Slide 15 Contingency Table (Independence) Test Example: Finger Lakes Homes (B) The number of homes sold for each model and price for the past two years is shown below. For convenience, the price of the home is listed as either $99,000 or less or more than $99,000. Price Colonial < $99,000 18 > $99,000 12 IS 310 – Business Statistics Log 6 14 Split-Level 19 16 A-Frame 12 3 Slide 16 Contingency Table (Independence) Test Hypotheses H0: Price of the home is independent of the style of the home that is purchased Ha: Price of the home is not independent of the style of the home that is purchased IS 310 – Business Statistics Slide 17 Contingency Table (Independence) Test Expected Frequencies Price Colonial Log Split-Level A-Frame Total < $99K 18 6 19 12 55 > $99K 12 14 16 3 45 Total 30 20 35 15 100 IS 310 – Business Statistics Slide 18 Contingency Table (Independence) Test Rejection Rule 2 With = .05 and (2 - 1)(4 - 1) = 3 d.f., .05 7.815 Reject H0 if p-value < .05 or 2 > 7.815 Test Statistic 2 2 2 ( 18 16 . 5 ) ( 6 11 ) ( 3 6 . 75 ) 2 ... 16.5 11 6. 75 = .1364 + 2.2727 + . . . + 2.0833 = IS 310 – Business Statistics 9.149 Slide 19 Contingency Table (Independence) Test Conclusion Using the p-Value Approach Area in Upper Tail .10 .05 .025 .01 .005 2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Because 2 = 9.145 is between 7.815 and 9.348, the area in the upper tail of the distribution is between .05 and .025. The p-value < . We can reject the null hypothesis. Note: A precise p-value can be found using Minitab or Excel. IS 310 – Business Statistics Slide 20 Contingency Table (Independence) Test Conclusion Using the Critical Value Approach 2 = 9.145 > 7.815 We reject, at the .05 level of significance, the assumption that the price of the home is independent of the style of home that is purchased. IS 310 – Business Statistics Slide 21 Goodness of Fit Test: Poisson Distribution 1. Set up the null and alternative hypotheses. H0: Population has a Poisson probability distribution Ha: Population does not have a Poisson distribution 2. Select a random sample and a. Record the observed frequency fi for each value of the Poisson random variable. b. Compute the mean number of occurrences . 3. Compute the expected frequency of occurrences ei for each value of the Poisson random variable. IS 310 – Business Statistics Slide 22 Goodness of Fit Test: Poisson Distribution 4. Compute the value of the test statistic. 2 ( f e ) 2 i i ei i 1 k where: fi = observed frequency for category i ei = expected frequency for category i k = number of categories IS 310 – Business Statistics Slide 23 Goodness of Fit Test: Poisson Distribution 5. Rejection rule: p-value approach: Reject H0 if p-value < Critical value approach: Reject H0 if 2 2 where is the significance level and there are k - 2 degrees of freedom IS 310 – Business Statistics Slide 24 Goodness of Fit Test: Poisson Distribution Example: Troy Parking Garage In studying the need for an additional entrance to a city parking garage, a consultant has recommended an analysis approach that is applicable only in situations where the number of cars entering during a specified time period follows a Poisson distribution. IS 310 – Business Statistics Slide 25 Goodness of Fit Test: Poisson Distribution Example: Troy Parking Garage A random sample of 100 oneminute time intervals resulted in the customer arrivals listed below. A statistical test must be conducted to see if the assumption of a Poisson distribution is reasonable. # Arrivals 0 1 2 Frequency 0 1 4 10 14 20 12 12 9 IS 310 – Business Statistics 3 4 5 6 7 8 9 10 11 12 8 6 3 1 Slide 26 Goodness of Fit Test: Poisson Distribution Hypotheses H0: Number of cars entering the garage during a one-minute interval is Poisson distributed Ha: Number of cars entering the garage during a one-minute interval is not Poisson distributed IS 310 – Business Statistics Slide 27 Goodness of Fit Test: Poisson Distribution Estimate of Poisson Probability Function otal Arrivals = 0(0) + 1(1) + 2(4) + . . . + 12(1) = 600 Estimate of = 600/100 = 6 Total Time Periods = 100 Hence, IS 310 – Business Statistics 6 x e 6 f ( x) x! Slide 28 Goodness of Fit Test: Poisson Distribution Expected Frequencies x f (x ) nf (x ) x 0 1 2 3 4 5 6 .0025 .0149 .0446 .0892 .1339 .1606 .1606 .25 1.49 4.46 8.92 13.39 16.06 16.06 7 8 9 10 11 12+ Total IS 310 – Business Statistics f (x ) nf (x ) .1377 .1033 .0688 .0413 .0225 .0201 1.0000 13.77 10.33 6.88 4.13 2.25 2.01 100.00 Slide 29 Goodness of Fit Test: Poisson Distribution Observed and Expected Frequencies i fi ei f i - ei 0 or 1 or 2 3 4 5 6 7 8 9 10 or more 5 10 14 20 12 12 9 8 10 6.20 8.92 13.39 16.06 16.06 13.77 10.33 6.88 8.39 -1.20 1.08 0.61 3.94 -4.06 -1.77 -1.33 1.12 1.61 IS 310 – Business Statistics Slide 30 Goodness of Fit Test: Poisson Distribution Rejection Rule With = .05 and k - p - 1 = 9 - 1 - 1 = 7 d.f. (where k = number of categories and p = number 2 of population parameters estimated), .05 14.067 Reject H0 if p-value < .05 or 2 > 14.067. Test Statistic 2 2 2 ( 1.20) (1.08) (1.61) 2 ... 3.268 6.20 8.92 8.39 IS 310 – Business Statistics Slide 31 Goodness of Fit Test: Poisson Distribution Conclusion Using the p-Value Approach Area in Upper Tail .90 .10 .05 .025 .01 2 Value (df = 7) 2.833 12.017 14.067 16.013 18.475 Because 2 = 3.268 is between 2.833 and 12.017 in the Chi-Square Distribution Table, the area in the upper tail of the distribution is between .90 and .10. The p-value > . We cannot reject the null hypothesis. There is no reason to doubt the assumption of a Poisson distribution. Note: A precise p-value can be found using Minitab or Excel. IS 310 – Business Statistics Slide 32 Goodness of Fit Test: Normal Distribution 1. Set up the null and alternative hypotheses. 2. Select a random sample and a. Compute the mean and standard deviation. b. Define intervals of values so that the expected frequency is at least 5 for each interval. c. For each interval record the observed frequencies 3. Compute the expected frequency, ei , for each interval. IS 310 – Business Statistics Slide 33 Goodness of Fit Test: Normal Distribution 4. Compute the value of the test statistic. 2 ( f e ) 2 i i ei i 1 k 5. Reject H0 if 2 2 (where is the significance level and there are k - 3 degrees of freedom). IS 310 – Business Statistics Slide 34 Normal Distribution Goodness of Fit Test Example: IQ Computers IQ Computers (one better than HP?) IQ manufactures and sells a general purpose microcomputer. As part of a study to evaluate sales personnel, management wants to determine, at a .05 significance level, if the annual sales volume (number of units sold by a salesperson) follows a normal probability distribution. IS 310 – Business Statistics Slide 35 Normal Distribution Goodness of Fit Test Example: IQ Computers A simple random sample of 30 of the salespeople was taken and their numbers of units sold are below. 33 64 83 43 65 84 44 66 85 45 68 86 52 70 91 52 72 92 56 73 94 IQ 58 63 64 73 74 75 98 102 105 (mean = 71, standard deviation = 18.54) IS 310 – Business Statistics Slide 36 Normal Distribution Goodness of Fit Test Hypotheses H0: The population of number of units sold has a normal distribution with mean 71 and standard deviation 18.54. Ha: The population of number of units sold does not have a normal distribution with mean 71 and standard deviation 18.54. IS 310 – Business Statistics Slide 37 Normal Distribution Goodness of Fit Test Interval Definition To satisfy the requirement of an expected frequency of at least 5 in each interval we will divide the normal distribution into 30/5 = 6 equal probability intervals. IS 310 – Business Statistics Slide 38 Normal Distribution Goodness of Fit Test Interval Definition Areas = 1.00/6 = .1667 53.02 71 88.98 = 71 + .97(18.54) 71 .43(18.54) = 63.03 78.97 IS 310 – Business Statistics Slide 39 Normal Distribution Goodness of Fit Test Observed and Expected Frequencies i fi ei f i - ei Less than 53.02 53.02 to 63.03 63.03 to 71.00 71.00 to 78.97 78.97 to 88.98 More than 88.98 Total 6 3 6 5 4 6 30 5 5 5 5 5 5 30 1 -2 1 0 -1 1 IS 310 – Business Statistics Slide 40 Normal Distribution Goodness of Fit Test Rejection Rule With = .05 and k - p - 1 = 6 - 2 - 1 = 3 d.f. (where k = number of categories and p = number 2 of population parameters estimated), .05 7.815 Reject H0 if p-value < .05 or 2 > 7.815. Test Statistic 2 2 2 2 2 2 (1) ( 2) (1) (0) ( 1) (1) 2 1.600 5 5 5 5 5 5 IS 310 – Business Statistics Slide 41 Normal Distribution Goodness of Fit Test Conclusion Using the p-Value Approach Area in Upper Tail 2 Value (df = 3) .90 .10 .05 .584 6.251 7.815 .025 .01 9.348 11.345 Because 2 = 1.600 is between .584 and 6.251 in the Chi-Square Distribution Table, the area in the upper tail of the distribution is between .90 and .10. The p-value > . We cannot reject the null hypothesis. There is little evidence to support rejecting the assumption the population is normally distributed with = 71 and = 18.54. A precise p-value can be found using Minitab or Excel. IS 310 – Business Statistics Slide 42 End of Chapter 12 IS 310 – Business Statistics Slide 43