Download 2 Goodness of Fit Test

χ2 "chi-square" probability distribution 0 • • • • • • • • µ χ2 "chi-square" probability distribution χ2 continuous probability distribution shape is skewed to the right variable values on horizontal axis are ≥ 0 area under the curve represents probability extends out to infinity along positive horizontal axis shape depends on "degrees of freedom" (d.f.)  when d.f. = 2 only, curve is an exponential distribution  for higher degrees of freedom, the peak shifts to the right  when d.f. > 90, curve begins to look like normal distribution (even then χ2 is still always somewhat skewed right) mean is located a little to the right of the peak Mean and Standard Deviation are determined by the degrees of freedom mean = d.f. standard deviation = 2 (d.f .) TI –83, 84: χ2 cdf (lower bound, upper bound, degrees of freedom) We will be using the right tail for hypothesis tests for Goodness of Fit or for Independence TI –83, 84: χ2 cdf (lower bound, 10^99, degrees of freedom) χ2 "chi-square" probability distribution * * * * * * * * * ** * * * * * * * * ** * * * * * * * * ** * * * * * * * * ** * * * * * * * χ2 "chi-square" probability distributionis used for χ2 with 6 degrees of freedom χ2 with 10 degrees of freedom PRACTICE: Sketch the shape yourself. Make sure it is skewed to the right. Finding probabilities for the χ2 "chi-square" probability distribution Page 1 Test of Goodness of Fit: Hypothesis: Population fits an assumed distribution (theory) Sample data is collected from a population Hypothesis test is performed to see if the sample data supports the theory that the population fits this assumed distribution, or not. Test of Independence: Hypothesis: Two qualitative variables are independent of each other Sample data is collected from a population Hypothesis test is performed to see if the sample data supports the theory that these two variables are independent or not Hypothesis Test (and confidence intervals) for an unknown population standard deviation. How we use the Chi-square distribution to make a decision inn the Test of Independence and the Test of Goodness of Fit: χ2 Goodness of Fit Test Writing the Hypothesis Test for a Goodness Of Fit Test We compare our observed data to what the data would be expected to look like if the null hypothesis is true. Null hypothesis states a probability distribution that we think the population follows Test Statistic: Measure of the differences between Observed Data and Expected Data • If the differences are large, the Observed and Expected Data are different, so the observed data contradicts the null hypothesis If the differences are small, the Observed and Expected Data are similar so the observed data does NOT contradict the null hypothesis The test statistic measures differences between observed and expected data The test statistic in this test follows a χ2 "chi-square" probability distribution. CASE 1: Observed and Expected are very different Differences between observed and expected are large Test Statistic is large Test Statistic is far out in the right tail of the χ2 probability distribution We look at sample data to decide if the population follows that distribution or not. Null Hypothesis: Ho: The data fit the expected (hypothesized) distribution (Describe the expected hypothesized distribution either in a sentence, by using its appropriate name, by showing a formula or referring to a nearby visible table or list) Alternate Hypothesis: Ha: The data do not fit the expected (hypothesized) distribution • Decide upon α • Collect observed data. Calculate expected data. • Find the test statistic and p-value and draw the graph Test statistic = Area to the right of the test statistic is small Differences between observed and expected are small Test Statistic is small Test Statistic is NOT far out in the right tail of the χ2 probability distribution Area to the right of the test statistic is large ∑ all cells (Obs − E) 2 E in table • • CASE 2: Observed and Expected are similar Write the hypotheses Find the p-value χ2 distribution , right tailed test df = # of CELLS – 1 χ2cdf ( test statistic, 10^99, df) Draw the graph ; shade the p-value; label axis, test statistic & p-value Compare p-value to the significance level α and make a decision. Write a conclusion in the context of the problem. Calculator Instructions for test statistic STAT EDIT: Enter Observed Data frequencies in list L1 Enter Expected Theoretical frequencies in list L2 (Do NOT put totals in either list) Arrow up to the very top (title line) in list L3 and input the instruction (L1 −L2)2 / L2 STAT CALC: 1 variable stats L3. ΣX is the total of all items in list L3 ΣX = ∑ (Obs. − E ) 2 E is the test statistic The test statistic is a number measuring the size of the differences between the distributions Page 2 χ2 Goodness of Fit Test Hypothesis Test for Goodness of Fit Example 1 Draw, shade and label the graph Facebook and Twitter data are based on data in the Report on Social Network Demographics in 2012 based on data from Google's Doubleclick Ad Planner http://royal.pingdom.com/2012/08/21/report-social-network-demographics-in-2012/) The table below shows the age distribution of Facebook users Age Distribution of Facebook users in 2012 : Example: 9% of all Facebook users are age 18-24 Age Range 0-17 18-24 25-34 35-44 45-54 Percent 7% 9% 19% 20% 33% 55-64 9% 65+ 3% Suppose a random sample of 175 Twitter users has the age distribution below: Age Distributon for a sample of 175 Twitter Users: Example: 25 people in the sample of 175 Twitter users are age 18-24 Age Range 0-17 18-24 25-34 35-44 45-54 55-64 65+ Number 14 25 39 41 40 11 5 We want to know if the age distribution of Twitter users fits the same age distribution as Facebook Users. Degrees of freedom = number of CELLS – 1 = ____ – 1 = _____ Probability Distribution to use to calculate the p-value: _____ with df = _____ p-value = P(χ2 > ____) = (χ2cdf (________, ________, ______) =_______ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) Decision: ______________________________________________ Reason for Decision: Null Hypothesis Ho: _____________________________________________ _________________________________________________________________ Alternate Hypothesis HA: ________________________________________ _________________________________________________________________ α = __________ What would the sample data look like if it exactly followed the expected distribution (null hypothesis)? L1 Observed Data L2 Expected Data L3 (Observed – Expected)2 Expected 0-17 18-24 25-34 35-44 At a 5% level of significance: The sample data provide sufficient evidence to conclude that age distribution of Twitter users is different from that of Facebook users. The age distribution of Twitter users does not fit the same age distribution as Facebook users. Calculator Instructions for finding the test statistic: STAT EDIT: Enter Observed Data frequencies in list L1 Enter Expected Theoretical frequencies in list L2 (Do NOT put totals in either list) Arrow up to the very top (title line) in list L3 and input (L1 −L2)2 / L2 ; then press Enter STAT CALC: 45-54 55-64 1 variable stats L3. ΣX is the total of all items in list L3 ΣX = 65+ Total = Page 3 Conclusion: Total = Test Statistic = Sum = ______ (Obs. − E ) 2 ∑ E is the test statistic The test statistic is a number that measures the size of the differences between the distributions χ2 Goodness of Fit Test Draw, shade and label the graph Hypothesis Test for Goodness of Fit Example 2 At Cathy's Campus Coffee Cart, Cathy has a theory about what her customers order. She thinks that the distribution of her customers orders is Hypothesized Distribution Order large small caffeinated caffeinated coffee coffee Percent 30% 20% decaf coffee tea iced drinks juice 10% 10% 20% 10% Cathy wants to determine whether this is really the true distribution of orders for all her customers. Cathy tests her theory by selecting a sample of 200 customers and finds that the orders are: Observed Data from sample Order Order large caffeinated coffee Number 70 33 small caffeinated coffee 35 Degrees of freedom = number of CELLS – 1 = ____ – 1 = _____ Probability Distribution to use to calculate the p-value: _____ with df = _____ p-value = P(χ2 > ____) = (χ2cdf (________, ________, ______) =_______ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) decaf coffee tea iced drinks Decision: ______________________________________________ 10 37 15 Reason for Decision: Null Hypothesis Ho: ______________________________________________ _______________________________________________________________________ _______________________________________________________________________ Alternate Hypothesis HA: ________________________________________ _______________________________________________________________________ _______________________________________________________________________ α = __________ What would the sample data look like if it exactly followed the expected distribution (null hypothesis)? L1 Observed Data L2 Expected Data L3 (Observed – Expected)2 Expected large coffee small ff decaf tea iced drinks juice Total = Page 4 Conclusion: At a 5% level of significance, the sample data provide evidence to conclude that the true distribution of purchases of all Cathy's customers does NOT fit the expected distribution of 30% large caffeinated coffee; 20% small caffeinated coffee; 10% decaf coffee; 10% tea; 20% iced drinks; 10% juice Calculator Instructions for finding the test statistic: STAT EDIT: Enter Observed Data frequencies in list L1 Enter Expected Theoretical frequencies in list L2 (Do NOT put totals in either list) Arrow up to the very top (title line) in list L3 and input (L1 −L2)2 / L2 ; then press Enter STAT CALC: 1 variable stats L3. ΣX is the total of all items in list L3 ΣX = (Obs. − E ) 2 ∑ E is the test statistic The test statistic is a number that measures the size of the differences between the distributions Total = Test Statistic = Sum = ______ χ2 Goodness of Fit Test Draw, shade and label the graph Hypothesis Test for Goodness of Fit Example 3 Is the random number generator in the TI-84 calculator uniformly distributed when simulating rolling a 6-sided die? I used my TI-84 calculator to generate a list of 500 random integers between 1 and 6; this simulates rolling a 6 sided die 500 times. (That is the largest sample my calculator memory would hold on the day I did this). If the random number generator is not biased, we would expect the outcomes to be uniformly distributed over the integers 1,2,3,4,5,6 with some small amount of variation due to randomness. In the chapter for Central Limit Theorem we had each student in the class use their calculators to roll a die 100 times. Based on my feelings (not knowledge) about the numbers that students get, I have suspected that the random number generator in the TI calculator might be biased and might not be uniformly distributed. Degrees of freedom = number of CELLS – 1 = ____ – 1 = _____ Probability Distribution to use to calculate the p-value: _____ with df = _____ p-value = P(χ2 > ____) = (χ2cdf (________, ________, ______) =_______ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) Here is the sample my calculator to generated to simulate rolls of the die: Number Showing on Die Number of Rolls 1 2 3 4 5 6 78 84 89 81 100 68 Decision: ______________________________________________ Reason for Decision: Based on the sample data, are random numbers simulating dice rolls on the TI-84 calculator uniformly distributed? Null Hypothesis Ho: ______________________________________________ Conclusion: _________________________________________________________________ At a 5% level of significance, ______________________________ Alternate Hypothesis HA: ________________________________________ _____________________________________________________ _________________________________________________________________ _____________________________________________________ α = __________ _____________________________________________________ What would the sample data look like if it exactly followed the expected distribution (null hypothesis)? _____________________________________________________ Outcome of Rolling Die L1 Observed Data L2 Expected Data L3 (Observed – Expected)2 Expected 1 2 3 4 5 6 STAT CALC: 1 variable stats L3. ΣX is the total of all items in list L3 (Obs. − E ) 2 ΣX = ∑ E Total = Page 5 Calculator Instructions for finding the test statistic: STAT EDIT: Enter Observed Data frequencies in list L1 Enter Expected Theoretical frequencies in list L2 (Do NOT put totals in either list) Arrow up to the very top (title line) in list L3 and input (L1 −L2)2 / L2 ; then press Enter Total = Test Statistic = Sum = ______ is the test statistic The test statistic is a number that measures the size of the differences between the distributions χ2 Goodness of Fit Test Draw, shade and label the graph Hypothesis Test for Goodness of Fit Example 4 Are births uniformly distributed by day of the week? The Health Commissioner in River County wants to know whether births in the county are uniformly distributed by day of week. A sample of 434 randomly selected births at hospitals in the county yield the following data. Degrees of freedom = number of CELLS – 1 = ____ – 1 = _____ Day of the Week Number of Births in sample Sun. 41 Mon. 64 Tues. Wed. 71 72 Thurs. Fri. 71 69 Sat. 46 Based on information in the National Vital Statistics Report January 7, 2009 http://www.cdc.gov/nchs/data/nvsr/nvsr57/nvsr57_07.pdf Probability Distribution to use to calculate the p-value: _____ with df = _____ p-value = P(χ2 > ____) = (χ2cdf (________, ________, ______) =_______ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) Null Hypothesis Ho: ______________________________________________ Decision: ______________________________________________ _________________________________________________________________ Reason for Decision: Alternate Hypothesis HA: ________________________________________ _________________________________________________________________ α = __________ At a 5% level of significance, ______________________________ What would the sample data look like if it exactly followed the expected distribution (null hypothesis)? Day of Week L1 Observed Data Conclusion: L2 Expected Data L3 (Observed – Expected)2 Expected _____________________________________________________ _____________________________________________________ _____________________________________________________ _____________________________________________________ Calculator Instructions for finding the test statistic: STAT EDIT: Enter Observed Data frequencies in list L1 Enter Expected Theoretical frequencies in list L2 (Do NOT put totals in either list) Arrow up to the very top (title line) in list L3 and input (L1 −L2)2 / L2 ; then press Enter STAT CALC: Total = Total = Test Statistic = Sum = ______ 1 variable stats L3. ΣX is the total of all items in list L3 ΣX = (Obs. − E ) 2 ∑ E is the test statistic The test statistic is a number that measures the size of the differences between the distributions Page 6 χ2 Hypothesis Test for Independence χ2 Hypothesis Test for Independence Re-Exploring Independence in a Contingency Table: EXAMPLE 5: Lina is a statistics student doing a project to compare ice cream flavor preferences at 3 ice cream stores in different cities. She wants to determine if customer preferences are related to store location or if they are independent.. She will select a sample of customers, and categorize each customer by store location and flavor preference. Assume that there are only 3 flavors A. Fremont(F) Gilroy(G) Hayward (H) Chocolate (C) 25 25 50 Vanilla (V) 15 15 30 Mint Chip (M) 10 10 20 Total 50 50 100 Are flavor preference and store location INDEPENDENT? Total 100 60 40 200 We use a hypothesis Test for Independence to help us decide whether two qualitative variables are independent or not. Null hypothesis assumes that the variables ARE independent. Alternate hypothesis challenges us to show that the variables are NOT independent. We calculate what the data is expected to look like if the variables are independent as in Ho. We compare the observed data to the expected data Is the "observed data" sufficiently different from "expected" to decide that Ho is not true. OR is the "observed data" reasonably close to "expected" Writing the Hypothesis Test for Independence • Write the hypotheses B: Fremont (F) Gilroy(G) Hayward (H) Chocolate (C) 20 10 70 Vanilla (V) 20 10 20 Mint Chip (M) 10 30 10 Total 50 50 100 Are flavor preference and store location INDEPENDENT? Total 100 50 50 200 Null Hypothesis: The variables (DESCRIBE THEM) are independent Alternate Hypothesis: The variables (DESCRIBE THEM) are NOT independent • Decide upon significance level α • Collect observed data. Calculate expected data. • Find the test statistic and p-value: (Obs − E) 2 Test statistic ∑ E all cells C: Fremont (F) Gilroy(G) Hayward (H) Chocolate (C) 27 24 49 Vanilla (V) 13 15 32 Mint Chip (M) 10 11 19 Total 50 50 100 Are flavor preference and store location INDEPENDENT? How far can our data vary from the independent data in part A and still be considered "similar"? How far must our data vary from the independent data in part A in order to be considered "different"? Page 7 in table Total 100 60 40 200 Find the p-value χ2 distribution , right tailed test df = (# of Rows – 1) (# of Columns -1) χ2cdf ( test statistic, 10^99, df) Draw the graph ; shade the p-value; label axis, test statistic & p-value • Compare p-value to the significance level α and make a decision. • Write a conclusion in the context of the problem. χ2 Hypothesis Test for Independence (in a 2-way contingency table) Measure of the difference for each cell inside the table EXAMPLE 6: Lina is a statistics student doing a project to compare ice cream flavor preferences at 3 ice cream stores in different cities. She wants to determine if customer preferences are related to store location or if they are independent.. She will select a sample of customers, and categorize each customer by store location and flavor preference. Are flavor preference and store location independent? Data Set D:Lina's actual data: OBSERVED DATA Fremont (F) Chocolate (C) Vanilla (V) Mint Chip (M) Total 22 10 18 50 Gilroy( G) 30 12 8 50 Hayward (H) 48 38 14 100 Total 100 60 40 200 (observed data value − expected data value) 2 (Obs − E) 2 = expected data value E Add them all up to get: Test Statistic = (Obs − E) 2 ∑ E all cells in table Test Statistic = (22−25) /25 + (30−25) /25 + (48−50)2/50 2 2 + (10−15)2/15 + (12−15)2/15 + (38−30)2/30 + (18−10)2/10 + (8−10)2/10 + (14−20)2/20 Test Statistic = 14.44 Draw, shade and label the graph Null Hypothesis: Ho: __________________________________________________________ ____________________________________________________________ Alternate Hypothesis: Degrees of freedom = (number of rows – 1)(number of columns – 1) Ha: __________________________________________________________ ____________________________________________________________ Compare OBSERVED Data to how the data is EXPECTED to look IF INDEPENDENT Calculating the EXPECTED DATA (assumes independence) EXPECTED DATA Fremont (F) Gilroy( G) Hayward (H) Chocolate (C) df = (___ – 1)(____ – 1) = (___)(____) = ____ Probability Distribution to use to calculate p-value: _____ with df = _____ p-value = P(χ2 > _____) = (χ2cdf (_________, _________, _____) = ____ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) Total 100 Decision: Reason for Decision: Vanilla (V) 60 Mint Chip (M) 40 Conclusion: At a 5% level of significance, the data show sufficient evidence to conclude that customer ice cream flavor preference and store location are NOT independent. 200 Flavor preference depends on store location. Total 50 50 100 How close is actual "observed data (Obs)" to "expected data (E)"? Page 8 EXAMPLE 7: χ2 Hypothesis Test for Independence A professor believes that giving extra time on an exam won’t affect the grades. He divided a random sample of 300 students into two groups and gave the same test. The first group had one hour to complete the exam; the second group had unlimited time. At a 5% level of significance, is there sample evidence to show time on exam and grade are related? OBSERVED DATA 1 hour Unlimited time TOTAL Null Hypothesis: Grade Earned on Exam C A B 65 23 42 85 17 48 150 40 90 F 12 8 20 TOTAL 142 158 300 Ho: Grade and Time are Independent Notes: The test was done by putting the observed data into a matrix and using the χ2 Test. The calculator calculated the expected data and stored it in another matrix. The calculator found the test statistic as a measure of the differences between the observed and expected data. The calculator used the Chi-Square distribution and the test statistic to calculate the p-value using the χ2cdf p-value = P(χ2 > 3.92) = χ2cdf (3.92, 10^99, 3) = 0.2697 TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) Alternate Hypothesis: Ha: Grade and Time are Dependent Use the χ2 Test on your calculator to find the following: Find the output matrix with the EXPECTED DATA and fill it in Test Statistic = 3.92 Degrees of freedom = (number of rows – 1)(number of columns – 1) Degrees of freedom = (2 – 1)(4 – 1) = (1)(3) = 3 Probability Distribution used to calculate p-value: χ2 with df = 3 p-value = 0.2597 Draw, shade and label the graph Decision: Do NOT Reject Ho Reason for Decision: Pvalue > α 0.2697 > 0.05 Conclusion: At a 5% level of significance the data do not show evidence to show that time allowed for an exam and grade earned on the exam are dependent. We believe that time allowed for an exam and grade earned on the exam are independent. Page 9 OBSERVED DATA 1 hour Unlimited time TOTAL Grade Earned on Exam C A B 71 18.93 42.6 79 21.07 47.4 150 40 90 F 9.47 10.53 20 TOTAL 142 158 300 SEE NEXT PAGE FOR CALCULATOR INSTRUCTIONS FOR THE χ2 Test AND FOR FINDING THE EXPECTED DATA FORMULAS AND CALCULATOR INSTRUCTIONS TEST OF INDEPENDENCE Row Total  Formula for Expected Data in Cell =   (Column Total ) Total for Table   Type of Test Homogeneity Variables One variable, across multiple populations How close is actual "observed data (Obs)" to "expected data (E)"? Independence One population categorized according to two variables Measure differences for each cell inside the table (observed data value − expected data value) 2 (Obs − E) 2 = expected data value E (Obs − E) 2 Add them all up to get: Test Statistic = ∑ E all cells in table Calculator Instructions for Test of Independence: Entering the data: 2nd MATRIX EDIT; select matrix [A] and press enter. Enter matrix size: number of rows x number of columns, pressing enter after each is entered NOTE: Look only at data when figuring out size; do NOT include row or column totals in the matrix. Matrix on calculator will change shape on the screen to match the stated size. Enter observed data into matrix A, pressing enter after each number is entered. Performing the hypothesis test to find the p-value and test statistic STAT TESTS χ2 Test Observed: 2nd Matrix 1: [A] Enter (This is the input to the test) Expected: 2nd Matrix 2: [B] Enter (This is output the calculator creates when doing the test) OUTPUT SCREEN will show p-value and test statistic and degrees of freedon ON HOME SCREEN: To see matrices containing observed and expected data Observed: 2nd Matrix 1: [A] Enter Expected: 2nd Matrix 2: [B] Enter You may need to scroll to the right to see part of the matrix if it is too wide to fit on y our calculator screen Page 10 What is Tested? Is the variable similar across all populations? Does the the variable follows the same unknown distribution for all populations. Are the two variables independent or not independent? The mathematics and procedures of the Test of Homogeneity and the Test of Independence are the same. The sampling method is what distinguishes between which type of test it is. Example: Selecting a sample of male students and a sample of female students and testing to see whether the age distributions for males and females are the same or different is a test of _________________________________ Selecting a random sample of students, categorizing each student by age and gender, and then testing to see whether the age distribution is related to gender or not is a test of _________________________________ Example: We select a random sample of hospital births. We categorize each birth by the day of week on which it occurred and by whether it was a natural (vaginal) birth or a surgical birth (cesarean section). We test to see whether the variables type of birth and day or week are independent. This is a test of _____________________________________________ We select a random sample of natural births and a random sample of surgical births at a hospital. We test to see whether the pattern of births by day of the week are the same for the populations of natural births and surgical births. This is a test of _____________________________________________ Hypotheses: Test of Independence: Ho: The variables are independent Ha: The variables are not independent Hypotheses: Test of Homogeniety Ho: The distributions of the two populations are the same. Ha: The distributions of the two populations are not the same. χ2 Hypothesis Test for Independence (in a 2-way contingency table) Draw, shade and label the graph EXAMPLE 8: A random sample of 1343 US adults age 25 and over were surveyed and asked about their educational status and whether they were employed or unemployed. Are employment status and educational status independent? SOURCE: Based on information in the Current Population Survey Feb 2013 http://www.bls.gov/webapps/legacy/cpsatab4.htm http://data.bls.gov/pdq/SurveyOutputServlet Not a high High OBSERVED school School DATA Graduate Graduate Employed 100 333 Unemployed 13 29 TOTAL 113 362 p-value = P(χ2 > _____) = (χ2cdf (_________, _________, _____) = ____ Some College or Associate Degree 348 25 373 Bachelors Degree or Higher 476 19 495 TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) TOTAL 1257 86 1343 Decision: Reason for Decision: Conclusion: Null Hypothesis: Ho: __________________________________________________________ ____________________________________________________________ Alternate Hypothesis: Find the output matrix with the EXPECTED DATA and fill it in Ha: __________________________________________________________ ____________________________________________________________ Use the χ2 Test on your calculator to find the following: Test Statistic = _____________ Probability Distribution used to calculate p-value: _____ with df = _____ Degrees of freedom = (number of rows – 1)(number of columns – 1) EXPECTED DATA Employed Unemployed TOTAL Not a high school Graduate High School Gradutate Some College or Associate Degree Bachelors Degree or Higher 113 362 373 495 By hand, show the calculation that the calculator has performed to find the expected number of a) unemployed people with bachelor's degree or higher. p-value = _______________ b) employed people with some college or associate degree Page 11 TOTAL 1257 86 1343 χ2 Hypothesis Test for Independence (in a 2-way contingency table) Draw, shade and label the graph EXAMPLE 9: A store conducts a survey of a random sample of customers. The survey asks about income level and whether the customer shops online or in the store. Conduct a hypothesis test to determine whether the income level of customers and whether customers shop in the store or online are independent. OBSERVED DATA Income < $50,000 $50,000-$99,999 $100,000 or higher TOTAL Shop in the store only 40 53 31 124 Shop online only 22 35 26 83 Shop both in store and online 55 94 44 193 p-value = P(χ2 > _____) = (χ2cdf (_________, _________, _____) = ____ TI –83, 84:χ2 cdf (lower bound, 10^99, degrees of freedom) TOTAL 117 182 101 400 Decision: Reason for Decision: Conclusion: Null Hypothesis: Ho: __________________________________________________________ ____________________________________________________________ Find the output matrix with the EXPECTED DATA and fill it in Alternate Hypothesis: Ha: __________________________________________________________ ____________________________________________________________ Use the χ2 Test on your calculator to find the following: Test Statistic = _____________ Probability Distribution used to calculate p-value: _____ with df = _____ Degrees of freedom = (number of rows – 1)(number of columns – 1) EXPECTED DATA Income < $50,000 $50,000-$99,999 $100,000 or higher TOTAL Shop in the store only 124 Shop online only Shop both in store and online 83 193 By hand, show the calculation that the calculator has performed to find the expected number of a) people with income of <$50,000 who shop in the store only. p-value = _______________ b) people with income of $100,000 + who shop online only Page 12 TOTAL 117 182 101 400

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 2 Goodness of Fit Test