Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SOLUTIONS ACTIVITY SET 14 Activity 14.1 For each of the following research questions does the situation or research question involve independent samples or paired data? a. Twenty-five people have their cholesterol measure before eating a Big Mac and again after eating a Big Mac. On average, does eating a Big Mac increase cholesterol? Paired data – the measurements will be taken twice on the same 25 subjects b. What is the difference in average ages at which teachers and plumbers retire? Two independent samples c. What is the difference in average salaries for high school graduates and college graduates? Two independent samples d. In fifty married couples, the husband and wife each separately take the same test of marital satisfaction. Is there a difference, on average, between the scores of husbands and wives? Paired data – spousal data is often analyzed as paired data. Activity 14.2 In the Datasets folder click the link for the “GSS Dataset” to open Minitab with the data in place. The data are from the 2002 General Social Survey, a federally funded national survey done every other year by the University of Chicago. The variable marital indicates whether the respondent is presently married or not. We’ll compare the mean amount of television watching per typical day (tvhours is the variable) for those who are married versus those who are not. a. In words, write a null hypothesis for this situation. We’re comparing two means (television watching for married people versus unmarried people). Null: no difference in mean television watching for married people and unmarried people b. Using statistical notation for means write null and alternative hypotheses for this problem. H0: μ1 – μ2 = 0 or equivalently H0: μ 1 = μ 2 Ha: μ1 – μ2 ≠0 or equivalently H0: μ 1 ≠ μ 2 c. Recall from the lecture notes that when doing a two-sample t-test one consideration is whether the two standard deviations (or variances) are equal. To check, go to Stat > Basic Statistics > Display Descriptive Statistics and enter tvhours in the Variables window and marital in the By Variables window. i. What are the two standard deviations? Variable tvhours marital 1_NotMarried 2_Married N 496 387 N* 1 1 Mean 3.276 2.6072 SE Mean 0.121 0.0927 StDev 2.693 1.8241 Minimum 0.000 0.0000 Q1 2.000 1.0000 ii. Is the larger standard deviation more than twice the smaller standard deviation? No, 2.693 is not more than twice 1.8241 iii. If your answer to part ii is “Yes” then we will use the unpooled method for calculating the standard error. If your answer was “No” then we can use the pooled method. Which method should we use? Pooled d. The two-sample t-test is used to compare means when data is from two independent samples (as it is here). Use Stat>Basic Statistics>2-sample t. At the top of the dialog box, enter tvhours in the “Samples” box and marital in the “Subscripts” box. If your answer to part iii above is to use pooled then click the box for “Assume Equal Variances”. Read the output to find the values of the t-statistic and the p-value. t= p-value = 0.000 4.19 e. State a conclusion about the hypotheses and about the “real world” situation. Reject the null. Conclude population mean tv hours differs for the two groups. It looks like the mean is higher for unmarried. f. The formula for the pooled t-statistic is t in the formula. x 1 3.28 x 2 2.61 x1 x2 . Give values for each of the elements 1 1 sp n1 n2 s p 2.3524 n 1 496 n 2 387 g. The output includes a 95% confidence interval for the difference between means. Write a sentence that interprets this interval in terms of how much difference there is mean television watching for the two groups. We are 95% confident that the difference in means for the two group is somewhere between 0.36 and 0.98 hours per day (with not married having a higher mean) h. Refer again the to the 95% confidence interval of the previous part. Explain why it is evidence that makes it reasonable to conclude that the population means differ. The interval does not include 0 so we can reject no difference as a possibility. Activity 14.3 In a national survey of 12th graders, 254 of 1356 boys said they never or rarely wear a seatbelt when driving. Among 1168 girls, 97 said they never or rarely wear a seatbelt when driving. a. Let p1 = population proportion that never or rarely wears a seatbelt for boys and p2 = the corresponding proportion for girls. Write null and alternative hypotheses about p1 and p2. H0: p1 – p2 = 0 or equivalently H0: p1 = p2 Ha: p1 – p2 ≠ 0 or equivalently Ha: p1 ≠ p2 or it’s okay to use H0: p1 – p2 <0 implying we think girls are less likely to never wear a seatbelt b. Start Minitab (you can use the Start menu to do this). Use Stat>Basic Stats> 2 proportions. Click on Summarized data. Use the boys as the first sample and girls as the second sample. “Number of trials” means sample size and “Number of events” means number rarely or never wearing a seatbelt. Use the output to give values for the following: For boys, sample proportion = p̂1 = .187 For girls, sample proportion = p̂ 2 = .083 The difference between the sample proportions is p̂1 p̂ 2 = .104 Value of z-statistic = 7.83 p-value = 0.000 c. Explain whether we can we say there is a difference between the population proportions in this situation. We can conclude that there is a difference (p-value =0.000 is less than 0.05) Activity 14.4 DEFINITIONS A type 1 error occurs if we incorrectly pick the alternative hypothesis (pick it when null is really the truth) A type 2 error occurs if we incorrectly pick the null hypothesis (pick it when the alternative is really the truth). a. Refer back to part a of Activity 14.2. Explain what the type 1 and type 2 errors are in this situation. Do this in terms of the "real world" situation. Type 1 = deciding the mean number of hours watched between married and unmarried people differs in the population when in reality this difference does not exist. Type 2 = deciding the mean number of hours watched between married and unmarried people does not differ in the population when in reality this difference does exist. b Refer back to part a of Activity 14.3. Explain what the type 1 and type 2 errors are in this situation. Do this in terms of the "real world" situation. Type 1 = deciding the proportion of girls who wear seatbelts differs from the proportion of boys who wear seatbelts when in reality this difference does not exist. Type 2 = deciding the proportion of girls who wear seatbelts does not differ from the proportion of boys who wear seatbelts when in reality this difference does exist.