Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stats 7 Homework 6: Due Wed. Mar. 2 by 5:00pm You may either hand-write or type your homework assignments. If you hand-write your homework, it must be well-organized, clean, and legible; if we cannot read your handwriting, you may lose points. Each problem is worth one point unless otherwise specified, for a total of 10 points. A select number of problems will be graded based on correctness; the rest will be graded based on completion. 1. Researchers studied a random sample of North Carolina high school students who participated in interscholastic athletics to learn about the risk of lower-extremity injuries (anywhere between hip and toe) for interscholastic athletes (Yang et al., 2005). Of 999 participants in girls soccer, 74 experienced lower-extremity injuries. Of 1667 participants in boys soccer, 153 experienced lower-extremity injuries. (a) What is the population in this study? (b) What is the sample in this study? (c) Write null and alternative hypotheses for a chi-squared test of these data. (d) For these data, the value of the chi-square statistic is 2.51, and the p-value for the chisquare test is 0.113. Based on these results, state a conclusion about the two variables in this situation and explain how you came to this conclusion. (e) Use the general two-way table inference simulation applet from discussion activity #2 (http://www.rossmanchance.com/applets/ChiSqShuffle.html) to calculate an empirical p-value for these data. (Click the box next to “2x2” to enter the sample data; Groups A and B are girls and boys; “Success” is experiencing lower-extremity injuries.) Print and turn in a screen shot of your applet result. Does the simulated p-value differ from the p-value in part (d)? If so, how? (f) For each sex separately, calculate the percent of participants who had a lower-extremity injury. Explain how the difference between these percentages is consistent with the conclusion you stated in part (d). 2. A large Internet provider conducted a survey of its customers. One question that it asked was how many e-mail messages the respondent had received the previous day. The mean number was 13.2. (a) What variable did the study measure on each customer? Is this variable quantitative or categorical? (b) What is the population of interest for this study? (c) What is the population parameter in this study? Define the parameter both in words and give the appropriate symbol for the parameter. (d) What is the value of the sample estimate (statistic)? What is the appropriate symbol for this sample estimate? 3. Consider a situation in which a random sample of 1000 U.S. adults is surveyed and each individual is asked whether or not they believe Obama should appoint the next Supreme Court justice. Researchers would like to test the hypothesis that the majority (more than half) of U.S. adults believe Obama should appoint the next Supreme Court justice. If a new random sample of 1000 adults is taken from the same population, explain whether each of the following would change: (a) The population proportion, p. (b) The sample proportion, p̂. (c) The mean of p̂. (d) The standard deviation of p̂. (e) The standard error p̂ (used in calculating a confidence interval for p). (f) The null standard error of p̂ (used in calculating the test statistic). 4. Vehicle speeds at a certain highway location are believed to follow a normal distribution with mean µ = 60 mph and standard deviation σ = 6 mph. The speeds for a randomly selected sample of n = 23 vehicles will be recorded. (a) Give numerical values for the mean and standard deviation of the sampling distribution of possible sample means for randomly selected samples of n = 23 from the population of vehicle speeds. (b) Does the sampling distribution of the possible sample means have an approximate normal distribution? Explain. (c) Use the Empirical Rule to find values that fill in the blanks in the following sentence: For a random sample of n = 23 vehicles, there is about a 95% chance that the mean vehicle speed in the sample will be between and mph. (d) Sample speeds for a random sample of 23 vehicles are measured at this location, and the sample mean is 66 mph. Given the answer to part (c), explain whether this result is consistent with the belief that the mean speed at this location is µ = 60 mph. 5. Small planes cannot fly well if the payload (people, luggage, and fuel) weighs too much. Suppose that an airline runs a commuter flight that holds 40 people. The airline knows that the weights of passenger plus luggage for typical customers on this flight is approximately normal with a mean of 210 pounds and a standard deviation of 25 pounds. (a) Draw a picture of the distribution of the weights of passenger plus luggage. Clearly label the x-axis and specify the mean and scale on the x-axis. (b) Describe the sampling distribution (mean, standard deviation, shape) of the mean weight of passenger plus luggage for a random sample of 40 customers. (c) Superimpose the sampling distribution from part (b) on your picture from part (a) (i.e., draw the two distributions on the same graph). Label it clearly and remember that the total area under each curve must equal one. (d) Assume that customers on any particular flight are similar to a random sample. If the total weight of passengers and their luggage should not exceed 8800 pounds, what is the probability that a sold-out flight (40 passengers and their luggage) will exceed the weight limit? (Hint: Rewrite the desired limit as an average per passenger.) 6. Two researchers are testing the null hypothesis that a population proportion p is equal to 0.30, and the alternative hypothesis that p 6= 0.30. Both take samples of 100 observations. Researcher A finds a sample proportion of 0.29, and Researcher B finds a sample proportion of 0.34. For which researcher will the p-value of the test be smaller? Explain without actually doing any computations. 7. A multiple choice test consists of 15 questions with four choices each. The teacher wants to test the hypothesis that a student is just guessing versus the hypothesis that the probability of a correct answer on each question is higher than it would be if the student were guessing. (a) Specify the parameter of interest, both in words and using the appropriate symbol. (b) Write the null and alternative hypotheses in terms of the parameter from part (a). (c) If a student chooses the correct answer on eight of the 15 questions, can you use a normal approximation to calculate the p-value? Why or why not? (d) Calculate the exact p-value using the appropriate binomial distribution. (e) What conclusion can be made about whether someone who got eight correct answers was guessing? Explain. Indicate the level of significance that you used in determining your conclusion. 8. A study was done to determine whether there is a relationship between snoring and the risk of heart disease (Norton and Dunn, 1985). Among 1105 snorers in the study, 85 had heart disease, while only 24 of the 1379 nonsnorers had heart disease. (a) Is this an observational study or a randomized experiment? Explain how you know. (b) For the snorers population, calculate a 90% confidence interval for the proportion who have heart disease. Write a sentence interpreting this interval. (c) For the nonsnorers population, calculate a 90% confidence interval for the proportion who have heart disease. Write a sentence interpreting this interval. (d) Based on your intervals, can we infer that the population proportions with heart disease differ for nonsnorers and snorers? Explain. 9. Consider this quote: “In a recent survey, 61 out of 100 consumers reported that they preferred plastic bags instead of paper bags for their groceries. If there is no difference in the proportions who prefer each type in the population, the chance of such extreme results in a sample of this size is about .03. Because .03 is less than .05, we can conclude that there is a statistically significant difference in preference.” Give a numerical value for each of the following. (a) The p-value. (b) The level of significance, α. (c) The sample proportion. (d) The sample size. (e) The null value. 10. A Gallup poll released on October 13, 2000 (Chambers, 2000) found that 47% of the 1052 U.S. adults surveyed classified themselves as “very happy” when given the choices of “very happy,” “fairly happy,” or “not too happy.” Suppose that a journalist who is a pessimist took advantage of this poll to write the headline “Poll finds that U.S. adults who are very happy are in the minority.” If p = the proportion of all U.S. adults who were very happy in 2000, go through the five steps of hypothesis testing and determine if the headline is justified. Use level of significance a α = 0.05. Be sure to comment on the headline in your conclusion.