Download Stats 7 Homework 6: Due Wed. Mar. 2 by 5:00pm You may either

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Stats 7 Homework 6: Due Wed. Mar. 2 by 5:00pm
You may either hand-write or type your homework assignments. If you hand-write your homework, it must be well-organized, clean, and legible; if we cannot read your handwriting, you may lose
points. Each problem is worth one point unless otherwise specified, for a total of 10 points. A select
number of problems will be graded based on correctness; the rest will be graded based on completion.
1. Researchers studied a random sample of North Carolina high school students who participated
in interscholastic athletics to learn about the risk of lower-extremity injuries (anywhere between
hip and toe) for interscholastic athletes (Yang et al., 2005). Of 999 participants in girls soccer,
74 experienced lower-extremity injuries. Of 1667 participants in boys soccer, 153 experienced
lower-extremity injuries.
(a) What is the population in this study?
(b) What is the sample in this study?
(c) Write null and alternative hypotheses for a chi-squared test of these data.
(d) For these data, the value of the chi-square statistic is 2.51, and the p-value for the chisquare test is 0.113. Based on these results, state a conclusion about the two variables in
this situation and explain how you came to this conclusion.
(e) Use the general two-way table inference simulation applet from discussion activity #2
(http://www.rossmanchance.com/applets/ChiSqShuffle.html) to calculate an empirical
p-value for these data. (Click the box next to “2x2” to enter the sample data; Groups A
and B are girls and boys; “Success” is experiencing lower-extremity injuries.) Print and
turn in a screen shot of your applet result. Does the simulated p-value differ from the
p-value in part (d)? If so, how?
(f) For each sex separately, calculate the percent of participants who had a lower-extremity
injury. Explain how the difference between these percentages is consistent with the conclusion you stated in part (d).
2. A large Internet provider conducted a survey of its customers. One question that it asked was
how many e-mail messages the respondent had received the previous day. The mean number
was 13.2.
(a) What variable did the study measure on each customer? Is this variable quantitative or
categorical?
(b) What is the population of interest for this study?
(c) What is the population parameter in this study? Define the parameter both in words and
give the appropriate symbol for the parameter.
(d) What is the value of the sample estimate (statistic)? What is the appropriate symbol for
this sample estimate?
3. Consider a situation in which a random sample of 1000 U.S. adults is surveyed and each
individual is asked whether or not they believe Obama should appoint the next Supreme
Court justice. Researchers would like to test the hypothesis that the majority (more than
half) of U.S. adults believe Obama should appoint the next Supreme Court justice. If a new
random sample of 1000 adults is taken from the same population, explain whether each of the
following would change:
(a) The population proportion, p.
(b) The sample proportion, p̂.
(c) The mean of p̂.
(d) The standard deviation of p̂.
(e) The standard error p̂ (used in calculating a confidence interval for p).
(f) The null standard error of p̂ (used in calculating the test statistic).
4. Vehicle speeds at a certain highway location are believed to follow a normal distribution with
mean µ = 60 mph and standard deviation σ = 6 mph. The speeds for a randomly selected
sample of n = 23 vehicles will be recorded.
(a) Give numerical values for the mean and standard deviation of the sampling distribution
of possible sample means for randomly selected samples of n = 23 from the population of
vehicle speeds.
(b) Does the sampling distribution of the possible sample means have an approximate normal
distribution? Explain.
(c) Use the Empirical Rule to find values that fill in the blanks in the following sentence: For
a random sample of n = 23 vehicles, there is about a 95% chance that the mean vehicle
speed in the sample will be between
and
mph.
(d) Sample speeds for a random sample of 23 vehicles are measured at this location, and the
sample mean is 66 mph. Given the answer to part (c), explain whether this result is
consistent with the belief that the mean speed at this location is µ = 60 mph.
5. Small planes cannot fly well if the payload (people, luggage, and fuel) weighs too much.
Suppose that an airline runs a commuter flight that holds 40 people. The airline knows that
the weights of passenger plus luggage for typical customers on this flight is approximately
normal with a mean of 210 pounds and a standard deviation of 25 pounds.
(a) Draw a picture of the distribution of the weights of passenger plus luggage. Clearly label
the x-axis and specify the mean and scale on the x-axis.
(b) Describe the sampling distribution (mean, standard deviation, shape) of the mean weight
of passenger plus luggage for a random sample of 40 customers.
(c) Superimpose the sampling distribution from part (b) on your picture from part (a) (i.e.,
draw the two distributions on the same graph). Label it clearly and remember that the
total area under each curve must equal one.
(d) Assume that customers on any particular flight are similar to a random sample. If the
total weight of passengers and their luggage should not exceed 8800 pounds, what is the
probability that a sold-out flight (40 passengers and their luggage) will exceed the weight
limit? (Hint: Rewrite the desired limit as an average per passenger.)
6. Two researchers are testing the null hypothesis that a population proportion p is equal to
0.30, and the alternative hypothesis that p 6= 0.30. Both take samples of 100 observations.
Researcher A finds a sample proportion of 0.29, and Researcher B finds a sample proportion
of 0.34. For which researcher will the p-value of the test be smaller? Explain without actually
doing any computations.
7. A multiple choice test consists of 15 questions with four choices each. The teacher wants to
test the hypothesis that a student is just guessing versus the hypothesis that the probability
of a correct answer on each question is higher than it would be if the student were guessing.
(a) Specify the parameter of interest, both in words and using the appropriate symbol.
(b) Write the null and alternative hypotheses in terms of the parameter from part (a).
(c) If a student chooses the correct answer on eight of the 15 questions, can you use a normal
approximation to calculate the p-value? Why or why not?
(d) Calculate the exact p-value using the appropriate binomial distribution.
(e) What conclusion can be made about whether someone who got eight correct answers was
guessing? Explain. Indicate the level of significance that you used in determining your
conclusion.
8. A study was done to determine whether there is a relationship between snoring and the risk
of heart disease (Norton and Dunn, 1985). Among 1105 snorers in the study, 85 had heart
disease, while only 24 of the 1379 nonsnorers had heart disease.
(a) Is this an observational study or a randomized experiment? Explain how you know.
(b) For the snorers population, calculate a 90% confidence interval for the proportion who
have heart disease. Write a sentence interpreting this interval.
(c) For the nonsnorers population, calculate a 90% confidence interval for the proportion
who have heart disease. Write a sentence interpreting this interval.
(d) Based on your intervals, can we infer that the population proportions with heart disease
differ for nonsnorers and snorers? Explain.
9. Consider this quote: “In a recent survey, 61 out of 100 consumers reported that they preferred
plastic bags instead of paper bags for their groceries. If there is no difference in the proportions
who prefer each type in the population, the chance of such extreme results in a sample of this
size is about .03. Because .03 is less than .05, we can conclude that there is a statistically
significant difference in preference.” Give a numerical value for each of the following.
(a) The p-value.
(b) The level of significance, α.
(c) The sample proportion.
(d) The sample size.
(e) The null value.
10. A Gallup poll released on October 13, 2000 (Chambers, 2000) found that 47% of the 1052
U.S. adults surveyed classified themselves as “very happy” when given the choices of “very
happy,” “fairly happy,” or “not too happy.” Suppose that a journalist who is a pessimist took
advantage of this poll to write the headline “Poll finds that U.S. adults who are very happy
are in the minority.” If p = the proportion of all U.S. adults who were very happy in 2000,
go through the five steps of hypothesis testing and determine if the headline is justified. Use
level of significance a α = 0.05. Be sure to comment on the headline in your conclusion.