Download Chapter 2-6 Optional Review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Law of large numbers wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
AP Statistics Final Review:
Do these problems on a separate sheet of paper. Show all work!
1. Explain the concept of resistance. Include an example comparing measures of center and an example
comparing measures of spread.
B
A
2. An experiment was conducted using 100 volunteers to investigate two different weight loss programs
(program A and program B). Researchers recorded each patient’s initial weight and gender and then randomly
assigned each subject to program A or program B. At the end of the study each subject was weighed again and
the change in weight was recorded (final – initial).
a) Describe the W’s. For each variable, record whether it is categorical or quantitative.
b) The boxplots below show the change in weights for the subjects in each treatment. Compare these
Box Plot
Collection 1
distributions.
-40
-20
0
20
40
60
80
100
Change in Weight (Final – Initial)
c) There is an outlier in the Program A’s distribution. Explain how this outlier was identified.
d) If a person had a negative change, it means that he or she actually gained weight. Which program had a
higher proportion of subjects who gained weight?
3. In November 2008, CDO had a mock election between John McCain and Barack Obama. The results shown
in the table below are categorized by grade level and presidential preference.
Obama McCain total
9
233
155
388
10
211
188
399
11
119
134
253
12
128
118
246
total
691
595
1286
a) Make a graph to display the relationship between grade level and presidential preference.
b) Based on your graph, are grade level and presidential preference independent for the students who
participated in the mock election?
4. Explain how you would decide when to use a histogram and when to use a bar chart.
5. A random sample of CDO students was asked how many hours of sleep they got the previous night. Here
are the results: 6, 6.75, 7.25, 7.5, 7.5, 7.5, 8, 9, 10.5
a) Calculate the mean and standard deviation
b) Interpret the standard deviation
c) In the context of this problem, explain the difference between x and  .
d) Calculate and interpret the z-score for the student with 6 hours of sleep.
e) If the times were converted to minutes, how would this student’s z-score change?
6. The following histogram shows the time it took to complete an exam for a class of 30 students.
a)
b)
c)
d)
Describe the shape of the distribution.
Make an ogive (cumulative relative frequency plot) for this data.
Explain how the characteristics of the ogive correspond to the shape of the histogram.
Use your ogive to estimate the interquartile range of this data.
7. The following data shows the time (in minutes) it took for students to complete a Sudoku puzzle.
6, 6, 6, 7, 7, 8, 8, 10, 10, 10, 11, 11, 13, 15, 19, 25, 30
a) Make a histogram of this data.
b) Without calculating, which is higher, mean or median? Explain.
c) In what circumstances would you want to make a relative frequency histogram?
8. The following summary statistics describe the distribution of test scores on a recent test.
x
n
s
Min Q1 Med Q3 Max
56 13.44 3.67 6
11 13.5 16.5 20
a) What are the range and interquartile range of the scores?
b) To scale the scores, the teacher multiplies each score by 3 and adds 10. Find the new values of the
mean, standard deviation, median, and interquartile range.
c) If a Sally’s raw score was in the 39th percentile, explain to her what this means.
d) After the test scores have been scaled, what percentile will Sally be in?
e) Suppose the teacher wanted the mean of the scores to be 80 with a standard deviation of 15. What
transformations should he apply?
9. The following stemplot shows the average number of text messages sent each day by a sample of 20 students
during the last month.
0
1
2
3
4
5
11257
369
39
2446799
39
0
Average Number of Text Messages Sent
1|3 = 13 messages/day
a) Make a boxplot of this data.
b) Discuss the advantages and disadvantages of using stemplots vs. boxplots.
10. Suppose that the distance a certain golfer can hit a golf ball is approximately normally distributed with a
mean of 250 yards and a standard deviation of 15 yards.
a) Sketch this distribution.
b) What proportion of his shots will go less than 231 yards?
c) What proportion will go at least 300 yards?
d) What proportion will go between 240 and 260 yards?
e) What is the 75th percentile for this distribution?
f) What distance be exceeded 90% of the time by this golfer?
11. Suppose that the number of calculators owned by high school students follows the following probability
distribution.
Number of Calculators 0 1
2
Probability
.68 .21
a) What is the probability that a person does not own a calculator?
b) Calculate and interpret the expected value of this distribution.
c) Calculate and interpret the standard deviation of this distribution.
12. Suppose that the number of text messages sent each day by a certain student is approximately normally
distributed with a mean of 52. Also, on about 20% of the days she sends more than 75 texts. What is the
standard deviation of the number of texts?
13. Suppose that the volume of soda in a can has a mean of 12.1 ounces with a standard deviation of 0.2
ounces. The cans are labeled 12 ounces. Suppose we were interested in the distribution of difference between
the actual volumes and the advertised volume. What would be the mean and standard deviation of this
distribution?
14. What information about the shape of the distribution from the normal quantile plot shown below?
15. Suppose that a casino offers the following game: You are dealt two cards from a deck and if they are both
aces you win $50, if they are both hearts you win $10 and otherwise you win nothing. Find the probability
distribution (probability model) for your winnings in this game.
16. A certain potato chip company produces two different kinds of potato chips: “Smooth” and “Ridges”.
Bags of “Smooth” potato chips have weights that are approximately normally distributed with a mean of 230
grams and a standard deviation of 10 grams.
a) If you were to randomly select 2 bags of “Smooth,” what is the probability that the total weight is over
450 grams?
b) If you were to select 100 bags of “Smooth”, what is the probability that the total weight will be more
than 23,100 grams?
c) Bags of “Ridges” have weights that are approximately normally distributed with a mean of 240 grams
and a standard deviation of 15 grams (the ridges make them more variable). If you randomly selected a
bag of each type, what is the probability that the bag of “Ridges” is heavier than the bag of “Smooth”?
17. A random sample of 90 portable music players was selected and each was charged and then played until it
ran out of power. The time each device lasted was recorded and plotted on the dotplot below. The mean of the
Dot Plot
Collection
distribution
is 1299.8 minutes with a standard deviation of 9.1 minutes.
280
290
300
310
320
Time (minutes)
tim e
a) If the distribution was normal, what percentage of times should be within 1 SD of the mean?
b) What percentage of times were actually within 1 standard deviation of the mean?
c) Even though the answers to parts (a) and (b) were not the same, it is possible that the population of times
from music players is approximately normally distributed and that the difference between (a) and (b)
was just due to sampling variability. To investigate, 100 samples of size 90 were selected from a normal
Dot Plot
Collection 2
population with a mean
of 299.8 and a standard deviation of 9.1 and the percentage that were within 1
SD of the mean was recorded. Use the results from the simulation below to discuss if it is possible that
the population is approximately normally distributed.
55
60
65
70
75
80
85
Simulated Percentage one
within 1 SD of the Mean
Problems 18-32: Is cardiovascular fitness (as measured by time to exhaustion running on a treadmill) related to
an athlete’s performance in a 20-km ski cross country ski-race? The following data shows x = treadmill run
time to exhaustion (in minutes) and y = 20-km ski time (in minutes).
Treadmill
Person
Ski Time
18. Why should treadmill time be the explanatory variable? Explain.
Time
19. Draw a scatterplot and discuss the noticeable features. Is one
1
7.7
71
variable completely dependent on the other?
2
8.4
71.4
20. Calculate the least squares line and graph it on the scatterplot.
3
8.7
65
21. Interpret the slope in the context of the problem.
4
9
68.7
22. Interpret the x- and y-intercepts in the context of the problem. Are
5
9.6
64.4
these reasonable values in this context?
6
9.6
69.4
23. Find the value of the correlation coefficient.
7
10
63
24. If the time was measured in seconds, how would this value change?
25. If r is high, can we conclude that a change in treadmill time causes
8
10.2
64.6
the ski time to change? Explain.
9
10.4
66.9
26. Calculate and interpret the residual for the first point in the data set.
10
11
62.6
27. Sketch the residual plot. What does it tell you?
11
11.7
61.7
28. Calculate and interpret the values of r 2 and s in the context of the
problem.
29. If you were to use number of hours instead of number of minutes for the ski time, how would the values of
r 2 and s change?
30. Predict the ski time for a runner who can last 8 minutes on the treadmill.
31. Suppose the observation (13, 59) was added to the data set. What effect will this have on the LSRL and the
values of r, r2, and s?
32. Instead of (13, 59), suppose the new point was (13, 65). What effect will this have on the LSRL and the
values of r, r2, and s?
33. In the “Ask Marilyn” column of Parade magazine (11-24-94) this question appeared. “Suppose a person
was having two surgeries performed at the same time. If the chances of success for surgery A are 85% and
the chances of success for surgery B are 90%, what are the chances that both would fail?” Write an answer
to this question for Marilyn.
34. Suppose that you draw one card from a deck.
a. If A = the card is a heart and B = the card is a club, are events A and B disjoint? Independent?
b. If A = the card is a heart and B = the card is a 7, are events A and B disjoint? Independent?
35. Joseph Lister (1827-1912) was one of the first to believe in Pasteur’s germ theory of infection. He
experimented with using carbolic acid to disinfect operating rooms during amputations. When carbolic acid
was used, 6/40 died. When it wasn’t used, 16/35 patients died. Let C = the event that carbolic acid was
used and D = the event that the person died.
a. express the given information in symbolic form (in terms of C and D) and in a two-way table
b. find the probability that a randomly selected patient:
i. died even though carbolic acid was used
ii. who died had received carbolic acid
iii. was given carbolic acid and died
iv. was given carbolic acid or died
c. Are events C and D disjoint? Independent? Explain.
36. In a recent sales period, 50% of automobiles sold were manufactured by American companies, 40% were
manufactured by Asian companies, and 10% were manufactured by European companies. Twenty percent
of the American autos were SUV’s, 15% of the Asian autos were SUV’s, and 10% of the European autos
were SUV’s.
a. express the given information in a tree diagram
b. if you randomly selected one auto sold during this time period find the probability that:
i. it was an American SUV
ii. it was an SUV
iii. it was an American auto or an SUV
iv. it was an American auto given that it was an SUV
v. it wasn’t American given that it wasn’t an SUV
37. Last year I bought a string of 25 Christmas lights. Unfortunately, my string of lights stopped working after
only 50 hours of use. The manufacturer says that the entire string stops working if two or more bulbs burn
out but claims that only 3% of bulbs burn out within 50 hours. Did I get a defective string of lights or could
this have occurred by random chance? Design and conduct a simulation (do 10 trials) to estimate the
probability that a string of 25 bulbs will go out within 50 hours (assuming the manufacturer’s claims are
correct).
38. You are at a bus stop in New York City waiting for any one of three downtown busses. Two are on
schedules that separate busses’ arrivals by 10 minutes and the third is on a schedule that separates busses’
arrivals by 8 minutes. Assuming that you arrive at a random point in the cycles of all three busses and that
their arrivals are all independent of one another, estimate the probability it takes more than 5 minutes for a
bus to arrive. For simplicity, use time in whole minutes (e.g. 1 minute, 2 minutes, …).
39. A basketball player is considered to be a “second-half player.” That is, many people believe that he shoots
better in the second half of a game. As evidence, people point to a recent game where he shot 20 percentage
points better in the second half of the game (where he made 60% of his shots) compared to the first half of
the game (where he only made 40% of his shots). Is this convincing evidence that he is a better shooter in
the second half, or could this have occurred by random chance? Assuming that he shoots equally well in
both halves (50% in both halves) and takes 10 shots in each half, design and conduct a simulation (do 10
runs) to estimate the probability that he shoots 20% better (or more) in the second half of a game.
40. The article “Study Provides New Data on the Extent of Gambling by College Athletes” (The Chronicle of
Higher Education, Jan. 22, 1999) reported that “72 percent of college football and basketball players had bet
money at least once since entering college.” This conclusion was based on a study in which “Copies of the
survey were mailed to 3000 athletes at 182 Division I institutions, 25 percent of whom responded.” (From
Statistics and Data Analysis 1e, Peck, Olsen and Devore, problem 2.22)
(a) What is the population of interest in this study?
(b) What is the parameter of interest in this study?
(c) What types of bias might have influenced the results of this study? Explain.
41. Suppose that the cafeteria staff at high school wants to do a survey of students who eat lunch in the
cafeteria.
(a) Carefully explain how they can get a simple random sample of students
(b) Carefully explain how they can get a stratified random sample of students. Explain your choice of
stratification variable.
(c) Carefully explain how they can get a cluster random sample of students. What is one benefit and
one possible drawback of your proposed method?
(d) Carefully explain how they can get a systematic random sample of students.
(e) Which method would you choose? Explain.
42. The Arizona Daily Star (10-29-2008) reported on a study of 1000 students at the University of Michigan
that investigated how to prevent catching the common cold. The students were randomly assigned to three
different cold prevention methods for 6 weeks. Some wore masks, some wore masks and used hand
sanitizer, and others did nothing. The two groups who used masks reported 10-50 percent fewer cold
symptoms than the ones who didn’t wear a mask.
(a) Explain how you know this is an experiment.
(b) What are the factors in this study? What are the treatments?
(c) What is the purpose of randomly assigning the treatments?
(d) Was blinding used in this study?
(e) What role might the placebo effect have in this study?
(f) Suppose that the reduction in symptoms was determined to be statistically significant. Explain what
statistically significant means in this context.
43. When there are combat troops requiring supplies who are difficult to reach by land, airdrops of supplies
may be conducted. Small crates with supplies are pushed out of an airplane, and attached parachutes
automatically open and slow the descent. One problem with airdrops is that since the supplies are unguided,
they may be lost or land in unreachable locations.
Suppose a new modestly-priced and reusable device is developed that uses radio communication with
the ground and motorized parachute guides to help steer supply crates to a target. Army researchers want to
assess how much closer crates will land to their intended target, on average, if they employ the new device.
They will conduct a lengthy experiment in which the explanatory variable is whether a crate has a guidance
device or not; and in which the response variable is the distance to the target that a crate lands on the
ground. (Problem scenario from Floyd Bullard)
(a) In addition to the explanatory variable, identify three other variables that may affect the response
variable.
(b) Pick one of the three variables you named in part (a)—or name a new one—and clearly describe
circumstances under which that variable would be a confounding variable.
(c) You have just described circumstances under which one extraneous variable might be a confounding
variable. Now explain why that confounding is highly undesirable in this study, and suggest a way
to eliminate the confounding nature of the extraneous variable that you chose.
(d) The weather conditions on different days would not be exactly the same and could therefore create
unwanted variability in the response variable. One way to deal with that problem is to design a
blocked study in which the blocks are the days of the airdrops. Describe how such a study could be
conducted if the researchers wanted to perform a total of 30 drops of each type of crate over a period
of 10 days. Be sure to describe clearly how you would determine which drops to make when.
(e) Suppose this study had in fact been conducted with a single prototype guidance device used over and
over again, rather than a different device in each crate.
(i) Describe one statistical advantage for using the same device over and over.
(ii) Describe one statistical disadvantage for using the same device over and over.
(f) Due to budget constraints, Army researchers have decided to use the new device in all the drops and
not use the traditional method in any of the drops. They will compare the average distance to the
target for drops using the new device to 450 meters, which is the average distance to the target for
the last 100 actual drops using the traditional method. Suppose that the average distance with the
new device is significantly lower than 450 meters. Should we conclude the new device works
better? Explain why or why not.
44. Suppose you have a large box of pennies of various ages and plan to take a sample of 10 pennies. Explain
how you can estimate that probability that the range of ages is greater than 15 years.
45. Approximately 13% of people are left handed. In a random sample of 20 people, what is the probability that
3 or more left handed people will be selected?
46. Suppose that a random sample of 100 Arizona residents was asked about their annual income. If x = annual
income, briefly explain the difference between x ,  , and  x
47. Suppose that 32% of US adults watched the Super Bowl in 2009. If you took a random sample of 1000 US
adults, what is the probability that between 30% and 35% of the members of the sample watched the Super
Bowl?
48. Suppose that 15% of cereal boxes contain a prize. What is the probability it takes 7 boxes to get 1 prize?
49. Suppose that in the United States the true mean number of tacos consumed in a year is 18 tacos per person
with a standard deviation of 25 tacos per person. Each person in a random sample of size 50 from the
United States was asked how many tacos they consumed in the last year.
a. What is the probability that the mean number of tacos per person in the sample is at most 20?
b. What can you do to increase this probability?
To estimate the sampling distribution of the sample standard deviation, 100 random samples of size 50
were taken from a population with a mean of 18 and a standard deviation of 25. This estimated
Dot Plot
Measures
Sample
of Collection 1
distribution
isfrom
shown
below.
10
15
20
25
30
35
40
45
Sample Standard
SD Deviation
c. Briefly explain the meaning of the phrase “sampling distribution of the sample standard deviation.”
d. Briefly explain what the dot at 45 represents.
e. Estimate the probability that the sample standard deviation will be at most 20 tacos.
50. Suppose that you wanted to create an 84% confidence interval for a proportion. What critical value would
you use?
51. When calculating a confidence interval for a mean, how do you know which type of critical value (z or t) to
use?
52. Why do statisticians prefer interval estimates to point estimates?
53. Define the term “standard error.”
54. Explain how you can determine if a statistic is unbiased.
55. Suppose that a newspaper randomly selected 400 voters in a particular city and asked them about the
mayor’s job performance. In the sample, 135 approved of his performance.
a. Calculate a 95% confidence interval for the true proportion of voters in this city who approve of the
mayor’s performance.
b. Interpret the 95% confidence level.
c. A member of the mayor’s staff suggests that the majority of voters approve of his performance. Is
this plausible?
d. How many more voters should be surveyed to reduce the margin of error to .03?
56. One way to evaluate microwave popcorn brands is to estimate the average number of unpopped kernels per
bag. Suppose a random sample of 7 bags of Pop Secret was selected, popped, and the number of unpopped
kernels was counted in each bag. Based on the data below, calculate a 90% confidence interval for the true
mean number of unpopped kernels.
18
21
21
22
25
26
28
57. A Fox News poll was conducted among a nationwide random sample of 900 registered voters, interviewed
by telephone February 27-28, 2007. According to the poll, “34% approve of the job George W. Bush is
doing as president.” A similar poll was conducted among a nationwide random sample of 900 registered
voters, interviewed by telephone March 18-19, 2008. According to this poll, “30% approve of the job
George W. Bush is doing as president.”
a) Use a 90% confidence interval to estimate the true change in the proportion of registered voters who
approve of the job George W. Bush is doing as president from February 2007 to March 2008.
b) In your calculations for part (a), what was the standard error? What was the margin of error?
c) Based on your interval, can you conclude that the proportion of registered voters who approve of the
job George W. Bush is doing as president has decreased in the time between the polls?
58. To investigate the effect of background noise on learning, a psychologist recruited 20
Noise Silence
volunteers from a local high school and randomly divided them into two treatment
10
12
groups. Each group was asked to read several selections from a literature book and
14
11
answer questions to measure their comprehension. While the first group was taking the
12
18
test, the psychologist was in the room having a conversation on her cell phone. However,
7
12
the second group was allowed to take the test in complete silence. The number of correct
8
9
responses for each student is recorded in the table to the right.
13
15
a) Can you conclude that the presence of background noise has a negative effect on
8
10
reading comprehension? Use a test with  = .05.
10
17
b) Describe the power of this test in the context of this problem.
12
14
c) If you used a 2 sample procedure to analyze this study, describe how you would
17
10
alter the design so a matched-pairs procedure would be appropriate. If you used a
matched pairs procedure to analyze this study, describe how you would alter the design so a 2
sample procedure would be appropriate.
59. An article in Nutrition Action (March 2009) summarizes an experiment on the relationship between
sleeping and snacking. They kept 11 sedentary men and women in a sleep lab for 4 weeks. When awake,
participants had unlimited access to meals, snacks, and entertainment. They were allowed to sleep 8.5 hours
each night for a 2 week period and then only 5.5 hours per night for a 2 week period. On the days they were
allowed to sleep less, they consumed an average of 220 more calories per day.
a) Should this be analyzed with a 2 sample t-test or a matched pairs t-test? Explain.
b) What are the conditions for the test you chose? Are they met?
c) State the hypotheses the researchers were interested in testing. Make sure to define any parameters
you use in the hypotheses.
d) Suppose that the p-value for this test was .001. Interpret this value in the context of the question.
e) What conclusion should the researchers make?
f) If the conclusion that the researchers made was incorrect, which type of error, Type I or Type II did
they commit? Explain.
g) If the researchers were to make a confidence interval using this data, would it include 0? Explain.
h) If you chose a 2-sample t-test in part (a), explain how you could redesign the experiment so a
matched pairs t-test would be appropriate. If you chose a matched pairs t-test in part (a), explain
how you could redesign the experiment so a 2 sample t-test would be appropriate.
60. A company that sells concrete blocks is receiving complaints from customers who say the weights of the
blocks vary too much. So, the company is considering a new, more expensive machine and decides to test it
out to see if the blocks it produces will have less variability than the blocks produced using the current
machine. To do this they will randomly select 10 blocks produced from each of the machines and compare
the difference of their standard deviations.
a) State the hypotheses the company is interested in testing. Make sure to define any parameters used
in your hypotheses.
b) Compute the difference of the sample standard deviations using the data below. Does this difference
give evidence that the new machine is more consistent?
Weights (ounces)
Old 312 320 324 316 335 325 321 320 320 320
New 321 318 317 313 325 319 319 320 320 319
c) To see if the difference is statistically significant, they do 100 trials of a simulation. In each trial of
Dot Plot assign
theScrambled
simulation,
they
Measures from
Collection
1 randomly shuffle all 20 weights, randomly assign 10 to the new machine,
the remaining 10 weights to the old machine, and then compute the difference of sample standard
deviations. Use the simulated differences shown in the dotplot below to estimate the p-value.
-4
-2
0
2
4
Simulated Differences of Sample
Standard Deviations
diff
d) Based on your estimated p-value in part (c), what conclusion should the company make?
e) If the company makes an error in part (d), which type of error did it make? Explain.
6
61. (from Workshop Statistics) The leading digit of a number is simply its first (left-most) digit. A
mathematician named Benford conjectured that with many real datasets, 1 is the most common leading
digit, followed by 2, 3, and so on. In fact, Benford developed a formula that he used to predict the
proportion of data values that have a certain leading digit. His formula says that the proportion having a
leading digit of i is log10 1  1/ i  . This works out to the following:
Leading Digit 1
2
3
4
5
6
7
8
9
Total
Probability
.301 .176 .125 .097 .079 .067 .058 .051 .046 1.000
Let’s investigate how well this Benford model fits a sample of real data from the 2000 Census. A random
sample of 358 US residents was selected and their total personal income was recorded. The leading digit for
each income was recorded and summarized in the table below. For example, an income of $13,000 has a
leading digit of 1 and a salary of $2.3 million has a leading digit of 2.
Leading Digit
1
2 3 4 5 6 7 8 9 Total
Number of Incomes 104 60 64 37 23 21 19 16 14 358
a) Display these data with a bar graph, using sample proportion (rather than observed count) for the
vertical axis. Comment on whether the bar graph appears to be consistent with the probabilities from
Benford’s model.
b) Use a test to determine if the distribution of personal incomes in the US is consistent with Benford’s
model.
c) Explain how this question suggests a way to check tax returns for fraud.
62. In their first semester project Kala Stepter and Liz Wong investigated is the wording of a question can
create response bias. To do this, they created three different survey questions and randomly selected
students to answer one of the three questions (also chosen at random). The three questions were:
A: “Should minors be allowed to consume alcohol?”
B: “Each year approximately 5000 people under the age of 21 die as a result of underage drinking.
Should minors be allowed to consume alcohol?”
C: “You are legally considered an adult when you are 18 gaining voting rights and other privileges.
Should minors be allowed to consume alcohol?”
The results are in the table below:
A B C
Yes 11 6 16
No 9 15 7
a) Does the data provide convincing evidence that the wording of a question creates response bias?
b) Interpret the p-value for this study.
63. Suppose that you stumbled on the table below and had no idea how the data was collected. Describe a
sampling procedure that would lead to a test of independence and then describe a sampling procedure which
would lead to a test of homogeneity of proportions.
Undergraduate Students Graduate Students
Live on Campus
103
12
115
Do Not Live on Campus
228
119
347
331
131
462
64. To help determine the price of his Honda CR-V, a statistic teacher took a random sample of 20 recent CRV sales and recorded the age of the car (in years) and the selling price (in thousands). Use the computer
output below to answer the questions that follow.
Scatterplot of Price vs Age
22.5
20.0
Coef
21.8405
-2.17426
S = 0.811030
SE Coef
0.3871
0.09912
T
56.42
-21.94
R-Sq = 96.4%
P
0.000
0.000
17.5
Price
Predictor
Constant
Age
R-Sq(adj) = 96.2%
15.0
12.5
10.0
7.5
5.0
1
Residuals Versus Age
2
3
4
Age
5
6
7
Normal Probability Plot of the Residuals
(response is Price)
(response is Price)
99
1.0
95
90
80
Percent
Residual
0.5
0.0
-0.5
60
50
40
30
20
10
-1.0
5
-1.5
1
a)
b)
c)
d)
70
2
3
4
Age
5
6
7
1
-2
-1
0
Residual
1
2
Interpret each number in the table above, except for R-Sq(adj).
Are the conditions for inference satisfied? Justify.
Is there a significant association between age and price?
Construct and interpret a 95% confidence interval for the average decrease in price for each
additional year the CR-V ages.
e) Construct a 95% confidence interval for the average price of a new CR-V. Calculations only.
f) According to a certain website, CR-V’s lose about $2000 in value each year. Is this data consistent
with this claim?
Choosing the Correct Testing Procedure: For each of the following scenarios, identify the inference
procedure you would use and state the hypotheses for the appropriate test (define variables when appropriate).
1. A teacher wants to know if the method of instruction affects how well students learn. Using two classes of
the same level of statistics, she teaches one class using lecture only and the other class using lecture and
group work. She measures the level of learning by giving both classes the same test.
2. A student of political science wished to determine whether there is a relationship between the gender of a
student and their political affiliation.
3. A student wishes to test if SUV drivers in his state are more likely to be male than female. He randomly
selects 50 students from a list of registered SUV drivers and records their gender.
4. Your friend in Portland claims that many drivers who pass her while she awaits the school bus are talking
on a cell phone. You think it’s a worse problem in your hometown.
5. In your psychology class, your group (5 students) wants to investigate the relative intelligence of mice. You
decide to perform an experiment on mice, using mazes. Each of you has one male and one female mouse at
home (for a total of 10 mice), and you each build a different maze. Each of you will allow each mouse one
trial and record the time to reach the cheese at the end of the maze.
6. Xylitol is a food sweetener that may also have antibacterial properties. In an experiment conducted in
Finland, 1 group of children regularly chewed gum with Xylitol, 1 group regularly took Xylitol lozenges,
and a third group regularly chewed gum that did not contain Xylitol. The experiment lasted 3 months and
researchers noted whether each child had an ear infection during that period.
7. Is there a relationship between the number of years a teacher has worked and their annual salary?
8. Researchers have noted that sleep deprivation leads to car accidents and other mistakes, often due to
inattention or slower reaction time. In order to examine the level of sleep deprivation in high school
students, a researcher performs the following study. At 10 a.m. on a particular school day, students in two
classes play a computer game that is actually recording the time it takes them to negotiate a mental obstacle
course. At 2 p.m. that day, one of the classes is given 30 minutes in a silent, dark room with comfortable
furniture, and the students are allowed to sleep. The other class has regular classes. At 3 p.m., both classes
play the computer game again. The researcher records the differences in the times it takes each student to
complete the game.
9. Suppose that 25% of all Hondas produced last year were white, 25% silver, 20% black, 15% blue, 10%
green, and 5% other. To see if they should change the distribution of colors for cars produced next year,
Honda takes a random sample of potential car buyers and asks what color they prefer the most.
10. Suppose that the 2000 Census showed that the mean household income in the US was $51,000. A random
sample of Californians was taken to see if Californians make more money than the rest of the country.
11. Which brand of AAA batteries last longer, Duracell or Eveready?
12. According to a recent survey, a typical teenager has 38 contacts stored in his/her cell phone. Is this true at
your school?
13. Do the majority of students at your school have a MySpace or Facebook page?
14. Is there a relationship between the age of a car and the number of miles it has been driven?
15. Is there a relationship between the type of music a student prefers and the student’s favorite academic
subject?
16. Is one gender more likely to own an iPod?
17. Do students spend at least 1 minute brushing their teeth, on average?
18. Are the colors uniformly distributed in Fruit Loops cereal?
19. Which brand of razor gives a closer shave? To answer this question, researchers recruited 25 men to shave
one side of their face with Razor A and the other side of their face with Razor B. After 12 hours the length
of the men’s whiskers was measured.
20. To see what factors influence heart attacks, subjects were recruited for an experiment and randomly
assigned to one of three treatment groups: low fat diet, exercise, and both.