Download Preliminaries - Lab Websites

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Preliminaries: Introduction to Statistical Investigations
P-1
Preliminaries
Introduction to Statistical Investigations
Have you ever heard statements like these?
 “Don’t get your child vaccinated. I vaccinated my child and now he is autistic.”
 “I will never start jogging because my friend’s dad jogged his whole life but he died at
age 46 of a heart attack.”
 “Teenagers shouldn’t be allowed to drive. Just last year there was a terrible accident at
our high school.”
The people making these statements are using anecdotal evidence (personal observations or
striking examples) to support broad conclusions. The first person concludes that vaccinations
cause autism, based solely on her own child. The second concludes that running is too risky
and could cause heart attacks, based entirely on the experience of one acquaintance. The third
person also judges risk based on a single striking incident.
Scientific conclusions cannot be based on anecdotal evidence. Science requires evidence from
data. Statistics is the science of producing useful data to address a research question,
analyzing the resulting data, and drawing appropriate conclusions from the data.
For example, suppose you are running for a student government office and have two different
campaign slogans in mind. You’re curious about whether your fellow students would react more
positively to one slogan than the other. Would you ask only for your roommate’s opinion, or
several of your friends? Or could you conduct a more systematic study? What might that look
like? The study of Statistics will help you see how to design and carry out such a study, and you
will see how Statistics can also help to answer many important research questions from a wide
variety of fields of application.
Example P.1: Organ Donations
Organ donations save lives. But recruiting organ donors is difficult,
even though surveys show that about 85% of Americans approve of
organ donations in principle and many states offer a simple organ
donor registration process when people apply for a driver’s license.
However, only about 38% of licensed drivers in the United States are registered to be organ
donors. Some people prefer not to make an active decision about organ donation because the
topic can be unpleasant to think about. But perhaps phrasing the question differently could
affect people’s willingness to become a donor?
Johnson and Goldstein (2003) recruited 161 participants for a study, published in the journal
Science, to address this question of organ donor recruitment. The participants were asked to
imagine they have moved to a new state and are applying for a driver’s license. As part of this
application, the participants were to decide whether or not to become an organ donor. What
differed was the default option that the participants were presented:
 Some of the participants were forced to make a choice of becoming a donor or not,
without being given a default option (the “neutral” group).
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-2

Other participants were told that the default option was not to be a donor but that they
could choose to become a donor if they wished (the “opt-in” group).
 The remaining participants were told that the default option was to be a donor but that
they could choose not to become a donor if they wished (the “opt-out” group).
What did the researchers find? Those given the “opt-in” strategy were much less likely to agree
to become donors. Consequently, policy makers have argued that we should employ an “opt
out” strategy instead. Individuals can still choose not to donate, but would have to more actively
do so rather than accept the default. Based on their results, Johnson and Goldstein stated that
their data “suggest changes in defaults could increase donations in the United States of
additional thousands of donors a year.” In fact, as of 2010, 24 European countries had some
form of the opt-out system–which some call “presumed consent”–with Spain, Austria, and
Belgium yielding high donor rates.
Why were Johnson and Goldstein able to make such a strong recommendation? Because
rather than relying on their own opinions or on anecdotal evidence, they conducted a carefully
planned study of the issue using sound principles of science and statistics. Similar to the
scientific method, we now identify six steps of a statistical investigation,
Six Steps of a Statistical Investigation






Step 1: Ask a research question that can be addressed by collecting data. These
questions often involve comparing groups, asking whether something affects something
else, or assessing people’s opinions.
Step 2: Design a study and collect data. This involves selecting the people or objects
to be studied and deciding how to gather relevant data on them.
Step 3: Explore the data, looking for patterns related to your research question as well
as unexpected outcomes that might point to additional questions to pursue.
Step 4: Draw inferences beyond the data by determining whether any findings in your
data reflect a genuine tendency and estimating the size of that tendency.
Step 5: Formulate conclusions that consider the scope of the inference made in Step
4. To what underlying process or larger group can these conclusions be generalized? Is
a cause-and-effect conclusion warranted?
Step 6: Look back and ahead to point out limitations of the study and suggest new
studies that could be performed to build on the findings of the study.
Let’s see how the organ donation study followed these steps.
Step 1: Ask a research question. The general question here is whether a method can be
found to increase the likelihood that a person agrees to become an organ donor. This question
was then sharpened into a more focused one: Does the default option presented to driver’s
license applicants influence the likelihood of someone becoming an organ donor?
Step 2: Design a study and collect data. The researchers decided to recruit various
participants and ask them to pretend to apply for a new driver’s license. The participants did not
know in advance that different options were given for the donor question, or even that this issue
was the main focus of the study. These researchers recruited participants for their study through
various general interest bulletin boards on the internet. They offered an incentive of $4.00 for
completing an online survey. After the results were collected, the researchers removed data
arising from multiple responses from the same IP address, surveys completed in less than five
seconds, and respondents whose residential address could not be verified.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-3
Step 3: Explore the data. The results of this study were:
 44 of the 56 participants in the neutral group agreed to become organ donors,
 23 of 55 participants in the opt-in group agreed to become organ donors, and
 41 of 50 participants in the opt-out group agreed to become organ donors.
The proportions who agreed to become organ donors are 44/56 ≈ .786 (or 78.6%) for the
neutral group, 23/55 ≈ .418 (or 41.8%) for the opt-in group, and 41/50 = .820 (or 82.0%) for the
opt-out group. The Science article displayed a graph of these data similar to Figure P.1.
Figure P.1: Percentages for Organ Donation Study
These results indicate that the neutral version of the question, forcing participants to make a
choice between becoming an organ donor or not, and the opt-out option, for which the default is
to be an organ donor, produced a higher percentage who agreed to become donors than the
opt-in version for which the default is not to be a donor.
Step 4: Draw inferences beyond the data. Using methods that you will learn in this course,
the researchers analyzed whether the observed differences between the groups was large
enough to indicate that the default option had a genuine effect, and then estimated the size of
that effect. In particular, this study reported strong evidence that the neutral and opt-out
versions do lead to a higher chance of agreeing to become a donor, as compared to the opt-in
version currently used in many states. In fact, they could be quite confident that the neutral
version increases the chances that a person agrees to become a donor by between 20 and 54
percentage points, a difference large enough to save thousands of lives per year in the United
States.
Step 5: Formulate conclusions. Based on the analysis of the data and the design of the study,
it is reasonable for these researchers to conclude that the neutral version causes an increase in
the proportion who agree to become donors. But because the participants in the study were
volunteers recruited from internet bulletin boards, generalizing conclusions beyond these
participants is only legitimate if they are representative of a larger group of people.
Step 6: Look back and ahead. The organ donation study provides strong evidence that the
neutral or opt-out wording could be helpful for improving organ donation proportions. One
limitation of the study is that participants were asked to imagine how they would respond, which
might not mirror how people would actually respond in such a situation. A new study might look
at people’s actual responses to questions about organ donation or could monitor donor rates for
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-4
states that adopt a new policy. Researchers could also examine whether presenting educational
material on organ donation might increase people’s willingness to donate. Another improvement
would be to include participants from wider demographic groups than these volunteers.
Part of looking back also considers how an individual study relates to similar studies that have
been conducted previously. Johnson and Goldstein compare their study to two others: one by
Gimbel et al. (2003) that found similar results with European countries and one by Caplan
(1994) that did not find large differences in the proportion agreeing to donate between the three
default options.
Figure P.2 displays the six steps of a statistical investigation that we have identified:
Figure P.2: Six Steps of a Statistical Investigation
1. Ask a research
question
Research Hypothesis
2. Design a study
and collect data
3. Explore the
data
Logic of
Inference
4. Draw
inferences
Significance
Estimation
Scope of
Inference
5. Formulate
conclusions
Generalization
Causation
6. Look back and
ahead
Four Pillars of Statistical Inference
Notice from Figure P.2 that Step 4 can be considered as the logic of statistical inference and
Step 5 as the scope of statistical inference. Furthermore, each of these two steps involves two
components. The following questions comprise the four pillars of statistical inference:
1. Significance: How strong is the evidence of an effect?
You will learn how to provide a measure of the strength of the evidence provided
by the data that the neutral and opt-out versions increase the chance of agreeing
to become an organ donor, as compared to the opt-in version.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-5
2. Estimation: What is the size of the effect?
You will learn how to estimate how much higher (if any) the chance someone
agrees to donate organs when asked with the neutral version is compared to the
other versions.
3. Generalization: How broadly do the conclusions apply?
You will learn to consider what larger group of individuals you believe this
conclusions can be applied to.
4. Causation: Can we say what caused the observed difference?
You will learn whether we can legitimately conclude that the version of the
question was the cause of the observed differences in the proportion who agreed
to become organ donors.
These four concepts are so important that they should be addressed in virtually all statistical
studies. Chapters 1-4 of this book will be devoted to introducing and exploring these four pillars
of inference. To begin our study of the six steps of statistical investigation, we now introduce
some basic terminology that will be used throughout the text.
Basic Terminology

Data can be thought of as the values measured or categories recorded on individual
entities of interest.
 These individual entities on which data are recorded are called observational units.
 The recorded characteristics of the observational units are the variables of interest.
o Some variables are quantitative, taking numerical values on which ordinary
arithmetic operations make sense.
o Other variables are categorical, taking category designations.
 The distribution of variable describes the pattern of value/category outcomes.
In the organ donation study, the observational units are the participants in the study. The two
variables recorded on these participants are the version of the question that the participant
received, and whether or not the participant agreed to become an organ donor. Both of these
are categorical variables. The graph in Figure P.1 displays the distributions of the donation
variable for each default option category.
The observational units in a study are not always people. For example, you might take the
Reese’s Pieces candies in a small bag as your observational units, on which you could record
variables such as the color (a categorical variable) and weight (a quantitative variable) of each
individual candy. Or you might take all of the Major League Baseball games being played this
week as your observational units, on which you could record data on variables such as the total
number of runs scored, whether the home team wins the game, and the attendance at the
game.
Think about it: For each of the three variables just mentioned (about Major League Baseball
games), identify the type of variable: categorical or quantitative.
The total number of runs scored and attendance at the game are quantitative variables.
Whether or not the home team won the game is a categorical variable.
Think about it: Identify the observational units and variable for a recent study (Ackerman,
Griskevicius, and Li, 2011) that investigated this research question: Among heterosexual
couples in a committed romantic relationship, are men more likely than women to say “I love
you” first?
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-6
The observational units in this study are the heterosexual couples, and the variable is whether
the man or the woman was the first to say “I love you.”
Coming Next
Next you will explore two situations where observing and summarizing data are helpful in
making decisions. In Example P.2, you will encounter data arising from a natural “datagenerating” process that repeats the same “random event” many, many times, which allows us
to see a pattern (distribution) in the resulting data. Then in Exploration P.3, you will examine
data from a purely random process (like rolling dice), to see how to use that information to make
better decisions. In subsequent chapters, you will analyze both data-generating processes and
random processes, often with the goal of seeing how well a random process models what you
find in data.
Example P.2: Old Faithful
Millions of people from around the world flock to Yellowstone Park in order
to watch eruptions of Old Faithful geyser. But, just how faithful is this
geyser? How predictable is it? How long does a person usually have to wait
between eruptions?
Suppose the park ranger gives you a prediction for the next eruption time,
and then that eruption occurs five minutes after that predicted time. Would
you conclude that predictions by the Park Service are not very accurate?
We hope not, because that would be using anecdotal evidence. To
investigate these questions about the reliability of Old Faithful, it is much
better to collect data.
(A live webcam of Old Faithful and surrounding geysers is available at:
http://www.nps.gov/yell/photosmultimedia/yellowstonelive.htm.)
Researchers collected data on 222 eruptions of Old Faithful taken over a number of days in
August 1978 and August 1979. Figure P.3 contains a graph (called a dotplot) displaying the
times until the next eruption (in minutes) for these 222 eruptions. Each dot on the dotplot
represents a single eruption.
Figure P.3: Times between eruptions of Old Faithful geyser
40
45
50
55
60
65
70
75
80
time until next eruption (min)
85
90
95
100
Think about it: What are the observational units and variable in this study? Is the variable
quantitative or categorical?
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-7
The observational units are the 222 geyser eruptions, and the variable is the time until the next
eruption, which is a quantitative variable. The dotplot displays the distribution of this variable,
which means the values taken by the variable and how many eruptions have those values. The
dotplot helps us see the patterns in times until the next eruption.
The most obvious point to be seen from this graph is that even Old Faithful is not perfectly
predictable! The time until the next eruption varies from eruption to eruption. In fact, variability
is the most fundamental property in studying Statistics.
We can view these times until the next eruption as observations from a process, an endless
series of potential observations from which our data constitute a small “snapshot.” Our
assumption is that these observations give us a representative view of the long-run behavior of
the process. Although we don’t know in advance how long it will take for the next eruption, in
part because there are many factors that determine when that will be (e.g., temperature,
season, pressure), and in part because of unavoidable, natural variation, we may be able to see
a predictable pattern overall if we record enough inter-eruption times. Statistics helps us to
describe, measure, and often explain the pattern of variation in these measurements.
Looking more closely at the dotplot, we can notice several things about the distribution of the
time until the next eruption:
 The shortest time until the next eruption was 42 minutes, and the longest time was 95
minutes.
 There appear to be two clusters of times, one cluster between roughly 42 and 63
minutes, another between about 66 and 95 minutes.
 The lower cluster of inter-eruption times is centered at approximately 55 minutes,
whereas the upper cluster is centered at approximately 80 minutes. Overall, the
distribution of times until the next eruption is centered at approximately 75 minutes.
 In the lower cluster, times until next eruption range from 42 to 63 minutes, with most of
the times between 50-60 minutes. In the upper cluster, times range from 66 to 95
minutes, with most between 75-85 minutes.
What are some possible explanations for the variation in the times? One thought is that some of
the variability in times until next eruption might be explained by considering the duration length
of the previous eruption. It seems to make sense that after a particularly long eruption, Old
Faithful might need more time to build enough pressure to produce another eruption. Similarly,
after a shorter eruption, Old Faithful might be ready to erupt again without having to wait very
long. Fortunately, the researchers recorded a second variable about each eruption: the duration
of the eruption, which is another quantitative variable. For simplicity we can categorize each
eruption’s duration as short (less than 3.5 minutes) or long (3.5 minutes or longer), a categorical
variable. Figure P.4 displays dotplots of the distribution of time until next eruption for short and
long eruptions separately.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-8
eruption type
Figure P.4: Times between eruptions of old faithful geyser, separated by duration of previous
eruption (less than 3.5 minutes or at least 3.5 minutes)
short
long
40
45
50
55
60
65
70
75
80
85
time until next eruption (min)
90
95
100
We can make several observations about the distributions of times until next eruption,
comparing eruptions with short and long durations, from this dotplot:
 The shapes of each individual distribution no longer reveal the two distinct clusters (bimodality) that was apparent in the original distribution before separating by duration
length. Each of these distributions seems to have a single peak.
 The centers of these two distributions are quite different: After a short eruption, a typical
time until the next eruption is between 50 and 60 minutes. In contrast, after a long
eruption, a typical time until the next eruption is between 75 and 85 minutes.
 The variability in the times until the next eruption is much smaller for each individual
distribution (times tend to fall closer to the mean within each duration type) than the
variability for the overall distribution, as we have been able to take into account one
source of variability in the data. But of course the times still vary, partly due to other
factors that we have not yet accounted for and partly due to natural variability inherent
in all random processes.
One way to measure the center of a distribution is with the average, also called the mean. One
way to measure variability is with the standard deviation, which is roughly the average
distance between a data value in the distribution and the mean of the distribution. (See the
Appendix for details about calculating standard deviation, which will also be explored further in
Chapter 3.) These values for the time until next eruption, both for the overall distribution and for
short and long eruptions separately, are given in Table P.1.
Table P.1: Means and Standard Deviations of Inter-Eruption Times
Mean
Standard deviation
71.0
12.8
Overall
56.3
8.5
After short duration
78.7
6.3
After long duration
Notice that the standard deviations (SD) of time until next eruption are indeed smaller for the
separate groups than for the overall dataset, as suggested by examining the variability in the
dotplots.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-9
Figure P.5: Times between eruptions of old faithful geyser, separated by duration of previous
eruption, with mean and standard deviation shown
eruption type
SD = 8.5
short
long
Mean = 56.3
40
45
50
55
SD = 6.3
Mean 75
= 78.780
60
65
70
85
time until next eruption (min)
90
95
100
So, what do you learn from this analysis? First, you can better predict when Old Faithful will
erupt if you know how long the previous eruption lasted. Second, with that information in hand,
Old Faithful is rather reliable, because it often erupts within six to nine minutes of the time you
would predict based on the duration of the previous eruption. So if the park ranger’s prediction
was only off by five minutes, that’s pretty good.
Basic Terminology
From this example you should have learned that a graph such as a dotplot can display the
distribution of a quantitative variable. Some aspects to look for in that distribution are:
 Shape: Is the distribution symmetric? Mound-shaped? Are there several peaks or
clusters?
 Center: Where is the distribution centered? What is a typical value?
 Variability: How spread out are the data? Are most within a certain range of values?
 Unusual observations: Are there outliers that deviate markedly from the overall pattern
of the other data values? If so, identify them to see if you can explain why those
observations are so different. Are there other unusual features in the distribution?
You have also begun to think about ways to measure the center and variability in a distribution.
In particular, the standard deviation is a tool we will use quite often as a measure of variability.
At this point, we want you to be comfortable visually comparing the variability among
distributions and anticipating which variables you might expect to have more variability than
others.
Think about it: Suppose that Mary records the ages of people entering a McDonald’s fast-food
restaurant near the interstate today, while Colleen records the ages of people entering a snack
bar on a college campus. Who would you expect to have the larger standard deviation of these
ages: Mary (McDonald’s) or Colleen (campus snack bar)? Explain briefly.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-10
The customers at McDonald’s are likely to include people of all ages, from young children to
elderly people. But customers at the campus snack bar are most likely to be college-aged
students, with some older people who work on campus and perhaps a few younger people.
Therefore the ages of the customers at McDonald’s will vary more than the ages of those at the
campus snack bar. Mary is therefore more likely to have a larger standard deviation of ages
than Colleen.
Exploration P.3: Cars or Goats
A popular television game show (Let’s Make a
Deal from the 1960s and 1970s) featured a
new car hidden behind one of three doors,
selected at random. Behind the other two
doors were less appealing prizes (e.g., goats!).
When a contestant played the game, he or she
was asked to pick one of the three doors. If the
contestant picked the correct door, he or she
won the car!
1. Suppose you are a contestant on this show. Intuitively, what do you think is the
probability that you win the car (i.e., that the door you pick has the car hidden behind it)?
2. Give a one-sentence description of what you think probability means in this context.
Assuming there is no set pattern to where the game show puts the car initially, this game is an
example of a random process: Although the outcome for an individual game is not known in
advance, we expect to see a very predictable pattern in the results if you play this game many,
many times. This pattern is called a probability distribution, similar to a data distribution as
you examined with Old Faithful inter-eruption times. We are interested in features such as how
common certain outcomes are – e.g., are you more likely to win this game (select the door with
the car) or lose this game?
To investigate what we mean by the term probability, we ask you to play the game many times.
As the game show no longer exists, we will simulate (artificially re-create) playing the game,
keeping track of how often you win the car.
3. Use three playing cards with identical backs, but two of the card faces should match and
one should differ. The different card represents the car. Work with a partner (playing the
role of game show host), who will shuffle the three cards and then randomly arrange
them face down. You pick a card and then reveal whether you have won the car or
selected a goat. Play this game a total of 15 times, keeping track of whether you win the
car (C) or a goat (G) each time:
Game #
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
Outcome (car or goat)
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-11
4. In what proportion of these 15 games did you win the car? Is this close to what you
expected? Explain.
These fifteen “trials” or “repetitions” mimic the behavior of the game show’s random process,
where you are introducing randomness into the process by shuffling the cards between
games. To get a sense of the long-run behavior of this random process, we want to observe
many, many more trials. Because it is not realistic to ask you to perform thousands of
repetitions with your partner, we will turn to technology to continue to generate a large
number of outcomes from this random process.
5. Suppose that you were to play this game 1000 times. In what proportion of those games
would you expect to win the car? Explain.
6. Use the website http://www.grand-illusions.com/simulator/montysim.htm to simulate
playing this game 10 times. Be sure to use the “keep” choice on the left side, and
change the Run times to 10. Click on the “Start” button. Record the proportion of wins in
these 10 games. Then simulate another 10 games, and record the overall proportion of
wins at this point. Keep doing this in multiples of 10 games until you reach 100 games
played. Record the overall proportions of wins after each additional multiple of 10 games
in the table below.
Number of games
10
20
30
40
50
60
70
80
90
100
Proportion of wins
7. What do you notice about how the proportion of wins changes as you play more games?
Does this proportion appear to be approaching some common value?
8. Now change the Run times to play to 100 and click on “Start.” Repeat this until you
reach a total of 1000 games played. Calculate the proportion of wins by dividing the
number of wins by 1000. Is this close to what you expected in #5?
You should see that the proportion of wins generally gets closer and closer to 1/3 (or .3333) as
you play more and more games. This is what it means to say that the probability of winning is
1/3: If you play the game repeatedly under the same conditions, then after a very large number
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-12
of games, your proportion of wins should be very close to 1/3. Figure P.6 displays a graph
showing how the proportion of wins changed over time for one simulation of 1000 games.
Notice that the proportion of wins bounces around a lot at first but then gradually settles down
and approaches a long-run value of 1/3.
Figure P.6: Proportion of wins as more and more games are played
1.0
0.9
proportion of wins
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
200
400
600
number of games
800
1000
Now consider a fun twist that the game show host adds to this game: Before revealing what’s
behind your door, the host will first reveal what’s behind a different door that the host knows to
be a goat. Then the host asks whether you (the contestant) prefer to stay with (keep) the door
you picked originally or switch (change) to the remaining door.
9. Prediction: Do you think the probability of winning is different between the “stay” (keep)
and “switch” (change) strategies?
Whether the “stay” or “switch” strategy is better is a famous mathematical question known as
the Monty Hall Problem, named for the host of the game show. Many people, including some
renowned mathematicians, got the solution wrong when this problem became popular through
the “Ask Marilyn” column in Parade magazine in 1990. We can approach this question as a
statistical one that you can investigate by collecting data.
10. Investigate the probability of winning with the “switch” strategy by playing with three
cards for 15 games. This time your partner should randomly arrange the three cards in
his/her hand, making sure that your partner (playing the role of game show host) knows
where the car is but you do not. You pick a card. Then your partner reveals one of the
cards known to be a goat but not the card you chose. Play with the “switch” strategy for
a total of 15 games, keeping track of the outcome each time:
Repetition
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
Outcome (car or goat)
11. In what proportion of these 15 games did you win the car? Is this more or less than (or
the same as) when you stayed with the original door? (question #4)
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-13
12. To investigate what would happen in the longer run, return to http://www.grandillusions.com/simulator/montysim.htm. Notice that you can change from “keep” your
original choice to “change” your original choice. Clear any previous work and then
simulate playing 1000 games with each strategy, and record the number of times you
win/lose with each:
“Stay” strategy
“Switch” strategy
Wins (cars)
Losses (goats)
1000
1000
Total
13. Do you believe that the simulation has been run for enough repetitions to declare one
strategy as superior? Which strategy is better? Explain how you can tell.
14. Based on the 1000 simulated repetitions of playing this game, what is your estimate for
the probability of winning the game with the “switch” strategy?
15. How could you use simulation to obtain a better estimate of this probability?
16. The probability of winning with the “switch” strategy can be shown mathematically to be
2/3. (One way to see this is to recognize that with the “switch” strategy, you only lose
when you had picked the correct door in the first place.) Explain what it means to say
that the probability of winning equals 2/3.
Extension
17. Suppose that you watch the game show over many years and find that door #1 hides the
car 50% of the time, door #2 has the car 40% of the time, and door #3 has the car 10%
of the time. What then is your optimal strategy? In other words, which door should you
pick initially, and then should you stay or switch? What is your probability of winning with
the optimal strategy? Explain.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-14
Basic Terminology
Through this investigation you should have learned:
 A random process is one that can be repeated a very large number of times (in
principle, forever) under identical conditions with the following property:
o Outcomes for any one instance cannot be known in advance, but the proportion
of times that particular outcomes occur in the long run can be predicted well in
advance.
 The probability of an outcome refers to the long-run proportion of times that the
outcome would occur if the random process were repeated a very large number of times.
 Simulation (artificially re-creating a random process) can be used to estimate a
probability.
o Simulations can be conducted with both tactile methods (e.g., cards) and with
computers.
o Using a larger number of repetitions in a simulation generally produces a better
estimate of the probability.
 Simulation can be used for making good decisions involving random processes.
o A “good” decision (in this context) means you can accurately predict which
strategy would result in a larger probability of winning. This tells you which
strategy to use if you do find yourself on this game show, but of course does not
guarantee you will win!
Preliminaries Summary
This concludes your study of the preliminary but important ideas necessary to begin studying
Statistics. We hope you have learned that:








Collecting data from carefully designed studies is more dependable than relying on
anecdotes for answering questions and making decisions.
Statistical investigations, which can address interesting and important research
questions from a wide variety of fields of application, follow the six steps illustrated in
Figure P.2.
Some data arise from processes that include a mix of systematic elements and natural
variation.
All data display variability. Distributions of quantitative data can be analyzed with
dotplots, where we look for shape, center, variability, and unusual observations.
Standard deviation is a widely used tool for quantifying variability in data.
Random processes that arise from chance mechanisms display predictable long-run
patterns of outcomes. Probability is the language of random processes.
Beginning in Chapter 1, we will use a random process to model a data-generating
process in order to assess whether the data process appears to behave similarly to the
random process.
Using data to draw conclusions and make decisions requires careful planning in
collecting and analyzing the data, paying particular attention to issues of variability and
randomness.
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission
Preliminaries: Introduction to Statistical Investigations
P-15
Preliminaries Glossary
Cases anecdotal evidence
Personal experience or striking example ....................................................................................... P-1
categorical variable
A variable whose outcomes are category designations............................................................... P-5
center
A middle or typical value of a quantitative variable....................................................................... P-9
distribution
The pattern of outcomes of a variable ....................................................................................P-5, P-7
dotplot
A graph with one dot representing the variable outcome for each observational unit ............ P-6
observational units
The individual entities on which data are recorded ...................................................................... P-5
probability distribution
The pattern of long-run outcomes form a random process .......................................................P-10
probability
The long-run proportion of times an outcome from a random process occurs.......................P-11
process
An endless series of potential observations .................................................................................. P-7
quantitative variable
A variable taking numerical values on which ordinary arithmetic operations make sense..... P-5
random process
A repeatable process with unknown individual outcomes but a long-run pattern ..................P-10
shape
A characteristic of the distribution of a quantitative variable ....................................................... P-9
simulation
Artificial recreation of a random process ......................................................................................P-10
six steps of a statistical investigation ............................................................................................ P-2
standard deviation
A measure of variability of a quantitative variable ........................................................................ P-8
Statistics:
A discipline that guides researchers in collecting, exploring and drawing conclusions from
data ...................................................................................................................................................... P-1
variability
The spread in observations for a quantitative variable ................................................................ P-7
variables
Recorded characteristics of the observational units ..................................................................... P-5
© Fall 2013, Tintle et al.; to be published by Wiley and Sons, not to be modified without permission