Download Binomial Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Methods for a Single Categorical Variable - Binomial Distribution
We’ve been using simulation techniques to create a picture of what outcomes seem reasonable based on the “no
preference,” “no knowledge,” or “no difference” scenarios. However, instead of simulating each scenario a finite
number of times and ____________________ the p-value, we will focus on the situation in which we simulate the
experiment an ____________________ number of times. This will provide us with the ___________________
probabilities of interest. Also, this will fix the “problem” of different people getting different answers to the research
question using the same data (since each simulation is slightly different).
Example Revisited: Helper vs. Hinderer
The following graphic shows what the distribution of “no preference” would look like if we simulated the scenario an
infinite number of times, and plotted the number of infants who chose the Helper Toy each time. This is known as
the ____________________ distribution.
Using the Binomial Distribution to Make Decisions
To answer research questions which involve a single categorical variable, statisticians do not necessarily always turn
to simulations involving taking random samples over and over again. Instead, we use the Binomial distribution
which shows us how often each of our possible outcomes would occur if we repeated the previous simulations an
infinite number of times.
Binomial Distribution – when can we use it?
The Binomial distribution can be used whenever the following conditions are satisfied:
1. There are a fixed number of trials, _____.
2. There are only _____ possible outcomes for each trial – a “success” and a “failure”
3. The probability of ____________________ (p) remains the same for __________ trial.
4. The trials are ____________________.
1
Question:
1. Determine whether the conditions for the Binomial Distribution have been satisfied for the Helper vs.
Hinderer example. Make sure to give your answers IN CONTEXT of the scenario.

Fixed number of trials?

2 possible outcomes?

P(success) remains the same?

Trials are independent?
The formula for calculating probabilities using the binomial distribution is given below:
n!
px (1 - p)n-x where p = P(Success on each trial).
x!(n - x)!
However, we are not going to calculate the probabilities by hand, but will use JMP instead. There is a file on the
course website called Binomial_Probabilities.jmp which will be used to calculate binomial probabilities.
The following two pieces of information are needed to calculate probabilities using the JMP file:
 The probability of success: P(success) = p
p = Probability infants choose the helper toy (assuming no toy preference) = __________
 The sample size (number of trials): n
n = number of infants in the study = __________
To change the number of trials in JMP, right-click on the n (number of trials) column and select Formula:
2
Then, change the value of n (number of trials) to 16 (since there were 16 infants in the study) as follows:
Click OK. Next, change the probability of success by right-clicking on the p (probability of success) column and
selecting Formula. Change the value to p = 0.50 (since there is a 50% chance of choosing the helper toy, assuming
infants really have no preference.)
Click OK and JMP should return the following output:
3
Questions:
2. What do the “Individual Binomial Probabilities” represent?
3. What is the probability of seeing exactly 8 infants choose the helper toy, assuming they have no preference?
4. What do the “Cumulative Binomial Probabilities” represent?
5. What is the probability of seeing 3 infants or fewer choose the helper toy, assuming they have no
preference?
6. What do the “Prob of x or more” probabilities represent?
7. What is the probability of seeing 14 or more infants choose the helper toy, assuming they have no
preference?
8. In Question 7, the p-value for this scenario was computed. Using this p-value, make a conclusion in context
regarding the research question of interest for this scenario.
4
Example Revisited: Bone Density
Recall, this example involves evaluating a new test to detect low bone density in postmenopausal women and
determine if it is better at early detection of the disease. A random sample of 248 postmenopausal women were
given the new test and treated accordingly, and 82 were diagnosed with low bone density.
Research Question – Is there evidence that the new test reduces the percentage of postmenopausal women
diagnosed with low bone density?
Question:
9. Determine whether the conditions for the Binomial Distribution have been satisfied for the Bone Density
example. Make sure to give your answers IN CONTEXT of the scenario.

Fixed number of trials?

2 possible outcomes?

P(success) remains the same?

Trials are independent?
The following two pieces of information are needed to calculate probabilities using the JMP file:
 The probability of success: P(success) = p
p = Current proportion of postmenopausal women with LBD (no change) = __________
 The sample size (number of trials): n
n = number of postmenopausal women in the study = __________
5
To change the number of trials in JMP, right-click on the n (number of trials) column and select Formula, and change
the value to 248 as follows:
Click OK. Next, change the probability of success by right-clicking on the p (probability of success) column and
selecting Formula, and change the value 0.40 as shown below:
Click Apply and then OK. JMP should then return the following output:
.
.
.
6
Questions:
10. Based on the Binomial probabilities, what is the probability of observing 82 or fewer postmenopausal women
selected if there really is no difference in the test detection? That is, find the p-value.
11. Based on the p-value from Question 10, provide the conclusion in context for the research question of
interest.
7
Example Revisited: Effectiveness of an Experimental Drug
Suppose a commonly prescribed drug for relieving nervous tension is believed to be only 70% effective. A new drug
was formulated and administered to a random sample of 20 adults who were suffering from nervous tension. Of
those 20 adults, 18 experienced relief when taking the new drug.
Research Question – Is the new experimental drug more effective than the old drug? That is, is the
experimental drug more than 70% effective?
Questions:
12. Determine whether the conditions for the Binomial Distribution have been satisfied for the Experimental
Drug example. Make sure to give your answers IN CONTEXT of the scenario.
13. Carry out a formal hypothesis test to answer the research question of interest.
Is the new experimental drug more effective than the old drug?
That is, is the experimental drug more than 70% effective?
Step 0:
H0:
Step 1:
Ha:
Observed number of success = __________
Step 2:
What values are considered more extreme in this scenario? Bigger or Smaller
Therefore, the p-value = P(X _____ 18) = _____________
Step 3:
8
Example Revisited: Obesity in America
In 2000 it was reported that 60% of Americans were categorized as overweight or obese. According to recent studies
it appears that even more Americans are now categorized as overweight or obese. A random sample of 125
Americans was taken and 83 of them were categorized as overweight or obese.
Research Question – Is there evidence that the obesity rate of Americans has increased since 2000?
14. Determine whether the conditions for the Binomial Distribution have been satisfied for the Obesity in
America example. Make sure to give your answers IN CONTEXT of the scenario.
15. Carry out a formal hypothesis test to answer the research question of interest.
Is there evidence that the obesity rate of Americans has increased since 2000?
Step 0:
H0:
Step 1:
Ha:
Observed number of success = __________
Step 2:
What values are considered more extreme in this scenario? Bigger or Smaller
Therefore, the p-value = P(X _____ 83) = _____________
Step 3:
9
Example: Twins
In 2001 a national vital statistics report indicated that about 3% of all births produce twins. Data from a large city
hospital found that only 2 sets of twins were born to 200 teenage girls. Does this suggest that teenage mothers are
less likely to have twins?
Research Question – Are teenage girls less likely to have twins?
16. Identify the population of interest in this study.
17. Identify the sample in this study.
18. Identify the variable of interest in this study.
19. Determine whether the conditions for the Binomial Distribution have been satisfied for the Twins example.
Make sure to give your answers IN CONTEXT of the scenario.
20. Carry out a formal hypothesis test to answer the research question of interest.
Are teenage girls less likely to have twins?
Step 0:
H0:
Step 1:
Ha:
Observed number of success = __________
Step 2:
What values are considered more extreme in this scenario? Bigger or Smaller
Therefore, the p-value = P(X _____ 2) = _____________
Step 3:
10