Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Methods for a Single Categorical Variable - Binomial Distribution We’ve been using simulation techniques to create a picture of what outcomes seem reasonable based on the “no preference,” “no knowledge,” or “no difference” scenarios. However, instead of simulating each scenario a finite number of times and ____________________ the p-value, we will focus on the situation in which we simulate the experiment an ____________________ number of times. This will provide us with the ___________________ probabilities of interest. Also, this will fix the “problem” of different people getting different answers to the research question using the same data (since each simulation is slightly different). Example Revisited: Helper vs. Hinderer The following graphic shows what the distribution of “no preference” would look like if we simulated the scenario an infinite number of times, and plotted the number of infants who chose the Helper Toy each time. This is known as the ____________________ distribution. Using the Binomial Distribution to Make Decisions To answer research questions which involve a single categorical variable, statisticians do not necessarily always turn to simulations involving taking random samples over and over again. Instead, we use the Binomial distribution which shows us how often each of our possible outcomes would occur if we repeated the previous simulations an infinite number of times. Binomial Distribution – when can we use it? The Binomial distribution can be used whenever the following conditions are satisfied: 1. There are a fixed number of trials, _____. 2. There are only _____ possible outcomes for each trial – a “success” and a “failure” 3. The probability of ____________________ (p) remains the same for __________ trial. 4. The trials are ____________________. 1 Question: 1. Determine whether the conditions for the Binomial Distribution have been satisfied for the Helper vs. Hinderer example. Make sure to give your answers IN CONTEXT of the scenario. Fixed number of trials? 2 possible outcomes? P(success) remains the same? Trials are independent? The formula for calculating probabilities using the binomial distribution is given below: n! px (1 - p)n-x where p = P(Success on each trial). x!(n - x)! However, we are not going to calculate the probabilities by hand, but will use JMP instead. There is a file on the course website called Binomial_Probabilities.jmp which will be used to calculate binomial probabilities. The following two pieces of information are needed to calculate probabilities using the JMP file: The probability of success: P(success) = p p = Probability infants choose the helper toy (assuming no toy preference) = __________ The sample size (number of trials): n n = number of infants in the study = __________ To change the number of trials in JMP, right-click on the n (number of trials) column and select Formula: 2 Then, change the value of n (number of trials) to 16 (since there were 16 infants in the study) as follows: Click OK. Next, change the probability of success by right-clicking on the p (probability of success) column and selecting Formula. Change the value to p = 0.50 (since there is a 50% chance of choosing the helper toy, assuming infants really have no preference.) Click OK and JMP should return the following output: 3 Questions: 2. What do the “Individual Binomial Probabilities” represent? 3. What is the probability of seeing exactly 8 infants choose the helper toy, assuming they have no preference? 4. What do the “Cumulative Binomial Probabilities” represent? 5. What is the probability of seeing 3 infants or fewer choose the helper toy, assuming they have no preference? 6. What do the “Prob of x or more” probabilities represent? 7. What is the probability of seeing 14 or more infants choose the helper toy, assuming they have no preference? 8. In Question 7, the p-value for this scenario was computed. Using this p-value, make a conclusion in context regarding the research question of interest for this scenario. 4 Example Revisited: Bone Density Recall, this example involves evaluating a new test to detect low bone density in postmenopausal women and determine if it is better at early detection of the disease. A random sample of 248 postmenopausal women were given the new test and treated accordingly, and 82 were diagnosed with low bone density. Research Question – Is there evidence that the new test reduces the percentage of postmenopausal women diagnosed with low bone density? Question: 9. Determine whether the conditions for the Binomial Distribution have been satisfied for the Bone Density example. Make sure to give your answers IN CONTEXT of the scenario. Fixed number of trials? 2 possible outcomes? P(success) remains the same? Trials are independent? The following two pieces of information are needed to calculate probabilities using the JMP file: The probability of success: P(success) = p p = Current proportion of postmenopausal women with LBD (no change) = __________ The sample size (number of trials): n n = number of postmenopausal women in the study = __________ 5 To change the number of trials in JMP, right-click on the n (number of trials) column and select Formula, and change the value to 248 as follows: Click OK. Next, change the probability of success by right-clicking on the p (probability of success) column and selecting Formula, and change the value 0.40 as shown below: Click Apply and then OK. JMP should then return the following output: . . . 6 Questions: 10. Based on the Binomial probabilities, what is the probability of observing 82 or fewer postmenopausal women selected if there really is no difference in the test detection? That is, find the p-value. 11. Based on the p-value from Question 10, provide the conclusion in context for the research question of interest. 7 Example Revisited: Effectiveness of an Experimental Drug Suppose a commonly prescribed drug for relieving nervous tension is believed to be only 70% effective. A new drug was formulated and administered to a random sample of 20 adults who were suffering from nervous tension. Of those 20 adults, 18 experienced relief when taking the new drug. Research Question – Is the new experimental drug more effective than the old drug? That is, is the experimental drug more than 70% effective? Questions: 12. Determine whether the conditions for the Binomial Distribution have been satisfied for the Experimental Drug example. Make sure to give your answers IN CONTEXT of the scenario. 13. Carry out a formal hypothesis test to answer the research question of interest. Is the new experimental drug more effective than the old drug? That is, is the experimental drug more than 70% effective? Step 0: H0: Step 1: Ha: Observed number of success = __________ Step 2: What values are considered more extreme in this scenario? Bigger or Smaller Therefore, the p-value = P(X _____ 18) = _____________ Step 3: 8 Example Revisited: Obesity in America In 2000 it was reported that 60% of Americans were categorized as overweight or obese. According to recent studies it appears that even more Americans are now categorized as overweight or obese. A random sample of 125 Americans was taken and 83 of them were categorized as overweight or obese. Research Question – Is there evidence that the obesity rate of Americans has increased since 2000? 14. Determine whether the conditions for the Binomial Distribution have been satisfied for the Obesity in America example. Make sure to give your answers IN CONTEXT of the scenario. 15. Carry out a formal hypothesis test to answer the research question of interest. Is there evidence that the obesity rate of Americans has increased since 2000? Step 0: H0: Step 1: Ha: Observed number of success = __________ Step 2: What values are considered more extreme in this scenario? Bigger or Smaller Therefore, the p-value = P(X _____ 83) = _____________ Step 3: 9 Example: Twins In 2001 a national vital statistics report indicated that about 3% of all births produce twins. Data from a large city hospital found that only 2 sets of twins were born to 200 teenage girls. Does this suggest that teenage mothers are less likely to have twins? Research Question – Are teenage girls less likely to have twins? 16. Identify the population of interest in this study. 17. Identify the sample in this study. 18. Identify the variable of interest in this study. 19. Determine whether the conditions for the Binomial Distribution have been satisfied for the Twins example. Make sure to give your answers IN CONTEXT of the scenario. 20. Carry out a formal hypothesis test to answer the research question of interest. Are teenage girls less likely to have twins? Step 0: H0: Step 1: Ha: Observed number of success = __________ Step 2: What values are considered more extreme in this scenario? Bigger or Smaller Therefore, the p-value = P(X _____ 2) = _____________ Step 3: 10