Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AP Statistics Section 11.1 A Basics of Significance Tests Is the card RED or BLACK? Each student in the class will be asked if the next card in a well shuffled deck of cards is red or black. One point of EC will be given to each student who chooses correctly. Before we begin, what proportion of the cards in the deck do expect to be black? .5 _______ All of you will be expected to keep track of how many cards are red and how many are black. RED: BLACK: What was our sample proportion of black cards? _______ Has your opinion on what proportion of the cards in the deck are black changed? Let’s use STATKEY to explore this situation further. Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to ____________________________. estimate a population parameter The second type of inference, called significance tests, has a different goal: to assess the evidence provided by data about some claim concerning a population. Example 1: I claim that I make 80% of my free throws. To test my claim, you ask me to shoot 20 free throws. I make only 8 out of 20. Assume p = .8, and find the probability of making exactly 8 of the 20 free throws. Also, find the probability of making 8 or less free throws. binomialpdf (20,.8,8) .000087 binomialcdf (20,.8,8) .0001 “Aha!” you say. “Someone who makes 80% of his free throws would almost never make only 8 out of 20. So I don’t believe your claim.” Your reasoning is based on asking what would happen if my claim were true and we repeated the sample of 20 free throws many times. I would almost never make as few as 8. This outcome is so unlikely that it gives strong evidence that my claim is not true. Significance tests use elaborate vocabulary but the basic idea is simple: getting an outcome that would rarely happen if a claim were true is strong evidence that the claim is not true. A significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to assess. The hypothesis is a statement about a population parameter such as the population mean ___ or population p The results of a test are proportion ___. expressed in terms of a probability that measures _______________________________________. how well the data and the hypothesis agree The reasoning behind statistical tests, like that of confidence intervals, is based on asking what would happen if we repeated the sampling or experiment many times. We will begin with the unrealistic assumption that we know , the population standard deviation. Example 2: Vehicle accidents can result in serious injuries to drivers and passengers. In the case of life-threatening injuries, victims generally need medical attention within 8 minutes of the crash. Several cities have begun to monitor paramedic response times. In one such city, the mean response time (RT) to all such accidents involving life-threatening injuries last year was 6.7 minutes with 2 minutes. The city manager shares this information with emergency personnel and encourages them to “do better” next year. At the end of the following year, the city manager selects a simple random sample of 400 calls involving life-threatening injuries and examines the response times. For this sample, the mean response time was x 6.48 minutes. Do these data provide good evidence that response times have decreased since last year? Remember, sample results may vary! Maybe the mean RT for the SRS is simply a result of ____________________. sampling variability We want to use the same reasoning here as we did in the previous example. We make a claim and ask if the data give evidence *__________* against it. We would like to conclude that the mean RT ____________, decreased so the claim we test is that RTs _____________________. have not decreased If we assume the RTs for calls involving lifethreatening injury have not decreased, the mean RT for the population of all such calls would still be __________ 6.7 (assume ________ 2 too). Consider the sampling distribution of x from 400 calls: Shape: approx. Normal - CLT Mean: x 6.7 Standard deviation: x 2 n 400 as long as N 10n Find the probability of x 6.48 minutes. normalcdf (1000,6.48,6.7,.1) .014 An observed value this small would rarely occur by chance if the true were 6.7 minutes. This observed value is good evidence that the true is, in fact, less than 6.7 minutes. Thus we can conclude the average response time decreased this year. In example 2, we asked whether the accident RT data are likely if, in fact, there is no decrease in paramedics’ RTs. Because the reasoning of significance tests looks for evidence against a claim, we start with the claim we seek evidence against, such as “no decrease in response time.” This claim is our _________________( null hypothesis ____ H 0 ). This is the statement being tested in a significance test. The significance test is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no change”, or “no difference” from historical values. The null hypothesis can be thought of as the “status quo” hypothesis. The claim about the population that we are trying to find evidence for is the alternative hypothesis ( ____ H a ). In example 2, the null hypothesis says “no decrease” in the mean RT of 6.7 min.”: H0:________ 6.7 while the alternative hypothesis says “there is a decrease in the mean RT of 6.7 min.”: Ha: ________ 6.7 where is the mean response time to all calls involving life-threatening injuries in the city this year. In this instance the alternative hypothesis is one-sided because we are interested only in deviations from the null hypothesis in one direction. Hypotheses always refer to some population, not to a particular outcome. Be sure to state H 0 and H a in terms of a population parameter. Example 3: Does the job satisfaction of assembly workers differ when their work is machine-paced rather than self-paced? One study chose 18 subjects at random from a group of people who assembled electronic devices. Half of the subjects were assigned at random to each of two groups. Both groups did similar assembly work, but one work setup allowed workers to pace themselves, and the other featured an assembly line that moved at fixed time intervals so that the workers were paced by the machine. After two weeks, all subjects took the Job Diagnosis Survey (JDS), a test of job satisfaction. Then they switched work setups and took the JDS again after two more weeks. This is a _________________design matched - pairs experiment. The response variable is the __________________________, difference in JDS scores self-paced minus machine-paced. The parameter of interest is the mean of the differences in JDS scores in the population of all assembly workers. The null hypothesis says that there (is a / is no) difference in the scores: H0 : 0 The authors of the study simply wanted to know if the two work conditions have different levels of job satisfaction. They did not specify the direction of difference. The alternative hypothesis is therefore two-sided; that is either _______ 0 or _______. 0 For simplicity, we write this as H a : _______. 0 The alternative hypothesis should express the hopes or suspicions we have before we see the data. It is cheating to first look at the data and then frame the alternative hypothesis to fit what the data show. If you do not have a specific direction firmly in mind in advance, use a two-sided alternative.