Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
67 Instruction: Hypothesis Testing Suppose the space shuttle carries sundry packets containing 25-seeds of a certain plant into orbit. These packets are stored in the space station for three months then taken back to earth by the shuttle. On earth, botanists open each packet and plant each set of 25-seeds in a fertile garden of its own. After a pre-determined duration, the heights of the resulting plants are recorded. The heights in millimeters below correspond to one randomly chosen garden. The botanists send this data set to a statistician for study. 5.8 6.5 6.6 6.0 5.7 5.9 6.8 5.7 5.8 5.9 5.1 5.7 5.8 5.7 5.6 5.9 5.2 5.6 4.8 5.7 6.4 5.0 5.9 5.9 6.0 The mean of this sample is 5.8 millimeters. Botanists know that under widely variable conditions on earth, the standard deviation, σ , for the plant height is 1.0 mm. In a preliminary investigation, botanists assume σ = 1.0 for the plants from space-borne seeds. In addition, on earth the mean height, µ , is 5.0 mm. If the sample above is representative of plants grown from space-borne seeds, it seems plants from space-borne seeds grow taller on average than plants from normal seeds, but the statistician must ask, "Is this point estimate really significant?" In this lecture, we demonstrate a mathematical procedure that leads to an answer to this question. Before continuing with the specific procedure, we recall some facts about normally distributed data and state some assumptions. First, recall the facts in the table below. If data is normally distributed, 68.26% of observations are within 1σ; 95.44% of observations are within 2σ; 99.74% of observations are within 3σ; therefore, 15.87% of observations fall above (or below) 1σ. 2.28% of observations fall above (or below) 2σ. 0.13% of observations fall above (or below) 3σ. Second, we make some assumptions for a population with mean µ , standard deviation σ , and sample size n. 1. The X -distribution has a mean equal to the population mean: µ X = µ . ∗ 2. For the standard deviation of X , we assume σ X = ∗ σ n . The symbol X here indicates a distribution, i.e., there are lots of X 's. The mean of this distribution µ x is the same as the mean of the population, that is, the X -distribution . 68 3. If n is "large enough," then X is approximately normally distributed. Now, we can apply a simple application of the normal distribution. We have a particular value, 5.8, of the distribution, X . We assume that µ X = 5. How likely is it that, although µ X = 5, our particular sample of plants from space-borne seeds has a mean equal to 5.8? Let the null hypothesis be H 0 : µ X = 5 . (We call this the null hypothesis because it states that the difference between 5 and µ X is zero, i.e., the difference is null). Let the alternative hypothesis be H A : µ X > 5 . Now we assign some level of significance denoted α . In practice, it is good statistical practice to select the level of significance before any samples have been taken. The level of significance equals the probability that we will reject the null hypothesis when it is in fact true and corresponds to the area under the right tail of a standard normal curve. See graph. α ↓ µ µ + zσ Using the level of significance, we reason as follows: • • If our particular sample mean falls within α , we reject H 0 . If our particular sample mean falls in the area to the left of α , we accept H0 . For our problem, we let α = 0.0013 . For convenience, we choose α so that it corresponds with the area to the right of µ + 3σ , which equals the critical value of our level of significance. This means we will reject the null hypothesis only in the case that the sample mean falls more than three standard deviations above the population mean. Naturally, we expect some sample means to be larger than the population mean. By assigning α to be 0.0013, we are saying that the difference in the sample mean and population mean is statistically significant only if the sample mean is so large that there is only a 0.13% probability that it came from the normal population. Now using our second assumption, we find the standard deviation of our sample using the standard error of the sampling mean. 69 σX = σ n = 1 25 = 1 = 0.2 5 Since we assume X to be normally distributed, we can calculate the Z-score of our sample mean. Z 5.8 = 5.8 − 5 0.8 = =4 0.2 0.2 Therefore, we have a sample mean that is four standard deviations above the mean, an unlikely occurrence. See the graph. µ µ + σ µ + 2σ µ + 3σ µ + 4σ Accordingly, we reject H 0 . There is not enough evidence to conclude that the plants from the space-borne seeds come from a population of plants where the mean height equals five millimeters. Accordingly, we suspect that the plants from space-borne seeds come from a population whose mean height is greater than five millimeters. Using our results, botanists might carry out further studies with seeds in space looking for a possible way to produce larger plants. The example above is called a one-tail test because the alternate hypothesis created an emphasis on one tail of the standardized normal distribution. Sometimes the alternate hypothesis simply asserts that µ X ≠ µ , in which case, a two-tail test is necessary, and the level of significance corresponds to the combined areas under the curve at two ends of the standardized normal distribution, which means that the null hypothesis is only rejected if the sample mean's Z-score falls above or below the critical value associated with ± α . Of course, hypothesis 2 testing can also be performed for other parameters such as the population proportion. Assignment 8 70 Problems #1 From lots of practice, a golfer knows that his average drive is 250 yards with a standard deviation of twenty-five yards. Recently, the golfer switched to a new type of driver. Using the new driver with one hundred shots, the golfer found that his average drive was 261 yards. Perform an hypothesis test to see if the new driver has changed the average drive of the golfer using a 1% level of significance. #2 Roy's commute to weekly commute to work has been 7.5 hrs with a standard deviation of σ = 0.5 hrs. Recently, however, Roy has changed his route to take advantage of a new toll road. After forty weeks commuting along the new route, Roy calculates a new mean weekly commute of 7.2 hrs. Test whether or not the new route has lessened the mean weekly commute for Roy using a 5% level of significance. #3 An aerospace company manufactures large rocket engines. To meet NASA specifications, the rockets should use an average of 5,500 pounds of rocket fuel the first fifteen seconds of operation. The company claims their rockets meet specifications. To test this claim, an inspector test fired six randomly selected engines. The average fuel consumption over the first fifteen seconds of operation was 5,690 pounds of fuel with a standard deviation of 250 pounds. Test whether or not the aerospace company's claim is justified using a two-tailed hypothesis test with a 0.05 level of significance. #4 A botanist has produced a new variety of hybrid wheat that is better able to withstand drought. The botanist knows that for the parent plants the probability of seed germination is 80%. The probability of seed germination for the hybrid variety is unknown, but the botanist claims that it is 80%. To test this claim 400 seeds from the hybrid plant are planted and 312 germinate. Does the hybrid plant have an equal probability of germination as the parent plant? Bonus: A large company has noticed high absenteeism. According to its records, the company knows that its employees are absent on average 8.5 times per quarter with a standard deviation of 2.5 absences. In order to test ways to reduce absenteeism, the company restructured the bonus schedule for 49 employees and noted that these employees were absent on average only 7.8 times per quarter. Test whether or not the new bonus schedule lowers absenteeism using a one-tailed hypothesis test with α = 0.05 .