* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Methods for a Single Numeric Variable – Hypothesis Testing We`ve
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Gibbs sampling wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Methods for a Single Numeric Variable – Hypothesis Testing We’ve already discussed how to summarize a single numeric variable both numerically and graphically. In this set of notes we will look at inferential procedures (hypothesis testing and confidence intervals) for a single numeric variable. Example: Consider a study in which the weight of insulin-dependent diabetics is being investigated. The variable of interest is the percent of their ideal body weight. For example, a value of 120% implies that an individual weighs 20% more than their ideal body weight, and a value of 95% implies an individual weighs 5% less than their ideal body weight. The data below can be found in the file diabetics.jmp on the course website. 107 100 116 104 119 125 101 88 99 114 121 114 114 95 152 124 120 117 We can use JMP to summarize the data as follows: Questions: 1. What is the mean percent ideal weight of the observed data? The standard deviation? 2. If another sample of n = 18 patients was obtained, would these new individuals have a mean exactly the same as the mean from this sample? Why or why not? 3. Given your answer to the previous question, do you think it is appropriate to use only this sample mean to make inferences about the true ideal body weights in the greater population of insulin-dependent diabetics? Explain. 1 Sampling Distribution of the Sample Mean The sample mean is a random quantity. That is, it changes from _______________ to ______________. Therefore, the sample mean HAS a distribution. The distribution will tell us two things: 1. What values the sample mean can ________________. 2. How __________ the sample mean will assume these values. The distribution is referred to as the sampling distribution of the sample mean. An understanding of this sampling distribution allows us to make decisions about a population mean for a single numeric variable. When we make decisions about a population mean, we will use both of the following: 1. The sample __________ (from the observed data). 2. The sampling ____________________ of the sample mean. Exploring the Sampling Distribution of the Sample Mean Before we discuss the procedure for hypothesis testing, let’s consider the next activity to gain a better understanding of how this particular sampling distribution works. Example: Simulation study – suppose we set up a hypothetical population of 500 insulin-dependent diabetics. This population was purposefully created so that the mean percent ideal body weight is 100%. Questions: 4. Looking at the histogram, how would you describe the shape of the distribution of percent ideal body weight values? 5. In this simulation, what is the value of the true population mean, µ? 2 Note that in reality, the true population mean is usually an _______________ quantity which we are trying to estimate. If it were impossible of not feasible to collect data on the entire population, we would take a sample from the population in order to estimate the average percent ideal body weight. Let’s see what happens when we take various samples of size 5 from this population. The “population” can be found in the file diabetic_population.jmp on the course website. To take a random sample of size 5, we need to perform the following steps in JMP. 1. Choose Table Subset and the following dialogue box should appear. 2. You’ll then want to choose the Random – sample size: option and enter 5 in the box. You may also want to call the table outputted Sample 1. 3. Click OK and a new data table should pop up with only 5 observations in it. This is your random sample of size 5 from the “population” of insulin-diabetic patients. 4. Next, find the mean of your observations and record your sample mean below. Then repeat this procedure 2 more times so you have 3 sample means total. Sample 1 2 3 Sample Mean 3 Next repeat this procedure using a random sample of size 10. Sample 1 2 3 Sample Mean Questions: 6. Consider the means calculated from your random samples of size 5. We will use the entire class’s data to construct a histogram of the sample means. Sketch the plot below. Based on the histogram, what can you say about the shape of the distribution? 7. How does the amount of variability in this sampling distribution compare to the amount of variability in the original population? Why does this happen? 8. Next, consider the means calculated from your random samples of size 10. Again, we will use the class data to construct a histogram of the sample means. What can you say about the shape of this distribution? 4 9. How does the amount of variability in this sampling distribution compare to the amount of dispersion in the sampling distribution for a random sample of size 5? Why does this happen? Next, let’s see what happens when we take many more samples than we did in class. The following output shows the results from one thousand random samples of size 5 and one thousand random samples of size 10. Questions: 10. How does the shape of the sampling distribution for the mean change as sample size increases? 11. How does the amount of variability of the sampling distribution for the mean change as the sample size increases? 12. How does the center of the sampling distribution for the mean compare to the true population mean? 5 13. Suppose a sample of size 10 was taken for this study. Does a sample mean of x = 112.78 seem likely to occur by chance if the true mean is really 100? What does this say about the research question? 14. What do you expect the sampling distribution of the sample mean to look like with a sample size of n = 18? Recall, this is the sample size used in the actual study. Characteristics of the Sampling Distribution for the Sample Mean To characterize the sampling distribution of the sample mean, we need to describe its center, shape, and amount of variability (or dispersion). The ____________ (or center) of the sampling distribution for the sample mean is the ________ as the mean from the original distribution. Comment: We often use µ to denote the mean of the original distribution. Therefore, the mean of the sampling distribution for the sample mean is also µ. The standard deviation (which measures the variability or dispersion) of the sampling distribution for the sample mean ______________ as the sample size gets larger. Specifically, if we let σ denote the standard deviation of the ____________________ distribution, then the standard deviation of the sample mean is given by: This quantity is referred to as the standard error of the sample mean. Central Limit Theorem for the Sample Mean Consider a random sample of _____ observations from _____ population with mean _____ and standard deviation _____. Then, when n is sufficiently large, the sampling distribution of the sample mean ( x ) will be an approximately normal distribution with a mean of µ and standard deviation . n Note: This approximation gets better as the sample size (n) increases. 6 Question: 15. How large does n have to be? If the original population is normally distributed, then the sampling distribution of the sample mean will also be normally distributed REGARDLESS of the sample size. For most populations, a sample size of n ≥ 30 or 40 will be sufficiently large enough to say the sampling distribution of the sample mean is approximately normally distributed. The more skewed the distribution, the larger the sample size must be before the normal approximation fits the sampling distribution for the sample mean well. If the distribution is very skewed, the sample size may have to be MUCH larger than 30!!! Example: Recall the study where the insulin-dependent diabetics are being investigated. Some summaries from JMP are shown below. Research Question – Do the data provide evidence that the mean percent ideal body weight for insulin dependent diabetics differs from 100? As mentioned earlier, we know the sample mean is a random variable. For our sample, x = 112.78. This would probably not be the case if we took another sample. Our goal is to use what we just learned about the sampling distribution of the sample mean, in addition to our sample mean ( x = 112.78) to decide whether we have evidence the true POPULATION mean percent of ideal body weight (µ) differs from 100. Consider these characteristics of the sampling distribution for the sample mean for this example: Center: The sampling distribution for the sample mean will be centered at a mean of µ = 100 (assuming the ideal percent ideal body weight is 100). Shape: Based on the Central Limit Theorem, the sampling distribution for the sample mean will be approximately normal if: o The sample size is sufficiently large OR o The distribution of the original is normally distributed 7 . n At this point we have established that the sampling distribution of the sample mean will approximately follow a normal, bell-shaped curve centered at µ = 100. Variability: The standard error of the sampling distribution for the sample mean is Our next step is to determine where our sample mean falls on this sampling distribution. We can then use this information to find the p-value for the test. To determine whether or not the distance between µ (the hypothesized mean) and x (the mean from our observed data) is larger than what we would expect under repeated sampling, we can consider using the z-score for a sample mean: The z-score comes from what is called the standard normal distribution. Let’s look at the formal hypothesis test. Step 0: Define the research question. Do the data provide evidence that the mean percent ideal body weight for insulin dependent diabetics differs from 100? Step 1: Determine the appropriate null and alternative hypotheses. H0: The mean percent ideal body weight of insulin-dependent diabetics is 100. Ha: The mean percent ideal body weight of insulin-dependent diabetics differs from 100. Equivalently, we could state the hypotheses as follows: H0: µ = 100 Ha: µ ≠ 100 8 Step 2: Check the assumptions behind the test and calculate the test statistic For this hypothesis test, we must check that one of the two assumptions has been met: The sample is sufficiently large OR It is reasonable to assume the distribution of the population is normally distributed Since n = 18 which is NOT sufficiently large, we can graphically determine whether the distribution of the population is normally distributed. Histogram: A histogram of the data with a smooth curve (which represents the underlying distribution of the population from which the data came from) and the normal distribution can be used to assess normality. If the red and green lines are roughly the same, then it can be concluded that the population the data came from is normally distributed. Normal Quantile Plot: This is another plot which can be used to assess normality. In this plot you’re looking for the points to lie on or very close to the y = x line. To get this plot click on the little red arrow next to the variable name and choose Normal Quantile Plot. Question: 16. Looking at the histogram and the quantile plot, is it reasonable to assume the population the data came from is normally distributed? 9 We discussed using the z-score to determine if our observed sample mean is unusual to observe by x chance alone. However, do you see a problem with this formula, z = ? What is σ? n In practice, we have to use the sample standard deviation, s, which is our best guess for the population standard deviation. That is, we’ll have to use x which no longer follows the s n standard normal distribution. Once we have to use s to estimate σ, the new statistic comes from the t-distribution. This distribution is also _________________________, __________________, and centered at _____ (just like the standard normal distribution). The difference is that the tdistribution is more variable than the standard normal distribution. The amount of variability in the t-distribution depends on the sample size n. Therefore, this distribution is indexed by its degrees of freedom (df). For inference regarding a single mean, df = n – 1. Consider the following t-distributions. Question: 17. Calculate the test statistic for the diabetic example. t x = s n 10 18. What does the numerator of the test statistic tell us? Step 3: Find the p-value. As we’ve already seen, the p-value is the probability (assuming H0 is true) of observing results as extreme as was observed in the observed data. Lower-Tailed Test (Ha contains <) Upper-Tailed Test (Ha contains >) Two-Tailed Test (Ha contains ≠) We can use JMP to find the p-value for us. Click on the red drop-down arrow and choose Test Mean. You should then get a dialogue box that looks like the one given below. You’ll want to enter the hypothesized value (in this case 100) in the top box as shown below. Click OK and you should get the following JMP output. p-value = ________________ 11 Step 4: Report the conclusion in context of the research question. Example: A physician has noticed that a large number of adults tend to have a body temperature less than 98.60F. Therefore, the physician decides to conduct a study to examine true average body temperature in adults. A random sample of 75 patients was taken and each had their body temperature taken. The data can be found in the file Temp.jmp on the course website. Research Question – Is there evidence that the average body temperature of adults is less than the conjectured 98.60F? Step 0: Define the research question. Is there evidence that the average body temperature of adults is less than the conjectured 98.60F? Step 1: Determine the appropriate null and alternative hypotheses. Step 2: Check the assumptions behind the test and calculate the test statistic. Step 3: Find the p-value. Step 4: Report the conclusion in context of the research question. 12 Example: The State Environmental Protection Agency (SEPA) is responsible for monitoring the air pollution level for a large western metropolis. The air pollution level is considered to be acceptable (or safe) if the mean pollution level is at or below a reading of 100mg of pollution per cubic yard of air. Air pollution levels substantially above 100mg/yd3 are considered to be dangerous. To monitor air pollution levels, the SEPA will take a pollution reading 10 times a day. If the evidence from this sample suggests that the air pollution levels are unacceptable, then the SEPA must decrease an air pollution emergency and impose emergency measures to reduce pollution levels in the air. Suppose the readings for one day are given in the following table (and can also be found on the course website in the file pollution.jmp). Pollution Level (mg/yd3) 98.6 100.2 101.1 109.4 99.4 110.5 95.6 108.9 112.9 110.5 Consider the following summary statistics and graphics for this example: Research Question – Is there evidence that the pollution levels are unacceptable? Step 0: Define the research question. Is there evidence that the pollution levels are unacceptable? 13 Step 1: Determine the appropriate null and alternative hypotheses. Step 2: Check the assumptions behind the test and calculate the test statistic. n = 10 < 30 Data is not normally distributed It appears neither assumption has been satisfied in this example. Therefore, a t-test is not appropriate to carry out the analysis. If the data is not normally distributed, but is symmetric we can conduct what is called the Wilcoxon Signed Rank Test. To carry out this test, choose click on the red drop-down arrow and choose Test Mean as done before. However, this time check the box for the Wilcoxon Signed Rank Test in the dialogue box as shown below. Click OK and you’ll get the following output from JMP. Step 3: Find the p-value. Step 4: Report the conclusion in context of the research question. 14 Types of Errors Encountered in Hypothesis Testing After examining evidence from a sample, we will make one of two decisions when carrying out a hypothesis test: Evidence for RQ (Reject H0): This indicates that we have enough evidence to conclude the _____________________ (research question) is true. No Evidence for RQ (Do Not Reject H0): This indicates we do not have enough evidence to ____________ the null hypothesis (i.e. no evidence for the RQ). The two possible outcomes: Ideally, these outcomes would occur: Reject H0 Do not Reject H0 However, we know that hypothesis tests are not error proof! The following table summarizes all possible scenarios when carry out a hypothesis test (the probability of each occurring is listed in parentheses). Null is true Alternative is true Reject H0 Type I Error (α) No Error (1 – β) Do not Reject H0 No Error (1 – α) Type II Error (β) You’ll see from the above table that two types of errors exist: Type I: This error occurs when we falsely reject the null hypothesis. That is, we _______ the null hypothesis when it is true. The probability of committing this error is α. Note: We can control the Type I Error rate by our selection of α prior to conducting the experiment. Type II: This occurs when we fail to reject the null hypothesis when a particular alternative scenario is true. The probability of committing this error is β. 15 There is a relationship between α, β, and n (sample size): We have __________________ control over _____. A decrease in _____ results in an increase in _____. An increase in the sample size will decrease both _____ and _____. Example: The MedAssist Pharmaceutical Company makes a pill intended for children susceptible for seizures. The pill is supposed to contain 20mg of Phenobarbital. A random sample of pills is selected and tested to see that the average amount of the drug is correct. H0: Ha: Questions: 19. Describe a Type I Error in context. 20. Give one consequence/implication of making a Type I Error. 21. Describe a Type I Error in context. 22. Give one consequence/implication of making a Type I Error. 16 Example: The State Environmental Protection Agency (SEPA) is responsible for monitoring the air pollution level for a large western metropolis. The air pollution level is considered to be acceptable (or safe) if the mean pollution level is at or below a reading of 100mg of pollution per cubic yard of air. Air pollution levels substantially above 100mg/yd3 are considered to be dangerous. To monitor air pollution levels, the SEPA will take a pollution reading 10 times a day. If the evidence from this sample suggests that the air pollution levels are unacceptable, then the SEPA must decrease an air pollution emergency and impose emergency measures to reduce pollution levels in the air. H0: Ha: Questions: 23. Describe a Type I Error in context. 24. Give one consequence/implication of making a Type I Error. 25. Describe a Type I Error in context. 26. Give one consequence/implication of making a Type I Error. 17 Confidence Interval for a Single Population Mean In the hypothesis testing set of notes we found evidence that the mean percent ideal body weight of insulin-dependent diabetics differs from 100. Our next question is obvious: How much does it differ? To answer this question, we must first construct a confidence interval. Confidence Intervals This procedure does not require any hypotheses concerning our population parameter of interest, in this case µ. We will use both sample data, in particular the observed _______________, and the appropriate sampling distribution to obtain a range of likely values for the population mean. A confidence interval allows us to estimate the population parameter of interest (recall a hypothesis test will NOT allow us to do this). Therefore, when available, a confidence interval should accompany a hypothesis test. Because the confidence interval does not require any hypothesized value for the population parameter, we can’t center the sampling distribution about the “true” or population hypothesized mean. However, the confidence interval will still incorporate both the data collected in our sample and what we know about sample-to-sample variation. Consider the following example. Example: Our goal is to construct a 95% confidence interval for the mean percent ideal body weight of insulin-dependent diabetics. To do this, we will center our sampling distribution at the observed mean. Then, we will find the lower and upper endpoints the separate the middle 95% of the distribution from the rest (since we are constructing a 95% confidence interval). 18 JMP automatically provides the endpoints of the 95% confidence interval: Questions: 27. Interpret this interval. What does this interval tell us about the true percent ideal body weight of insulin-dependent diabetics? 28. Does this interval agree with what you learned from the hypothesis test? Explain. 29. What additional information is gained by using a confidence interval over a simple hypothesis test? Explain. 19 Margin of Error The margin of error is defined as the difference between the center of the confidence interval and either endpoint. For this problem we have: Upper Endpoint – Center of Interval = 119.951 – 112.778 = 7.173 Center of Interval – Lower Endpoint = 112.778 – 105.605 = 7.173 So, the margin of error for this problem is ± 7.173. Question: Can you identify at least two ways to make this margin of error smaller? 1. Confidence level 2. Sample size Changing the Confidence Level in JMP Recall the 95% confidence interval for the mean: To change the level of confidence, click on the red drop-down arrow and choose Confidence Interval and choose 0.90. 20 You should get the following output: Question: 30. Did the margin of error change as you thought it would? More on the Interpretation of Confidence Intervals Consider the 95% confidence interval from the diabetic example. Correct interpretation: We’re 95% confident the true mean percent ideal body weight of insulin-dependent diabetics is between 105.6% and 119.95%. Incorrect interpretation: The probability that the true mean percent of ideal body weight of insulin-dependent diabetics is between 105.6% and 119.95%. The 95% refers to the process of constructing the confidence interval. This means that if we were to take 100 samples of size 18, constructing a confidence interval each time, we would expect 95% of them to capture the true population mean. Consider the following example: Example: Our goal is take samples from a population in order to estimate the true population mean. Shown below are 10 random samples of size n = 5. Construct a confidence interval for each of the samples. Sample ID 1 2 Data from Sample 12.49983 11.4342 8.210933 7.373925 8.776002 5.655407 8.903349 12.98215 10.22548 6.172528 Sample Statistics 90% Confidence Interval Mean Std Dev 9.65898 2.19771 7.56 ≤ μ ≤ 11.75 Mean Std Dev 8.78778 3.01349 5.91 ≤ μ ≤ 11.66 21 3 4 5 6 7 8 9 10 8.181802 12.08606 6.176875 5.556382 5.822172 13.19405 5.122735 2.469639 7.373925 6.401793 9.293009 10.52984 7.260893 10.50763 7.431728 9.303573 2.354969 8.811873 17.06401 10.45554 10.91127 8.023941 8.432168 14.17466 8.603912 11.53353 5.782364 11.44628 10.61424 -1.68752 8.197059 6.193274 9.114461 6.290799 9.661013 6.53196 12.08221 6.81856 13.46314 9.183324 Mean Std Dev 7.56466 2.73035 4.96 ≤ μ ≤ 10.17 Mean Std Dev 6.91243 3.96465 3.13 ≤ μ ≤ 10.69 Mean Std Dev 9.00462 1.59555 7.31 ≤ μ ≤ 10.71 Mean Std Dev 9.59799 5.23552 4.6 ≤ μ ≤ 14.6 Mean Std Dev 10.02919 2.57711 7.58 ≤ μ ≤ 12.48 Mean Std Dev 7.53778 5.67659 2.13 ≤ μ ≤ 12.95 Mean Std Dev 7.89132 1.59424 6.37 ≤ μ ≤ 9.41 Mean Std Dev 9.61584 3.09866 6.67 ≤ μ ≤ 11.75 22 A graphical representation of the confidence intervals is given below. Questions: 31. Why are some of the 90% confidence intervals wider than others? 32. In truth, these 10 random samples were generated from a population with a mean of 10. How many of the confidence intervals contain the true mean? What does it mean to say that we are 90% confident? Example: Recall the Student Data Survey completed at the beginning of the semester. Using JMP, construct a 99% confidence interval for the true average number of hours spent on facebook in a day. The data can be found in the file facebook.jmp on the course website. Interval Interpretation 23