Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Outline and Review of Chapter 9 (and 11) Introduction to the t test Hypothesis testing with the t statistic is based on hypothesis testing with the z statistic. When using z, you use the population parameters, mean (μ) and standard deviation (σ), to determine the probability of obtaining a particular score (X) or sample mean (M). There will be some times, however, when you do not know the population parameters and other times when you want to compare your sample mean to something other than the population mean. In these cases, you will need to us a t-test. The t-test is the most common statistic for comparing differences between two means. Like z, t is an inferential statistic or a statistic that is used to make an inference, or a decision. In our case, that decision is usually whether or not to reject the null hypothesis. Basic t-test Theory We previously discussed the differences between the equations for population standard deviation (σ) and the sample standard deviation (s). Population: σ = SS = N 2 Sample: s = SS = ( n 1) SS = df ss The equations differ because the sample standard deviation does not actually represent the sample standard deviation (confused?). The sample standard deviation is a formula designed to estimate the population standard deviation. It is a guess, but it’s better than nothing, so if you don’t know the population standard deviation, the t-test allows you conduct a hypothesis test using the sample standard deviation in place of the population standard deviation. Look at the similarities between the equation for z and the equation for t: z M M t M sM The standard error of the sample (sM) is calculated from the sample standard deviation in the same way the standard error of the population (σM) is calculated from the population standard deviation. s M sM n n The sample standard deviation is an estimate of the population standard deviation, the accuracy of this estimate improves as sample size increases. As a result, for the t-test, the critical value that you need to reach in order to reject the null hypothesis will be quite large if your sample size is small but it will decrease as sample size increases. This means that, unlike the z-test, the t-test has different critical regions for different sample sizes (see table below). Although the critical value for z for a two-tailed test with α = 0.05 will always be ± 1.96, the critical value for t is ± 4.303 if n = 3 and ± 2.042 if n = 30. The critical values for t get closer and closer to the critical values for z as sample size increases. This is because, as sample size increases, s becomes a better estimate of σ. In summary, t is used just like z with two important exceptions: (1) for the t-test, the sample standard deviation is used in place of the population standard deviation and (2) the critical values for t will depend on the sample size; small samples will have larger critical values than small sample sizes. One-Sample t-tests You can use a one-sample t-test to compare a sample mean to any single value, whether it is a real population mean or any other hypothetical value that interests you. For example, imagine that your bank informs you that they will charge a fee each time your average yearly checking account balance drops below $100. Although you have always made sure your account was greater than zero, you never really paid much attention to whether or not your account balance was greater than $100. You know what your monthly balances have been and you know that the average of last year’s twelve monthly balances was greater than 100 (M = 106.25) but it would be nice to know if this number is significantly greater than 100 so you don’t slip below this average next year. You don’t expect your financial situation to change in the upcoming year so you decide to take last year’s twelve monthly balances to see if the mean balance for these months is significantly greater than $100. This will give you a good idea of your odds of slipping below an average balance of $100 next year. Note that $100 does not represent a true population mean, but it is an important value to which you would like to compare your average monthly balance. For simplicity, we will still refer to this as a hypothetical population mean and represent it with μ. Here are your twelve monthly balances from last year along with the appropriate statistics: January February March April May June July August September October November December 107 93 105 97 100 113 105 106 112 119 99 119 M= s= sm= n= 106.25 8.31 2.40 12 Your hypothesis test is based on the question, “Was last year’s average monthly balance significantly greater than $100?” H0: M <= μ or H1: M > μ or H0: 106 <= 100 H1: 106 > 100 Note that we are conducting a one-tailed test. If we set α = 0.05, the critical value for t with n-1 degrees of freedom (df = 11) will be + 1.796. Make note of this: t(11)crit = 1.796. Figure 9.1. The t distribution showing the critical region (or zone of rejection) in red for values greater than +1.796. We have everything we need to calculate t using the formula: t M sM t 106 100 2.40 t 2.5 Since + 2.5 is well within the zone of rejection, you conclude that last year’s average monthly balance was significantly greater than $100 and, provided nothing dramatic happens to your financial situation, you will probably remain safely above $100 next year. If you wanted to report these results formally, it would look something like this: Last year’s mean monthly balance (M = 106.25) was greater than the hypothesized mean monthly balance (μ = 100) and this difference was statistically significant (t(11) = 2.5, p <0.05, one-tailed). Notice that the degrees of freedom (df) are reported in parentheses after t and that you clearly state that your probability of making a Type I Error (p) is less than your declared value of α (0.05) which means that you should reject the null hypothesis. Since many researchers are now using statistical software that can report the exact probability of making a Type I error, you may see the results presented with an exact value of p. In our case, p = 0.013 so, if you know this, you should tell your reader (t(11) = 2.5, p = 0.013, one tailed). In these cases, it is up to you to understand that p is less than the declared value of alpha and that the null hypothesis should be rejected. It is also important to realize that statistical software will probably report p to only three decimal places. This means that you could get a result that states p = 0.000. Understand that your probability of making a Type I error is never zero but the software has rounded a very low p value to the nearest three decimal places. When reporting this result, you do not know the exact value of this small p, but you do know that it is less than 0.001, so the appropriate thing to do is to state that p < 0.001. Effect Size and t-tests Effect size, as measured by Cohen’s d, is calculated just as it was when you were using z. This time, however, instead of using the population standard deviation, you must use the sample standard deviation. Since the sample standard deviation (s) is an estimate of the population standard deviation (σ), the value is an estimate of Cohen’s d: estimated d M s Just as before, consider values of d near 0.5 to show a medium effect, values below 0.3 to be small, and values at and above 0.8 to be large. Related-Samples t-tests A simple modification of the one sample t-test opens the door to another powerful and common version of the t-test, the related-samples t-test, or paired-samples t-test. One of the simplest and most common experimental strategies is the within-subjects design in which you measure people or subjects under a treatment condition and under a control condition. For example, if you study maze learning in rats and typically test your animals in an illuminated room, you might like to see if rats can complete the maze equally well in the dark. In order to conduct this experiment in an unbiased fashion, you take ten naïve rats (who have never been in your maze) and time them under each of the two conditions. To do this properly, you should use a counterbalanced design and randomly determine which condition to run first for each rat. This is a repeated-measures design because you measured each subject twice. It is also a withinsubjects design because you are comparing treatment and control measurements that are coming from the same group if subjects. Your results appear below: Rat 1 2 3 4 5 6 7 8 9 10 Illuminated (seconds) 128 138 125 133 43 73 60 70 11 42 M = 82.3 SD = 45.43 Dark (seconds) 19 290 120 130 114 137 218 151 226 141 M = 154.6 SD = 74.38 There is quite a lot to consider here. You have two sets of times, each with its own mean and standard deviation. So far, we do not have the tools to handle more than one sample mean (although we will get to that in later chapters). However, there is another way of looking at these data. Each rat has two scores. These are paired scores and we can turn each pair into a single score by taking the difference between them. If we subtract the time in the illuminated condition from the time in the dark condition we can create another column of Difference Scores. These scores show us how much each rat’s time changed as it moved from the illuminated condition to the dark condition or vice versa. We can only do this because each rat was tested under each condition. (If the ten rats in the illuminated condition were not the same rats used in the dark condition, it would be pointless to calculate difference scores since no single score in one condition would have any special relationship to any single score in the other condition. We will discuss how to run t-tests for independent groups in the following chapter.) Note that some values are positive and others are negative; it is essential that you make this distinction. Rat 1 2 3 4 5 6 7 8 9 10 Illuminated (seconds) 128 138 125 133 43 73 60 70 11 42 M = 82.3 SD = 45.43 Dark (seconds) 19 290 120 130 114 137 218 151 226 141 M = 154.6 SD = 74.38 Difference Scores -109 152 -5 -3 71 64 158 81 215 99 M D= 72.3 SD = 93.95 These difference scores contain all the information that we need. They tell us how much change occurs when we go from the illuminated condition to the dark condition. Once we have calculated the differences, we can forget about the original data and focus on the difference scores. From there, we can calculate the Mean Difference (MD), Standard Deviation of Difference Scores, and the Standard Error of the Mean Difference (SMD). The data are simplified as follows: Rat 1 2 3 4 5 6 7 8 9 10 Difference Scores -109 152 -5 -3 71 64 158 81 215 99 MD = s= SMD = n= 72.3 93.95 29.71 10 M D= 72.3 SD = 93.95 We will be testing the hypothesis that a difference exists. Therefore, our null hypothesis will state that the mean difference equals zero and the alternative hypothesis will state that it does not. H0: MD = μD H1: MD < > μD or or H0: 72.3 = 0 H1: 72.3 < > 0 Calculating a Related-Samples t The formula for t does not change for a related-samples test but we do need to think a bit about the questions we are asking. We know the mean of the difference scores (MD) and we know how to calculate the standard error of the difference scores (sMD), but what is our hypothetical value for the mean of the difference scores? If we want to know if a change has occurred between the Illuminated and dark conditions, then we are asking of the difference is greater than zero. This means that the hypothetical mean difference will always be zero if your null hypothesis predicts that the mean difference will be zero (Note: There may be times when your null hypothesis does not predict a change of zero. If that is the case, plug that value in for μD). Therefore, the equation for a repeated measures t-test is: t M D D sMD and if your null hypothesis predicts no change, μD = 0 and the equation simplifies to: t MD sMD t 72.3 29.71 t 2.43 Cohen’s d is estimated as: estimated d MD s d 72.3 93.95 d 0.77 Your sample size will be the number of difference scores so, for this example, n = 10 and df = 9 . As always, the critical values for t are selected based on df and whether you are conducting a two-tailed test (predicting that the change will be greater than zero) or conducting a one-tailed test (specifically predicting either a positive or negative change). In our case, the critical value for t and 9 degrees of freedom and α = 0.05 is tcrit = 2.262. This is how you would report the results: We found that, compared to the illuminated condition, rats took significantly more time to solve the maze in the dark, t(9) = 2.43, p < 0.05, one-tailed, d = 0.77. Summary In these examples, we used t-tests to compare means and to make an informed statement about whether a sample mean (M) was significantly different than a hypothetical mean (μ) or if the mean difference (MD) from a set of paired scores is significantly different than zero. In the next section, we will modify the t-test formula one more time so we can compare unpaired scores. This t-test is commonly used in Between-Groups Studies in which individual scores from one group have no special relationship with the scores in the other. Values of t corresponding to the proportion of the distribution in one tail or two tails combined. Numbers in the left column are degrees of freedom (n-1 ). Proportion in One Tail 0.10 0.05 0.03 0.01 0.005 Proportion in Two-Tails 0.20 0.10 0.05 0.02 0.01 1 3.078 6.314 12.710 31.820 63.660 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.440 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.860 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.250 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.350 1.771 2.160 2.650 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947 16 1.337 1.746 2.120 2.583 2.921 17 1.333 1.740 2.110 2.567 2.898 18 1.330 1.734 2.101 2.552 2.878 19 1.328 1.729 2.093 2.539 2.861 20 1.325 1.725 2.086 2.528 2.845 21 1.323 1.721 2.080 2.518 2.831 22 1.321 1.717 2.074 2.508 2.819 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 26 1.315 1.706 2.056 2.479 2.779 27 1.314 1.703 2.052 2.473 2.771 28 1.313 1.701 2.048 2.467 2.763 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 40 1.303 1.684 2.021 2.423 2.704 50 1.299 1.676 2.009 2.403 2.678 60 1.296 1.671 2.000 2.390 2.660 80 1.292 1.664 1.990 2.374 2.639 100 1.290 1.660 1.984 2.364 2.626 120 1.289 1.658 1.980 2.358 2.617 8 1.282 1.645 1.960 2.326 2.576