* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
Introduction to the t-statistic Introduction to Statistics Chapter 9 Mar 5-10, 2009 Classes #15-16 The Problem with Z-scores Z-scores have a shortcoming as an inferential statistic: The computation of the standard error requires knowing the population standard deviation (). In reality, we rarely know the value of . When we don’t know , can’t compute standard error. Therefore, we use the t statistic, rather than the Z-score, for hypothesis testing. How to test? Still need to know about standard deviation of the population To figure this out… Use formula learned back in chapter 4: S 2 ( X X ) n 1 Use this to calculate estimated standard error (sM) We calculate t statistic similarly to how we calculated the zstatistic, but now we will use the estimated standard error of the mean (rather than the standard error of the mean) The t Statistic Use the sample variance (s2) to estimate the population variance s2 = SS/df = SS/(n-1) Use variance s2 in the formula to get the estimated standard error: Provides an estimate of the standard distance between M and when is unknown estimated standard error = sM = s n = s2 n The t Statistic Finally, we replace the standard error in the z-score formula with the estimated standard error to get the t statistic formula: t = M- sM Illustration In chapter 8 for Z-scores we used: plugging it into z-score formula: M = n Z = M- M Now we are using an estimate of standard error by using the sample SD sM = s n Population SD plugging it into t statistic formula: t = M- sM Another change… Up until this chapter we have been using formulas that used the standard deviation as part of the standard error formula Now, we shift our focus to the formula based on variance On page, 234 the book gives reasoning for this. Main reason: sample variance (s2) provides an accurate and unbiased estimate of the population variance (²) One more change… Although the definitional formula for sum of squares is the most direct for computing SS it has its problems When the mean is not a whole number the deviations will contain decimals and thus calculations become more difficult leading to rounding error Therefore, from now on we will use the SS computational formula which can be found on page 93 in the textbook “From now on” means all future tests and final exam Sample Variance Therefore, we will use sample variance rather than sample standard deviation to compute sM Sample standard deviation is a descriptive statistic rather than a inferential statistic Sample variance will provide the most accurate way to estimate the standard error We are now using variance-based formula in these equations Why? Inferential purpose, rather than descriptive Drawing inferences about the population estimated standard error = sM = s n = s2 n t statistic Definition: Used to test hypotheses about an unknown population mean when the value of is unknown. The formula for the t statistic has the same structure as the z-score formula, except the t statistic uses the estimated standard error in the denominator. t = M- sM Z-score vs. T-Score Z-distribution stays the same, regardless of sample size T-distribution changes, depending on how many pieces of information you have: degrees of freedom here, df = n-1 Everything else stays the same Have an alpha level Have one-tailed and two-tailed tests Determine boundaries of critical region Determine whether t-statistic falls in critical region If it does, reject null and know that p<alpha Degrees of Freedom How well does s approximate ? Depends on the size of the sample. The larger the sample, the better the approximation. Degrees of Freedom (df) = n-1 Measures the number of scores that are free to vary when computing SS for sample data. The value of df also describes how well a t statistic estimates a normal curve. Degrees of Freedom Degrees of Freedom = df = n-1 As df (sample size) gets larger, 3 things result: 1) s2 (sample variance) better represents 2 (population variance). 2) t better approximates z. 3) in general, the sample better represents the population. The t-distribution T-distribution The set of all possible t statistics obtained by selecting all possible samples of size n from a given population How well the t distribution approximates a normal distribution is determined by the df. In general, the greater n (and df), the more normal the t distribution becomes. t distribution more variable and flatter than normal z-score distribution – why is this the case? Both mean and standard error can vary in t-distribution – only the mean varies in the zdistribution Distributions of the t statistic The Versatility of the t test You do not need to know when testing with t The t test permits hypothesis testing in situations in which is unknown All you really need to compute t is a sensible null hypothesis and a sample drawn from the unknown population Hypothesis Testing with t (two tails) Same four steps, with a few differences: Now estimating the standard error, so compute t rather than z Consult t-distribution table rather than Unit Normal Table to find critical value for t (this will involve the calculation of the df) Hypothesis Testing w/ t-statistic Instead of the Unit Normal Table, we now have the t-table p. 531-532 Similar in form to the Unit Normal Table Pay attention to the df column!! Let’s think about this table for a minute Looking at the two-tail, p=0.05 column: What is value at 10 df? What is value at 20 df? What is value at 30 df? What is value at 120 df? A portion of the t-distribution table Hypothesis Testing with t (two tails) Step 1: State the hypotheses. Step 2: Set and locate the critical region. You will need to calculate the df to do this, and use the t distribution table. Step 3: Graph (shade) the critical region. Step 4: Collect sample data and compute t. This will involve 3 calculations, given SS, n, , and M: a) the sample variance (s2) b) the estimated standard error (sM) c) the t statistic Hypothesis Testing with t (two tails) Step 5: Go back to graph and see if tcalc falls in the critical region Step 6: Make a decision. Compare t computed in Step 3 tCALC with tCRIT found in the t table: If tCALC > tCRIT (ignoring signs) Reject HO If tCALC < tCRIT (ignoring signs) Fail to Reject HO One-Tailed Hypothesis Testing with t Same as with z, only steps 1 and 2 change. Step 1: Now use directional hypotheses. H0: = ? and H1: ? (predicts decrease) OR H0: = ? and H1: ? (predicts increase). Step 2: Now the critical region located in only one tail of the distribution (sign of tCRIT represents the direction of the predicted effect). You will have to use a different column on the t distribution table. Example1 Do eye-spot patterns affect behavior? If eye-spots do affect behavior, birds should spend more or less time in chamber w/ eye-spots painted on the walls. Sample of n=16 birds. Allowed to wander between the 2 chambers for 60 minutes. If eye-spots do not affect behavior, we’d expect they’d spend about 30 minutes in each chamber. We’re told the sample mean =39, SS = 540. Example1 Step 1: State the hypotheses Ho: µplain side = 30 min. H1: µplain side ≠ 30 min. Two-tailed Alpha = 0.05 Step 2: Locate the critical region Based on df. What are df here? What is the critical value? Step 3: Shade in critical region Example 1 Step 4: Calculate the t-statistic. First calculate the sample variance s2 = SS/n-1 , 540/15 = 36. Next use the sample variance (s2) to calculate the estimated standard error 2 s 36 sM 2.25 1.50 n 16 Finally, compute the t-statistic: M 39 30 9 t 6 sM 1.50 1.50 Example 1 Step 5: Make a decision. T-calculated = 6.00 t-critical = + 2.131 We observe that our t-value is in the region of rejection. We conclude that eye-spots have an effect on predatory behavior. Example 2 A teacher was trying to see whether a new teaching method would increase the Test of English as Foreign Language (TOFEL) scores of students. She received a report which included a partial list of previous scores on the exam. Unfortunately, most of the records were burned in a fire that occurred in the school’s Records’ Department. From the available data, students taught by old methods had = 580. She tested her method in a class of 20 students and got a mean of 595 and variance of 225. Is this increase statistically significant at the level of 0.05 in a 2-tailed test? Example 2 Step 1: Step 2: Example 2: Step 3 Example 2 Step 4: Step 5: Example 3 A researcher believes that children in poverty-stricken regions are undernourished and underweight. Past studies show the mean weight of 6-year olds is normally distributed with a 20.9 kg. However, the exact mean and standard deviation of the population is not available. The researcher collects a sample of 9 children, with a sample mean of 17.3 kg & s = 2.51 kg. Using a one-tailed test and a 0.01 level of significance, determine if this sample is significantly different from what would be expected for the population of 6-year olds. Example 3 Step 1 Step 2 Example 3: Step 3 Example 3 Step 4 Step 5 Example 4 A researcher has developed a new formula (Sunblock Extra) that she claims will help protect against the harmful rays of the sun. In a recent promotion for the new formula she is quoted as saying she is sure her new formula is better than the old one (Sunscreen). Her prediction: The “improved” Sunblock Extra will score higher than the previous Sunscreen score of 12? She decides to use the .05 significance level to test for differences. To the right are the Sunblock Extra scores for participants in her study. In notation form: H0: HA: Determine if there is a significant difference between the new product and the old one (make your decision and interpret). X X2 12 144 13 169 6 36 11 121 12 144 8 64 11 121 7 49 10 100 16 256 10 100 7 49 14 196 15 225 16 256 168 2030 Example 4 Step 2: Example 4: Step 3 Example 4 Step 4 Step 5 Example 5 Scientists believe that the “Monstro Motors” new model will get the highest gas mileage of any car on their lot. Although, not much data is available on the older cars, from a review of previous models they estimate that the best of the rest of their cars achieved 67 m.p.g. They using an alpha level α = .01. H A: H0: Determine if there is a significant difference between the MPG of the new car and the best old model on their lot (make your decision and interpret). X X2 65 4225 76 5776 69 4761 71 5041 74 5476 78 6084 77 5929 68 4624 72 5184 75 5625 74 5476 64 4096 69 4761 63 3969 82 6724 1077 77751 Example 5 Step 2: Example 5: Step 3 Example 5 Step 4 Step 5 Steps Step 2? Step 3? Step 4? Effect Size Effect size is a measure of the strength of the relationship between two variables In scientific experiments, it is often useful to know not only whether an experiment has a statistically significant effect, but also the size of any observed effects In practical situations, effect sizes are helpful for making decisions. Effect Size The concept of effect size appears in everyday language. For example, a weight loss program may boast that it leads to an average weight loss of 30 pounds. In this case, 30 pounds is an indicator of the claimed effect size. Another example is that a tutoring program may claim that it raises school performance by one letter grade. This grade increase is the claimed effect size of the program. Effect Size An effect size is best explained through an example: if you had no previous contact with humans, and one day visited England, how long would it take you to realize that, on average, men are taller than women there? The answer relates to the effect size of the difference in average height between men and women. The larger the effect size, the easier it is to see that men are taller. If the height difference were small, then it would require knowing the heights of many men and women to notice that (on average) men are taller than women Effect Size Cohen’s d An effect size measure representing the standardized difference between two means. Effect Size mean difference M Cohen' s d sample standard deviation s Example 4 M 11.2 12 s 3.25 .8 -0.24 3.25 Small effect (small to medium) Example 5 d Large effect M 71.8 67 d s 5.49 4.8 0.87 5.49 Credits http://myweb.liu.edu/~nfrye/psy53/ch9.ppt#9 http://homepages.wmich.edu/~malavosi/Chapt9PPT_S_05.ppt#2 http://faculty.plattsburgh.edu/alan.marks/Stat%20206/Introduction%20to%2 0the%20t%20Statistic.ppt#4 http://home.autotutor.org/hiteh/Stats%20S04/Statistics04onesamplettest1.ppt#7