Download Hypothesis Testing--t-Tests and Confidence Intervals

UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Testing Hypothesis about Means using t-Tests and the t-Table (with a guest appearance by Confidence Intervals) t-Tests are used to conduct hypothesis tests involving means when you don’t know the σ of the population (which is the usual situation). Conducting a t-Test is similar to conducting a Z-test, but t-Tests are used much more often in practice (because we usually don’t know the σ of the population). We use the t-distribution, or t-Table of values to help us conduct t-Tests. Use of the t-Table will be demonstrated in lecture. The t-distribution is also named “Student’s distribution,” but it should rightfully be named “Gosset’s distribution.” Related to this is the fact that, if it were not for beer, the t-distribution may not have been discovered, and statistics would not be nearly as useful. For this reason, “Beer saved statistics!” Read more here: https://en.wikipedia.org/wiki/William_Sealy_Gosset The t-test gives different results from the Z-test when sample size n is small; but, when sample size is large, the ttest and the Z-test give approximately the same results. There are three different types of t-Tests about means. (The examples later in this handout will show you how to conduct each type of t-Test.)  A One-Sample t-Test is used to test whether the mean from one sample of data is different from a given number.  An Independent Samples t-Test is used to test whether the mean from one sample of data is different from the mean from a second sample of data where the individuals in the first sample are DIFFERENT FROM the individuals in the second sample. For example, if you measure the opinion of 100 people on some issue and also measure the opinion of a different set of 100 people in some other location, say, on the same issue, and then you want to test whether the opinion of the first set of people differs from the opinion of the second set of people (to determine whether opinions are different in the different locations), you would conduct a Independent Samples t-Test.  A Dependent (or "Paired") Samples t-Test is used to test whether the mean from one sample of data is different from the mean from a second sample of data where the individuals measured in the first sample are THE SAME individuals measured in the second sample. For example, if you measure the opinion of 100 people on some issue and then go back later to the same 100 people and measure their opinion again on the same issue, and then you want to test whether the opinion of these 100 people changed, you would conduct a Dependent (or "Paired") Samples tTest. Examples of When to Use Independent Samples t-Tests For example, X might be per capita income, and the first sample might be observations on X for east coast cities, and the second sample might be observations on X for west coast cities, and you want to know whether the mean value of X for east coast cities is different from the mean value of X for west coast cities. As another example, X might be per capita income, and the first sample might be observations on X for years in the 1960s, the second sample might be observations on X for years in the 1980s, and you want to know whether the mean value of X in the 1960s is different from the mean value of X in the 1980s. 1 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Examples of When to Use Dependent/Paired Samples t-Tests For example, suppose you have one group of students, and for each student you have two test grades, and for all the students in the group together you want to test for a difference between the mean grade on test 1 and the mean grade on test 2. As another example, you have a group of firms for which you measure cost per unit of production before an efficiency-improvement program is implemented and after the program is implemented, and you want to test whether mean cost per unit before the program is different from mean cost per unit after the program. One-Sided versus Two-Sided t-Tests The three types of t-tests described above can be conducted as either "One-Sided" tests or "Two-Sided" tests, depending on how your hypotheses are set up . . . A One-Sided t-Test is appropriate when the hypotheses you want to test are in either of the two forms shown below. In a One-Sided t-test, the H1 hypothesis has either a less than "<" symbol or a greater than ">" symbol. either this way: H0: μ1 = μ0 H1: μ1 < μ0 H0: μ1 = μ0 H1: μ1 > μ0 or this way: A Two-Sided t-Test is appropriate when the hypotheses that you want to test are of the form shown below. In a Two-Sided t-test, the H1 hypothesis has a not-equal-to "≠" symbol. H0: μ1 = μ0 H1: μ1 ≠ μ0 this way: Three Different Ways of Conducting a t-Test Conducting a t-Test is similar to conducting a Z-test. Any of the t-Tests (one-sample, independent samples, or dependent samples, on either a one-sided or a two-sided basis) can be conducted in any of the three ways described below. The three ways of conducting a t-Test described below all give the same answer for the test, so it doesn't matter which of the three ways you use, but you will see all three methods used in the "real world," so you need to be familiar with all three. 1. Comparing a "ttest" number (calculated from your data, using the formula below) against a "tcritital" number (from a statistical table). 𝑡𝑡𝑒𝑠𝑡 = (𝑋̅−𝜇) 𝑠.𝑒. where s.e. = s √n If the ttest number is farther from zero than tcritical, (that is if |ttest| > tcritical), then conclude: Reject H0 and Accept H1. 2. Comparing a "p-value" against either (a) an "α-value," for a one-sided test, or (b) an "(α/2)-value," for a two-sided test  One-sided tests: Compare the p-value against an α-value. If the p-value is less than the α-value, then conclude: Reject H0 and Accept H1.  Two-sided test: Compare the p-value against an (α/2)-value. If the p-value is less than the (α/2)value, then conclude: Reject H0 and Accept H1. 2 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas 3. Constructing a "Confidence Interval" and comparing it against either (a) a given null hypothesis number, or (b) another Confidence Interval. See the next page for the methodology of how to construct Confidence Intervals. Methodology for Conducting One-Sample t-Tests The methodology used to conduct t-Tests by comparing a "ttest" number from a formula against a "tcritital" number from a table, or by comparing a p-value against an α-value, is very similar to the methodologies used for Z-tests. Use the ttest number instead of the Ztest number. (See the examples further below.) 𝑡𝑡𝑒𝑠𝑡 = (𝑋̅−𝜇) 𝑠.𝑒. where s.e. = s √n Methodology for Constructing Confidence Intervals for One-Sample t-Tests Confidence Intervals provide an alternative way of conducting hypothesis tests (instead of comparing ttest to tcritical, or comparing the p-value with the α-value). The true population value will lie between the Confidence Interval numbers a percentage of the time equal to the Confidence Level you have chosen for the test. Step 1. Choose your Confidence Level for the test. (Typically, 95%, or 0.95) Stept 2. Calculate the Signficance Level (α) of the test using the usual formula: Significance Level (α) = 1 – Confidence Level So, if the Confidence Level is the typical 95%, then Significance Level (α) = 5% (0.05) Step 3. Using the t-table, find tα/2 for your d.f. and the Significance Level (α) you have chosen. So, suppose you have chosen Confidence Level = 0.95, then α = 0.05, and α/2 = 0.025, and suppose your d.f. is very large, so you can use the ∞ row of the t-table. Using the t-table, we find that tα/2 = 1.96 for d.f. = ∞. Step 4. Construct the Confidence Interval numbers based on tα/2 from the t-table and the Xbar and s.e. that you calculated from your sample data: Confidence Interval = Xbar +/- [ tα/2∙(s.e.)] Notice that because of the “+/-“ in the formula above, the Confidence Interval is really two numbers: Upper Confidence Interval Number = Xbar + [ tα/2∙(s.e.)] Lower Confidence Interval Number = Xbar - [ tα/2∙(s.e.)] Step 5. Interpret the Confidence Interval Suppose, for example, that you calculate the following Confidence Interval numbers: Upper Confidence Interval Number = 63 Lower Confidence Interval Number = 57 and also suppose that your Confidence Level is 95% We would write this as: 95% Confidence Interval = (57,63) Interpretation: If you take many random samples, and calculate the Confidence Interval for each sample, then 95% of the Confidence Intervals will contain the true population mean. 3 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Methodology for Conducting Independent Samples t-Tests n1 = number of individuals in first sample 𝑋̅1 = sample mean of variable X for first sample s1 = sample standard deviation of variable X for first sample 𝑠 s.e.1 = 1𝑛 = standard error of variable X for first sample √ n2 = number of individuals in second sample 𝑋̅2 = sample mean of variable X for second sample s2 = sample standard deviation of variable X for second sample 𝑠 s.e.2 = 2𝑛 = standard error of variable X for second sample √ d = difference in true means = (μ2 – μ1) (we don't know what this is; we hypothesize about it) D = difference in sample means = (𝑋̅2 - 𝑋̅1) probability Calculate the s.e. for the difference in means: s.e.d = √𝑠. 𝑒.12 + 𝑠. 𝑒.22 H0: d = (μ2 – μ1) = 0 H1: d = (μ2 – μ1) > 0 ===> this is a one-sided test p-value 0 ttest t - value First, find the ttest value for the difference in means: ttest = 𝐷−𝑑 𝑠.𝑒.𝑑 = ̅2− X ̅ 1 )−(0) (X 𝑠.𝑒.𝑑 Next, choose a level for α and find tcrit in the t-table using d.f. = n1 + n2 -2. Finally, compare ttest and tcrit. If ttest is farther from zero than tcrit, then reject H0 and accept H1. Otherwise, accept H0 and reject H1. Alternatively, you can find the p-value of ttest and compare the p-value with the level of α. If the p-value is less than α, then reject H0 and accept H1. Otherwise, accept H0 and reject H1. Note: Actually, the formula1 for degrees of freedom in this situation (testing for differences in means between two independent samples with unequal standard deviations) is: d.f. = (𝑠.𝑒.21 +𝑠.𝑒.22 )2 4 𝑠.𝑒.4 1 )+( 𝑠.𝑒.2 ) 𝑛1 −1 𝑛2 −1 ( but we are going to approximate it with d.f. = n1 + n2 -2 (which is the d.f. when the standard deviations are equal). 1 Satterthwaite, F.W. 1946. An Approximate Distribution of Estimates of Variance Components. Biometrics Bulletin. 2:110114. 4 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Methodology for Conducting Dependent Samples t-Tests Suppose that there are two things, X1 and X2, that we could measure for each individual i in a population. That is, for each individual i, we have X1i and X2i. Definition: “d” is the true difference between X1 and X2 for all individuals in the population. Importantly, “d” might be zero. H 0: d = 0 H 1: d > 0 The methodology will be demonstrated using an example with 10 individuals. (More than 10 individuals are allowed; we’re just using 10 in this example.) Thus, n = 10. ̅ ” of these differences First, find the difference “D” = X2 - X1 for each individual i in the sample. Second, find the mean “𝐷 2 for all individuals in the sample. Third, find the variance sD , standard deviation sD, and standard error s.e.D of the ̅ to test hypotheses about “d”, the true population mean differences. Basic idea of the test: Use the sample mean difference 𝐷 difference. Second, find: ̅ – d)/ s.e.D = (1.2 - 0)/0.512 = +2.34 ttest = (𝐷 Third, assuming alpha = 0.05, and using df = n – 1 = 9, use the t-table to find: tcrit = +1.833. Finally, if ttest is farther from zero than tcrit, then reject H0 and accept H1. Otherwise, accept H0 and reject H1. So, in this example, Reject H0, and conclude that d > 0. Individual i 1 2 3 4 5 6 7 8 9 10 n = 10 = number of individuals d.f. = n – 1 =9 First observation on X (X1) 5 4 6 7 5 3 6 4 5 6 Second observation on X (X2) 7 6 5 6 7 6 5 7 6 8 Difference between first and second observations, Di = X 2 – X1 2 2 -1 -1 2 3 -1 3 1 2 Sample mean of differences ̅ ” = ∑ 𝐷𝑖 = +1.2 “𝐷 𝑛 Squared Deviations of the Differences ̅ )2 (Di - 𝐷 0.64 0.64 4.84 4.84 0.64 3.24 4.84 3.24 0.04 0.64 Sample variance of differences ̅ )2 ∑(𝐷 −𝐷 𝑖 sD2 = = 2.622 𝑛−1 Sample std.dev. of differences sD = √𝑠𝐷2 = 1.619 Sample s.e. of differences 𝑠 s.e.D = 𝐷 = 0.512 √𝑛 5 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Appendix on Confidence Intervals (Note: You won’t be held responsible for the material in this Appendix in ECN377, but I’m putting it here for those who just can’t get enough . . . you know who you are) In the Methodology for Constructing Confidence Intervals section earlier in this handout, we show how to use a Confidence Interval to test a hypothesis about the value of a mean from a single sample. This was using a Confidence Interval to conduct a One-Sample test. But, actually, you can use Confidence Intervals to conduct Independent-Samples and Dependent-Samples tests. Here’s the scoop on how to use Confidence Intervals to conduct all three types of tests: One-sample tests: Compare the Confidence Interval from the sample with the given null hypothesis value (μ0). (This is what we did in the example earlier in the handout.) If the Confidence Interval contains the given number, then the mean of the sample is not significantly different from the given number. If the Confidence Interval does not contain the given number, then the mean of the sample is significantly different from the given number. Independent-samples tests: Construct two Confidence Intervals, one for each sample, and check whether the two Confidence Intervals overlap. If the Confidence Intervals overlap, then the means of the two samples are not significantly different. If the intervals do not overlap, then the means of the two samples are significantly different. Dependent-samples tests: Compare the Confidence Interval for the mean difference with the given null hypothesis value (μ0). If the Confidence Interval contains the given number, then the mean difference is not significantly different from the given number. If the Confidence Interval does not contain the given number, then the mean difference is significantly different from the given number. 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Hypothesis Testing--t-Tests and Confidence Intervals