Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 8: Hypothesis Testing (Chapter 7.1–7.2, 7.4) Distribution of Estimators (Chapter 5.1–5.2, Chapter 6.4) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Agenda for Today • Hypothesis Testing (Chapter 7.1) • Distribution of Estimators (Chapter 5.2) • Estimating s 2 (Chapter 5.1, Chapter 6.4) • t-tests (Chapter 7.2) • P-values (Chapter 7.2) • Power (Chapter 7.2) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-2 What Sorts of Hypotheses to Test? • To test a hypothesis, we first need to specify our “null hypothesis” precisely, in terms of the parameters of our regression model. We refer to this “null hypothesis” as H0. • We also need to specify our “alternative hypothesis,” Ha , in terms of our regression parameters. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-3 What Sorts of Hypotheses to Test? (cont.) • Claim: The marginal propensity to consume is greater than 0.70 : Ci b0 b1 Incomei i • Conduct a one-sided test of the null hypothesis • H0 : b1 > 0.70 against the alternative, • Ha : b1 = 0.70 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-4 What Sorts of Hypotheses to Test? (cont.) • Claim: The marginal propensity to consume equals the average propensity to consume: Ci b0 b1 Incomei i , with b0 0 • Conduct a two-sided test of • H0 : b0 = 0 against the alternative, • Ha : b0 ≠0 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-5 What Sorts of Hypotheses to Test? (cont.) • The CAPM model from finance says that the E(excess return on portfolio k) b ·(Excess return on market portfolio) • Regress E(excess return on portfolio k) b0 b1 (excess return on market portfolio) for a particular mutual fund, using data over time. Test H0 : b0 > 0. • If b0 > 0, the fund performs better than expected, said early analysts. If b0 < 0, the fund performs less well than expected. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-6 What Sorts of Hypotheses to Test? (cont.) • H0 : b0 > 0 • Ha : b0 = 0 • What if we run our regression and find bˆ 0 0.012 • Can we reject the null hypothesis? What if bˆ 0 0.012 ? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-7 Hypothesis Testing: Errors • In our CAPM example, we are testing – H0 : b0 > 0, against the alternative – Ha : b0 = 0 • We can make 2 kinds of mistakes. – Type I Error: We reject the null hypothesis when the null hypothesis is “true.” – Type II Error: We fail to reject the null hypothesis when the null hypothesis is “false.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-8 Hypothesis Testing: Errors (cont.) • Type I Error: Reject the null hypothesis when it is true. • Type II Error: Fail to reject the null hypothesis when it is false. • We need a rule for deciding when to reject a null hypothesis. To make a rule with a lower probability of Type I error, we have to have a higher probability of Type II error. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-9 Hypothesis Testing: Errors (cont.) • Type I Error: Reject the null hypothesis when it is true. • Type II Error: Fail to reject the null hypothesis when it is false. • In practice, we build rules to have a low probability of a Type I error. Null hypotheses are “innocent until proven guilty beyond a reasonable doubt.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-10 Hypothesis Testing: Errors (cont.) • Type I Error: Reject the null hypothesis when it is true. • Type II Error: Fail to reject the null hypothesis when it is false. • We do NOT ask whether the null hypothesis is more likely than the alternative hypothesis. • We DO ask whether we can build a compelling case to reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-11 Hypothesis Testing • What constitutes a compelling case to reject the null hypothesis? • If the null hypothesis were true, would we be extremely surprised to see the data that we see? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-12 Hypothesis Testing: Errors • In our CAPM example H 0 : b0 0 H a : b0 0 • What if we run our regression and find bˆ 0 0.012 ? • Could a reasonable jury reject the null hypothesis if the estimate is “just a little lower” than 0? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-13 Hypothesis Testing: Errors (cont.) • Type I Error: Reject the null hypothesis when it is true. • Type II Error: Fail to reject the null hypothesis when it is false. • In our CAPM example, our null hypothesis is b0 > 0. Can we use our data to amass overwhelming evidence that this null hypothesis is false? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-14 Hypothesis Testing: Errors (cont.) • Note: if we “fail to reject” the null, it does NOT mean we can “accept” the null hypothesis. • “Failing to reject” means the null has “reasonable doubt.” • The null hypothesis could still be fairly unlikely, just not overwhelmingly unlikely. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-15 Hypothesis Testing: Strategy • Our Strategy: Look for a Contradiction. • Assume the null hypothesis is true. • Calculate the probability that we see the data, assuming the null hypothesis is true. • Reject the null hypothesis if this probability is just too darn low. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-16 How Should We Proceed? 1. Ask how our estimates of b0 and b are distributed if the null hypothesis is true. 2. Determine a test statistic. 3. Settle upon a critical region to reject the null hypothesis if the probability of seeing our data is too low. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-17 How Should We Proceed? (cont.) • The key tool we need is the probability of seeing our data if the null hypothesis is true. • We need to know the distribution of our estimators. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-18 Distribution of a Linear Estimator (from Chapter 5.2) If the true value of b 0 0, what is the probability that we observe data with estimated intercept bˆ ? 0 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-19 Distribution of a Linear Estimator (cont.) • Perhaps the most common hypothesis test is H0 : b = 0 against Ha : b ≠ 0 • This hypothesis tests whether a variable has any effect on Y • We will begin by calculating the variance of our estimator for the coefficient on X1 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-20 Hypothesis Testing • Add to the Gauss–Markov Assumptions • The disturbances are normally distributed i ~ N(0, s ) 2 Yi ~ N(b Xi , s ) 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-21 DGP Assumptions Yi b0 b1 X1i b2 X 2i bk X ki i E( i ) 0 Var( i ) s 2 Cov( i , j ) 0, for i j Each explanator is fixed across samples i ~ N (0, s ) 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-22 How Should We Proceed? • Ask how our guesses of b0 and b1 are distributed. • Since the Yi are distributed normally, all linear estimators are, too. bˆ0 ~ N ( b 0 ,?) bˆ1 ~ N ( b1 ,?) What are the variances of bˆ0 and bˆ1? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-23 What is the Variance of b̂ 1? Var(wiYi ) Var(wiYi ) 0 wi2Var(Yi ) s 2 wi2 Let xi Xi - X and yi Yi - Y Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-24 What is the Variance of b̂ 1? (cont.) Var(wiYi ) s 2 wi2 b̂ 1 xy x i i i wi 2 xi xi 2 Var(b̂ 1) s s 2 2 wi s 2 ( x xi 2 i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 2 2 ) 2 xi 2 ( ) 2 xi s2 x i 2 8-25 What is the Variance of b̂ 1? (cont.) Var ( bˆ 1) s 2 x i 2 where xi Xi X Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-26 What is the Variance of b̂ 1? (cont.) Var ( bˆ 1) s2 x 2 i Thus... bˆ 1 ~ N ( b 1, s2 x 2 ) i Suppose we have as our null hypothesis H 0 : b 1 b 1* Under the null hypothesis bˆ 1 ~ N ( b 1 , * s2 x 2 ) i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-27 Distribution of a Linear Estimator • We have a formula for the distribution of our estimator. However, this formula is not in a very convenient form. We would really like a formula that gives a distribution for which we can look up the probabilities in a common table. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-28 Test Statistics • A “test statistic” is a statistic: 1. Readily calculated from the data 2. Whose distribution is known (under the null hypothesis) • Using a test statistic, we can compute the probability of observing the data given the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-29 Test Statistics (cont.) bˆ 1 ~ N ( b 1 , * s2 x 2 ) i If we subtract the mean and divide by the standard error, we can transform a Normal Distribution into a Standard Normal Distribution. Z bˆ 1 - b 1 * s 2 xi Copyright © 2006 Pearson Addison-Wesley. All rights reserved. ~ N (0,1) 2 8-30 Test Statistics (cont.) We can easily look up the probability that we observe a given value of Z bˆ 1 - b 1 * s2 x 2 i on a Standard Normal table. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-31 Test Statistics (cont.) Z bˆ 1 - b 1 * s 2 xi ~ N (0,1) 2 Suppose we want to test H 0 : b1 0.70, against the alternative H a : b1 0.70 We could replace b1 * with 0.70 and calculate Z If Z -1.64, then we know there is less than a 5% chance we would observe this data if b1 really were greater than 0.70 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-32 Estimating s2 (from Chapter 5.1, Chapter 6.4) One Problem: We cannot observe bˆ 1 - b 1* because we cannot observe s 2 . s2 x 2 i Solution: Estimate s . 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-33 Estimating s2 (cont.) • We need to estimate the variance of the error terms, i Yi - b0 - b1 X1i - ... - bk Xki • Problem: we do not observe i directly. • Another Problem: we do not know b0…bk, so we cannot calculate i either. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-34 Estimating s2 (cont.) • We need to estimate the variance of the error terms, i Yi - b0 - b1 X1i - ... - bk Xki • We can proxy for the error terms using the residuals. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-35 Estimating s2 (cont.) ˆ ˆ ˆ ei Yi b 0 b1 X 1i ... b k X ki ˆ Y Y i i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-36 Estimating s2 (cont.) ei Yi bˆ0 bˆ1 X 1i ... bˆk X ki Y Yˆ i i • Once we have an estimate of the error term, we can calculate an estimate of the variance of the error term. We need to make a “degrees of freedom” correction. 1 2 s ei n k 1 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-37 Estimating s2 (cont.) s2 Recall Var ( bˆ 1) xi 2 Thus... bˆ 1 ~ N ( b 1, s2 x 2 ) i Suppose we have as our null hypothesis H 0 : b 1 b 1* Under the null hypothesis bˆ 1 ~ N ( b 1 , * s2 x 2 ) i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-38 Estimating s2 (cont.) Suppose we have as our null hypothesis H 0 : b1 b * 1 Under the null hypothesis 2 s * ˆ b1 ~ N b1 , 2 x i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-39 Estimating s2 (cont.) Plug in our estimate for s : 2 2 s * ˆ b1 ~ N b1 , 2 x i 2 e * i ˆ b1 ~ N b1 , 2 ( n k 1) x i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-40 Standard Error (from Chapter 5.2) • Remember, the standard deviation of the distribution of our estimator is called the “standard error.” • The smaller the standard error, the more closely your estimates will tend to fall to the mean of the distribution. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-41 Standard Error (from Chapter 5.2) • If your estimate is unbiased, a low standard error implies that your estimate is probably “close” to the true parameter value. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-42 t-statistic (from Chapter 7.2) tˆ bˆ b * s2 2 x i ~ tn k 1 • Because we need to estimate the standard error, the t-statistic is NOT distributed as a Standard Normal. Instead, it is distributed according to the t-distribution. • The t-distribution depends on n-k-1. For large n-k-1, the t-distribution closely resembles the Standard Normal. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-43 t-statistic (cont.) tˆ * ˆ b b sbˆ 2 ~ tn k 1 • Under the null hypothesis H0 : b1 = b1* t ~ t n-2 • In our earlier example, we could: 1. Replace b1* with 0.70 2. Compare t to the “critical value” for which the tn-2 distribution has .05 of its probability mass lying to the left, 3. There is less than a 5% chance of observing the data under the null if t < “critical value.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-44 Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-45 Significance Level • We can now calculate the probability of observing the data IF the null hypothesis is true. • We choose the maximum chance we are willing to risk that we accidentally commit a Type I Error (reject a null hypothesis when it is true). • This chance is called the “significance level.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-46 Significance Level (cont.) • We choose the probability we are willing to accept of a Type I Error. • This probability is the “Significance Level.” • The significance level gives operational meaning to how compelling a case we need to build. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-47 Significance Level (cont.) • The significance level denotes the chance of committing a Type I Error. • By historical convention, we usually reject a null hypothesis if we have less than a 5% chance of observing the data under the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-48 Significance Level (cont.) • 5% is the conventional significance level. Analysts also often look at the 1% and 10% levels. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-49 Critical Region • We know the distribution of our test statistic under the null hypothesis. • We can calculate the values of the test statistic for which we would reject the null hypothesis (i.e., values that we would have less than a 5% chance of observing under the null hypothesis). Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-50 Critical Region (cont.) • We can calculate the values of the test statistic for which we would reject the null hypothesis. • These values are called the “critical region.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-51 Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-52 Critical Region • Regression packages routinely report estimated coefficients, their estimated standard errors, and the t-statistics associated with the null hypothesis that an individual coefficient is equal to zero. • Some programs also report a “p-value” for each estimated coefficient. • This reported p-value is the smallest significance level for a two sided test at which one would reject the null that the coefficient is zero. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-53 One-Sided, Two-Sided Tests • t-tests come in two flavors: 1-sided and 2-sided. • 2-sided tests are much more common: – H0 : b = b* – Ha : b ≠b* • 1-sided tests look at only one-side: – H0 : b > b* – Ha: b = b* Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-54 One-Sided, Two-Sided Tests (cont.) • The procedure for both 1-sided and 2-sided tests is very similar. • For either test, you construct the same t-statistic: * ˆ b b tˆ ˆ s.e.( b ) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-55 One-Sided, Two-Sided Tests (cont.) • Once you have your t-statistic, you need to choose a “critical value.” The critical value is the boundary point for the critical region. You reject the null hypothesis if your t-statistic is greater in magnitude than the critical value. • The choice of critical value depends on the type of test you are running. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-56 Critical Value for 1-Sided Test • For a 1-sided test, you need a critical value such that a of the distribution of the estimator is greater than (or less than) the critical value. a is our significance level (for example, 5%). Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-57 Critical Value for 1-Sided Test (cont.) • In our CAPM example, we want to test: H 0 : b0 0 H a : b0 0 • We need a critical value t* such that a of the distribution of our estimator is less than t* Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-58 Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-59 Critical Value of a 1-Sided Test (cont.) • For a 5% significance level and a large sample size, t* = -1.64 • We reject the null hypothesis if: tˆ 1.64 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-60 Critical Value for a 2-Sided test • For a 2-sided test, we need to spread our critical region over both tails. We need a critical value t* such that – a/2 of the distribution is to the right of t* – a/2 of the distribution is to the left of –t* • Summing both tails, a of the distribution is beyond either t* or -t* Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-61 Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-62 Critical Value for 2-Sided Test • For a large sample size, the critical value for a 2-sided test at the 5% level is 1.96 • You reject the null hypothesis if: tˆ 1.96 or tˆ 1.96 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-63 P-values • The p-value is the smallest significance level for which you could reject the null hypothesis. • The smaller the p-value, the stricter the significance level at which you can reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-64 P-values (cont.) • Many statistics packages automatically report the p-value for a two-sided test of the null hypothesis that a coefficient is 0 • If p < 0.05, then you could reject the null that b= 0 at a significance level of 0.05 • The coefficient “is significant at the 95% confidence level.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-65 Statistical Significance • A coefficient is “statistically significant at the 95% confidence level” if we could reject the null that b = 0 at the 5% significance level. • In economics, the word “significant” means “statistically significant” unless otherwise qualified. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-66 Performing Tests • How do we compute these test statistics using our software? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-67 Power • Type I Error: reject a null hypothesis when it is true • Type II Error: fail to reject a null hypothesis when it is false • We have devised a procedure based on choosing the probability of a Type I Error. • What about Type II Errors? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-68 Power (cont.) • The probability that our hypothesis test rejects a null hypothesis when it is false is called the Power of the test. • (1 – Power) is the probability of a Type II Error. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-69 Power (cont.) • If a test has a low probability of rejecting the null hypothesis when that hypothesis is false, we say that the test is “weak” or has “low power.” • The higher the standard error of our estimator, the weaker the test. • More efficient estimators allow for more powerful tests. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-70 Power (cont.) • Power depends on the particular Ha you are considering. The closer Ha is to H0, the harder it is to reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-71 Figure 7.2 Distribution of b̂ s for bs = -2, 0, and a Little Less Than 0 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-72 Figure 7.3 Power Curves for Two-Tailed Tests of H0 : bs = 0 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-73 Figure SA.12 The Distribution of the t-Statistic Given the Null Hypothesis is False and = + 5 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-74 Figure SA.13 The t-Statistic’s Power When the Sample Size Grows Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-75 Review • To test a null hypothesis, we: – Assume the null hypothesis is true; – Calculate a test statistic, assuming the null hypothesis is true; – Reject the null hypothesis if we would be very unlikely to observe the test statistic under the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-76 Six Steps to Hypothesis Testing 1. State the null and alternative hypotheses 2. Choose a test statistic (so far, we have learned the t-test) 3. Choose a significance level, the probability of a Type I Error (typically 5%) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-77 Six Steps to Hypothesis Tests (cont.) 4. Find the critical region for the test (for a 2-sided t-test at the 5% level in large samples, the critical value is t*=1.96) 5. Calculate the test statistic * ˆ b b tˆ s.e.( bˆ ) 6. Reject the null hypothesis if the test statistic falls within the critical region * * ˆ ˆ Is t t or t t ? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-78