Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sufficient statistic wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Omnibus test wikipedia , lookup
Misuse of statistics wikipedia , lookup
Homework #3 is due tomorrow by 5pm Homework #4 is due Friday, March 7th, by 5pm. Homework #5 is due Friday, March 14th, by 5pm. Political Science 15 Lecture 13: Hypothesis Testing, Interpretation of Hypothesis Tests Hypothesis Testing If the null hypothesis is true, our sample statistic will come from a normal distribution centered on that number. If it is “too far” away (furthest 5% of the distribution from the center), we reject the null hypothesis as likely to be false. Calculating a Test Statistic How do we know if our sample statistic falls inside or outside the critical values for our hypothesis test? We must calculate a test statistic. In this case, the number of standard deviations our sample statistic is from the null hypothesis. If we know the standard deviation of the sampling distribution, we can calculate a z score: Estimating Population Variance Using z scores to test our hypotheses relies on the assumption that we know the standard deviation of the variable we are testing. In reality we will not know this. We instead estimate the standard deviation with the square root of S2: We then use a slightly different distribution to account for the fact we had to estimate . This is the t distribution. The t Distribution The t distribution is similar to the normal, but more spread out to account for the additional uncertainty that comes from estimating . This will make our critical values slightly larger. We say the t distribution is distributed with n-1 degrees of freedom, where n is sample size. More degrees of freedom mean more information was used to determine our distribution. As sample size increases, the t approximates the normal. The t Distribution k = degrees of freedom Hypothesis Test with a t Example #1 We hypothesize the mean level of education in the US is 14 years. H0 = 14. HA 14. We calculate the mean level of education in our sample. That mean comes out to 15. We estimate S in our sample to be 2. We have 30 observations. Our test statistic is a t score. t = (15 – 14)/(2 / 30) = 2.74. With a level of significance = 5%, our critical values in a t distribution with 29 d.f. are 2.045. Our test statistic falls outside this range. Thus, we reject the null hypothesis. Hypothesis Test with a t Example #2 We hypothesize that IMF loans cause more political instability. H0 for our regression slope = 0. HA 0. We calculate a regression line. The slope coefficient on IMF loans is 2. We calculate S = 6. Our test statistic is a t score. t = (2 – 0)/(6 / 150) = 4.08. With a level of significance = 5%, our critical values in a t distribution with 149 d.f. are 1.98. Our test statistic falls outside this range. Thus, we reject the null hypothesis. IMF loans do seem to have a positive effect on political instability. Hypothesis Testing with H0=0 If we reject H0=0, we say that variable is statistically significant -- that is, we can reject the null hypothesis that it has no effect. This is not the same as substantively significant. Something can be statistically significant but have a tiny effect on the dependent variable. Most statistical programs (including SPSS) will automatically perform a t test on each coefficient in the regression, using 0 as the null hypothesis. Hypothesis Testing in Practice We’ve now seen the basics of hypothesis testing -- setting up null and alternative hypotheses, estimating a test statistic, and determining whether to reject or fail to reject the null hypothesis based on this test statistic. Now we will see how hypothesis tests are actually used in the social sciences. We will focus mostly on regression. Hypothesis Testing in Regression In most cases we are testing whether a relationship is positive or negative, so we test the coefficients in a regression with H0= 0. Most statistical programs (including SPSS) will automatically perform a t test on each coefficient in the regression, using 0 as the null hypothesis. If we reject H0= 0 for the coefficient on a variable we say that variable is statistically significant -- that is, we can reject the hypothesis that it has no effect. Standard Errors The standard error of a sample statistic is just our estimate of the standard deviation of the sampling distribution of that statistic. For regression coefficients it is calculated as: where and Standard Errors in Regression The standard error on a regression coefficients will grow smaller both as sample size increases and as the variance on that coefficient’s variable increases. Same n, but the s.e. for the squares will be smaller Hypothesis Test in a Regression Example #1 We hypothesize that IMF loans cause more political instability. H0 for our regression slope = 0. HA 0. We calculate a regression line. The slope coefficient on IMF loans is 2, with a standard error of 1. Our test statistic is a t score. It is known as a t-ratio since it boils down to just the coefficient over the standard error: t = (2 – 0)/ 1 = 2/1 = 2 With a level of significance = 5%, our critical values in a t distribution with 149 d.f. are 1.98. (N = 150) Our test statistic falls outside this range. Thus, we reject the null hypothesis. IMF loans do seem to have a positive effect on political instability. Hypothesis Test in a Regression Example #2 We hypothesize that IMF loans cause more political instability. H0 for our regression slope = 0. HA 0. We calculate a regression line. The slope coefficient on IMF loans is 2, with a standard error of 3. Our t-ratio is 2/3 = 0.67. With a level of significance = 5%, our critical values in a t distribution with 149 d.f. are 1.98. (N = 150) Our test statistic falls within this range. Thus, we fail to reject the null hypothesis. We cannot rule out the possibility that IMF loans have no effect on political instability. p values Many times statistical software and journal articles will report a p value on a sample statistic. The p value tells you the probability of observing a sample statistic further from the null hypothesis than the current statistic if the null hypothesis were true. Hypothesis testing can be done by comparing the p value to the level of significance you want for your test. A p value of less than 0.05 usually means you reject the null hypothesis. Graphical example of a p value Example of hypothesis testing in SPSS