Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
10 November 2003 6.2 Tests of Significance 6.3 Use and Abuse of Tests 6.4 Power and Inference as a Decision Tests of Significance • • • • • • • • The reasoning of significance tests Stating hypotheses Test statistics p values Statistical significance Tests for a population mean Two-sided tests versus confidence intervals p values versus fixed alpha Reasoning of significance tests 1 Make a statement (the null hypothesis) about some unknown population parameter. 2 Assuming the null hypothesis is true, what is the probability of obtaining data such as yours? 3 If the probability of the data is small, then reject the null hypothesis. Example 6.6 • Tim believes that his “true weight” is 187 pounds • Let’s assume that if Tim weighed himself over and over, the weights would have an approximately normal distribution with s=3 • Tim weighs himself once a week for four weeks. The average of these four measurements is 190.5 • Are the data consistent with Tim’s belief, or is Tim fooling himself? Example 6.6 mu = 187 x μ 190.5 187 P(x 190.5) P( ) σ/ n 3/ 4 P(z 2.333) .01 We reject the null hypothesis because, if it is true, there is only about a 1% chance of obtaining the data we have. Stating hypotheses Null hypothesis About the population, not the sample H0 or NH “Nothing interesting is happening” Alternative hypothesis Ha What a researcher thinks is happening May be one- or two-sided Test statistics The test statistic, such as the sample mean, is the information we use to make the decision to reject or keep the null hypothesis. Usually, the null hypothesis tells us how the test statistic would be distributed if the null hypothesis is true, and if we drew lots and lots of samples at random from the population. p values If the null hypothesis is true, what is the probability that we would see data such as ours? P(data|H0) is called the p value If our sample mean is very different from what the null hypothesis says the population mean is, then the p value will be small (because our data will be unusual, or surprising). Statistical significance When you do a hypothesis test, you must decide how small the p value must be to lead you to reject the null hypothesis. It is very common that researchers reject H0 if the p value is less than .05. Sometimes values of .01 or .10 are used. This arbitrary threshold is called the alpha level. Tests for a population mean Example 6.12 Null hypothesis: mu=450 Alternative hypothesis: mu>450 (Assume population is approximately normal with standard deviation of 100.) We have a sample of 500 students whose average score is 461. x μ 461 450 P(x 461) P( ) P(z 2.46) .0069 σ/ n 100/ 500 We reject H (because if it is true, then our sample 0 mean is unusually large). Example 6.12 Histogram of means of samples of size 500 if mu=450 We reject the null hypothesis because sample means of 461 or larger have a very small probability. (We expect such large means less than 1% of the time.) Two-sided significance tests and confidence intervals A two-sided significance test which uses the .05 alpha level corresponds to a 95% confidence interval. That is, if the hypothesized population mean is outside of the 95% confidence interval, then the p value for the hypothesis test will be less than .05. Ditto for a 90% CI and a = .10, etc. p values versus fixed alpha • In many journal articles you will see statements such as “the null hypothesis was rejected at the .05 level of significance.” • It’s more informative to report the p value. For example, “the null hypothesis was rejected (p = .032).” Use & Abuse of Tests • Choosing a level of significance • What statistical significance doesn’t mean • Don’t ignore lack of significance • Statistical inference is not valid for all sets of data • Beware of searching for significance Power and Inference Power Increasing the power Inference as decision Two types of error Error probabilities The common practice of testing hypotheses Two types of error the null hypothesis is actually true false “reject NH” Type I Error “keep NH” Type II Error the test says Power When you do a certain hypothesis test, the probability that the test will reject the null hypothesis is called the power of that test. Power is a function of • • • • The alpha level What m really is The size of the sample The standard deviation of the population Increasing the power • • • • Increase the alpha level (from .05 to .10, for example) Try to make m really different from the null-hypothesis value Increase the size of the sample Try to reduce the standard deviation of the population Inference as decision SKIP THIS SECTION Error probabilities When the null hypothesis is true: P(Type I Error) = alpha When the null hypothesis is false: P(Type II Error) = beta The common practice of testing hypotheses SKIP THIS SECTION Homework 6.2 32, 37, 41, 44 6.3 72, 78 6.4 84, 85, 88