* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 1.017 Class 10: Common Distributions
Survey
Document related concepts
Transcript
1.017/1.010 Class 16 Testing Hypotheses about a Single Population Formulating Hypothesis Testing Problems Hypotheses about a random variable x are often formulated in terms of its distributional properties. Example, if property is a: Null hypothesis H0: a = a0 Objective of hypothesis testing is to decide whether or not to reject this hypothesis. Decision is based on estimator â of a: Reject H0: If observed estimate â lies in rejection region Ra0 ( aˆ Ra 0 ) Do not reject H0: Otherwise ( aˆ Ra 0 ) Select rejection region to obtain desired error properties: Test Result Do not reject H0 aˆ Ra 0 Reject H0 aˆ Ra 0 H0 true P(H0|H0) =1 - P(~H0|H0) = (Type I Error) H0 false P(H0|~H0) = (Type II Error) P(~H0|~H0) = 1- True situation Type I error probability is called the test significance level. Deriving Hypothesis Rejection Regions for Large Sample Tests Hypothesis test is often based on a standardized statistic that depends on unknown true property and its estimate. Basic concepts are the same as used to derive confidence intervals (see Class 14). An example is the z statistic: z (aˆ , a) aˆ - a SD[aˆ ] If the estimate is unbiased E[z] = 0 and Var[z] = 1. 1 Define a rejection region Rz0 in terms of z as: R z 0 : z (aˆ , a0 ) z L z (aˆ , a0 ) zU As rejection region grows Type I error increases and Type II error decreases (test is more likely to reject hypothesis). As rejection region shrinks Type I error decreases and Type II error increases (test is less likely to reject hypothesis) Usual practice is to select rejection region to insure that Type I error probability is equal to a specified value For a two-sided test require that Type I error probability is distributed equally between intervals below zL (probability = /2) and above zU (probability = /2). These probabilities are: P[ z (aˆ , a) z L | H0] P[ z (aˆ , a0 ) z L ] Fz ( z L ) 2 P[ z (aˆ , a) zU | H0] P[ z (aˆ , a0 ) zU ] 1 Fz ( zU ) z L Fz-1 2 2 zU Fz-1 1 2 For large samples z (aˆ , a0 ) has a unit normal distribution. Use the MATLAB function norminv to evaluate Fz-1 If the definition of z is applied a two-sided rejection region Ra0 can also be written directly in terms of the estimate â : Ra 0 : aˆ a L a0 Fz-1 SD[aˆ ] 2 aˆ aU a0 Fz-1 1 SD[aˆ ] 2 p Values p value is largest significance level resulting in acceptance of H0. For a symmetric two-sided rejection region and a large sample: 2 aˆ a0 p / 2 1 Fz SD(aˆ ) aˆ a0 p / 2 Fz SD(aˆ ) aˆ a aˆ a0 0 For large samples use the MATLAB function normcdf to compute p from â and SD[ â ]. Special Case -- Sample mean Consider hypothesis about value of population mean a = E[x]: H0: a = E[x] = a0 Base test on sample mean estimator mx. Obtain SD[mx] from sample standard deviation: SD[m x ] SD[ x ] N sx N Example: Testing whether mean is significantly different from zero Suppose a0 = 0, sx = 3, N = 9, mx = 1.2 and = .05: 0.05 3 Ra 0 : m x a L 0 Fz-1 1.96 2 9 0.05 3 m x aU 0 Fz-1 1 1.96 2 9 In this case hypothesis is not rejected since mx = 1.2 does not lie in Ra0. The two-sided p-value is (see plot): m a0 1 p / 2 Fz x Fz s / N x p = 0.22 3 1.2 0 Fz 1.2 .89 3 / 9 1-p/2 z Copyright 2003 Massachusetts Institute of Technology Last modified Oct. 8, 2003 4