Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PHP 2510 Hypothesis testing: One sample We have discussed methods of point and interval estimation for parameter of interest. Researchers often have preconceived ideas about what the parameter might be and wish to test whether the data confirm with these ideas. PHP 2510 – October 29, 2009 1 Example: Testing hypothesis about a mean Suppose that the average birthweight of full-term, live-born is 120 oz. A researcher hypothesizes that mothers with low socioeconomic status (SES) deliver babies whose birthweight are lower than “normal”. Two hypotheses are considered. The average birthweight of these newborn is 120 oz The average birthweight of these newborn is lower than 120 oz PHP 2510 – October 29, 2009 2 Procedure for hypothesis testing 1. State your hypothesis about a parameter of interest. Usually constructed in terms of a null hypothesis and alternative hypothesis. Example: Null hypothesis: mean is 120 oz Alternative hypothesis: mean is lower than 120 oz 2. Collect data and compute an estimate of the parameter. 3. Draw conclusion based on whether the estimate is close to the null hypothesis value. The primary methodologic issue is: how do we define ‘close’ ? That comes later. PHP 2510 – October 29, 2009 3 Back to birthweight example Let µ denote the average birthweight of those newborn. Null hypothesis µ = 120 oz Alternative hypothesis µ < 120 oz PHP 2510 – October 29, 2009 4 Conducting the test After collecting the data and testing the hypothesis, make a conclusion to accept or reject the null hypothesis. In this case, if you accept that µ = 120, we say that those newborns have normal birthweight. If you reject, then we conclude that those newborns have lower birthweight and associated risks should be further investigated. PHP 2510 – October 29, 2009 5 Four possible outcomes an occur: (1) We accept the null hypothesis when the null hypothesis is in fact true. (2) We reject the null hypothesis when the null hypothesis is in fact true. In this case, we say that we make a Type I error. (3) We accept the null hypothesis when the alternative hypothesis is in fact true. In this case, we say that we make a Type II error. (4) We reject the null hypothesis when the alternative hypothesis is in fact true. PHP 2510 – October 29, 2009 6 Testing hypothesis with data Consider birthweight example. Collect data on 100 such newborns, and find the following: X = µ b = 115 S = σ b = 24 To test hypothesis, figure out how far is X from the null hypothesis value? • If it is far away, reject the null • If it is close, do not reject What do we mean by ‘close’ ? PHP 2510 – October 29, 2009 7 1. Testing hypotheses with a confidence interval We can draw some conclusions by forming a confidence interval. A 95% interval is √ X ± 1.96 × σ b/ n 115 ± 1.96 × 24/10 ⇒ (110.2, 119.8) Use the interval to draw a conclusion about the null hypothesis that µ = 120. PHP 2510 – October 29, 2009 8 2. Testing hypotheses with a test statistics The usual approach: 1. State null hypothesis about a parameter of interest (e.g. µ = 120 oz) 2. Decide on a statistic that will estimate µ (like X) 3. Characterize the random variation of your statistic under the assumption that the null hypothesis is true (e.g. if the true value of µ is 120, what is the distribution of X?). This is the crucial step. PHP 2510 – October 29, 2009 9 4. Collect data and compute a value for the statistic. 5. Compare the observed value of your statistic to its distribution under the null hypothesis (a) If the observed value is consistent with the distribution of the statistic under the hypothesis, accept the hypothesis (b) If not, then reject the hypothesis PHP 2510 – October 29, 2009 10 Example: Test hypothesis about birthweight Before we collect data, we need to characterize the distribution of X if the null hypothesis is true. The null hypothesis mean is usually denoted by µ0 . Using the central limit theorem, we know that for any sample mean, ¶ µ 2 σ X ∼ N µ0 , n If the null hypothesis is true, then µ = µ0 = 120, and we expect that µ ¶ 2 σ X ∼ N 120, n Equivalently, we expect that X − 120 √ ∼ N (0, 1) σ/ n PHP 2510 – October 29, 2009 11 Accept or reject? Once I observe X, I need to make a decision to accept or reject the hypothesis. Remember that the hypothesis is either true or false! I will make a decision to accept or reject the null hypothesis based on X. The question I will ask is, ‘If the true mean is 120, what is the probability of observing X, or something farther from 120 than X?’ If the probability is low, then I reject the null. The threshhold for ‘low’ can vary, depending on the setting. PHP 2510 – October 29, 2009 12 Carrying out the test The probability of interest is calculated using the null distribution of X; that is, its distribution when the null hypothesis is true The standardized distance from X to µ0 is Z = = = X − µ0 √ σ/ n 115 − 120 √ 24/ 100 −2.08 The associated probability is P (Z < −2.08) = .02. This is a one-sided p-value because the only alternatives we consider are in the downward direction (i.e., lower birthweight than the nationwide average). The two-sided p value is P (Z < −2.08) + P (Z > 2.08) = 0.02 + .02 = .04. PHP 2510 – October 29, 2009 13 100 0 50 Frequency 150 200 Histogram of Sample Means under mu = 120, n = 100 110 PHP 2510 – October 29, 2009 115 120 125 130 14 Types of Hypotheses Simple vs simple H0 : µ = µ0 vs H1 : µ = µ1 . e.g. H0 : µ = 3 vs H1 : µ = 5. H0 : µ = 3 vs H1 : µ = 1. H0 : average birthweight is µ = 120 vs H1 : average birthweight is µ = 110. PHP 2510 – October 29, 2009 15 Simple vs Composite H0 : µ = µ0 vs H1 : µ > µ0 . H0 : µ = µ0 vs H1 : µ < µ0 . H0 : µ = µ0 vs H1 : µ 6= µ0 . e.g. H0 : µ = 5 vs H1 : µ < 5. H0 : average birthweight is µ = 120 vs H1 : average birthweight is µ < 120. We commonly see this type of hypotheses. PHP 2510 – October 29, 2009 16 Type I error rate α = = Type I error rate P (reject null hypothesis | null is true) Type I error can be made if the null hypothesis is true. We can set it to be 0.01, 0.05, 0.10, ... Most commonly used value is 0.05. We have a control over α. Why? Recall that we assume H0 is true before drawing a conclusion. PHP 2510 – October 29, 2009 17 One-tailed test A one-tailed test is one that locates the rejection region in only one tail of the sampling distribution of the test statistic. To detect H1 : µ > µ0 , place the rejection region in the upper tail of the distribution of X̄ under H0 . To detect H1 : µ < µ0 , place the rejection region in the lower tail of the distribution of X̄ under H0 . PHP 2510 – October 29, 2009 18 Consider the case, H0 : µ = µ0 versus H1 : µ < µ0 . H0 will be rejected for small X̄ < c. But how small? Suppose that H0 is rejected for all values of X̄ < c. α = P (reject null | null is true) = P (X̄ < c | X̄ ∼ N (µ0 , σ 2 /n)) X̄ − µ0 c − µ0 √ < √ | X̄ ∼ N (µ0 , σ 2 /n)) σ/ n σ/ n c − µ0 √ ) = P (Z < σ/ n = P( Let zα satisfies P (Z > zα ) = α. c − µ0 √ = −zα σ/ n PHP 2510 – October 29, 2009 ⇒ √ c = µ0 − zα σ/ n 19 Two-tailed test A two-tailed test is one that locates the rejection regions in both tails of the sampling distribution of the test statistic. To detect H1 : µ 6= µ0 , place the rejection region in both the upper and lower tails of the distribution of X̄ under H0 . To test H0 : µ = µ0 versus H1 : µ 6= µ0 . H0 will be rejected for small X̄ < c1 or large X̄ > c2 . In this case α = P (reject null | null is true) = P (X̄ < c1 or X > c2 | X̄ ∼ N (µ0 , σ 2 /n)) PHP 2510 – October 29, 2009 20 Usually, α/2 = P (X̄ < c1 | X̄ ∼ N (µ0 , σ 2 /n)) = P( = X̄ − µ0 c −µ √ < 1 √ 0 | X̄ ∼ N (µ0 , σ 2 /n)) σ/ n σ/ n c1 − µ0 √ ) P (Z < σ/ n Let zα/2 satisfies P (Z > zα/2 ) = α/2. √ c1 = µ0 − zα/2 σ/ n Similarly √ c2 = µ0 + zα/2 σ/ n PHP 2510 – October 29, 2009 21 p-value Alternatively, we can draw conclusions using p-value. p-value can be thought of as the probability under the null hypothesis of a result as or more extreme that actually observed. PHP 2510 – October 29, 2009 22 Example: Certain brand of cigarettes is advertised by manufacturer as having mean nicotine content of 15 mg/cigarette. Sample of 200 cigarettes is tested by lab and found to have average of 16.2 mg of nicotine with SD = 3.6. Using a 0.01 level of significance, can we conclude that actual mean nicotine content of this brand is greater than 15 mg? PHP 2510 – October 29, 2009 23 Type II error rate β = = Type II error rate P (accept null hypothesis | alternative is true) In general, we do not have control over Type II error rate β. Why? So, we say “we fail to reject the null hypothesis” instead of “we accept the null”. When we test “simple vs simple” hypotheses, we can determine Type II error rate. PHP 2510 – October 29, 2009 24 Simple vs simple hypotheses Consider H0 : µ = µ0 vs H1 : µ = µ1 . PHP 2510 – October 29, 2009 25