Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
381 Hypothesis Testing (Testing with Two Samples-III) QSCI 381 – Lecture 32 (Larson and Farber, Sects 8.3 – 8.4) Independent and Dependent Samples 381 Two samples are if the sample selected from one population is not related to the sample selected from the second population. The two samples are if each member of one sample corresponds to a member of the other sample. Dependent samples are also called or matched samples. Examples 381 Which are independent and dependent samples? 25 fish in each of two ponds are weighed. Weights of 25 fish in a pond on two successive days. Weights and lengths of 30 fish. Heights of 25 males and 25 females. t-test for the Difference Between Means-I (Conditions) 381 A t-test can be used to test the difference of two population means when a sample is randomly selected from each population. The samples must be randomly selected. The samples must be dependent (paired). Both populations must be normally distributed. t-test for the Difference Between Means-II 381 The approaches in lectures 30 and 31 only apply to independent samples. Dependent data are analyzed by considering the difference for each pair: di x1,i x2,i The test statistic is the mean difference: d 1n di i 381 t-test for the Difference Between Means-III The test statistic is: d 1n di i and the standardized test statistic is: d d t sd / n sd n( di2 ) ( di ) 2 n(n 1) d is the hypothesized mean of the differences of the paired data in the population. d.f. = n-1. Example-I 381 You are evaluating a program that aims to recover degraded streams. The data available are “environmental scores” before and after the recovery program. Prior to the start of the recovery program, the contractors claimed that the “environmental score” would increase by an average of more than 5 points. Evaluate the claim at the 5% level of significance. Example-II 381 1. 2. 3. H0: d 5; Ha: d > 5; The level of significance is 0.05, the d.f.=15-1=14, and we have a right-tailed test. The rejection region is therefore t > 1.76. The standard deviation of the differences, sd, is given by: sd 4. n( di2 ) ( di ) 2 3.622 The standardized test statistic is: d d 5.82 5 t 5. n(n 1) sd / n 3.622 / 15 0.877 We fail to reject the null hypothesis. Note that d 5 but will still fail to reject the null hypothesis – why? Constructing a c-confidence Interval 381 To construct a confidence interval for d, use the following inequality: sd sd d tc d d tc n n Construct a 90% confidence interval for d for Example I. 3.622 3.622 5.82 1.76 d 5.82 1.76 15 15 4.173 d 7.467 Two sample z-test for the difference between proportions-I 381 We can test the difference between two population proportions p1 and p2 based on samples from each population. We can use the ztest if the following conditions are true: The samples are randomly selected. The samples are independent. The sample sizes are large enough to use a normal sampling distribution assumption, i.e.: n1 p1 5; n1q1 5; n2 p2 5; n2q2 5; Two sample z-test for the difference between proportions-II 381 ˆ1 pˆ 2 , the difference The sampling distribution for p between the sample proportions, is a normal distribution with mean difference: pˆ pˆ p1 p2 1 2 and standard error: pˆ pˆ 1 2 p1 q1 p2 q2 n1 n2 The standard error can be approximated by: pˆ pˆ 1 1 1 1 pq n1 n2 p x1 x2 n1 n2 Two sample z-test for difference between proportions-III 381 1. 2. 3. State H0 and Ha. Identify and find the critical values(s) and rejection region(s). Find the weighted estimate of p̂1 and p̂2: p 4. Calculate the standardized test statistic: z 5. x1 x2 n1 n2 ( pˆ1 pˆ 2 ) ( p1 p2 ) 1 1 pq n1 n2 Make a decision to reject or fail to reject the null hypothesis. Example-I 381 One expectation of creating a marine reserve is that the fraction of “large” fish should increase. 100 fish are sampled from each of two areas (one a reserve and another actively fished). Test whether the fraction of “large” fish in the reserve and the fished area differ at the 1% level of significance. Reserve Non-reserve n1=100 n2=100 x1=17 x2=6 Example-II 381 H0: p1=p2; Ha: p1p2. =0.01; rejection region |z|>2.576. The weighted proportion estimate is: p The standardized test statistic: z x1 x2 17 6 0.115 n1 n2 200 ( pˆ1 pˆ 2 ) ( p1 p2 ) 1 1 pq n1 n2 (0.17 0.06) 0 2.438 0.115 x 0.885 x (1/100 1/100) We fail to reject the null hypothesis at the 1% level of significance. 381 Constructing a c-confidence Interval To construct a confidence interval for p1-p2, use the following inequality: ( pˆ1 pˆ 2 ) zc pˆ1 qˆ1 pˆ 2 qˆ2 p1 p2 ( pˆ1 pˆ 2 ) zc n1 n2 pˆ1 qˆ1 pˆ 2 qˆ2 n1 n2 Construct a 95% confidence interval for p1-p2 for Example I. 0.11 1.96 0.17 x 0.83 0.06 x 0.94 0.17 x 0.83 0.06 x 0.94 p1 p2 0.11 1.96 100 100 100 100 0.023 p1 p2 0.197 Review 381 Are the samples independent? No Use t-test for dependent samples Yes Are both samples large? Yes Use z-test for large independent samples No Are both populations normal? No Cannot use any of the tests No Use a t-test for small independent samples. Yes Are both population standard deviations known? Yes Use z-test