Download Document

381 Hypothesis Testing (Testing with Two Samples-III) QSCI 381 – Lecture 32 (Larson and Farber, Sects 8.3 – 8.4) Independent and Dependent Samples 381  Two samples are if the sample selected from one population is not related to the sample selected from the second population. The two samples are if each member of one sample corresponds to a member of the other sample. Dependent samples are also called or matched samples. Examples 381  Which are independent and dependent samples?     25 fish in each of two ponds are weighed. Weights of 25 fish in a pond on two successive days. Weights and lengths of 30 fish. Heights of 25 males and 25 females. t-test for the Difference Between Means-I (Conditions) 381  A t-test can be used to test the difference of two population means when a sample is randomly selected from each population.    The samples must be randomly selected. The samples must be dependent (paired). Both populations must be normally distributed. t-test for the Difference Between Means-II 381   The approaches in lectures 30 and 31 only apply to independent samples. Dependent data are analyzed by considering the difference for each pair: di  x1,i  x2,i  The test statistic is the mean difference: d  1n  di i 381 t-test for the Difference Between Means-III  The test statistic is: d  1n  di i and the standardized test statistic is: d  d t sd / n sd  n( di2 )  ( di ) 2 n(n  1) d is the hypothesized mean of the differences of the paired data in the population. d.f. = n-1. Example-I 381  You are evaluating a program that aims to recover degraded streams. The data available are “environmental scores” before and after the recovery program. Prior to the start of the recovery program, the contractors claimed that the “environmental score” would increase by an average of more than 5 points. Evaluate the claim at the 5% level of significance. Example-II 381 1. 2. 3. H0: d  5; Ha: d > 5; The level of significance is 0.05, the d.f.=15-1=14, and we have a right-tailed test. The rejection region is therefore t > 1.76. The standard deviation of the differences, sd, is given by: sd  4. n( di2 )  ( di ) 2  3.622 The standardized test statistic is: d  d 5.82  5 t 5. n(n  1) sd / n  3.622 / 15  0.877 We fail to reject the null hypothesis. Note that d  5 but will still fail to reject the null hypothesis – why? Constructing a c-confidence Interval 381  To construct a confidence interval for d, use the following inequality: sd sd d  tc   d  d  tc n n  Construct a 90% confidence interval for d for Example I. 3.622 3.622 5.82  1.76  d  5.82  1.76 15 15 4.173  d  7.467 Two sample z-test for the difference between proportions-I 381  We can test the difference between two population proportions p1 and p2 based on samples from each population. We can use the ztest if the following conditions are true:    The samples are randomly selected. The samples are independent. The sample sizes are large enough to use a normal sampling distribution assumption, i.e.: n1 p1  5; n1q1  5; n2 p2  5; n2q2  5; Two sample z-test for the difference between proportions-II 381  ˆ1  pˆ 2 , the difference The sampling distribution for p between the sample proportions, is a normal distribution with mean difference:  pˆ  pˆ  p1  p2 1 2 and standard error:  pˆ  pˆ  1 2 p1 q1 p2 q2  n1 n2 The standard error can be approximated by:  pˆ  pˆ  1 1 1 1 pq    n1 n2  p x1  x2 n1  n2 Two sample z-test for difference between proportions-III 381 1. 2. 3. State H0 and Ha. Identify  and find the critical values(s) and rejection region(s). Find the weighted estimate of p̂1 and p̂2: p 4. Calculate the standardized test statistic: z 5. x1  x2 n1  n2 ( pˆ1  pˆ 2 )  ( p1  p2 ) 1 1 pq    n1 n2  Make a decision to reject or fail to reject the null hypothesis. Example-I 381  One expectation of creating a marine reserve is that the fraction of “large” fish should increase. 100 fish are sampled from each of two areas (one a reserve and another actively fished). Test whether the fraction of “large” fish in the reserve and the fished area differ at the 1% level of significance. Reserve Non-reserve n1=100 n2=100 x1=17 x2=6 Example-II 381    H0: p1=p2; Ha: p1p2. =0.01; rejection region |z|>2.576. The weighted proportion estimate is: p  The standardized test statistic: z  x1  x2 17  6   0.115 n1  n2 200 ( pˆ1  pˆ 2 )  ( p1  p2 ) 1 1 pq    n1 n2   (0.17  0.06)  0  2.438 0.115 x 0.885 x (1/100  1/100) We fail to reject the null hypothesis at the 1% level of significance. 381 Constructing a c-confidence Interval  To construct a confidence interval for p1-p2, use the following inequality: ( pˆ1  pˆ 2 )  zc  pˆ1 qˆ1 pˆ 2 qˆ2   p1  p2  ( pˆ1  pˆ 2 )  zc n1 n2 pˆ1 qˆ1 pˆ 2 qˆ2  n1 n2 Construct a 95% confidence interval for p1-p2 for Example I. 0.11  1.96 0.17 x 0.83 0.06 x 0.94 0.17 x 0.83 0.06 x 0.94   p1  p2  0.11  1.96  100 100 100 100 0.023  p1  p2  0.197 Review 381 Are the samples independent? No Use t-test for dependent samples Yes Are both samples large? Yes Use z-test for large independent samples No Are both populations normal? No Cannot use any of the tests No Use a t-test for small independent samples. Yes Are both population standard deviations known? Yes Use z-test

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document