Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 17 Two-Sample Problems BPS - 3rd Ed. Chapter 17 1 Two-Sample Problems The goal of inference is to compare the responses to two treatments or to compare the characteristics of two populations. We have a separate sample from each treatment or each population. – Each sample is separate. The units are not matched, and the samples can be of differing sizes. BPS - 3rd Ed. Chapter 17 2 Case Study Exercise and Pulse Rates A study if performed to compare the mean resting pulse rate of adult subjects who regularly exercise to the mean resting pulse rate of those who do not regularly exercise. n mean std. dev. Exercisers 29 66 8.6 Nonexercisers 31 75 9.0 This is an example of when to use the two-sample t procedures. BPS - 3rd Ed. Chapter 17 3 Conditions for Comparing Two Means We have two independent SRSs, from two distinct populations – that is, one sample has no influence on the other-matching violates independence – we measure the same variable for both samples. Both populations are Normally distributed – the means and standard deviations of the populations are unknown – in practice, it is enough that the distributions have similar shapes and that the data have no strong outliers. BPS - 3rd Ed. Chapter 17 4 Two-Sample t Procedures In order to perform inference on the difference of two means (m1 – m2), we’ll need the standard deviation of the observed difference x1 x2 : 2 σ1 n1 BPS - 3rd Ed. 2 σ2 n2 Chapter 17 5 Two-Sample t Procedures We don’t know the population standard deviations s1 and s2. Problem: Solution: Estimate them with s1 and s2. The result is called the standard error, or estimated standard deviation, of the difference in the sample means. SE BPS - 3rd Ed. 2 s1 n1 Chapter 17 2 s2 n2 6 Two-Sample t Confidence Interval Draw an SRS of size n1 form a Normal population with unknown mean m1, and draw an independent SRS of size n2 form another Normal population with unknown mean m2. A confidence interval for m1 – m2 is: x1 x2 t 2 s1 n1 2 s2 n2 – here t* is the critical value for confidence level C for the t(k) density curve. The degrees of freedom k are equal to the smaller of n1 – 1 and n2 – 1. BPS - 3rd Ed. Chapter 17 7 Case Study Exercise and Pulse Rates Find a 95% confidence interval for the difference in population means (nonexercisers minus exercisers). 2 2 2 (8.6)2 s s (9.0) 1 x1 x2 t 2 75 66 2.048 n1 n2 31 29 9 4.65 4.35 to 13.65 “We are 95% confident that the difference in mean resting pulse rates (nonexercisers minus exercisers) is between 4.35 and 13.65 beats per minute.” BPS - 3rd Ed. Chapter 17 8 Two-Sample t Significance Tests Draw an SRS of size n1 form a Normal population with unknown mean m1, and draw an independent SRS of size n2 form another Normal population with unknown mean m2. To test the hypothesis H0: m1 = m2, the test statistic is: t ( x1 x 2 ) ( μ1 μ 2 ) 2 s1 n1 2 s2 n2 x1 x 2 2 s1 n1 2 s2 n2 Use P-values for the t(k) density curve. The degrees of freedom k are equal to the smaller of n1 – 1 and n2 – 1. BPS - 3rd Ed. Chapter 17 9 P-value for Testing Two Means Ha: m1 > m2 Ha: m1 < m2 P-value is the probability of getting a value as large or larger than the observed test statistic (t) value. P-value is the probability of getting a value as small or smaller than the observed test statistic (t) value. Ha: m1 m2 P-value is two times the probability of getting a value as large or larger than the absolute value of the observed test statistic (t) value. BPS - 3rd Ed. Chapter 17 10 Case Study Exercise and Pulse Rates Is the mean resting pulse rate of adult subjects who regularly exercise different from the mean resting pulse rate of those who do not regularly exercise? Null: The mean resting pulse rate of adult subjects who regularly exercise is the same as the mean resting pulse rate of those who do not regularly exercise? [H0: m1 = m2] Alt: The mean resting pulse rate of adult subjects who regularly exercise is different from the mean resting pulse rate of those who do not regularly exercise? [Ha : m1 ≠ m2] Degrees of freedom k = 28 (smaller of 31 – 1 and 29 – 1). BPS - 3rd Ed. Chapter 17 11 Case Study 1. Hypotheses: 2. Test Statistic: t H0: m1 = m2 x1 x 2 2 s1 n1 2 s2 Ha: m1 ≠ m2 75 66 (9.0) n2 31 2 3.961 (8.6) 2 29 3. P-value: P-value = 2P(T > 3.961) = 0.000207 (using a computer) P-value is smaller than 2(0.0005) = 0.0010 since t = 3.961 is greater than t* = 3.674 (upper tail area = 0.0005) (Table C) 4. Conclusion: Since the P-value is smaller than a = 0.001, there is very strong evidence that the mean resting pulse rates are different for the two populations (nonexercisers and exercisers). BPS - 3rd Ed. Chapter 17 12 Robustness of t Procedures The two-sample t procedures are more robust than the one-sample t methods, particularly when the distributions are not symmetric. When the two populations have similar distribution shapes, the probability values from the t table are quite accurate, even when the sample sizes are as small as n1 = n2 = 5. When the two populations have different distribution shapes, larger samples are needed. In planning a two-sample study, it is best to choose equal sample sizes. In this case, the probability values are most accurate. BPS - 3rd Ed. Chapter 17 13 Using the t Procedures Except in the case of small samples, the assumption that each sample is an independent SRS from the population of interest is more important than the assumption that the two population distributions are Normal. Small sample sizes (n1 + n2 < 15): Use t procedures if each data set appears close to Normal (symmetric, single peak, no outliers). If a data set is skewed or if outliers are present, do not use t. Medium sample sizes (n1 + n2 ≥ 15): The t procedures can be used except in the presence of outliers or strong skewness in a data set. Large samples: The t procedures can be used even for clearly skewed distributions when the sample sizes are large, roughly n1 + n2 ≥ 40. BPS - 3rd Ed. Chapter 17 14 Details of t Degrees of Freedom Using degrees of freedom k as the smallest of n1 – 1 and n2 – 1 is only a rough approximation to the actual degrees of freedom for the twosample t procedures. A better approximation that is used by software uses a function of the sample sizes and sample standard deviations to compute degrees of freedom df. Use of df in the two-sample t procedures gives more accurate results than when using k. BPS - 3rd Ed. Chapter 17 15 Details of t Degrees of Freedom BPS - 3rd Ed. Chapter 17 16 Case Study Exercise and Pulse Rates Compute the degrees of freedom df used by software to analyze these data using two-sample t procedures. df 2 n n 1 2 2 2 2 2 1 s1 1 s2 n 1 n n 1 n 1 1 2 2 s2 s2 1 2 2 9.0 8.6 31 29 2 2 2 2 1 9.0 1 8.6 30 31 28 29 2 2 57.97 This is the degrees of freedom value used by software when computing critical values and P-values. BPS - 3rd Ed. Chapter 17 17 Comparing Two Population Standard Deviations Suppose we have independent SRSs from two Normal populations with unknown means and standard deviations: a sample of size n1 from N(m1, s1) and a sample of size n2 from N(m2, s2). To test the hypothesis of equal spread, H0: s1 = s2 Ha: s1 ≠ s2 , use the ratio of sample variances, called the F statistic: F = s12 / s22 Since s1 and s2 each have associated degrees of freedom (n1 – 1 and n2 – 1, respectively), F has two types of degrees of freedom: – Numerator degrees of freedom: j = n1 – 1 – Denominator degrees of freedom: k = n2 – 1 BPS - 3rd Ed. Chapter 17 18 Characteristics of F Distributions The F distributions are right-skewed. F(j, k) is the notation for the F distribution with j numerator degrees of freedom and k denominator degrees of freedom. The area under the density curve equals 1 Since the F statistic takes on only positive values, the F distribution has no probability below 0. The peak of the F density curve is near 1. If s1 = s2, we would expect s1 and s2 to be close in size, yielding an F statistic near 1 – values of F far from 1 provide evidence against the null hypothesis of equal standard deviations. BPS - 3rd Ed. Chapter 17 19 Characteristics of F Distributions Density curve for the F(9, 10) distribution. BPS - 3rd Ed. Chapter 17 20 Using Table D F Distribution Critical Values Table D gives the upper tail area p critical points of the F distribution for upper tail areas p = 0.100, 0.050, 0.025, 0.010, and 0.001. For example, the critical points for the F(15, 20) distribution are: p F* 0.100 1.84 0.050 2.20 0.025 2.57 0.010 3.09 0.001 4.56 Since the F distributions are not symmetric about 0 (as are the standard Normal and t distributions), and Table D only gives upper tail areas, we shall eliminate the need for lowertail critical values by always computing F as the larger s2 divided by the smaller s2. BPS - 3rd Ed. Chapter 17 21 F Test Statistics If you do not use statistical software, arrange the two-sided F test as follows: BPS - 3rd Ed. Chapter 17 22 Case Study Reading Ability Two methods of teaching children to read are being investigated to see if one provides more consistent improvement in reading ability across students. Method 1 is implemented in one randomly chosen class of 16 students, while another class of 16 students is randomly chosen to represent Method 2. The standard deviations of the improvement in reading ability scores (as measured by a standardized test) are s1 = 3.905 and s2 = 3.186. Test to see if the standard deviations differ. Degrees of freedom: j = 15 and k = 15. p 0.100 0.050 0.025 0.010 0.001 F* 1.97 2.40 2.86 3.52 5.54 BPS - 3rd Ed. Chapter 17 23 Case Study 1. Hypotheses: 2. Test Statistic: H 0 : s1 = s2 F 2 s1 2 s2 H a : s1 ≠ s2 (3.905) 2 (3.186) 2 15.249 1.502 10.151 3. P-value: P-value = 2P(F > 1.502) = 0.4398 (using a computer) P-value is larger than 2(0.100) = 0.200 since F = 1.502 is smaller than F* = 1.97 (upper tail area = 0.100) (Table D) 4. Conclusion: Since the P-value is larger than a = 0.200, there is not strong evidence that the standard deviations differ for the two teaching methods; thus, neither gives more consistent improvement. BPS - 3rd Ed. Chapter 17 24