* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 19, Nov 15.
Survey
Document related concepts
Transcript
Math/Stat 352 Lecture 19 Section 6.5 and 6.7 Large Sample Tests for the difference of two means and Small Sample Tests for the difference of two means: independent samples 1 Independent and dependent samples Two samples are independent if the sample values selected from one population are not related to or somehow paired or matched with the sample values selected from the other population. Examples: weights of students in different univ., test results of students in different towns, yields on different fields, etc. Two samples are dependent (or consist of matched pairs) if the members of one sample can be used to determine the members of the other sample. Examples: Test results for students before and after a study session, weight of a group of people before and after a weight loss program, predicted and true max temps for several days in a given month in Reno, etc. Large Sample Tests for the Difference Between Two Means: Independent samples Now, we are interested in determining whether or not the means of two populations are equal. The data will consist of two samples, one from each population. The samples are independent. IDEA FOR THE TEST: We will compute the difference of the sample means. If the difference is far from 0, we will conclude that the population means are different. If the difference is close to 0, we will conclude that the population means might be the same, or in other words, not enough evidence to conclude significant difference. 3 COMPARING MEANS: INDEPENDENT LARGE SAMPLES FRAMEWORK: 1ST sample: x1, x2, …, xm from population with mean μx; 2nd sample: y1, y2, …, yn from population with mean μy; GOAL: Determine if μx –μy =>< 0 or if μx –μy =>< ∆ Test Ho: μx – μy=≥≤ ∆ vs Ha: μx - μy ≠ <>∆ ∆ is the difference between two means we want to test for. Example: Ho: μx – μy = 0, here ∆=0. Or Ho: μx – μy ≥ 2, here ∆=2. Some example questions: Are the mean heights of students in UNR and UNLV the same? Is the difference between the mean heights of Americans and Chinese larger than 2 inches? COMPARING MEANS: INDEPENDENT LARGE SAMPLES or σ 2 X known ,σ Case 1: Large samples or σ X2 , σ Y2 known. STEP1. Ho: μx – μy=≥≤ ∆ vs Ha: μx - μy ≠ ><∆, significance level α. STEP 2. Test statistic: z= ( X − Y ) − ∆0 σ / n X + σ / nY 2 X 2 Y . Under the Ho, the test statistic has standard normal distribution. STEP 3. Critical value? For one-sided test zα, for two-sided zα/2 . STEP 4. DECISION-critical/rejection region(s) depends on Ha. Ha: μx - μy ≠ ∆ Reject Ho if |z|> zα/2; Ha: μx - μy > ∆ Reject Ho if z > zα; Ha: μx - μy < ∆ Reject Ho if z < - zα. STEP 5. Answer the question in the problem. 2 Y COMPARING MEANS: INDEPENDENT LARGE SAMPLES or known σ ,σ 2 X Note: If σx and σy are not known, substitute sx and sy in the formula for the test statistic. Use also when small sample, but normal population and known. σ X2 , σ Y2 6 2 Y COMPARING MEANS: INDEPENDENT SMALL SAMPLES CASE 2: σx and σy not known, but assumed equal. STEP 2. Test statistic: where 𝒔𝒑 𝟐 𝒕 = �−𝒚 � −∆ 𝒙 𝒔𝒑 𝟏 𝟏 + 𝒏𝒙 𝒏𝒚 is a pooled estimate of the common variance 1 (m − 1) sx2 + (n − 1) s y2 } . s = { m+n−2 2 p Under the Ho, the test statistic has t distribution with df = m+n-2. STEP 3. Critical value? One-sided test tα, two-sided tα/2 . STEP 4. DECISION-critical/rejection region(s) depends on Ha. Ha: μx - μy ≠ ∆ Reject Ho if |t|> tα/2; Ha: μx - μy > ∆ Reject Ho if t > tα; Ha: μx - μy < ∆ Reject Ho if t < - tα. COMPARING MEANS: INDEPENDENT SMALL SAMPLES CASE 3: σx and σy not known, and may not be assumed equal. STEP 2. Test statistic: 𝒕 = �−𝒚 � −∆ 𝒙 𝒔𝒙𝟐 𝒔𝒚𝟐 + 𝒏𝒙 𝒏𝒚 Under Ho, the degrees of freedom for the t distribution may be approximated by v= ( s X2 2 ( s nX ) + ( s nY ) nX ) 2 (nX − 1) + ( sY2 nY ) 2 (nY − 1) 2 X 2 Y STEP 3. Critical value? One-sided test tα, two-sided tα/2 . STEP 4. DECISION-critical/rejection region(s) depends on Ha. Ha: μx - μy ≠ ∆ Reject Ho if |t|> tα/2; Ha: μx - μy > ∆ Reject Ho if t > tα; Ha: μx - μy < ∆ Reject Ho if t < - tα. EXAMPLE A medication for blood pressure was administered to a group of 13 randomly selected patients with elevated blood pressure while a group of 15 was given a placebo. At the end of 3 months, the following data was obtained on their Systolic Blood Pressure. Control group, x: n=15, sample mean = 180, s=50 Treated group, y: m=13, sample mean =150, s=30. Test if the treatment has been effective. Assume the variances are the same in both groups and use α=0.01. Soln. Let μx= mean blood pressure for the control group; μy= mean blood pressure for the treatment group. x Then, n=15, = 180, sx=50, m=13, of variances/st.dev. σx=σy y =150, sy =30. Assumed equality EXAMPLE contd. STEP1. Ho: μx = μy (medicine not effective) vs Ha: μx > μy (med. effective) STEP 2. Pooled variance: 2 2 2 2 m − s + n − s ( 1) ( 1) (15 1)50 (13 1)30 − + − x y = s 2p = = 1761.54. 15 + 13 − 2 m+n−2 Standard deviation = sp = s 2p 1761.54 = 41.97 Test statistic: x−y 180 − 150 = t = = 1.8863. 1 1 1 1 sp + 41.97 + m n 15 13 STEP 3. Critical value=t0.01=2.479, df=26. STEP 4. t=1.8863 not > 2.479, do not reject Ho. STEP 5. Not enough evidence to conclude that the medicine is effective. Example Sample statistics are shown for the distances of the home runs hit in record-setting seasons by Mark McGwire and Barry Bonds. Use a 0.05 significance level to test the claim that the distances come from populations with different means. McGwire Bonds n 70 73 x 418.5 403.7 s 45.5 30.6 Soln. Let μx= mean distance for McGwire; μy= mean distance for Bonds. CASE3. σx and σy are not known, and can not be assumed equal. EXAMPLE contd. STEP1. Ho: μx = μy (same mean distances) vs Ha: μx ≠ μy (different mean distances) Test statistic: = t x−y = 2 2 s sx + y m n 418.5 − 403.7 = 2.273. 2 45.5 30.62 + 70 73 STEP 3. Critical value= t0.025 = 1.994 , df = 120 STEP 4. t=2.273 > 1.994, reject Ho. STEP 5. There is enough evidence to conclude that the mean distances of the home runs for the two players are different. NOTE: Since s. sizes are large, we could use z-test. Then zα/2=z0.025 = 1.96, same conclusion. Blood pressure EXAMPLE contd. Construct a 95% CI for the difference in the means of blood pressures for the two groups (μx - μy). Soln. We already know n=15, sy =30, sp=41.97. x = 180, sx=50, m=13, y =150, CASE 2. 95% CI, so α=0.05, so α/2=0.025, t(26)0.025 = 2.056. 95% CI is: (180 − 150) ± (2.056)(41.97) 1 1 + = (−2.7, 62.7). 13 15 NOTE: The interval contains zero. Intuitively, that confirms our decision that there is no difference in mean effects between the medicine and the placebo. Example An article compares properties of welds made using carbon dioxide as a shielding gas with those of welds made using a mixture of argon and carbon dioxide. One property studied was the diameter of inclusions, which are particles embedded in the weld. A sample of 544 inclusions in welds made using argon shielding averaged 0.37µm in diameter, with a standard deviation of 0.25 µm. A sample of 581 inclusions in welds made using carbon dioxide shielding averaged 0.40 µm in diameter, with a standard deviation of 0.26 µm. Can you conclude that the mean diameters of inclusions differ between the two shielding gases? MINITAB: Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 10 32.30 8.56 2.7 2 10 44.1 10.1 3.2 Difference = mu (1) - mu (2) Estimate for difference: -11.80 95% upper bound for difference: -4.52 T-Test of difference = 0 (vs <): T-Value = -2.82 P-Value = 0.006 DF = 17 14 Example Good website design can make Web navigation easier. An article presents a comparison of item recognition between two designs. A sample of 10 users using a conventional Web design averaged 32.3 items identified, with a standard deviation of 8.56. A sample of 10 users using a new structured Web design averaged 44.1 items identified, with a standard deviation of 10.09. Can we conclude that the mean number of items identified is greater with the new structured design? Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 544 0.370 0.250 0.011 2 581 0.400 0.260 0.011 Difference = mu (1) - mu (2) Estimate for difference: -0.0300 95% CI for difference: (-0.0598, -0.0002) T-Test of difference = 0 (vs not =): T-Value = -1.97 P-Value = 0.049 DF = 1122 15 Example Two methods have been developed to determine the nickel content of steel. In a sample of five replications of the first method, X, on a certain kind of steel, the average measurement (in percent) was 3.16 with a standard deviation of 0.042. The average of seven replications of the second method, Y, was 3.24, and the standard deviation was 0.048. Assume that it is known that the population variances are nearly equal. Can we conclude that there is a difference in the mean measurements between the two methods? Minitab: Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 5 3.1600 0.0420 0.019 2 7 3.2400 0.0480 0.018 Difference = mu (1) - mu (2) Estimate for difference: -0.0800 95% CI for difference: (-0.1396, -0.0204) T-Test of difference = 0 (vs not =): T-Value = -2.99 P-Value = 0.014 DF = 10 Both use Pooled StDev = 0.0457 16