Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 4030 – 9b Comparing Two Means • Dependent and independent samples • Comparing two means 1 Choice of test depend on • Independent samples design or Matched pairs design (Sec. 8.1) • Size of samples • Equal variances • Normality 2 Independent Samples Large samples? (≥ 30) Matched Pairs Sample N One sample of differences (Sec. 8.4) Normality? N Y Y Z test (Sec. 8.2) Y Nonparametric Tests (Ch.14) Equal variance? Y t test with df = n1 + n2 -2, using pooled estimator for the common variance (Sec. 8.3) N t test with estimated degree of freedom (Sec. 8.3) 3 Data format for independent samples: Population 1 (may or may not be normally distributed), with mean 1 (to be estimated and compared) and variance 21 (may or may not known). Population 2 (may or may not be normally distributed), with mean 2 (to be estimated and compared) and variance 22 (may or may not known). Sample of size n1: Sample of size n2: X 1 , X 2 ,..., X n1 With sample mean and sample variance: X and S12 Y1 , Y2 ,..., Yn2 With sample mean and sample variance: Y and S22 4 Sampling distribution of X Y : E X Y E X E Y 1 2 Var X Y Var X Var Y 12 n1 22 n2 S12 S22 n1 n2 Distribution? CLT still apply? 5 Case 1: both samples are large (n1 ≥ 30, n2 ≥ 30) (Sec. 8.2) 12 22 , X Y ~ N 1 2 , n1 n2 or X Y Z ~ N 0,1 1 12 n1 2 22 n2 S12 S22 n1 n2 6 Example 1: It is believed that the resistance of certain electric wire can be reduced by 0.05 ohm by alloying. (Assuming standard deviation of resistance of any wire is 0.035 ohm.) A sample of 32 standardX wires and 32 alloyed wires are sampled. Question 1: Find the probability that average resistance of 32 standard wires is at least 0.03 ohm higher than that of 32 alloyed wires. 7 Confidence interval for 1 2 : x y z /2 2 1 2 2 s s n1 n2 Test statistic for H0: 1 2 0 X Y Z 2 1 0 2 2 S S n1 n2 8 Example 1: It is believed that the resistance of certain electric wire can be reduced by alloying. To verify this, a sample of 32 standard wires results the sample mean 0.136 ohm and sample sd 0.034 ohm, and a sample of 32 X alloyed wires results the sample mean of 0.083 ohm and sample sd 0.036 ohm Question 2: Construct a 95% confidence interval for the mean resistance reduction due to alloying. 9 Example 1: It is claimed that the resistance of certain electric wire can be reduced by more than 0.05 ohm by alloying. To verify this, a sample of 32 standard wires results the sample mean 0.136 ohm and sample sd X 0.004 ohm, and a sample of 32 alloyed wires results the sample mean of 0.083 ohm and sample sd 0.005 ohm. Question 3: Can we support the claim at = 0.05 level? 10 Case 2.1: Small sample(s), normal populations with known equal variance 2 (Sec. 8.3) 1 2 1 X Y ~ N 1 2 , , n1 n2 or X Y Z ~ N 0,1 1 2 1 1 n1 n2 11 Example 1’: It is claimed that the resistance of certain electric wire can be reduced by more than 0.05 ohm by alloying. To verify this, a sample of 15 standard wires results the sample mean 0.136 ohm, and a sample of X 15 alloyed wires results the sample mean of 0.083 ohm. (Assume that the resistance has normal distribution with standard deviation 0.0049 ohm for any types of wire.) Question: Can we support the claim at = 0.05 level? 12 Case 2.2: Small sample(s), normal populations with unknown equal variance (Sec. 8.3) Sp 2 2 n 1 S n 1 S 2 1 2 2 1 n1 n2 2 X Y t ~ t n n 1 Sp 1 1 n1 n2 2 1 2 1 13 Example 1’’: It is claimed that the resistance of certain electric wire can be reduced by more than 0.05 ohm by alloying. To verify this, a sample of 15 standard wires results the sample mean 0.136 ohm and sample sd 0.0049, and a sample of 15 alloyed wires results the sample mean of 0.083 ohm and sample sd 0.0052 . (Assume that the resistance has normal distribution with the same variance) Question: Can we support the claim at = 0.05 level? 14 Case 2.3: Small sample(s), normal populations with unequal variance (Sec. 8.3) X Y 1 2 t' S12 S22 n1 n2 has t distribution with estimated degree of freedom 2 2 2 s1 s2 n1 n2 df 2 2 2 2 s2 s1 n n2 1 n1 1 n2 1 15 Matched Pairs Samples (Sec. 8.4) Only one population, and one sample of size n, but two measurements: X 1 , X 2 ,..., X n Y1 , Y2 ,..., Yn Since we are interested in the differences, this is really a one sample problem: D1, D2 ,..., Dn where D X Y i i i 16 Sampling distribution of D : E D E X Y E X E Y 1 2 D Var D Var X Var Y 2 1 n S Di D . n 1 i 1 2 D Test the hypothesis D = 0 vs. Confidence interval containing 0. 17 Example 2: It is claimed that the resistance of certain electric wire can be reduced by more than 0.05 ohm by alloying. To verify this, a sample of 15 wires are tested before the alloying and again after the alloying, we find the mean X reduction 0.063 ohm, and the sd of the reductions 0.025. (Assume that the resistance has normal distribution) Question: Can we support the claim at = 0.05 level? 18 Use R to compare two means: t.test(X, Y,…) can be used to compare the means from two samples, where X and Y are vectors of data values of samples from two population. Other parameters: • Format of alternative hypothesis • Assumed mean in null hypothesis • Dependent or independent samples • Equal or unequal variances • Level of significance 19 Example 3: > Calif=c(59,68,44,71,63,46,69,54,48) > Org=c(50,36,62,52,70,41) > t.test(Calif, Org, alternative = "two.sided“, mu = 0, conf.level = 0.95) Two Sample t-test Textbook Data: 8-11.TXT Calif 59 68 44 71 63 46 69 54 48 Org 50 36 62 52 70 41 data: Calif and Org t = 1.0302, df = 13, p-value = 0.3217 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -6.764863 19.098196 sample estimates: Conclusion? mean of x mean of y 58.00000 51.83333 20 Example 4: > X=read.table(file.choose(),header=TRUE) > t.test(X$wghtI,X$wghtII,alternative = "less“, mu = 0, paired = TRUE, conf.level = 0.95) Paired t-test Textbook Data: 8-16.TXT wghtI 11.23 14.36 8.33 10.5 23.42 9.15 13.47 6.47 12.4 19.38 wghtII 11.27 14.41 8.35 10.52 23.41 9.17 13.52 6.46 12.45 19.35 data: X$wghtI and X$wghtII t = -2.2056, df = 9, p-value = 0.02742 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf -0.003377979 sample estimates: mean of the differences Conclusion? -0.02 21