Download Review of Basic Statistical Concepts

MGT 2120: Chapter 10 Statistical Inference about Means and Proportions with Two Populations Two population means Notations: 1 = Mean of population 1 1 = Standard deviation of population 1 2 = Mean of population 2 2 = Standard deviation of population 2 Sample 1 = Sample taken from population 1 n1 = Sample size of sample 1 𝑥̅1 = Mean of sample 1 s1 = Standard deviation of sample 1 Sample 2 = Sample taken from population 2 n2 = Sample size of sample 2 𝑥̅2 = Mean of sample 2 s2 = Standard deviation of sample 2 §10.1 Inferences about the difference between two population means: 1 and 2 are known Parameter of interest = 1 – 2 Point estimate of 1 – 2 = 𝑥̅1 - 𝑥̅2 Expected value of 𝑥̅1 - 𝑥̅ 2 = E(𝑥̅1 - 𝑥̅2 ) = 1 – 2 𝜎2 𝜎2 1 2 Standard error of 𝑥̅1 - 𝑥̅2 = √ n1 + n2 If n1 and n2 are both large (i.e. > 30), then 𝑥̅1 - 𝑥̅2 will follow an approximately normal distribution. Confidence interval for 1 – 2: 𝜎2 𝜎2 1 2 Use formula 10.4, page 410: (𝑥̅1 - 𝑥̅2) ± Z/2 √ n1 + n2 Hypothesis testing: Ho: 1 – 2 ≥ D0 Ha: 1 – 2 < D0 One-tail (left) test Test statistic zcalc = Ho: 1 – 2 ≤ D0 Ha: 1 – 2 > D0 One-tail (right) test Ho: 1 – 2 = D0 Ha: 1 – 2 ≠ D0 Two-tail test (𝑥̅ 1 −𝑥̅ 2 )−𝐷0 2 2 n1 n2 𝜎 𝜎 √ 1+ 2 1 and 2 are rarely known if ever, so we will not work out an example. §10.2 Inferences about the difference between two population means: 1 and 2 are unknown Case 1: 1 ≠ 2 Parameter of interest = 1 – 2 Point estimate of 1 – 2 = 𝑥̅1 - 𝑥̅2 Expected value of 𝑥̅1 - 𝑥̅ 2 = E(𝑥̅1 - 𝑥̅2 ) = 1 – 2 𝑆2 𝑆2 1 2 Standard error of 𝑥̅1 - 𝑥̅2 = √n1 + n2 with df = Formula 10.7, page 416 If n1 and n2 are both large (i.e. > 30), then 𝑥̅1 - 𝑥̅2 will follow an approximately normal distribution. Confidence interval for 1 – 2: 𝑆2 𝑆2 1 2 Use formula 10.6, page 416: (𝑥̅1 - 𝑥̅2) ± t/2,df √n1 + n2 Note: df is given by the formula 10.7, page 416; Data Analysis will provide this number for us. Hypothesis testing: Ho: 1 – 2 ≥ D0 Ha: 1 – 2 < D0 One-tail (left) test Test statistic tcalc = Ho: 1 – 2 ≤ D0 Ha: 1 – 2 > D0 One-tail (right) test Ho: 1 – 2 = D0 Ha: 1 – 2 ≠ D0 Two-tail test (𝑥̅1 −𝑥̅2 )−𝐷0 2 2 n1 n2 𝑆 𝑆 √ 1+ 2 p-value: T.DIST.RT(ABS(tcalc,df) for both the one-tail tests T.DIST.2T(ABS(tcalc,df) for the two-tail test See Formula 10.7, page 416, for df We will use Excel Data Analysis command for determining the p-value. Case 2: 1 = 2 =  Parameter of interest = 1 – 2 Point estimate of 1 – 2 = 𝑥̅1 - 𝑥̅2 Expected value of 𝑥̅1 - 𝑥̅ 2 = E(𝑥̅1 - 𝑥̅2 ) = 1 – 2 Pooled variance estimate (𝑆𝑝2 ) = (n1 −1)s12 +(n2 −1)S22 n1 +n2 −2 1 1 1 2 Then, the standard error of 𝑥̅1 - 𝑥̅2 = 𝑆𝑝 √n + n with df = n1 + n2 - 2 If n1 and n2 are both large (i.e. > 30), then 𝑥̅1 - 𝑥̅2 will follow an approximately normal distribution. Confidence interval for 1 – 2: 1 1 1 2 Use the formula: 𝑥̅1 - 𝑥̅2 ± t/2,df 𝑆𝑝 √n + n Hypothesis testing: Ho: 1 – 2 ≥ D0 Ha: 1 – 2 < D0 One-tail (left) test Test statistic tcalc = with df = n1 + n2 - 2 Ho: 1 – 2 ≤ D0 Ha: 1 – 2 > D0 One-tail (right) test Ho: 1 – 2 = D0 Ha: 1 – 2 ≠ D0 Two-tail test (𝑥̅ 1 −𝑥̅2 )−𝐷0 1 1 n1 n2 𝑆𝑝 √ + p-value: T.DIST.RT(ABS(tcalc,df) for both the one-tail tests T.DIST.2T(ABS(tcalc,df) for the two-tail test df = n1 + n2 - 2 We will use Excel Data Analysis command for determining the p-value. §10.3 Inferences about the difference between two population means: Matched Samples Sample 1 = Observations of a sample prior to the event Sample 2 = Observations from the same subjects as sample 1 taken after the event n = Sample size of samples 1 and 2 x1i = ith observation from sample 1 x2i = ith observation from sample 2 Define: Sample difference = di = x1i – x2i d = Average of the population of differences d = Standard deviation of population of differences ∑𝑑 𝑑̅ = Mean of the sample differences = 𝑛 𝑖 ∑(𝑑𝑖 −𝑑̅)2 sd = Standard deviation of sample differences = √ 𝑛−1 Confidence interval for d: 𝑑̅ ± 𝑡𝛼/2 𝑆𝑑 ⁄√𝑛 with df = n - 1 Hypothesis testing: Ho: d ≥ 0 Ha: d < 0 One-tail (left) test Ho: d ≤ 0 Ha: d > 0 One-tail (right) test Ho: d = 0 Ha: d ≠ 0 Two-tail test 𝑑̅ −𝜇0 𝑑 ⁄√𝑛 Test statistic tcalc = 𝑆 p-value: T.DIST.RT(ABS(tcalc,df) for both the one-tail tests T.DIST.2T(ABS(tcalc,df) for the two-tail test df = n - 1 We will use Excel Data Analysis command for determining the p-value. §10.4 Inferences about the difference between two population proportions Notations: p1 = Proportion of “success” in population 1 p2 = proportion of “success” in population 2 Sample 1 = Sample taken from population 1 Sample 2 = Sample taken from population 2 n1 = Sample size of sample 1 n2 = Sample size of sample 2 𝑝̅1 = Proportion of “success” in sample 1 𝑝̅2 = Proportion of “success” in sample 2 Parameter of interest = p1 – p2 Point estimate of p1 – p2 = 𝑝̅1 - 𝑝̅2 Expected value of 𝑝̅1 - 𝑝̅2 = E(𝑝̅1 - 𝑝̅2 ) = p1 – p2 𝑝1 (1−𝑝1 ) Standard error of 𝑝̅1 - 𝑝̅2 = p1 – p2 = √ n1 + 𝑝2 (1−𝑝2 ) n2 𝑝̅1 (1−𝑝̅1 ) Estimated standard error of 𝑝̅1 - 𝑝̅2 = Sp1 – p2 = √ n1 + 𝑝̅2 (1−𝑝̅2 ) n2 𝑝̅1 - 𝑝̅2 will follow an approximately normal distribution if all the following four conditions are true: n1p1 ≥ 5; n1(1- p1) ≥ 5; n2p2 ≥ 5; n2(1- p2) ≥ 5 Confidence interval for p1 – p2 𝑝̅1 (1−𝑝̅1 ) Use the formula 10.13, page 430: 𝑝̅1 - 𝑝̅2 ± Z/2 √ Hypothesis testing: Ho: p1 – p2 ≥ 0 Ha: p1 – p2 < 0 One-tail (left) test n1 𝑝̅2 (1−𝑝̅2 ) + Ho: p1 – p2 ≤ 0 Ha: p1 – p2 > 0 One-tail (right) test n2 Ho: p1 – p2 = 0 Ha: p1 – p2 ≠ 0 Two-tail test All three Ho includes p1 – p2 = 0, i.e. p1 = p2 = p, a pooled estimate for 𝑝̅ can be found for p using the following formula. Pooled estimate 𝑝̅ = n1 𝑝̅1 +n2 𝑝̅2 n1 +n2 1 1 1 2 Estimated standard error of 𝑝̅1 - 𝑝̅2 = Sp1 – p2 = √𝑝̅(1 − 𝑝̅) (n + n ) Test statistic zcalc = (𝑝̅1 −𝑝̅2 ) 1 1 n1 n2 √𝑝̅ (1−𝑝̅ )( + ) p-value: 1 - NORM.DIST(ABS(zcalc,1) for both the one-tail tests 2*(1 - NORM.DIST(ABS(zcalc,1)) for the two-tail test Summary of formulas Confidence interval Values of 1 and 2 are known Values of 1 and 2 are unknown Hypothesis testing 𝜎12 𝜎22 1 2 (𝑥̅1 - 𝑥̅2 ) ± Z/2 √ n + n (𝑥̅1 – 𝑥̅2) ± t/2,df √ 𝑆12 n1 + Z= (𝑥̅ 1 −𝑥̅ 2 )−𝐷0 𝑆22 n2 t= 1 1 1 2 𝑥̅1 - 𝑥̅2 = 𝑆𝑝 √n + n 2 n1 n2 and (𝑥̅1 −𝑥̅2 )−𝐷0 2 2 n1 n2 df = 𝑆 𝑆 √ 1+ 2 df from results of Data Analysis command Values of 1 and 2 are unknown; but 1 = 2 =  2 𝜎 𝜎 √ 1+ 2 t= (𝑥̅ 1 −𝑥̅2 )−𝐷0 1 1 n1 n2 𝑆𝑝 √ + Sp2 = 𝑑̅ ± 𝑡𝛼/2 𝑆𝑑 ⁄√𝑛 t=  12  22  = Standard error of 𝑥̅1 - 𝑥̅ 2 n1 n 2  S12 S 22      n1 n2  1  S12  n1  1  n1 2 2 2  1  S 22      n2  1  n2   Sp = Pooled estimate of the common  if 1 = 2 = df = n1 + n2 - 2 Matched sample Comments Where, = 𝑥̅1 - 𝑥̅2 = Point estimate for 1-2 (n1 −1)S12 +(n2 −1)S22 n1 +n2 −2 with df = n1 + n2 – 2 Where d = Mean of paired differences, and Sd = Standard deviation of paired differences 𝑑̅ −𝜇0 𝑆𝑑 ⁄√𝑛 df = n - 1 Two population proportions Z== 𝑝̅1 - 𝑝̅2 ± Z/2 √ 𝑝̅1 (1−𝑝̅1 ) n1 + 𝑝̅2 (1−𝑝̅2 ) (𝑝̅1 −𝑝̅2 ) 1 1 n1 n2 √𝑝̅ (1−𝑝̅ )( + ) n2 where, 𝑝̅ = n1 𝑝̅1 +n2 𝑝̅2 n1 +n2 Where, p1  p 2 = Point estimate for p1 – p2 1 1 1 2 √𝑝̅ (1 − 𝑝̅ ) (n + n ) = Standard error of p1  p 2 for hypothesis testing where we assume p1 = p2 = p

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Review of Basic Statistical Concepts