* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SDC 3 Appendix: Background information and formulas for the p
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Confidence interval wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
German tank problem wikipedia , lookup
Gibbs sampling wikipedia , lookup
SDC 3 Appendix: Background information and formulas for the p-values and confidence interval for the problem of comparing two means. Brief summary: The purpose of this technical appendix is to present the detailed formulas used to compute the p-values and the confidence interval for the problem of comparing two means. The derivation of the formulas relating pb and ph to p-values and the derivation of the confidence interval are also presented.. The model we assume is that Y11 …Y1n1 are independent and normally distributed with mean 1 and variance  12 and, independently, Y21 , Y2n are independent and normally distributed 2 2 with mean  2 and variance  2 . The problem is to make inferences about the difference in means 2  1 . The sample mean of sample i  1 2 is Y i  ni1  ji1 Yij (presented in L62 and L93), and the n sample variance of sample i  1 2 is si2  (ni  1) 1  ji1 (Yij  Y i )2 , (the sample standard n deviation si is presented in L63 and L94). The standard error is SE Y 2  Y 1  s12  n1  s22  n2 (presented in L64 and L95) and the approximate degrees of   freedom is   SE Y 2  Y 1   s12  n1    n1  1   s22  n2    n2  1 (presented in L65 and 4 2 2 L96). Under the assumed model, the sampling distribution of Y 2  Y 1   2  1   SE Y 2  Y 1 is approximated by the Student t distribution with  degrees of freedom. In the equal variance case (  12   22 ), the sampling distribution of Y 2  Y 1   2  1   SE Y 2  Y 1 where SE Y 2  Y 1   n 1 s   n 1 s    n  n 2 1 1 2 2 2 1 2  2 1  n1  1  n2 , is exactly the Student t distribution with   n1  n2  2 degrees of freedom. Let G be the distribution function of the Student t distribution with  degrees of freedom and let T denote a random variable with this distribution. The Student t distribution is symmetric so T and T have the same distribution. The one-sided p-value for testing the null hypothesis that 2  1   against the alternative that 2  1   is  Pr T   Y2  Y1    / SE  Y2  Y1  2  1      Pr   T     Y  Y  / SE  Y  Y     G    Y  Y  / SE  Y  Y       Pr T      Y2  Y1  / SE  Y2  Y1  2  1       2  2 1 1 2 2 1 2  1     1  ph given by [2]. Thus ph can be interpreted as the one-sided p-value for testing the null hypothesis that 2  1   against the alternative that 2  1   (i.e. the probability under the null hypothesis 2  1   that we observe a t-ratio greater than or equal to the observed value of the t-ratio Y 2  Y 1    / SE Y 2  Y 1 ). Similarly, pb can be interpreted as the one-sided p-value for testing the null hypothesis that 2  1   against the alternative that 2  1   . The derivation is  Pr Tv   Y2  Y1    / SE  Y2  Y1  2  1      Pr   T     Y  Y  / SE  Y  Y     1  G    Y  Y  / SE  Y  Y       Pr Tv      Y2  Y1  / SE  Y2  Y1  2  1      v v  pb . 2 2 1 1 2 2 1 1 2  1     The confidence interval [1] for 2  1 is obtained as 1    Pr  t  Y 2  Y 1   2  1   SE Y 2  Y 1  t   Pr Y 2  Y 1  t SE Y 2  Y 1  2  1  Y 2  Y 1  t SE Y 2  Y 1  . The critical value t  G 1 1   / 2 , where G 1 is the inverse of the distribution function of the Student t distribution.