Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Misuse of statistics wikipedia , lookup
1 Inference About a Population Mean, We want to test hypotheses or obtain confidence interval estimates about the value of a population mean, , using only the data obtained from a random sample from the population. If X1, X2, …, Xn are a random sample from a distribution with mean and standard deviation , then either: 1) 2) X has a standard normal distribution, or n X If we do not know the population distribution, then has an approximate standard n If the population distribution is normal, then normal distribution for large n. In both situations, we have a second unknown parameter, involved in the random variable. We want to be able to do inference about the single parameter using sample data. Defn: Let X1, X2, …, Xn be a random sample from a normal distribution with (unknown) mean and (unknown) standard deviation < +. Then the random variable X S has a n t distribution with degrees of freedom n-1. The p.d.f. of the t-distribution is given by k 1 1 2 , f t k 1 k 2 k t 2 2 1 k for - < t < +. Here k is the degrees of freedom of the distribution. Properties of the t-distribution: 1) It is bell-shaped, centered at 0. 2) It’s standard deviation is larger than 1. 3) For larger degrees of freedom, the standard deviation is closer to 1. 4) The limiting distribution as the degrees of freedom goes to + is the standard normal distribution. Some percentiles of the t-distribution for various values of d.f. are given in Table II in Appendix A, p. 436. Confidence interval for : We choose our confidence level to be 1 - . Then we can write the statement 2 X P t t n 1, 2 S n 1, 2 n 1 . Rearranging, we obtain S S P X t X t 1 n 1, n 1, n n 2 2 Then X t n 1, 2 S is a 1 100% confidence interval for . n Example: p. 171, Exercise 4-38 d) Sample Size for a Specified Margin of Error: As part of our experimental design, we want to specify the margin of error, E, that is acceptable for our estimate of , and choose a sample size to insure that we achieve this margin of error. We let 2 z 2 E z . Solving for n, we obtain n 2 . n 2 E Now, we know E and , but we need to find a usable value for 2 before we can find the sample size. Since we don’t know the value of the sample variance until we collect the data, we have to go to another source for a usable value of 2. Often, we do a literature search for previous published research on the same topic. We then use the sample variance from the previous research. Then the above equation will give us a sample size for achieving the desired margin of error with the desired level of confidence. Example: p. 171, Exercise 4-37 e) Testing Hypotheses Concerning a Population Mean, : We want to test hypotheses of the following possible forms: 1) H0: = 0 vs. Ha: 0 2) H0: 0 vs. Ha: < 0 3) H0: 0 vs. Ha: > 0 The test statistic to be used is T X 0 . Under the null hypothesis, the Central Limit Theorem S n says that this statistic has an approximate t distribution with d.f. = n-1. For the three types of alternative hypotheses, the rejection regions are: 1) H0: = 0 Reject H0 if | T | t n 1, 2 3 2) Ha: < 0 Reject H0 if T tn 1, 3) Ha: > 0 Reject H0 if T tn 1, Example: p. 171, Exercise 4-38 Sample Size for a Two-Sided Hypothesis Test for a Mean: As part of our experimental design, we want to find the sample size that will allow us to achieve a specified level of power for detecting an effect of a given size. We decide on our significance level, . We decide on the effect size, , that we wish to be able to detect with probability 1 - . Suppose that the null-hypothesized value of is 0, and that the actual value is 0 . Then the test statistic n . Then the T t | . In this case, n 1, 2 has a noncentral t-distribution with d.f. = n – 1 and noncentrality parameter probability of making a Type II error is given by P t n 1, 2 since T has a noncentral t distribution, we cannot find this probability using Table II. Instead, we look at Charts Va, Vb, Vc, and Vd in Appendix A, which give the Operating Characteristic Curves (OC’s). Charts Va and Vb give the OC’s for tests of two-sided tests. The other charts give OC’s for one-sided tests. The abscissas in these charts are given by d Example: p. 172, Exercise 4-41 c) .