* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CH. 7 The t test
Survey
Document related concepts
Transcript
CH. 7 The t test 2009. 12. 5 (Sat.) Jin-ju Yang Talk outline 1. 2. 3. 4. 5. Application of the t distribution Confidence interval for the mean from a small sample Difference of sample mean from population mean (one sample t test) Difference between means of two samples Difference between means of paired samples (paired t test) • • • The CLT is very powerful, but it has two limitations: 1) it depends on a lar ge sample size, and 2) to use it, we need to know the standard deviation of the population(σ). In reality, we usually don’t know the standard deviation of the population (σ) so we use the standard deviation of our sample (denoted as ‘s’) as an estimate. Since we are estimating the standard deviation using our sample, the sa mpling distribution will not be normal (even though it appears bell-shape d). It is a little shorter and wider than a normal distribution, and it’s called a t-dist ribution. The t-distribution is actually a family of distributions – there is a different distribution for each sample value of n-1 (degrees of freedom). The shape of t de pends on the size of the sample…the larger the sample size, the more confident w e can be that ‘s’ is near ‘σ’, and the closer t gets to Z. http://ocw.tufts.edu/Content/1/readings/193325 1. Application of the t distribution • The application of the t distribution to the following four types of proble m will now be considered. 1. The calculation of a confidence interval for a sample mean. 2. The mean and standard deviation of a sample are calculated and a valu e is postulated for the mean of the population. How significantly does t he sample mean differ from the postulated population mean? 3. The means and standard deviations of two samples are calculated. Coul d both samples have been taken from the same population? 4. Paired observations are made on two samples (or in succession on one sample). What is the significance of the difference between the means of the two sets of observations? 2. CI for the mean from a small sample • To find the number by which we must multiply the standard error to give the 95% confidence interval we enter table B at 17 in the l eft hand column and read across to the column headed 0.05 to d iscover the number 2.110. • The 95% confidence intervals of the mean are now set as follows: Mean + 2.110 SE to Mean - 2.110 SE • Likewise from table B the 99% confidence interval of the mean is as follows: Mean + 2.898 SE to Mean - 2.898 SE 3. One Sample t test To test a sample of normal continuous data, we need: • • • • An expected value = the population or true mean (μ) An observed mean = the average of your sample A measure of spread: standard error Degrees of freedom (df) = n-1 (number of values used to calculate SD or SE) • Then, we can calculate a test statistic to be compared to a known distribution. In the case of continuous, normal data, it’s the t-stati stic and the t-distribution. http://ocw.tufts.edu/Content/1/readings/193325 4. Two Samples t test • We can use the t-test to compare two different groups of continuous data as the outcome and compare test statistic to appropriate distribution to get p-value. • • Under the null hypothesis, we propose that this difference equals 0. We can calculate an estimate of the SE of this difference from our data. H0 : σ₁ = σ₂ = σ (Equal standard deviations) • • • • Obtain the standard deviation in sample 1: S₁ Obtain the standard deviation in sample 2: S₂ Multiply the square of the standard deviation of sample 1 by the degrees of freed om, which is the number of subjects minus one: repeated for sample 2 Add the two together and divide by the total degrees of freedom • The standard error of the difference between the means is • When the difference between the means is divided by this standard error the result is t. Thus, • The table of the t distribution Table B (appendix) which gives two sided P values is entered at degrees of freedom. • A 95% confidence interval is given by H₁ : σ₁ ≠ σ₂ (Unequal standard deviations) • Rather than use the pooled estimate of variance, compute • This is analogous to calculating the standard error of the difference in two prop ortions under the alternative hypothesis as described in Chapter 6 We now compute • We then test this using a t statistic, in which the degrees of freedom are: • There is a slight modification to allow for unequal variances – this modification adjusts the d.f for the test, using slightly different SE computation. 5. Paired t test • Sometimes data are paired. In this case, the “before” and “after” are not i ndependent – they are taken from the same person. What you are testing is the change in the same individual. When your da ta are paired, you basically create one set of data by calculating each per son’s change, then doing a one-sample t-test. • • • Find the mean of the differences, Find the standard deviation of the differences, SD. Calculate the standard error of the mean • To calculate t, divide the mean of the differences by the standard error of the mean • A 95% confidence interval for the mean difference is given by • Exercises 7.1 In 22 patients with an unusual liver disease the plasma alkaline phosphatase was fou nd by a certain laboratory to have a mean value of 39 King-Armstrong units, standard d eviation 3.4 units. What is the 95% confidence interval within which the mean of the pop ulation of such cases whose specimens come to the same laboratory may be expected t o lie? 7.2 In the 18 patients with Everley's syndrome the mean level of plasma phosphate was 1.7 mmol/l, standard deviation 0.8. If the mean level in the general population is taken a s 1.2 mmol/l, what is the significance of the difference between that mean and the mean of these 18 patients?