Download CH. 7 The t test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
CH. 7 The t test
2009. 12. 5 (Sat.)
Jin-ju Yang
Talk outline
1.
2.
3.
4.
5.
Application of the t distribution
Confidence interval for the mean from a small sample
Difference of sample mean from population mean (one
sample t test)
Difference between means of two samples
Difference between means of paired samples (paired t test)
•
•
•
The CLT is very powerful, but it has two limitations: 1) it depends on a lar
ge sample size, and 2) to use it, we need to know the standard deviation
of the population(σ).
In reality, we usually don’t know the standard deviation of the population
(σ) so we use the standard deviation of our sample (denoted as ‘s’) as an
estimate.
Since we are estimating the standard deviation using our sample, the sa
mpling distribution will not be normal (even though it appears bell-shape
d). It is a little shorter and wider than a normal distribution, and it’s called a t-dist
ribution. The t-distribution is actually a family of distributions – there is a different
distribution for each sample value of n-1 (degrees of freedom). The shape of t de
pends on the size of the sample…the larger the sample size, the more confident w
e can be that ‘s’ is near ‘σ’, and the closer t gets to Z.
http://ocw.tufts.edu/Content/1/readings/193325
1. Application of the t distribution
•
The application of the t distribution to the following four types of proble
m will now be considered.
1.
The calculation of a confidence interval for a sample mean.
2.
The mean and standard deviation of a sample are calculated and a valu
e is postulated for the mean of the population. How significantly does t
he sample mean differ from the postulated population mean?
3.
The means and standard deviations of two samples are calculated. Coul
d both samples have been taken from the same population?
4.
Paired observations are made on two samples (or in succession on one
sample). What is the significance of the difference between the means
of the two sets of observations?
2. CI for the mean from a small sample
• To find the number by which we must multiply the standard error
to give the 95% confidence interval we enter table B at 17 in the l
eft hand column and read across to the column headed 0.05 to d
iscover the number 2.110.
• The 95% confidence intervals of the mean are now set as follows:
Mean + 2.110 SE to Mean - 2.110 SE
• Likewise from table B the 99% confidence interval of the mean is
as follows:
Mean + 2.898 SE to Mean - 2.898 SE
3. One Sample t test
To test a sample of normal continuous data, we need:
•
•
•
•
An expected value = the population or true mean (μ)
An observed mean = the average of your sample
A measure of spread: standard error
Degrees of freedom (df) = n-1 (number of values used to calculate SD or
SE)
• Then, we can calculate a test statistic to be compared to a known
distribution. In the case of continuous, normal data, it’s the t-stati
stic and the t-distribution.
http://ocw.tufts.edu/Content/1/readings/193325
4. Two Samples t test
•
We can use the t-test to compare two different groups of continuous data as the
outcome and compare test statistic to appropriate distribution to get p-value.
•
•
Under the null hypothesis, we propose that this difference equals 0.
We can calculate an estimate of the SE of this difference from our data.
H0 : σ₁ = σ₂ = σ (Equal standard deviations)
•
•
•
•
Obtain the standard deviation in sample 1: S₁
Obtain the standard deviation in sample 2: S₂
Multiply the square of the standard deviation of sample 1 by the degrees of freed
om, which is the number of subjects minus one:
repeated for sample 2
Add the two together and divide by the total degrees of freedom
•
The standard error of the difference between the means is
•
When the difference between the means is divided by this standard error
the result is t. Thus,
•
The table of the t distribution Table B (appendix) which gives two sided P
values is entered
at degrees of freedom.
•
A 95% confidence interval is given by
H₁ : σ₁ ≠ σ₂ (Unequal standard deviations)
• Rather than use the pooled estimate of variance, compute
• This is analogous to calculating the standard error of the difference in two prop
ortions under the alternative hypothesis as described in Chapter 6
We now compute
• We then test this using a t statistic, in which the degrees of freedom are:
• There is a slight modification to allow for unequal variances – this modification
adjusts the d.f for the test, using slightly different SE computation.
5. Paired t test
•
Sometimes data are paired. In this case, the “before” and “after” are not i
ndependent – they are taken from the same person.
What you are testing is the change in the same individual. When your da
ta are paired, you basically create one set of data by calculating each per
son’s change, then doing a one-sample t-test.
•
•
•
Find the mean of the differences,
Find the standard deviation of the differences, SD.
Calculate the standard error of the mean
•
To calculate t, divide the mean of the differences by the standard error of
the mean
•
A 95% confidence interval for the mean difference is given by
•
Exercises
7.1 In 22 patients with an unusual liver disease the plasma alkaline phosphatase was fou
nd by a certain laboratory to have a mean value of 39 King-Armstrong units, standard d
eviation 3.4 units. What is the 95% confidence interval within which the mean of the pop
ulation of such cases whose specimens come to the same laboratory may be expected t
o lie?
7.2 In the 18 patients with Everley's syndrome the mean level of plasma phosphate was
1.7 mmol/l, standard deviation 0.8. If the mean level in the general population is taken a
s 1.2 mmol/l, what is the significance of the difference between that mean and the mean
of these 18 patients?