Download Inferences for a Single Population Mean ( )

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
10 - Comparing Two Population Means ( 
1
vs.  2 )
General Idea:
Example of Independent vs. Dependent Samples -
89
10.1 - Comparing Two Population Means Using Dependent Samples
When using dependent samples each observation from population 1 has a one-to-one
correspondence with an observation from population 2. One of the most common cases
where this arises is when we measure the response on the same subjects before and after
treatment. This is commonly called a “pre-test/post-test” situation. However, sometimes
we have pairs of subjects in the two populations meaningfully matched on some prespecified criteria. For example, we might match individuals who are the same race,
gender, socio-economic status, height, weight, etc... to control for the influence these
characteristics might have on the response of interest. When this is done we say that we
are “controlling for the effects of race, gender, etc...”. By using matched-pairs of subjects
we are in effect removing the effect of potential confounding factors, thus giving us a
clearer picture of the difference between the two populations being studied.
DATA FORMAT
Matched Pair X 1i
1
2
3
...
n
X 2i
X 11 X 21
X 12 X 22
X 13 X 23
...
...
X 1n X 2 n
d i  X 1i  X 2i
d1
d2
d3
...
dn
The hypotheses are
H o : d  0
H a :  d  0 or H a :  d  0 or H a :  d  0
For the sample paired differences
( d i ' s ) find the sample mean (d )
and standard deviation ( s d ) .
We actually can hypothesize any size difference for the
mean of the paired differences that we want. For example
if wanted to show a certain diet resulted in at least a 10 lb.
decrease in weight then we could test if the paired
differences: d = Initial weight – After diet weight had
mean greater than 10 ( H a :  d  10 lbs. )
Test Statistic for a Paired t-test
d  d
t
~ t-distribution with df = n - 1
sd
n
Note:  d = the hypothesized value for the mean paired difference.
Confidence Interval for  d
s
 where t comes from the appropriate quantile of t-distribution df = n – 1.
d  t  d

n

This interval has a specified % chance of covering the true mean paired difference.
90
Example 1: Effect of Captopril on Blood Pressure
In order to estimate the effect of the drug Captopril on blood pressure (both systolic and
diastolic) the drug is administered to a random sample n = 15 subjects. Each subjects
blood pressure was recorded before taking the drug and then 30 minutes after taking the
drug. The data are shown below.
Syspre – initial systolic blood pressure
Syspost – systolic blood pressure 30 minutes after taking the drug
Diapre – initial diastolic blood pressure
Diapost – diastolic blood pressure 30 minutes after taking the drug
Research Questions:
 Is there evidence to suggest that Captopril results in a systolic blood pressure
decrease of at least 10 mmHg on average in patients 30 minutes after taking it?
 Is there evidence to suggest that Captopril results in a diastolic blood pressure
decrease of at least 5 mmHg on average in patients 30 minutes after taking it?
For each blood pressure we need to consider paired differences of the form
d i  BPpre i  BPpost i . For paired differences defined this way, positive values
correspond to a reduction in their blood pressure ½ hour after taking Captopril. To
answer research questions above we need to conduct the following hypothesis tests:
H o :  syspre syspost  10 mmHg
and
H o :  diaprediapost  5 mmHg
H a :  syspre syspost  10 mmHg
H a :  diaprediapost  5 mmHg
Below are the relevant statistical summaries of the paired differences for both blood
pressure measurements.
91
The t-statistics for both tests are given below:
Systolic BP
Diastolic BP
We can use the t-Probability Calculator in JMP to find the associated p-values or better
yet use JMP to conduct the entire t-test.
Systolic Blood Pressure
Diastolic Blood Pressure
Both tests result in rejection of the null hypotheses. This we have sufficient evidence to
suggest that taking Captopril will result in mean decrease in systolic blood pressure
exceeding 10 mmHg (p = _______) and a mean decrease in diastolic blood pressure
exceeding 5 mmHg (p = _______). Furthermore we estimate that the mean change in
systolic blood pressure will be somewhere between _______ mmHg and ______ mmHg,
and that the mean change in diastolic blood pressure could be as large as ______ mmHg.
92
Example 2 - Middle Ear Effusion in Breast-Fed and Bottle-Fed Infants (Hw 3)
A common symptom of otitus media in young children in the prolonged presence of
fluid in the middle ear, known a middle-ear effusion. The presence of fluid may result in
termporary hearing loss and interfere with normal learning skills in the first two years of
life. One hypothesis is that babies who are breast-fed for at least 1 month build up some
immunity against the effects of the disease and have less prolonged effusion than do
bottle-fed babies. A small study of 24 pairs of babies is set up, where the babies are
matched on a one-to-one basis according to age, sex, socioeconomic status, and type of
medications taken. One member of the matched pair is a breast-fed baby, and other
member is a bottle fed baby. The outcome variable is the duration of middle-ear effusion
after the first episode of otitus media. The results are shown below.
Pair
Duration of effusion in
Duration of effusion in Paired Difference
Number
breast-fed baby
bottle-fed baby
d = Bottle - Breast
1
20
18
-22
2
11
35
24
3
3
7
4
4
24
182
156
5
7
6
-1
6
28
33
5
7
58
223
165
8
7
7
0
9
39
57
18
10
17
76
59
11
17
186
169
12
12
29
17
13
52
39
-13
14
14
15
1
15
12
21
9
16
30
28
-2
17
7
8
1
18
15
27
12
19
65
77
12
20
10
12
2
21
7
8
1
22
19
16
-3
23
34
28
-6
24
25
20
-5
Do these data provide evidence that breast-fed babies have shorter durations of effusion
when compared to bottle-fed babies that are the same age, sex, socioeconomic status, and
on the same medications?
93
94
10.2 – Comparing Two Population Means Using Independent Samples
Basic Form of Hypotheses:
H o : 1   2
H a : 1   2
H a : 1  2
H a : 1   2
or
or
or
or
(1   2 )  0
(1   2 )  0 (upper-tail test)
(1   2 )  0 (lower-tail test)
(1   2 )  0 (two-tailed test, use CI for (1   2 ) to make decision)
Example 1: Normal Body Temperature and Gender
These data were taken from the results of a study presented in Mackowiak, P. A.,
Wasserman, S. S., and Levine, M. M. (1992), "A Critical Appraisal of
98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and
Other Legacies of Carl Reinhold August Wunderlich", Journal of the
American Medical Association, 268, 1578-1580. The data consists of body temperatures
and heart rates for random samples of 65 healthy men and women.
Research Questions:
1) Is there evidence that 98.6 Degrees F is not the normal human body temperature?
2) Is there evidence that the normal mean human body temperature differs across
gender?
Question 1: Is 98.6o F the normal mean body temperature?
95
Research Question 2: Is there evidence to suggest that the mean normal human body
temperature differs across gender?
Assumptions for Independent Samples Test
1. The two groups must be independent of each other.
2. The two populations should be normally distributed.
3. Decide whether or not we wish to assume the population variances are equal.
(see two cases below)
Assessing Normality of the Two Sampled Populations
To assess normality we select Normal Quantile Plot from the Oneway Analysis pulldown menu as shown below.
Normality appears to
be satisfied here.
Two Cases
96
Case 1 - Equal Populations Variances/Standard Deviations
(  1   2 =  2  common variance to both populations)
2
2
Rule of Thumb for Checking Variance Equality
If the larger sample variance is more than twice the smaller sample variance do not assume
the variances are equal.
Formally Checking the Equality of the Population Variances
To test the equality of the population variances select Unequal Variances from the
Oneway Analysis pull-down menu.
The test is:
Ho : F   M
Ha : F   M
JMP gives four different tests for examining the equality of
population variances. To use the results of these tests simply
examine the resulting p-values. If any/all are less than .10 or .05
then worry about the assumption of equal variances and use the unequal variance t-Test
instead of the pooled t-Test.
p-values for testing equality of
population variances
97
Confidence Interval for Difference in Population Means ( 1   2 )
( X 1  X 2 )  t  SE ( X 1  X 2 )
where the standard error of the difference when we assume equal population variances is
given by the following:
1 
2 1
SE ( X 1  X 2 )  s p   
 n1 n 2 
where
(n  1) s1  (n2  1) s 2
 1
n1  n 2  2
2
sp
2
2
if n1  n 2
s 2p 
s12  s 22
if n1  n2
2
s p is called the “pooled estimate of the common variance ( 2 ) ”. The degrees of
2
freedom for the t-distribution is df  n1  n2  2 .
CI for ( F   M )
98
Hypothesis Testing ( 1 vs.  2 )
The general null hypothesis says that the two population means are equal, or equivalently
there difference is zero. The alternative or research hypothesis can be any one of the
three usual choices (upper-tail, lower-tail, or two-tailed). For the two-tailed case we can
perform the test by using a confidence interval for the difference in the population means
discussed above.
H o : 1   2 or equivalently (  1   2 )  0
H a: 1   2 or equivalently ( 1   2 )  0 (upper - tail)
or
H a : 1   2 or equivalently ( 1   2 )  0 (two - tailed, USE CI! )
etc....
Test Statistic
( X  X 2 )  ( 1   2 )
t 1
~ t-distribu tion with df  n1  n2  2
SE ( X 1  X 2 )
where the SE ( X 1  X 2 ) is as defined in the confidence interval section above.
Computing the Test Statistic for Body Temperature Example
Summary Statistics
x F  98.39
x M  98.10
s F  .743
s M  .699
n F  65
n M  65
99
Performing the Test to Compare the Population Means in JMP
To perform the two-sample t-test for independent samples:
 assuming equal population variances select the Means/Anova/Pooled t option
from Oneway-Analysis pull-down menu.
 assuming unequal population variances select t-Test from the Oneway-Analysis
pull-down menu.
Because we have no evidence
against the equality of the
population variances
assumption we will use a
pooled t-Test to compare the
population means.
Several new boxes of output will appear below the graph once the appropriate option has
been selected, some of which we will not concern ourselves with. The relevant box for us
will be labeled t Test as shown below for the mean body temperature comparison.
Because we have concluded
that the equality of variance
assumption is reasonable for
these data we can refer to the
output for the t-Test assuming
equal variances.
100
Example 2: Effect of Cadmium Oxide on Hemoglobin Levels in Dogs
An experiment was conducted to determine the examine the potential effect cadmium
oxide might have on the hemoglobin levels of dogs. It is thought that cadmium oxide
exposure would lead to decreased hemoglobin levels. 10 dogs were randomly assigned to
the control group and 15 were randomly assigned to the cadmium oxide exposure group.
Research Question: Is there evidence to suggest the cadmium oxide exposure lowers the
hemoglobin level found in dogs?
To answer the question of interest we need tools for comparing the population mean
hemoglobin level for dogs not exposed to cadmium oxide vs. that for dogs that have had
cadmium oxide exposure, i.e. how does  control compare to  exp osed .
Assumptions for Independent Samples Test
1. The two groups must be independent of each other.
2. The observation from each group should be normally distributed.
3. Decide whether or not we wish to assume the population variances are equal.
101
Cadmium Exposure and Hemoglobin Levels: 95% CI for  C   E 
Hypothesis Test
102
Case 2 ~ Unequal Populations Variances/Standard Deviations (  1   2 )
Assumptions:
For this case we make the following assumptions
1. The samples from the two populations were drawn independently.
2. The population variances/standard deviations are NOT equal.
(This can be formally tested or use rule o’thumb presented above.)
3. The populations are both normally distributed. This assumption can be relaxed
when the samples from both populations are “large”.
Confidence Interval for ( 1   2 )
( X 1  X 2 )  t  SE ( X 1  X 2 )
where
2
SE ( X 1  X 2 ) 
2
s1
s
 2
n1
n2
and
df 
 s1 2 s 2 2 


n  n 
2 
 1
2
rounded down to the nearest integer
2
2
 s1 2 
 s2 2 




n 
n 
1
2

 

n1  1
n2  1
The t-quantiles are the same as those we have seen previously.
Hypothesis Testing
Test Statistic
t
( X 1  X 2 )  ( 1   2 )
~ t-distribution with df = (see formula above)
SE ( X 1  X 2 )
where the SE ( X 1  X 2 ) is as defined above.
103
Example: Cell Radii of Malignant vs. Benign Breast Tumors
In your previous work with these data you noticed that the radii of malignant breast
tumor cells were generally larger than the radii of benign breast tumor cells. Assuming
the researchers initially hypothesized that cancerous breast tumor cells have larger radii
than non-cancerous cells, conduct a test to see if this is supported by these data.
The cell radii of the malignant tumors certainly appear to be larger than the cell radii of
the benign tumors. The summary statistics support this with sample means/medians of
rough 17 and 12 units respectively. The 95% CI’s for the mean cell radius for the two
tumor groups do not overlap, which further supports a significant difference in the cell
radii exists.
FYI: Formally Testing the Equality of Population Variances
Ho :  
2
1
2
2
H a :  12   22
or equivalently
Ho :1   2
In JMP
H a : 1   2
Test Statistic
 s12 s 22 
F  max  2 , 2  which has an F-distribution with
 s 2 s1 
numerator df = n1  1 and denominator df = n2  1 if
s12  s 22 and are reversed if s12  s 22 .
If F is large then one variance is several times larger than
the other and we should reject the null in favor of the
alternative. There is separate F-table for each level of
significance. If our test statistic value exceeds the value in
the table for appropriate level of significance and degrees of
freedom we reject the null hypothesis. BETTER TO JUST
USE JMP!!!
104
Because we conclude that the population variances are unequal we should use the nonpooled version to the two-sample t-test. No one does this by hand, so we will use JMP.
Conclusion:
105
10.3 - Comparing Two Population Proportions Using Independent
Samples ( p1 vs. p2 )
Confidence Interval for ( p1  p2 )
( pˆ 1  pˆ 2 )  z  SE( pˆ 1  pˆ 2 )  (provided n1 & n2 are “large”)
where,
SE ( pˆ 1  pˆ 2 ) 
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )

n1
n2
“Large” sample sizes
Both samples should be
larger than 25 and both
samples should have more
than 5 “successes” and more
than 5 “failures”
and
Confidence Level
95 % (   .05)
90 % (   .10 )
99 % (   .01 )
z
1.96
1.645
2.576
Hypothesis Testing
Basic Hypotheses
Test Statistic
z
( pˆ 1  pˆ 2 )  0
~ standard normal dist. provided n1 , n2 are “large” (see above)
SE ( pˆ 1  pˆ 2 )
where the SE ( X 1  X 2 ) is as defined in the confidence interval section above
SE ( pˆ 1  pˆ 2 ) 
1
1 
pq   
 n1 n2 
where
p
# of successes in combined sample
n1  n2

n1 pˆ 1  n2 pˆ 2
n1  n2
q  1 p
106
Example: In a study conducted to investigate the non-clinical factors associated with
the method of surgical treatment received for early-stage breast cancer, some patients
underwent a modified radical mastectomy while others had a partial mastectomy
accompanied by radiation therapy. We are interested in determining whether the age of
the patient affects the type of treatment she receives. In particular, we want to know
whether the proportions of women under 55 are identical in the two treatment groups.
In a sample of n = 658 women who underwent a partial mastectomy and subsequent
radiation therapy contains 292 women under 55, which is a sample percentage of 44.4%.
In another independently drawn sample of n = 1580 women who received a modified
radical mastectomy 397 women were under 55, which is a sample percentage of 25.1%.
Conduct a test comparing the proportion of women each group under the age of 55
and construct a 95% confidence interval for the difference in these two proportions.
107
Fisher’s Exact Test for Comparing Two Proportions (in JMP)
Enter these data as you would for setting up a 2 X 2 contingency table.
In JMP, select Analyze > Fit Y by X and place Surgery in the X box and Age in the Y.
The following output from JMP is obtained.
The results of Fisher's Exact Test are always included in the JMP output whenever we are working with a
2 X 2contingency table.
The three p-values given are for testing the following:
(1) Left, p-value = 1.000 is for testing if the proportion of women under age 55 is larger for the modified
radical mastectomy group than the partial mastectomy group. This is clearly not supported as the p-value
>> .05. Obviously we would not conclude this when only 25% of women in the mod. rad. Group were
under age 55 compared to 44.4% in the partial mastectomy group.
(2) Right, p-value < .0001 is for testing if the proportion of women under age 55 is larger for the partial
mastectomy group than the modified radical mastectomy group. The fact this p-value is highly significant
suggests that the proportion of women under age 55 in partial mastectomy group is indeed greater than the
proportion under 55 in the modified radical mastectomy group. This was the research hypothesis.
(3) 2-Tail, p-value < .0001 is for testing if the proportion of women under age 55 differs between the two
surgery groups. The fact this p-values is highly significant suggests that these proportion of women under
age 55 is not the same for both surgery groups.
108
Example 2: Low Birth Weight and Smoking
These data come from a study looking at the effects of smoking during pregnancy on
birth weight. Amongst the 381 non-smokers in the study, 13 had babies with low birth
weight, while amongst the 299 mothers who smoked during pregnancy, 28 had babies
with low birth weight. Is there evidence to suggest that the proportion of babies born
with low birth weight is greater for mothers who smoked during pregnancy?
Normal
Birth
Weight
Low Birth
Weight
Nonsmoker
368
96.59%
13
3.41%
381
Smoker
271
90.64%
28
9.36%
299
Column
Totals
639
41
Smoking
Status
Row
Totals
680
Hypothesis Test:
1)
2)
3)
4)
5)
109
Construct and interpret a 95% CI for ( p smo ker  pnonsmo ker )
Fisher’s Exact Test Results from JMP
Conclusion:
110