Download CH9: Testing the Difference Between Two Means, Two Proportions

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
CH9: Testing the Difference Between Two
Means, Two Proportions, and Two
Variances
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 343
Section 9-1 Testing the Difference Between Two
Means: Using the Z Test
Suppose we are interested in determining if a certain
medication relieves patients’ headaches.
We give the drug/treatment to one group and give a placebo to
a control group and compare the mean incidences of patient
relief from the headache between the two groups.
If the treatment group had a statistically significant
improvement in headache symptoms over the control group,
then we can conclude the drug works.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 344
So our question might be, “Is the mean incidence of headache
relief different for the two groups?”
Let 1  mean headache relief from treatment group and
2  mean headache relief from control group.


Then our hypotheses would be:
H0 :
H1 :
Alternatively, we could state the hypotheses as:
H0 :
H1 :
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 345
Assumptions for the Test to Determine the Difference
Between Two Means
 The samples must be independent of each other. That is,
there can be no relationship between the subjects in each
sample.
 The populations from which the samples come must be
(approximately) normally distributed or the sample sizes
of both groups should be at least 30.
 The standard deviations of both populations must be
known.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 346
We can compare the groups by the difference in their
population means, 1  2, where 1 is the population mean for
group 1 and 2 is the population mean for group 2.
We estimate
 1  2 withx1
 x2

The
standard deviation of
x1  x2 is

1
2
n1

2
2
n2
When both populations are normally distributed or the
each groupis at least 30, then x1  x2 has a
samples size for
normal distribution.
CH9: Testing the Difference Between Two Means or Two Proportions

Santorico - Page 347
Formula for the z test for Comparing Two Means from
Independent Populations
H0 : 1  2  k (or  k or  k )
Note: We often k  0, but it doesn’t have to be.

Test value:


(x1  x2 )  (1  2 ) (x1  x2 )  k
z 

2
2
1 2
 12  22


n1 n2
n1 n2
*

CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 348
The observed difference between the sample means may be
due to chance, in which case the null hypothesis will not be
rejected.
If the difference is statistically significant, the null hypothesis is
rejected and the researcher can conclude the population means
are different.
The same approach to finding critical values and P-values that
was used in Section 8-2 will be used here (Table E or Table F
with d.f. = ∞).
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 349
Example: Dr. Cribari would like to determine if there is a
statistically significant difference between her two Math 2830
classes. To make this comparison she will compare the results
from exam 1. Class one had 35 students take the exam with a
mean of 82.6 and a population standard deviation of 1.41.
Class two had 32 students take the exam with a mean of 84 and
a population standard deviation of 3.63. Can Dr. Cribari
conclude that there is difference in the mean test grades
between the two classes at α=0.05? Ho: µ 1 = µ 2 Ho: µ 1 ≥ µ 2
Step 1 State the hypotheses and identify the claim.
H 0 : 1  2
H1 : 1  2
CLAIM
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 350
Step 2 Find the critical value(s) from the appropriate table.
As stated, the problem is giving the population standard
deviations. This means that we will be doing a z-test.

Two-sided test
=0.05
critical value = 1.96
Step 3 Compute the test value and determine the P-value.
z* 
( x1  x2 )  ( 1  2 )

2
1
n1


2
2

(82.6  84)  0
1.41
n2
35
2
3.63


2
 -2.05
32
p-value = 2*.0202 = 0.0404
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 351
Step 4 Make the decision to reject or not reject the null
hypothesis.
Since the p-value is smaller than our , the null hypothesis
is rejected.
[OR, Since, our test value, -2.05, falls within the rejection
region, the null hypothesis is rejected]
Step 5 Summarize the results.
That is, there is evidence to support the claim that the exam
1 grades differ between the two sections of MATH2830.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 352
Example: A survey found that the average hotel room in New
Orleans is $88.42 and the average room rate in Phoenix is
$80.61. Assume that the data were obtained from two samples
of 50 hotels each and that the (population) standard deviations
were $5.62 and $4.83, respectively. At α = 0.01, can it be
concluded that the average hotel room in New Orleans costs
more than in Phoenix?
Step 1 State the hypotheses and identify the claim.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 353
Step 2 Find the critical value(s) from the appropriate table.
Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 354
Step 4 Make the decision to reject or not reject the null
hypothesis.
Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 355
Formula for the z Confidence Interval for Difference
Between Two Means
Assumptions:
1.The data for each group are independent random samples.
2.The data are from normally distributed populations and/or
the sample sizes of the groups are greater than 30.
3.The population standard deviation is (assumed) known.
Formula:
( x1  x2 )  z /2
  12    2 2 



 n1   n2 
Note: When n1 and n2 are at least 30, then s1 and s2 can be used
in place of  1 and  2.


CH9: Testing the Difference Between Two Means or Two Proportions


Santorico - Page 356
Example: Two brands of cigarettes are selected, and their
nicotine content is compared. The data are shown below.
Find the 95% confidence interval of the true difference in
the means.
Brand A
X 1  28.6 mg
 1  5.1 mg
n1  30
Brand B
X 2  32.9 mg
 2  4.4 mg
n2  40

2
2
2
2










5.1
4.4
1
2
( x1  x2 )  z


(28.6

32.9)

1.96

/2

 

 30   40 
n
n

 

  1   2 
 (4.3)  2.278158
 (-6.58, -2.02)
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 357
At   0.05., is there convincing evidence that the mean
amount of nicotine differs between the brands?

CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 358
Example: For the hotel example, construct a 98%
confidence interval of the true difference in the
means.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 359
Section 9-2: Testing the Difference Between Two
Means of Independent Samples: Using the t Test
Many times the conditions set forth by the z test in Section 9-1
cannot be met (e.g., the population standard deviations are
not known).
In these cases, a t test is used to test the difference between
means when the two samples are independent and when the
samples are taken from two normally or approximately
normally distributed populations.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 360
Formula for the t Test for Testing the Difference Between
Two Means: Independent Samples.
Variances are assumed to be unequal:
t

(X 1  X 2 )  (1  2 )
s12 s22

n1 n2
where degrees of freedom is equal to the smaller of n1 1 or
n2 1.

We will use Table F to find our critical values and our p-values.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 361
WARNING: Your calculator will perform a
2 sample t-test (its #4 under STATS then
TESTS). However, it uses a complicated
formula to determine the degrees of
freedom that will ultimately affect how
the calculator deals with confidence
intervals and p-values. We will come back
to this point at the end of the section.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 362
Example: A real estate agent wishes to determine whether tax
assessors and real estate appraisers agree on the values of
homes. A random sample of the two groups appraised 10
homes. Is there a significant difference in the values of the
homes for each group? Let α = 0.05. Assume the data are from
normally distributed populations.
Real Estate Appraisers
X 1  $83,256
s1  $3256
n1  10
Tax Assessors
X 2  $88,354
s2  $2341
n2  10
Sample standard
deviations given!

Use a t-test

Step 1 State the hypotheses and identify the claim.


H0: 1=2 H1: 12
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 363
Step 2 Find the critical value(s) from the appropriate table.
 T-test means use the t-table (Table F).
 We have 9 degrees of freedom since n1=10 and n2=10.
The smallest of n1-1 and n2-1 is 9.
 Information we need: two-tailed test, =0.05, df=9
 T critical value is  2.262
Step 3 Compute the test value and determine the P-value.
t
( X 1  X 2 )  ( 1  2 )
2
1
2
2

s
s

n1 n2
(83, 256  88,354)  (0)
 3256    2341
10
10
2
2
-5098
1268.141
=-4.02

CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 364
0.4
Two tailed
t* = -4.02
0.2
0.0
CRITICAL
REGION
0.1
P(t)
0.3
p  value  2 P(t  4.02)
 2(0.0015)
 0.003
-6
-4
-2
0
2
4
t
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 365
6
Step 4 Make the decision to reject or not reject the null
hypothesis.
The null hypothesis is rejected. This decision can be based
on:
 the fact that the test value (-4.02) is within the critical
region since it is less than -2.262 or
 the fact that the p-value (0.003) is smaller than =0.05
Step 5 Summarize the results.
There is significant evidence that tax assessors and real
estate appraisers disagree on the values of homes.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 366
Example: A researcher suggests that male nurses earn more
than female nurses. A survey of 16 male nurses and 20 female
nurses reports these data. Is there enough evidence to support
the claim that male nurses earn more than female nurses? Use
α = 0.01. Assume the data are from normally distributed
populations.
Females
X 1 = $23,750
s1 = $250
n1 = 20
Males
X 2 = $23,900
s2 = $300
n2 = 16

Step 1 State the hypotheses and identify the claim.


CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 367
Step 2 Find the critical value(s) from the appropriate table.
Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 368
Step 4 Make the decision to reject or not reject the null
hypothesis.
Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 369
Confidence Intervals for the Difference of Two Means:
Small Independent Samples
Variances assumed to be unequal:
(X 1  X 2 )t 2
s12 s22
 where d.f. = smaller value of n1 1 or n2 1.
n1 n2

CH9: Testing the Difference Between Two Means or Two Proportions

Santorico - Page 370
WARNING: The way our calculator determines the
degrees of freedom is not the same as the book. So
you will NOT be able to use your calculator
STAT/TESTS function to calculate your confidence
interval because you will get a VERY different
confidence interval. This is due to the fact that the tmultiplier will be sufficiently different then what the
calculator will find.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 371
Example: Let’s find the 95% Confidence Interval for the first
problem.
2
2
3256   2341

s12 s22
(X  X )  t
  (83256  88354)  2.262

1
2
 2
n1
n2
10
10
 -5098  2868.535
 (-7967, -2229)
Example: Let’s find the 99% Confidence Interval for the second
problem.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 372
Section 9-3: Testing the Difference
Between Two Means: Dependent Samples
So far we have only compared two means
when the samples were independent.
Samples are considered to be dependent when
the subjects are paired or matched in some
way.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 373
Examples of paired data:
 Each person is measured twice where the 2
measurements measure the same thing but under
different conditions
 Similar individuals are paired prior to an experiment and
each member of a pair receives a different treatment
 Two different variables are measured for each individual
and there is interest in the amount of difference between
the 2 variables
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 374
When using paired data, you are interested primarily in
the “difference” and not the data itself.
When samples are dependent, a special t test for dependent
means is used. The test uses the difference in the values of the
matched pairs.
IMPORTANT: We cannot use the t test we had learned for a
difference in independent means.
To determine whether one set of observations tend to be larger
or different than the paired observations, we take the
difference between the matched observations and perform
analysis on the differences.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 375
Classic example
would be studies of
weight loss
Weight before

Weight after

We are interested
in the CHANGE!
An aside: this study also used a placebo group. Why?
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 376
Hypotheses:
Right-tailed: H0: D=0 H1: D>0
Left-tailed: H0: D=0 H1: D<0
Two-tailed: H0: D=0 H1: D0
D  population mean of differences = 1 – 2
Here, 1 is the mean of the population of the first set of
measurements and 2 is the mean of the population of the
second set of measurements.
 Actually, you can also use  D  2  1 as long as you are
consistent
with your statement of hypotheses and calculation
of D.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 377
Formulas for the t Test for Dependent Samples
D  D
with d.f. = n 1 and
t
sD
n
D

where D 
is the mean of the sample of differences and
n
 s 
D
n D2  ( D)2
n(n 1)
is the sample standard deviation of the sample of differences.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 378
The good news is we can find the mean of the differences, D,
and the standard deviation of the differences, s D , using the LIST
and STAT functions in your TI-83/84.
1.
2.
3.
4.

Go to STAT -> EDIT -> Edit .
 under L .
Enter the first set of observations
1
Enter the second set of observations under L2.
Highlight L3 in list, type L1 – L2 and hit enter. The set of
differences should now be calculated.
5. Go to STAT -> CALC -> 1-Var Stats, hit enter. Type L3
(after 1-Var Stats on your screen) and hit enter. Your
calculator will calculate the sample mean and standard
deviation for you.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 379
Example: A physical education director claims by taking a
special vitamin, a weight lifter can increase his strength. Eight
athletes are selected and given a test of strength, using the
standard bench press. After 2 weeks of regular training,
supplemented with the vitamin, they are tested again. Test the
effectiveness of the vitamin regimen at α = 0.05. Each value in
these data represents the maximum number of pounds the
athlete can bench press. Assume the variable is approximately
normally distributed.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 380
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 381
Step 1 State the hypotheses and identify the claim.
I will base my differences on:
D = strength after – strength before.
H0: D=0 H1: D>0
Step 2 Find the critical value(s) from the appropriate table.
We have 8 lifters which gives 7 degrees of freedom.
Our =0.05.
We have a right tailed test  critical value will be positive.
t critical value = 1.895
(see next page)
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 382
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 383
Step 3 Compute the test value and determine the P-value.
Use your calculator to get the mean difference and standard
deviation of the differences.
D   D 2.375  0
t

 1.388
sD
4.838
8
n
p-value = P(t1.388)=0.104
Found using tcdf(1.388,E99,7) on the TI calculator
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 384
Step 4 Make the decision to reject or not reject the null
hypothesis.
The null hypothesis is NOT rejected. We can base this
decision on either of the two facts:
 The p-value is larger than  = 0.05
 The test value (1.388) is smaller than the critical value
(1.895). That is, our test value is within the non-rejection
region:
0.3
Right tailed
Test value = 1.388
0.2
REJECTION REGION
0.0
0.1
P(t)
NON-REJECTION REGION
-3
-2
-1
0
1
2
3
t
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 385
Step 5 Summarize the results.
There is not sufficient evidence to support the education
director claims by taking a special vitamin, a weight lifter
can increase his strength.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 386
Example: A sample of 10 college students in a class were
asked how many hours per week they watch TV and how many
hours a week they used a computer. Is there a difference in the
mean number of hours a college student spends on a computer
versus watching TV at α = 0.01? Assume the population of
differences is approximately normally distributed.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 387
The data:
Student
1
2
3
4
5
6
7
8
9
10
CH9: Testing the Difference Between Two Means or Two Proportions
Comp
30
20
10
10
10
0
35
20
2
5
TV
2
1.5
14
2
6
20
14
1
14
10
Santorico - Page 388
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s) from the appropriate table.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 389
Step 3 Compute the test value and determine the P-value.
Step 4 Make the decision to reject or not reject the null
hypothesis.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 390
Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 391
Confidence Interval for the Mean Difference
D t 2
sD
where d.f. = n 1
n
 confidence interval for the mean difference
Let’s find the 99%
of the last example of TV watching vs. Computer Usage.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 392
Section 9-4: Testing the Difference Between
Proportions
Let p1 be the proportion of a population having some
characteristic of interest.
Similarly, let p2 be the proportion of a different population
having that characteristic.
We estimate these parameters by taking samples from each
population and using the sample proportions as estimates.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 393
Let x1 be the number of observations in sample 1 having the
characteristic of interest and x2 be the number of observations
in sample 2 having that characteristic.
x1
The sample proportion for the first sample is pˆ1  and the
n1
x2
sample proportion for the second sample is pˆ2  .
n2

We will learn how to perform a hypothesis test for the
difference in population proportions.

CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 394
Hypotheses:
Right-tailed test:
H0: p1 = p2
H1: p1 > p2
or
or
H0: p1-p2=k
H0: p1-p2>k
Left-tailed test:
H0: p1 = p2
H1: p1 < p2
or
or
H0: p1-p2=k
H0: p1-p2<k
Two-tailed test:
H0: p1 = p2
H1: p1  p2
or
or
H0: p1-p2=k
H0: p1-p2k
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 395
Formula for the z Test for Comparing Two Proportions
H0 : p1  p2  0
(or more generally p1-p2 = k)
Test value:
(pˆ1  pˆ2 )  (p1  p2 )
(pˆ1  pˆ2 )
z

1 1 
1 1 
p q   
p q   
n1 n2 
n1 n2 
x1  x2 n1 pˆ1  n2 pˆ2

where p 
and q  1 p .
n1  n2
n1  n2



CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 396
What is p ?
We are assuming in the null hypothesis that p1  p2  p,
where
p is the value of the common proportion.

Under this assumption, we should combine the
information from both samples to estimate
the

common population proportion p.
p is an estimate of p combining the information from
both samples.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 397
P-values:
(computed as before, depending on the alternative hypothesis)
Right-tailed test:
P(Z  z * )
Left-tailed test: P(Z  z * )

Two-tailed test: 2P(Z  z * )

Since we are performing a z test we will use Table E for pvalues and 
Table F (d.f. = ∞) for critical values.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 398
Assumptions:
1. The samples are independent random
samples
2. All counts must all be at least 5:
a. n1 pˆ1  x1
b. n1qˆ1  n1  x1
c. n2 pˆ2  x2
 d. n2qˆ2  n2  x2



CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 399
Example: It is believed that a sweetener called xylitol helps prevent
ear infections. In a randomized experiment n1  165 children took a
placebo and 68 of them got ear infections. Another sample of
n2  159 children took xylitol and 46 of them got ear infections. We
believe that the proportion of ear infections in the placebo group will
 this hypothesis at α = 0.025.
be greater than the xylitol group. Test
Step 1 State the hypotheses and identify the claim.
H0: p1 = p2 H1: p1 > p2 (CLAIM)
Step 2 Find the critical value(s) from the appropriate table.
Test of two proportions, right-tailed  positive Z value
Based on  = 0.025, we get Z = 1.96.
(I pulled my critical value from the bottom of Table F)
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 400
Step 3 Compute the test value and determine the P-value.
68  46
Let’s find p first: p  165  159  0.352 .
Using this:
68 46

( pˆ1  pˆ 2 )  ( p1  p2 )
165 159
z

1 
1 1
 1
0.352(1

0.352)

pq   


165
159


 n1 n2 
0.122813

 2.31
0.0530751
p-value = P(Z2.31) = 0.0104
(I found my p-value using the Z table - Table E).
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 401
Step 4 Make the decision to reject or not reject the null
hypothesis.
Given that our p-value = 0.0104 which is smaller than our
=0.025, the null hypothesis is rejected.
We could also reach this conclusion by noting that our test
value = 2.31 is greater than our critical value = 1.96.
Step 5 Summarize the results.
There is significant evidence to support the claim that
xylitol helps prevent ear infections. Specifically, infants that took
xylitol had a lower proportion of ear infections than infants that
were given an placebo.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 402
Example: In a sample of 200 surgeons, 15% thought the
government should control health care. In a sample of 200
general practitioners, 21% felt the same way. At α = 0.01, is
there a difference in the proportions?
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s) from the appropriate table.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 403
Step 3 Compute the test value and determine the P-value.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 404
Step 4 Make the decision to reject or not reject the null
hypothesis.
Step 5 Summarize the results.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 405
Confidence Interval for the Difference Between Two
Proportions
( pˆ1  pˆ2 ) z 2
pˆ1qˆ1 pˆ2qˆ2

n1
n2
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 406
Example: Let’s find the 95% confidence interval for the xylitol
problem.
First, note that
( pˆ1  pˆ 2 )  z 2
pˆ1 
68
46
 0.412 , pˆ 2 
 0.289
165
159
and z/2=1.96
pˆ1qˆ1 pˆ 2 qˆ2

n1
n2
 (0.412  0.289)  1.96
0.412(1  0.412) 0.289(1  0.289)

165
159
 0.123  0.103
 ( 0.020, 0.226)
We can say with 95% confidence that infants receiving xylitol
have between 0.2% and 2.26% fewer ear infections than those
receiving placebo.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 407
Example: Let’s find the 90% confidence interval for the health
care problem.
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 408
Decision Tree for Deciding Which Hypothesis Test to Use:
CH9: Testing the Difference Between Two Means or Two Proportions
Santorico - Page 409