Download Comparing Two Population Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Comparing Two Population
Means
The Two-Sample
T-Test and T-Interval
Example
Do male and female college students differ with respect to
their fastest reported driving speed?
Population of all male
college students
Sample of n1 = 17 males
report average of 102.1 mph
Population of all female
college students
Sample of n2 = 21 females
report average of 85.7 mph
Comparative Observational
Study
• A research study in which two or more
groups are compared with respect to some
measurement or response.
• The groups, determined by their natural
characteristics, are merely “observed.”
Graphical summary of
sample data
Gender
male
female
75
85
95
105
115
125
Fastest Driving Speed (mph)
135
145
Numerical summary of
sample data
Gender
female
male
Gender
female
male
N
21
17
Mean
85.71
102.06
SE Mean
2.05
4.14
Median TrMean
85.00
85.26
100.00 101.00
Minimum
75.00
75.00
StDev
9.39
17.05
Maximum Q1
Q3
105.00 77.50 92.50
145.00 90.00 115.00
The difference in the sample means is
102.06 - 85.71 = 16.35 mph
The Question
in Statistical Notation
Let M = the average fastest speed of all male students.
and F = the average fastest speed of all female students.
Then we want to know whether M  F.
This is equivalent to knowing whether M - F  0
All possible questions
in statistical notation
In general, we can always compare two averages by seeing
how their difference compares to 0:
This comparison… is equivalent to …
1  2
1 - 2  0
1 > 2
1 - 2 > 0
1 < 2
1 - 2 < 0
Set up hypotheses
• Null hypothesis:
– H0: M = F [equivalent to M - F = 0]
• Alternative hypothesis:
– Ha: M  F [equivalent to M - F  0]
Make initial assumption
• Assume null hypothesis is true.
• That is, assume M = F
• Or, equivalently, assume M - F = 0
Determine the P-value
• P-value = “How likely is it that our sample
means would differ by as much as 16.35
m.p.h. if the difference in population means
really is 0?”
• The P-value, 0.001, is small. Our sample
result is not likely if the null hypothesis is
true.
• Reject the null hypothesis.
Make a decision
• There is sufficient evidence, at the 0.05
level of significance, to conclude that the
average reported fastest driving speed of all
male college students differs from the
average reported fastest driving speed of all
female students.
How the P-value is calculated
The P-value is determined by standardizing, that
is, by calculating the two-sample test statistic...
t  difference in sample means  hypothesiz ed difference
standard error of the difference
…and comparing the value of the test statistic to
the appropriate sampling distribution.
The sampling distribution depends on how you estimate
the standard error of the differences.
If variances of the measurements
of the two groups are not equal...
Estimate the standard error of the difference as:
s12 s22
n1  n 2
Then the sampling distribution is an approximate t
distribution with a complicated formula for d.f.
If variances of the measurements
of the two groups are equal...
Estimate the standard error of the difference using the
common pooled variance:










2 
s2p n1  n1
1
where
2  (n 1)s2
(n

1)s
1
2
2
s2p  1
n1  n 2  2
Then the sampling distribution is a t distribution
with n1+n2-2 degrees of freedom.
Assume variances are equal only if neither sample standard deviation
is more than twice that of the other sample standard deviation.
Two-sample t-test in Minitab
• Select Stat. Select Basic Statistics.
• Select 2-sample t to get a Pop-Up window.
• Click on the radio button before Samples in
one Column. Put the measurement variable in
Samples box, and put the grouping variable in
Subscripts box.
• Specify your alternative hypothesis.
• If appropriate, select Assume Equal Variances.
• Select OK.
Pooled two-sample t-test
Two sample T for Fastest
Gender
female
male
N
21
17
Mean
85.71
102.1
StDev
9.39
17.1
SE Mean
2.0
4.1
95% CI for mu (female) - mu (male ): ( -25.2, -7.5)
T-Test mu (female) = mu (male ) (vs not =): T = -3.75
P = 0.0006
DF = 36
Both use Pooled StDev = 13.4
(Unpooled) two-sample t-test
Two sample T for Fastest
Gender
female
male
N
21
17
Mean
85.71
102.1
StDev
9.39
17.1
SE Mean
2.0
4.1
95% CI for mu (female) - mu (male ): ( -25.9, -6.8)
T-Test mu (female) = mu (male ) (vs not =): T = -3.54
P = 0.0017
DF = 23
Assumptions for correct P-values
• Data in each group follow a normal
distribution.
• If use pooled t-test, the variances for each
group are equal.
• The samples are independent. That is, who
is in the second sample doesn’t depend on
who is in the first sample (and vice versa).
Confidence interval for
difference in two means
We can be “such-and-such” confident that the
difference in the population means falls in the interval...
difference in sample means  (t* standard error)
where the t* multiplier depends on the confidence
level and is obtained either from the appropriate t
distribution.
Interpreting a confidence interval
for the difference in two means…
If the confidence
interval contains…
zero
only positive
numbers
only negative
numbers
then, we conclude …
the two means may
not differ
first mean is larger
than second mean
first mean is smaller
than second mean
Two-sample confidence interval
in Minitab
• Select Stat. Select Basic Statistics.
• Select 2-sample t to get a Pop-Up window.
• Click on the radio button before Samples in
one Column. Put the measurement variable in
Samples box, and put the grouping variable in
Subscripts box.
• Specify confidence level.
• If appropriate, select Assume Equal Variances.
• Select OK.
Example
Two sample T for laundry
gender
M
F
N
44
44
Mean
3.07
3.89
StDev
1.81
3.88
SE Mean
0.27
0.58
95% CI for mu (M) - mu (F): ( -2.11, 0.47)
T-Test mu (M) = mu (F) (vs not =): T = -1.27
P = 0.21
DF = 60
Example
Do the average guesses of the population of Turkey
differ depending on preliminary information received?
Population of all people
seeing “80 million”
Sample of n1 = 34 people
Population of all people
seeing “10 million”
Sample of n2 = 33 people
Randomized comparative
experiment
• A study in which two or more groups are
randomly assigned to a “treatment” to see
how the treatment affects some “response.”
• If each “experimental unit” has the same
chance of receiving any treatment, then the
experiment is called a “completely
randomized design.”
Graphical summary of data
Form
80
10
0
100
200
Guess of Population of Turkey
300
Two-sample t-test results
Two sample T for Turkey
Form2
10
80
N
33
34
Mean
12.50
62.8
StDev
8.50
54.8
SE Mean
1.5
9.4
95% CI for mu (10) - mu (80): ( -69.6, -30.9)
T-Test mu (10) = mu (80) (vs <): T = -5.28
P = 0.0000
DF = 34
Conclusions of
Turkey experiment
• There is sufficient evidence, at the 0.05
level, to conclude that the average guesses
of the population of Turkey differ between
the two forms.
• The population mean guess of the “10
million” form is lower than the population
mean guess of the “80 million” form.
Related documents