Download SP17 Lecture Notes 7b - Inference for a Difference in Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Lecture notes 7b:
Inference for a difference in means
Outline:
•
•
•
•
•
Hypothesis test for two means using
independent samples (example 1)
CI for two means using independent samples
“Paired” differences
Hypothesis test for a paired difference (example
2)
CI for a paired difference
Inference for a difference between
population means
•
We have looked at how hypothesis tests and confidence
intervals can be used to draw inference on a population
mean.
•
In practice, it is more common to want to investigate how
two means differ from one another – or if they differ at all.
•
We will look at how we compare the means from two
separate groups (“independent samples”), as well as how
we compare the means of two sets of observations taken
on the same group (“paired data”).
Hypothesis tests for two means
(independent samples)
•
We will first look at inferential procedures for
comparing means when we have “independent
samples”, i.e. two different groups of subjects.
•
We’ll start with a hypothesis test, and then look at
constructing a corresponding confidence interval.
•
The following hypothesis test is usually referred to
as a “two sample t-test”.
Hypothesis tests for two means
•
This is the formula for the test statistic which compares two
sample means to one another:
•
Here, the subscripts refer to populations 1 and 2.
•
In the numerator of this formula,
, is the point estimate for
the difference between population means 1 and 2.
Hypothesis tests for two means
•
is the hypothesized difference between the
means. This is almost always zero, because the null
hypothesis is almost always that the means are equal.
If we reject this null, it will be in favor of the
alternative hypothesis that the means are different.
•
When
statistic:
is zero, we can simplify the test
Hypothesis tests for two means
•
The denominator of this equation is the standard
error of
. It combines the standard errors
of both
and
.
•
And so this statistic follows the same general formula
as the statistic for a one sample t-test:
Example 1
•
A researcher who is studying the relationship between concentration
and balance conducts an experiment in which nine elderly subjects
and eight young subjects each stand barefoot on a “force platform”,
which measures how much a person sways (in millimeters) in the
forward/backward and side-to-side directions.
•
Subjects are asked to maintain a stable upright position and to react
as quickly as possible to a randomly timed noise by pressing on a
hand held button.
•
The researcher would like to know if there is a difference between the
elderly and young with regards to how well they maintain balance in
this scenario.
Example 1
Here is the data:
Elderly subjects
18 11 16 24 18
14
41
21
17
Young subjects
10 16 22 12 14 12 18
37
Step 1: the hypotheses
State the null and alternative hypotheses:
Step 2: α and the critical value
•
This time we didn’t specify a level of significance. By convention,
when no level of significance is specified, we default to α = 0.05
•
As with the one sample test for a mean, the test statistic used for
a two sample test follows a t-distribution. So, we will use a t-critical value.
•
As it turns out, the formula for finding degrees of freedom for the
sampling distribution of this test statistic is pretty involved:
Step 2: α and the critical value
In practice, software can be used to find this degrees of
freedom. In our class, you will be given the degrees of
freedom for two sample procedures.
In this case, df = 11.
Find the critical value and sketch the sampling
distribution of the test statistic under H0:
Step 3: The test statistic and p-value
Compute the test statistic and corresponding pvalue:
Step 4: The statistical decision
State and interpret the statistical decision:
Example 1 confidence interval
•
We can also construct a confidence interval for the
true difference in mean millimeters of sway between
the elderly and the young.
•
Recall the general form of a confidence interval:
CI = point estimate ± margin of error
or CI = point estimate ± (critical value) * (standard error)
Example 1 confidence interval
•
Using this general form as a guide, we see that
Construct a 95% CI for the true difference in mean
sway between the young and elderly groups:
Example 1 confidence interval
Do the results of the hypothesis test and confidence
interval agree? Why or why not?
A brief aside: pooling variance
•
The 2 sample t-test we just conducted involved finding estimates for
the standard deviations of our two groups (young and elderly)
separately.
•
There is a slight variation on this method which involves “pooling”
the standard deviations. Usually this is referred to as “pooling the
variances” – mathematically this is the same thing, since the
standard deviation is the square root of the variance.
•
This method assumes that the population standard deviations of the
two groups are equal, and so we can estimate one standard deviation
that applies to both groups. We will not be using this technique in
our class, but in practice it is often used.
Statistical Tests Involving Paired Data
•
Sometimes, when testing for a difference in means, we are
able to measure the same subjects twice and test for
differences in the two measurements.
•
Examples include “before” and “after” type studies (e.g.
compare blood pressure before being put on a drug to after
being on the drug), or studies where each subject can be
measured under two different treatments, or a treatment
and a control (e.g. conduct a vision test with your right eye,
then with your left eye, and compare the results).
•
These “paired” studies have an advantage over studies using independent
samples in that there is less natural variability to account for.
•
Example: suppose we want to see if beer and wine consumption have different
effects on short term memory. We could take a sample of participants,
randomly assign them to either the beer or wine group, and then administer a
memory test after they’ve consumed some specified quantity of alcohol.
•
If we do this, any differences we observe between the groups may be
attributable to the beer or wine, but they may also be attributable many other
variables that affect memory (age, genetics, physical health, etc.) and that
differ from person to person.
•
Using paired observations, i.e. multiple observations on the same
subjects, in effect “controls” for these other variables.
•
If each participant undergoes a memory test once after consuming
beer and once after consuming wine, then we know that any
difference between these pair observations will not be attributable
to these other variables (age, genetics, physical health, etc.).
•
Because of this, the amount of random variability that we
normally expect to see in our data will be reduced, which might
make it easier to test our research hypotheses.
•
Using paired observations can introduce other possible problems,
and so this type of test is not always the most appropriate.
•
For instance, human subjects may perform differently on a test the
nd
2 time due to having gone through it once.
•
Also, you may not be able to test against placebo in a paired study.
•
If conducting a paired study is feasible and appropriate, then it will
generally be preferable to a two-sample study because it results in
a smaller standard error.
•
The formula for a paired t-test is essentially identical to the formula for a t-test for
a single mean. The only difference is that the data we use is not raw observations
of a variable; rather it is the differences between the paired observations.
•
We note that our data takes the form of differences by using the subscript “d” in
our notation. Apart from this, the formula is identical to that of a one sample test
for a mean:
Example 2
A physical fitness program is designed to increase a
person’s upper body strength. To determine the
effectiveness of this program a SRS of 31 members of a
health club was selected and each member was asked
to do as many push-ups as possible in 1 minute. After
1 month on the program the participants were once
again asked to do as many push-ups as possible in 1
minute. These values were recorded and the difference
(After - Before) was computed.
Example 2
Test to see if the program is effective in increasing upper body
strength as defined by the “number of push ups in one minute”
metric, using α = 0.05.
Here are summary statistics for the variable “difference”:
Variable
difference
N
31
Mean
9.17
Median
8.00
StDev
8.06
SE Mean
1.45
Note here that “SE mean” (the standard error of the mean) is
found by dividing the standard deviation by the square root of the
sample size.
Sample data (Partial Listing)
Subject
1
2
3
4
5
6
7
8
…
Before
28
34
28
60
20
25
32
19
…
After
32
32
42
64
41
33
49
32
…
Difference
4
-2
14
4
21
8
17
13
…
44
This shows how the paired differences are calculated.
For the purposes of this test, our data will be the
“difference” column. Note that this isn’t all of the data.
Step 1: The Hypotheses
Step 2: α and the critical value
Step 3: test statistic and p-value
Step 4: the statistical decision
Example 2 confidence interval
We can also construct a CI for the true mean difference
in upper body strength, before and after the program:
Example 2 confidence interval
Finally, we can interpret this confidence interval, and
note how it relates to the hypothesis test we
conducted:
Conclusion
•
Hopefully the basic process of performing a
hypothesis test and constructing a confidence
interval has become familiar.
•
The statistical techniques that we study throughout
the remainder of the class will always involve a
hypothesis test, confidence interval, or both. These
are the “bread and butter” of statistical inference.
•
In the next set of notes, we will consider the
assumptions that underlie these procedures, as
well as some areas of controversy in statistical
inference.