• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Statistics for the Behavioral
and Social Sciences:
A Brief Course
Fifth Edition
Arthur Aron, Elaine N. Aron, Elliot Coups
Prepared by:
Genna Hymowitz
Stony Brook University
This multimedia product and its contents are protected under copyright law.
The following are prohibited by law:
-any public performance or display, including transmission of any image over a network;
-preparation of any derivative work, including the extraction, in whole or in part, of any images;
-any rental, lease, or lending of the program.
The t Test for Independent Means
Chapter 9
Chapter Outline
• The Distribution of Differences Between Means
• Estimating the Population Variance
• Hypothesis Testing with a t Test for Independent
Means
• Assumptions of the t Test for Independent Means
• Effect Size and Power for the t Test for Independent
Means
• Review and Comparison of the Three Kinds of Tests
• The t Test for Independent Means in Research
Articles
t Tests for Independent Means
• Hypothesis-testing procedure used for studies
with two sets of scores
– Each set of scores is from an entirely different group of
people and the population variance is not known.
• e.g., a study that compares a treatment group to a
control group
The Distribution of Differences
Between Means
• When you have one score for each person with two different
groups of people, you can compare the mean of one group to
the mean of the other group.
– The t test for independent means focuses on the difference
between the means of the two groups.
• The comparison distribution is a distribution of differences between
means.
– created by randomly selecting one mean from the distribution of means from
the first group’s population
– randomly selecting one mean from the distribution of means for the second
group’s population
– Subtract the mean from the second distribution of means from the mean
from the first distribution of means to create a difference score between the
two selected means.
– Repeat this process a large number of times and you will have a distribution
of differences between means.
» Note that this is not the actual way a distribution of means is created, but
conceptually this is what a distribution of means is.
The Logic of a Distribution of
Differences Between Means
• The null hypothesis is that Population M1 = Population M2
– If the null hypothesis is true, the two population means from which
the samples are drawn are the same.
• The population variances are estimated from the sample scores.
• The variance of the distribution of differences between means is
based on estimated population variances.
– The goal of a t test for independent means is to decide whether the
difference between means of your two actual samples is a more
extreme difference than the cutoff difference on this distribution of
differences between means.
Mean of the Distribution of
Differences Between Means
• With a t test for independent means, two populations are
considered.
– An experimental group is taken from one of these
populations and a control group is taken from the other
population.
• If the null hypothesis is true:
– The populations have equal means.
– The distribution of differences between means has a mean
of 0.
Estimating the Population
Variance
• In a t test for independent means, you calculate two estimates of
the population variance.
– Each estimate is weighted by a proportion consisting of its sample’s
degrees of freedom divided by the total degrees of freedom for both
samples.
• The estimates are weighted to account for differences in sample size.
– More weight is given to the larger sample.
» The most weight is given to the sample that provides the most information
as determined by the degrees of freedom each sample provides.
– The weighted estimates are averaged.
• This is known as the pooled estimate of the population variance.
– S2Pooled = df1(S21) + df2(S22)
df Total
df Total
– df Total = df1 + df2
Figuring the Variance of Each of the
Two Distributions of Means
• The pooled estimate of the population variance is the best
estimate for both populations.
• Even though the two populations have the same variance,
if the samples are not the same size, the distributions of
means taken from them do not have the same variance.
– This is because the variance of a distribution of means is the
population variance divided by the sample size.
• S2M1 = S2Pooled / N1
• S2M2 = S2Pooled / N2
The Variance and Standard Deviation
of the Distribution of Differences
Between Means
• The Variance of the distribution of differences between means
(S2Difference) is the variance of Population 1’s distribution of
means plus the variance of Population 2’s distribution of means.
– S2Difference = S2M1 + S2M2
• The standard deviation of the distribution of difference between
means (SDifference ) is the square root of the variance.
– SDifference = √S2Diifference
Steps to Find the Standard Deviation
of the Distribution of Differences
Between Means
•
•
Figure the estimated population variances based on each sample.
• S2 = [∑(X – M)2] / (N – 1)
Figure the pooled estimate of the population variance.
• S2Pooled = df1(S21) + df2(S22)
df Total
df Total
df1 = N1 – 1 and df2 = N2 – 1; dfTotal = df1 + df2
•
•
•
Figure the variance of each distribution of means.
• S2M1 = S2Pooled / N1
• S2M2 = S2 Pooled / N2
Figure the variance of the distribution of differences between means.
• S2Difference = S2M1 + S2M2
Figure the standard deviation of the distribution of differences between
means.
• SDifference =√ S2Difference
The Shape of the Distribution of
Differences Between Means
• Since the distribution of differences between
means is based on estimated population
variances:
– The distribution of differences between means is a t
distribution.
– The variance of this distribution is figured based on
population variance estimates from two samples.
• The degrees of freedom of this t distribution are the sum
of the degrees of freedom of the two samples.
– dfTotal = df1 + df2
The t score for the Difference
Between the Two Actual Means
• Figure the difference between your two samples’
means.
• Figure out where this difference is on the distribution
of differences between means.
– t = M1 – M2 / SDifference
Hypothesis Testing with a t
Test for Independent Means
• The comparison distribution is a distribution
of differences between means.
• The degrees of freedom for finding the cutoff
on the t table is based on two samples.
• Your samples’ score on the comparison
distribution is based on the difference
Summary of t Test for Independent
Means
(Steps 1 and 2)
•
•
Restate the question as a research hypothesis and a null hypothesis about the
populations.
Determine the characteristics of the comparison distribution.
–
–
The mean will be 0.
standard deviation
•
Calculate the estimated population variances based on each sample.
•
•
Figure the estimated population variances based on each sample.
• S2 = [∑(X – M)2] / (N – 1)
Figure the pooled estimate of the population variance.
» S2Pooled = df1 (S21) + df2 (S22)
df Total
df Total
df1 = N1 – 1 and df2 = N2 – 1; dfTotal = df1 + df2
•
•
Figure the variance of each distribution of means.
» S2M1 = S2Pooled / N1
» S2M2 = S2 Pooled / N2
Figure the variance of the distribution of differences between means.
•
•
Figure the standard deviation of the distribution of differences between means.
•
•
S2Difference = S2M1 + S2M2
SDifference = √S2Diifference
The comparison distribution will be a t distribution with df total degrees of freedom.
Summary of t Test for Independent
Means
(Steps 3, 4, & 5)
•
Determine the cutoff sample score on the comparison distribution at
which the null hypothesis should be rejected.
– Determine the degrees of freedom (dfTotal), desired significance level, and
tails in the test (one or two).
– Look up the appropriate cutoff in a t table.
– If the exact df is not given, use the df below it.
•
Determine your sample’s score on the comparison distribution.
–
•
t = (M1 – M2) / SDifference
Decide whether to reject the null hypothesis.
–
Compare your samples’ score on the comparison distribution to the cutoff t score.
Example of The t Test for
Independent Means: Step 1
•
Use the expressive writing study example from the text.
–
–
–
–
•
You have a sample of 20 students who were recruited to take part in the study.
10 students were randomly assigned to the expressive writing group and wrote about
their thoughts and feelings associated with their most traumatic life events.
10 students were randomly assigned to the control group and wrote about their plans
for the day.
One month later, all of the students rated their overall level of physical health on a
scale from 0 (very poor health) to 100 (perfect health).
Restate the question as a research hypothesis and a null hypothesis about the
populations.
–
–
–
–
Population 1: students who do expressive writing
Population 2: students who write about a neutral topic (their plans for the day)
Research hypothesis: Population 1 students would rate their health differently from
Population 2 students (two-tailed tests).
Null hypothesis: Population 1 students would rate their health the same as Population
2 students.
Example of t Test for Independent
Means: Step 2
•
Determine the characteristics of the comparison distribution.
–
–
–
•
–
•
–
•
•
•
–
•
–
•
–
The comparison distribution is a distributions of differences between means.
Its mean = 0.
Figure the estimate population variances based on each sample.
S21 = 94.44 and S22 = 111.33
Figure the pooled estimate of the population variance.
S2Pooled = 102.89
Figure the variance of each distribution of means.
S2Pooled / N = S2M
S2M1 = 10.29
S2M2 = 10.29
Figure the variance of the distribution of differences between means.
Adding up the variances of the two distributions of means would come out to S2Difference
= 20.58
Figure the standard deviation of the distribution of difference between means.
S Difference= √S2Difference = √20.58 = 4.54
The shape of the comparison distribution will be a t distribution with a total of 18
degrees of freedom.
Example of t Test for Independent
Means: Step 3
• Determine the cutoff sample score on the
comparison distribution at which the null
hypothesis should be rejected.
– You will use a two-tailed test.
– If you also chose a significance level of
.05, the cutoff scores from the t table would
be 2.101 and -2.101.
Example of t Test for Independent
Means: Step 4
• Determine your sample’s score on the
comparison distribution.
– t = (M1 – M2) / SDifference
– (79.00 – 68.00) / 4.54
– 11.00 / 4.54 = 2.42
Example of t Test for Independent
Means: Step 5
• Decide whether to reject the null
hypothesis.
– Compare your samples’ score on the
comparison distribution to the cutoff t
score.
– Your samples’ score is 2.42, which is
larger than the cutoff score of 2.10.
– You can reject the null hypothesis.
Assumptions of the t Test for
Independent Means
• The population distributions are normal.
• The two populations have the same variance.
• Even if the distributions are not exactly
normal or the variances are not exactly equal,
the t test is still pretty accurate.
– If the populations are very far from normal, the
variances are very different, or the variances are
very different and the populations are far from
normal, then the t test does not give accurate
results and alternative tests should be used.
Effect Size and Power for the t
Test for Independent Means
• Estimated Effect Size = (M1 – M2) / Spooled
– The estimated effect size is the difference
between the sample means divided by the
pooled estimate of the population's standard
deviation.
• Power
– determined using a power table, computer
software, or a power calculator (found
online)
Power When Sample Sizes
Are Not Equal
• Harmonic Mean
– special average influenced more by smaller numbers
• It is used in a t test for independent means when the number of
scores in the two groups differ.
– In such cases, the harmonic mean is used as the equivalent of
each group’s sample size when determining power.
• It gives the equivalent sample size for what you would have if
• It is two times the first sample size multiplied by the second
sample size—all divided by the sum of the two sample sizes.
Harmonic Mean = [2(N1)(N2)] / (N1 + N2)
Review of the t Test for a Single Sample, t Test
for Dependent Means, and the t Test for
Independent Means
•
t Test for a Single Sample
–
–
–
–
–
–
•
t Test for Dependent Means
–
–
–
–
–
–
–
•
Population Variance is not known.
Population mean is known.
There is 1 score for each participant.
The comparison distribution is a t distribution.
df = N – 1
Formula t = (M – Population M) / Population SM
Population variance is not known.
Population mean is not known.
There are 2 scores for each participant.
The comparison distribution is a t distribution.
t test is carried out on a difference score.
df = N – 1
Formula t = (M – Population M) / Population SM
t Test for Independent Means
–
–
–
–
–
–
Population variance is not known.
Population mean is not known.
There is 1 score for each participant.
The comparison distribution is a t distribution.
df total = df1 + df2 (df1 = N1 – 1; df2 = N2 – 1)
Formula t = (M1 – M2) / SDifference
The t Test for Independent
Means in Research Articles
• When found in research articles, the
results of these tests are accompanied
by reporting of the means and
sometimes the standard deviations.
– t = (dftotal) = x.xx, p < .01
Key Points
•
•
•
•
•
A t test for independent means is used for hypothesis testing with scores from two entirely
separate groups of people. The comparison distribution is a distribution of differences between
means of samples. The distribution can be thought of as being built up in two steps. Each
population of individuals produces a distribution of means, and then a new distribution is created of
differences between pairs of means selected from these two distributions of means.
The distribution of differences between means has a mean of 0 and a t distribution with the total of
the degrees of freedom from the two samples. Its standard deviation is figured in several steps.
– Figure the estimated population variance based on each sample.
– Figure the pooled estimate of the population variance.
– Figure the variance of each distribution of means.
– Figure the variance of the distribution of differences between means.
– Figure the standard deviation of the distribution of differences between means.
The assumptions of the t test for independent means are that the two populations have a normal
distribution and have the same variance. However, the t test gives fairly accurate results when the
distribution is not exactly normal or the variances are not exactly equal.
Estimated effect size for a t test for independent means is the difference between the samples’
means divided by the pooled estimate of the population standard deviation. Power is greatest
when the sample sizes of the two groups are equal. When they are not equal, you use the
harmonic mean of the two sample sizes when looking up power on a table. Power for a t test for
independent means can be determined using a t table, a power software package, or an internet
power calculator.
t tests for independent means are usually reported in research articles with the means of the two
groups and the degrees of freedom, t score, and significance level. Results may also be reported
in a table where significant differences are noted by asterisks.