Download jac_methods_Ch09

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 9
Three Tests of Significance
Winston Jackson and Norine Verberg
Methods: Doing Social Research, 4e
Inferential Statistics
Inferential statistics do two things:
1. Allow us to judge the accuracy of
generalizing from a limited sample to the
larger population

We can be 95% certain that the sample
mean will be 30%, plus or minus 4%
2. Conduct hypothesis testing
 Indicate whether study outcome is a fluke or
reflects a true difference in the population
 i.e., say if the findings are statistically significant
© 2007 Pearson Education Canada
9-2
What Does Statistically Significant Mean?
 A test of significance reports the probability
that an observed difference or association is a
result of sampling fluctuations and not reflective
of a “true” difference in the population from
which the sample was selected
 Three tests of statistical significance introduced
in Chapter 9

Chi-square test, t-test, and F-test
© 2007 Pearson Education Canada
9-3
Preliminary Considerations
1. Research and null hypothesis
2. The sampling distribution

Standard error of the means
3. One- and two-tailed tests of significance
© 2007 Pearson Education Canada
9-4
1. Research and Null Hypothesis
 Tests of significance are used to test hypotheses

Set up in the form of a “research hypothesis” and
“null hypothesis”
 Research Hypothesis (or Alternative Hypothesis)

States one’s prediction of the relationship
between the variables
 Null Hypothesis

States the prediction that there is no relation
between the variables
© 2007 Pearson Education Canada
9-5
Research and Null Hypothesis (cont’d)
 Research hypothesis 1: The greater the
participation, the higher the self-esteem

Null hypothesis 1: There is no relation between
levels of participation and levels of self-esteem
 Research hypothesis 2: Male university faculty
members earn more money than their female
counterparts after controlling for qualifications,
achievements, and experience

Null hypothesis 2: There is no relation between
gender and earnings of faculty members, after
controls
© 2007 Pearson Education Canada
9-6
Research and Null Hypothesis (cont’d)
 It is the null hypothesis that is tested

Leads us to accept or reject the null hypothesis
 If the null hypothesis is accepted:


Conclude that the association or difference may
simply be the result of sampling fluctuations and
may not reflect an association or difference in
the population being studied
Research hypothesis deemed to therefore be
false
© 2007 Pearson Education Canada
9-7
Research and Null Hypothesis (cont’d)
 If the null hypothesis is rejected:

Argue that there is an association between the
variables in the population, and that this
association is of a magnitude that probably
has not occurred because of chance
fluctuations in sampling
 Would then examine the data to see if the
association is in the predicted direction

i.e., consistent with prediction (it could be
different than predicted)
© 2007 Pearson Education Canada
9-8
Findings and Probability
 When the results of a study lead to the
rejection of the null hypothesis, this only
means that there is probably a relationship
between the variables under examination

It is one piece of evidence that the
relationships exists
 Other researchers will test it again, and either
confirm or disconfirm the past findings

Research conclusions are therefore treated as
tentative, open to disconfirmation
© 2007 Pearson Education Canada
9-9
Did they fail if they accept the null?
 Some researchers believe they failed if they
accept the null hypothesis (i.e., find no
relation between the variables rather than
support for the predicted relationship)

Not so: it is just as important to show that two
variables are not associated as it is to find out
they are associated
© 2007 Pearson Education Canada
9-10
2. The Sample Distribution
 Tests of significance report whether an
observed relationship could be the result of
sample fluctuations or reflect a “real”
difference in the population from which the
sample has been taken
 Sample fluctuation is the idea that each time
we select a sample we will get somewhat
different results

If we draw 1,000 samples of 50 cases, each
will be slightly different from the first sample
© 2007 Pearson Education Canada
9-11
The Sample Distribution (cont’d)
 If the means of the same variable for each of
the samples were plotted, a normal curve would
results, but it would be peaked (or leptokurtic)
 Example: The means of weights of respondents
are plotted

The weights range from 70 to 80 kg, but the
majority of samples would cluster around the
true mean weight of 75 kg

Note: We are plotting the mean weights of the
respondents in each of the 1000 samples drawn
© 2007 Pearson Education Canada
9-12
The Sample Distribution (cont’d)
 The distribution is quite peaked because we are
plotting the mean weights for each sample
 To measure the dispersion of the means of the
samples, we use a statistic called the standard error
of the means
Standard error of means 
Sd population
N
© 2007 Pearson Education Canada
9-13
The Sample Distribution (cont’d)
 Relevance to hypothesis testing?
 In doing tests of significance, we are
assessing whether the results of one sample
fall within the null hypothesis acceptance
zone (usually 95% of the distribution) or
outside the zone, in which we reject the null
hypothesis
 Four key points that can be made about
probability sampling procedures where
repeated measures are taken
© 2007 Pearson Education Canada
9-14
Four Key Points: Repeated Samples
1. Plotting the means of repeated samples will
produce a normal distribution: it will be more
peaked than when raw data are plotted (as
shown in Figure 9.1)
© 2007 Pearson Education Canada
9-15
Four Key Points (cont’d)
2. The larger the sample sizes, the more
peaked the distribution and the closer the
means of the samples to the population
mean (shown in Figure 9.2)
© 2007 Pearson Education Canada
9-16
Four Key Points (cont’d)
3. The greater the variability in the population,
the greater the variations in the samples
4. When sample sizes are above 100, even if a
variable in the population is not normally
distributed, the means will be normally
distributed when repeated samples are
plotted

E.g., weight of population of males and
females will be bimodal, but if we did
repeated samples, the weights would be
normally distributed
© 2007 Pearson Education Canada
9-17
3. One- and Two-Tailed Tests
 If the direction of a relationship is predicted,
the appropriate test will be one-tailed

If the direction of the relationship is not
predicted, conduct a two-tailed test
 Example:


One tailed: Females are less approving of
violence than are males
Two-tailed: There is a gender difference in the
acceptance of violence [Note: No prediction
about which gender is more approving]
© 2007 Pearson Education Canada
9-18
One- and Two-Tailed Tests (cont’d)
 Figure 9.3 (next slide) shows two
normal distribution curves
 The first one has the 5% rejection area
split between the two tails—this would
be a two-tailed test
 The second one has the 5% rejection
area all in one tail, indicating a one
tailed test
 Same principle applies to 1% level
© 2007 Pearson Education Canada
9-19
Figure 9.3 Five Percent Probability
Rejection Area: One- and Two-Tailed Tests
© 2007 Pearson Education Canada
9-20
Chi-Square: Red and White Balls
 The Chi-Square test (X2) is used primarily in
contingency table analysis, where the
dependent variable is nominal level
 The formula is:
( fo  fe )
X 
fe
2
2
© 2007 Pearson Education Canada
9-21
One Sample Chi-Square Test
Suppose the following incomes:
INCOME
Over $100,000
$40,000 to $99,999
Under $40,000
TOTAL
STUDENT
SAMPLE
% OF SAMPLE
30
15.0
7.8
160
80.0
68.9
10
5.0
23.3
200
100.0
100.0
© 2007 Pearson Education Canada
GENERAL
POPULATION
9-22
The Computation
 Chi-squares compare expected frequencies
(assuming the null hypothesis is correct) to
the observed frequencies.
 To calculate the expected frequencies, simply
multiply the proportion in each category of the
general population times the total number of
cases (e.g., 200 students)
 Why do you do this?
© 2007 Pearson Education Canada
9-23
Why?
 If the student sample is drawn equally from all
segments of society, then they should have
the same income distribution (this is
assuming the null hypothesis is correct)
 So what are the expected frequencies in this
case?
© 2007 Pearson Education Canada
9-24
Expected Frequencies fe
Frequency
Observed
30
160
10
Frequency
Expected
15.6 (200 x .078)
137.8 (200 x .689)
46.6 (200 x .233)
 Degrees of Freedom = 2
© 2007 Pearson Education Canada
9-25
Decision
 Look up critical value: Table 9.2, p. 260
 Need to know:
 2 degrees of freedom
 .05 level of significance
 1 tailed test (i.e., column one)
 Find the critical value = 4.61
 Compare to the Chi-Square calculated = 45.61
 Decision: Calculated value exceeds critical value so
reject null hypothesis
 Inspect the data, conclude university students from
higher SES background
© 2007 Pearson Education Canada
9-26
Standard Chi-Square Test
 Drug use by Gender (Box 9.4, p. 261)
 3 categories of drug use (no experience, once or
twice, three or more times)
 row marginal x column marginal ÷ total N of
cases = expected frequencies
 Degrees of freedom = (row – 1)(columns – 1) =
2
© 2007 Pearson Education Canada
9-27
Decision
 With 2 degrees of freedom, 2-tailed test, .05
level of significance, the critical value is 5.99
 Calculated Chi-Square is 5.69
 Does not equal or exceed the critical value
 So, your decision is what?

Accept the null hypothesis
© 2007 Pearson Education Canada
9-28
The t Distribution:
t-Test Groups and Pairs
Used often for experimental data
t-test used when:
 Sample size is small (e.g.,< 30)
 Dependent variable measured at ratio level
 Random assignment to treatment/control
groups
 Treatment has two levels only
 Population normally distributed
© 2007 Pearson Education Canada
9-29
The t Distribution:
t-Test Groups and Pairs
The t-test represents the ratio between the
difference in means between two groups and
the standard error of the difference. Thus:
t = difference between the means
standard error of the difference
© 2007 Pearson Education Canada
9-30
Two t-Tests:
Between- and Within-Subject Design
 Between-subjects: used in an experimental
design, with an experimental and a control
group, where the groups have been
independently established
 Within-subjects: In these designs the same
person is subjected to different treatments
and a comparison is made between the two
treatments.
© 2007 Pearson Education Canada
9-31
The F Distribution: Means, ANOVA
 Box 9.7 provides an illustration of one-way
analysis of variance (“Egalitarianism by
Country,” p. 269)
 Concern is with how much variation there is
within columns compared to variation
between columns
 The F represents the ratio of between
variation divided by within variation
 Probabilities looked up on Table 9.4, p. 272
© 2007 Pearson Education Canada
9-32
When Are Tests of Significance
Not Appropriate?
 Total populations studied
 Non-probability sampling procedures used
 High nonparticipation rates
 Nonexperimental research tests for
intervening variables
 Research is not guided by formal hypotheses
© 2007 Pearson Education Canada
9-33
Related documents