Download Chapter 7 hypothesis testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
CHAPTER 7: TESTING
HYPOTHESES
Leon-Guerrero and Frankfort-Nachmias,
Essentials of Statistics for a Diverse Society
Chapter 7: Testing Hypotheses
 Overview
 Assumptions of Statistical Hypothesis Testing
 Stating the Research and Null Hypotheses
 Determining What is Sufficiently Improbable: Probability
Values and Alpha
 Five Steps in Hypothesis Testing: A Summary
 Errors in Hypothesis Testing
 Testing Hypotheses About Two Samples
 The Sampling Distribution of the Difference Between
Means
 The t Statistic
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Overview
Independent Variables
Nominal
Interval
Considers the distribution
of one variable across the
categories of another
variable
Considers how a change in
a variable affects a
discrete outcome
Interval
Dependent
Variable
Nominal
Considers the difference
between the mean of one
group on a variable with
another group
Considers the degree to
which a change in one
variable results in a
change in another
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Overview
Overview
Independent Variables
Nominal
Interval
Lambda
Logistic Regression
Interval
Dependent
Variable
Nominal
Confidence Intervals
T-Test
Regression
Correlation
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
General Examples
Independent Variables
Nominal
Interval
Interval
Dependent
Variable
Nominal
Is one group scoring significantly higher
on average than another group?
Is a group statistically different from
another on a particular dimension?
Is Group A’s mean higher than Group
B’s?
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Specific Examples
Independent Variables
Nominal
Interval
Interval
Dependent
Variable
Nominal
Do people living in rural communities live longer
than those in urban or suburban areas?
Do students from private high schools perform better
in college than those from public high schools?
Is the average number of years with an employer
lower or higher for large firms (over 100 employees)
compared to those with fewer than 100 employees?
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Testing Hypotheses
• Statistical hypothesis testing – A procedure that
allows us to evaluate hypotheses about population
parameters based on sample statistics.
• Research hypothesis (H1) – A statement reflecting
the substantive hypothesis. It is always expressed
in terms of population parameters, but its specific
form varies from test to test.
• Null hypothesis (H0) – A statement of “no
difference,” which contradicts the research
hypothesis and is always expressed in terms of
population parameters.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Research and Null Hypotheses
One Tail — specifies the hypothesized direction
• Research Hypothesis:
H1: 12, or 12> 0
• Null Hypothesis:
H0: 12, or 12= 0
Two Tail — direction is not specified (more common)
• Research Hypothesis:
H1: 12, or 12= 0
• Null Hypothesis:
H0: 12, or 12= 0
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
One-Tailed Tests



One-tailed hypothesis test – A hypothesis test
in which the alternative is stated in such a way
that the probability of making a Type I error is
entirely in one tail of a sampling distribution.
Right-tailed test – A one-tailed test in which the
sample outcome is hypothesized to be at the
right tail of the sampling distribution.
Left-tailed test – A one-tailed test in which the
sample outcome is hypothesized to be at the
left tail of the sampling distribution.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Two-Tailed Tests

Two-tailed hypothesis test – A
hypothesis test in which the region of
rejection falls equally within both tails
of the sampling distribution.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Probability Values


Z statistic (obtained) – The test statistic
computed by converting a sample statistic
(such as the mean) to a Z score. The formula
for obtaining Z varies from test to test.
P value – The probability associated with
the obtained value of Z.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Probability Values
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Probability Values

Alpha ( ) – The
 level of probability at
which the null hypothesis is rejected. It is
customary to set alpha at the .05, .01, or
.001 level.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Five Steps to Hypothesis Testing
(1) Making assumptions
(2) Stating the research and null hypotheses
and selecting alpha
(3) Selecting the sampling distribution and
specifying the test statistic
(4) Computing the test statistic
(5) Making a decision and interpreting the
results
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Type I and Type II Errors
•
•
Type I error (false rejection error)the probability (equal to ) associated with
rejecting a true null hypothesis.
Type II error (false acceptance error)the probability associated with failing to
reject a false null hypothesis.
Based on sample results, the decision made is to…
In the
population
H0 is ...
reject H0
do not reject H0
true
Type I
error ()
correct
decision
false
correct
decision
Type II error
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
t-Test



t statistic (obtained) – The test statistic computed to
test the null hypothesis about a population mean
when the population standard deviation is
unknown and is estimated using the sample
standard deviation.
t distribution – A family of curves, each
determined by its degrees of freedom (df). It is
used when the population standard deviation is
unknown and the standard error is estimated from
the sample standard deviation.
Degrees of freedom (df) – The number of scores
that are free to vary in calculating a statistic.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
t Distribution
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
t Distribution Table
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
t-Test for Difference Between Two Means
Is the value of 2 1 significantly different from 0?
This test gives you the answer:
The difference between
the two means
t( N1  N 2 2)
Y1 Y 2

 the estimated standard
SY1 Y2 error of the difference
If the t value is greater than 1.96, the difference between the means is
significantly different from zero at an alpha of .05 (or a 95% confidence level).
The critical value of t will be higher than 1.96 if the total N is less than
122. See Appendix C for exact critical values when N < 122.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Estimated Standard Error of the Difference
Between Two Means Assuming Unequal Variances
SY  Y 
1
2
2
s1
N1

Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
2
s2
N2
t-Test and Confidence Intervals
t( N1  N 2 2)
Y1 Y 2

SY1 Y2
The t-test is essentially creating a confidence interval around the
difference score. Rearranging the above formula, we can calculate
the confidence interval around the difference between two means:
Y  Y  t (S
1
2
)
Y 1 Y 2
If this confidence interval overlaps with zero, then we cannot be
certain that there is a difference between the means for the two
samples.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Why a t Score and not a Z Score?
Y  Y  t (S
1
•
•
•
2
)
Y 1 Y 2
Use of the Z distribution has assumes the population standard error of
the difference is known. In practice, we have to estimate it and so we
use a t score.
When N gets larger than 50, the t distribution converges with a Z
distribution so the results would be identical regardless of whether you
used a t or Z.
In most sociological studies, you will not need to worry about the
distinction between Z and t.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Example (1): t-Test
Men (N=54)
Women (N=46)
Mean pay
$10.06
$10.29
Standard Deviation
.9051
.8766
10.06 10.29
.23
t

 1.23
2
2
.1785
.9051 .8766

54
46
What can we conclude about the
Y1  Y 2
difference in wages?
t(N1 N 2 2)  2
s1 s22

N1 N 2
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
Example (2):t-Test
Men (N=51)
Women (N=57)
Mean Pay
$10.32
$9.68
Standard Deviation
.9461
1.0550
t
10 .32  9.68
.9461 2 1.0550 2

51
57
t( N1  N 2  2) 
Y1 Y 2

.64
 3.32
.1925
What can we conclude about the
difference in wages?
s12 s22

N1 N 2
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications
In-Class Exercise
Using these GSS income data, calculate a t-test statistic to determine if the
difference between the two group means is statistically significant.
Mean
Standard Deviation
N
Men
$22,052.51
$17,734.92
434
Women
$14,331.21
$12,165.89
448
t (N 1 N 2 2) 
Y1  Y2
s12
s22

N1 N2
22, 052.51-14, 221.21
7,831.30
t(880) =
=
= 7.62
2
2
1,027.18
17, 734.92 12,165.89
+

434
448
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society
© 2012 SAGE Publications