Download chi-square test for independence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
14
Chapter
Chi-Square Tests
 Chi-Square Test for Independence
 Chi-Square Tests for Goodness-of-Fit
McGraw-Hill/Irwin
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.
Chi-Square Test for Independence
Chi-Square Test
•
•
•
•
In a test of independence for an r x c contingency table, the
hypotheses are
H0: Variable A is independent of variable B
H1: Variable A is not independent of variable B
Use the chi-square test for independence to test these hypotheses.
This non-parametric test is based on frequencies.
The n data pairs are classified into c columns and r rows and then
the observed frequency fjk is compared with the expected
frequency ejk.
14-2
Chi-Square Test for Independence
Chi-Square Distribution
•
•
•
•
The critical value comes from the chi-square probability
distribution with n degrees of freedom.
n = degrees of freedom = (r – 1)(c – 1)
where
r = number of rows in the table
c = number of columns in the table
Appendix E contains critical values for right-tail areas of the chisquare distribution.
The mean of a chi-square distribution is n with variance 2n.
14-3
Chi-Square Test for Independence
Expected Frequencies
•
Assuming that H0 is true, the expected frequency of row j and
column k is:
ejk = RjCk/n
where
Rj = total for row j (j = 1, 2, …, r)
Ck = total for column k (k = 1, 2, …, c)
n = sample size
Steps in Testing the Hypotheses
Step 1: State the Hypotheses
H0: Variable A is independent of variable B
H1: Variable A is not independent of variable B
14-4
Chi-Square Test for Independence
Steps in Testing the Hypotheses
•
Step 2: Specify the Decision Rule
Calculate n = (r – 1)(c – 1)
For a given a, look up the right-tail critical value (c2R) from
Appendix E or by using Excel.
Reject H0 if c2R > test statistic.
•
Step 3: Calculate the Expected Frequencies
ejk = RjCk/n
14-5
Chi-Square Test for Independence
Steps in Testing the Hypotheses
•
Step 4: Calculate the Test Statistic
The chi-square test statistic is
calc
•
Step 5: Make the Decision
Reject H0 if c2R > test statistic or if the
p-value < a.
14-6
Chi-Square Test for Independence
Small Expected Frequencies
•
•
The chi-square test is unreliable if the expected frequencies are
too small.
Rules of thumb:
• Cochran’s Rule requires that ejk > 5 for all cells.
• Up to 20% of the cells may have ejk < 5
•
•
Most agree that a chi-square test is infeasible
if ejk < 1 in any cell.
If this happens, try combining adjacent rows or
columns to enlarge the expected frequencies.
14-7
Chi-Square Test for Goodness-of-Fit
Why Do a Chi-Square Test on Numerical Data?
•
•
•
The researcher may believe there’s a relationship between X and Y, but
doesn’t want to use regression.
There are outliers or anomalies that prevent us from assuming that the
data came from a normal population.
The researcher has numerical data for one variable but not the other.
Purpose of the Test
•
•
•
The goodness-of-fit (GOF) test helps you decide whether your sample
resembles a particular kind of population.
The chi-square test will be used because it is versatile and easy to
understand.
Goodness-of-fit tests may lack power in small samples. As a guideline,
a chi-square goodness-of-fit test should be avoided if n < 25.
14-8
Chi-Square Test for Goodness-of-Fit
Multinomial GOF Test
•
•
A multinomial distribution is defined by any k probabilities p1, p2,
…, pk that sum to unity.
For example, consider the following “official” proportions of M&M
colors. The hypotheses are
H0: p1 = .13, p2 = .13, p3 = .24, p4 = .20, p5 = .16, p6 = .14
H1: At least one of the pj differs from the
hypothesized value
14-9
Chi-Square Test for Goodness-of-Fit
Hypotheses for GOF
•
•
The hypotheses are:
H0: The population follows a _____ distribution
H1: The population does not follow a ______
distribution
The blank may contain the name of any theoretical distribution (e.g.,
uniform, Poisson, normal).
14-10
Chi-Square Test for Goodness-of-Fit
Test Statistic and Degrees of Freedom
for GOF
•
Assuming n observations, the observations are grouped into c
classes and then the chi-square test statistic is found using:
calc
where
fj = the observed frequency of
observations in class j
ej = the expected frequency in class j if
H0 were true
14-11
Uniform Goodness-of-Fit Test
Uniform Distribution
•
•
•
The uniform goodness-of-fit test is a special case of the multinomial
in which every value has the same chance of occurrence.
The chi-square test for a uniform distribution compares all c groups
simultaneously.
The hypotheses are:
H0: p1 = p2 = …, pc = 1/c
H1: Not all pj are equal
14-12
Uniform Goodness-of-Fit Test
Uniform GOF Test: Grouped Data
•
•
•
•
•
•
The test can be performed on data that are already tabulated into
groups.
Calculate the expected frequency ej for each cell.
The degrees of freedom are n = c – 1 since there are no parameters
for the uniform distribution.
Obtain the critical value c2a from Appendix E for the desired level of
significance a.
The p-value can be obtained from Excel.
Reject H0 if p-value < a.
14-13
Uniform Goodness-of-Fit Test
Uniform GOF Test: Raw Data
•
•
•
•
•
First form c bins of equal width (Xmax – Xmin)/c and create a
frequency distribution.
Calculate the observed frequency fj for each bin.
Define ej = n/c and perform the chi-square calculations.
The degrees of freedom are n = c – 1 since there are no
parameters for the uniform distribution.
Obtain the critical value from Appendix E for a given significance
level a and make the decision.
14-14
Uniform Goodness-of-Fit Test
Uniform GOF Test: Raw Data
•
•
•
Calculate the mean and standard deviation of the uniform
distribution as:
s=
[(b – a + 1)2 – 1)/12
m = (a + b)/2
If the data are not skewed and the sample size is large (n > 30),
then the mean is approximately normally distributed.
So, test the hypothesized uniform mean using
14-15
14-15
Related documents