Download ast3e_chapter11

Document related concepts
Transcript
Chapter 11
Analyzing the Association
Between Categorical
Variables
Section 11.1
Independence and Dependence (Association)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness
Table 11.2 Conditional Percentages for Happiness Categories, Given Each
Category of Family Income.
3
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness
The percentages in a particular row of a table are called
conditional percentages.
They form the conditional distribution for happiness, given
a particular income level.
Table 11.2 Conditional Percentages for Happiness Categories, Given Each
Category of Family Income.
4
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness
Figure 11.1 MINITAB Bar Graph of Conditional Distributions of Happiness, Given
Income, from Table 11.2. For each happiness category, you can compare the three
categories of income on the percent in that happiness category. Question: Based on this
graph, how would you say that happiness depends on income?
5
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness
Guidelines when constructing tables with conditional
distributions:
6

Make the response variable the column variable.

Compute conditional proportions for the response
variable within each row.

Include the total sample sizes.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Independence vs. Dependence
For two variables to be independent, the population
percentage in any category of one variable is the same for all
categories of the other variable.
For two variables to be dependent (or associated), the
population percentages in the categories are not all the same.
7
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Independence vs. Dependence
Are race and belief in life after death independent or
dependent?
 The conditional distributions in the table are similar but
not exactly identical.
 It is tempting to conclude that the variables are
dependent.
Table 11.4 Sample Conditional Distributions for Belief in Life After Death, Given Race
8
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Independence vs. Dependence
Are race and belief in life after death independent or
dependent?
9

The definition of independence between variables
refers to a population.

The table is a sample, not a population.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Independence vs. Dependence
Even if variables are independent, we would not expect
the sample conditional distributions to be identical.
Because of sampling variability, each sample percentage
typically differs somewhat from the true population
percentage.
10
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 11
Analyzing the Association
Between Categorical
Variables
Section 11.2
Testing Categorical Variables for
Independence
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Testing Categorical Variables for Independence
Create a table of frequencies divided into the categories
of the two variables:

The hypotheses for the test are:
H0
: The two variables are independent.
Ha
: The two variables are dependent (associated).
The test assumes random sampling and a large sample size
(cell counts in the frequency table of at least 5).
12
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Expected Cell Counts If the Variables
Are Independent
The count in any particular cell is a random variable.
 Different samples have different count values.
The mean of its distribution is called an expected cell
count.
 This is found under the presumption that H 0 is true.
13
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
How Do We Find the Expected Cell Counts?
Expected Cell Count:
For a particular cell,
(Row total) (Column total)
Expected cell count 
Total sample size
The expected frequencies are values that have the
same row and column totals as the observed counts, but
for which the conditional distributions are identical (this
is the assumption of the null hypothesis).
14
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
How Do We Find the Expected Cell Counts?
Table 11.5 Happiness by Family Income, Showing Observed and Expected Cell
Counts. We use the highlighted totals to get the expected count of 66.86 = (315 *
423)/1993 in the first cell.
15
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chi-Squared Test Statistic
The chi-squared statistic summarizes how far the
observed cell counts in a contingency table fall from the
expected cell counts for a null hypothesis.
2
(observed count - expected count)
 
expected count
2
16
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness and Family Income
State the null and alternative hypotheses for this test.
 H 0 : Happiness and family income are independent
 H a : Happiness and family income are dependent
(associated)
17
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness and Family Income
2
Report the x statistic and explain how it was calculated.
2
 To calculate the x statistic, for each cell, calculate:
2
(observed count - expected count)
expected count
 Sum the values for all the cells.
 The
18
x
2
value is 106.955.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Happiness and Family Income
19
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Test Statistic
2
Insight: The larger the x value, the greater the
evidence against the null hypothesis of independence
and in support of the alternative hypothesis that
happiness and income are associated.
20
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Distribution
2
To convert the x test statistic to a P-value, we use the
2
sampling distribution of the x statistic.
For large sample sizes, this sampling distribution is well
approximated by the chi-squared probability
distribution.
21
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Distribution
Figure 11.3 The Chi-Squared Distribution. The curve has larger mean and standard
deviation as the degrees of freedom increase. Question: Why can’t the chi-squared
statistic be negative?
22
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Distribution
Main properties of the chi-squared distribution:

It falls on the positive part of the real number line.

The precise shape of the distribution depends on
the degrees of freedom:
df  (r  1)(c  1)
23
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Distribution
Main properties of the chi-squared distribution
(cont’d):

The mean of the distribution equals the df value.

It is skewed to the right.

24
2
The larger the x value, the greater the evidence
against H 0 : independence.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Chi-Squared Distribution
Table 11.7 Rows of Table C Displaying Chi-Squared Values. The values have right-tail
probabilities between 0.250 and 0.001. For a table with r = 3 rows and c = 3 columns,
df = (r - 1) x (c - 1) = 4, and 9.49 is the chi-squared value with a right-tail probability of 0.05.
25
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Five Steps of the Chi-Squared Test
of Independence
1. Assumptions:
26

Two categorical variables

Randomization

Expected counts  5 in all cells
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Five Steps of the Chi-Squared Test of
Independence
2. Hypotheses:
27

H 0 : The two variables are independent

H a : The two variables are dependent (associated)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Five Steps of the Chi-Squared Test of
Independence
3. Test Statistic:
2
(observed count - expected count)
 
expected count
2
28
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Five Steps of the Chi-Squared Test of
Independence
4. P-value:
 Right-tail probability above the observed
value, for the chi-squared distribution with
df  (r  1)(c  1) .
5. Conclusion:
 Report P-value and interpret in context. If a
decision is needed, reject H 0 when P-value
significance level.
29
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.

Chi-Squared is Also Used as a “Test of
Homogeneity”
The chi-squared test does not depend on which is the
response variable and which is the explanatory variable.
When a response variable is identified and the
population conditional distributions are identical, they are
said to be homogeneous.

30
The test is then referred to as a test of homogeneity.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chi-Squared and the Test Comparing
Proportions in 2x2 Tables
In practice, contingency tables of size 2x2 are very
common. They often occur in summarizing the responses
of two groups on a binary response variable.
 Denote the population proportion of success by p1
in group 1 and p2 in group 2.
 If the response variable is independent of the
group, p1  p2 , so the conditional distributions are
equal.
 H 0 : p1  p2 is equivalent to H 0 : independence
z  x where
z  ( pˆ1  pˆ 2 ) / se0
2
31
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Aspirin and Heart Attacks Revisited
Table 11.9 Annotated MINITAB Output for Chi-Squared Test of Independence of Group
(Placebo, Aspirin) and Whether or Not Subject Died of Cancer. The same P-value results as
with a two-sided Z test comparing the two population proportions.
32
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Aspirin and Heart Attacks
Revisited
What are the hypotheses for the chi-squared test for
these data?
 The null hypothesis is that whether a doctor has a heart
attack is independent of whether he takes placebo or
aspirin.
 The alternative hypothesis is that there’s an
association.
33
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Aspirin and Heart Attacks
Revisited
Report the test statistic and P-value for the chisquared test:
 The test statistic is 11.35 with a P-value of 0.001.
This is very strong evidence that the population
proportion of heart attacks differed for those taking
aspirin and for those taking placebo.
The sample proportions indicate that the aspirin group
had a lower rate of heart attacks than the placebo group.
34
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Limitations of the Chi-Squared Test
If the P-value is very small, strong evidence exists
against the null hypothesis of independence.
But…
The chi-squared statistic and the P-value tell us
nothing about the nature of the strength of the
association.
We know that there is statistical significance, but the
test alone does not indicate whether there is practical
significance as well.
35
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Limitations of the Chi-Squared Test
The chi-squared test is often misused. Some examples are:
36

When some of the expected frequencies are too small.

When separate rows or columns are dependent
samples.

Data are not random.

Quantitative data are classified into categories - results
in loss of information.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
“Goodness of Fit” Chi-Squared Tests
The Chi-Squared test can also be used for testing
particular proportion values for a categorical variable.
 The null hypothesis is that the distribution of the
variable follows a given probability distribution; the
alternative is that it does not.
 The test statistic is calculated in the same manner
where the expected counts are what would be
expected in a random sample from the
hypothesized probability distribution.
 For this particular case, the test statistic is referred
to as a goodness-of-fit statistic.
37
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 11
Analyzing the Association
Between Categorical
Variables
Section 11.3
Determining the Strength of the Association
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Analyzing Contingency Tables
1. Is there an association?
39

The chi-squared test of independence addresses
this.

When the P-value is small, we infer that the
variables are associated.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Analyzing Contingency Tables
2. How do the cell counts differ from what
independence predicts?

40
To answer this question, we compare each
observed cell count to the corresponding expected
cell count.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Analyzing Contingency Tables
3. How strong is the association?

41
Analyzing the strength of the association reveals
whether the association is an important one, or if it
is statistically significant but weak and unimportant
in practical terms.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Measures of Association
A measure of association is a statistic or a parameter
that summarizes the strength of the dependence between
two variables.

42
A measure of association is useful for comparing
associations to determine which is stronger.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Difference of Proportions
An easily interpretable measure of association is the
difference between the proportions making a particular
response.
Table 11.12 Decision About Accepting a Credit Card Cross-Tabulated by Income.
43
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Difference of Proportions
Case (a) in Table 11.12 exhibits the weakest possible
association – no association. The difference of
proportions is 0.
Case (b) exhibits the strongest possible association:
The difference of proportions is 1.
44
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Difference of Proportions
In practice, we don’t expect data to follow either extreme
(0% difference or 100% difference), but the stronger the
association, the larger the absolute value of the
difference of proportions.
45
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Student Stress, Depression and
Gender
Table 11.13 Conditional Distributions of Stress and Depression, by Gender
 Which response variable, stress or depression, has the
stronger sample association with gender?
 The difference of proportions between females and males was
0.35 – 0.16 = 0.19 for feeling stressed.
 The difference of proportions between females and males was
0.08 – 0.06 = 0.02 for feeling depressed.
46
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Student Stress, Depression and
Gender
In the sample, stress (with a difference of proportions =
0.19) has a stronger association with gender than
depression has (with a difference of proportions = 0.02).
47
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Ratio of Proportions: Relative Risk
Another measure of association, is the ratio of two
proportions: p1 / p2 .
In medical applications in which the proportion refers to
an adverse outcome, it is called the relative risk.
48
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Seat Belt Use and Outcome of
Auto Accidents
Treating the auto accident outcome as the response
variable, find and interpret the relative risk.
Table 11.14 Outcome of Auto Accident by Whether or Not Subject Wore Seat Belt.
49
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Seat Belt Use and Outcome of
Auto Accidents
 The adverse outcome is death.
 The relative risk is formed for that outcome.
 For those who wore a seat belt, the proportion who
died equaled 510/412,878 = 0.00124.
 For those who did not wear a seat belt, the proportion
who died equaled 1601/164,128 = 0.00975.
50
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Seat Belt Use and Outcome of
Auto Accidents
The relative risk is the ratio:
51

0.00124/0.00975 = 0.127

The proportion of subjects wearing a seat belt who
died was 0.127 times the proportion of subjects
not wearing a seat belt who died.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Seat Belt Use and Outcome of
Auto Accidents
Many find it easier to interpret the relative risk but
reordering the rows of data so that the relative risk has
value above 1.0.
Reversing the order of the rows, we calculate the ratio:
 0.00975/0.00124 = 7.9

52
The proportion of subjects not wearing a seat belt
who died was 7.9 times the proportion of subjects
wearing a seat belt who died.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Seat Belt Use and Outcome of
Auto Accidents
A relative risk of 7.9 represents a strong association
53

This is far from the value of 1.0 that would occur if
the proportion of deaths were the same for each
group.

Wearing a set belt has a practically significant
effect in enhancing the chance of surviving an
auto accident.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Properties of the Relative Risk
 The relative risk can equal any nonnegative number.
 When p1  p2 , the variables are independent and
relative risk = 1.0.
 Values farther from 1.0 (in either direction) represent
stronger associations.
54
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Large x 2 Does Not Mean There’s a Strong
Association
 A large chi-squared value provides strong evidence that
the variables are associated.
 It does not imply that the variables have a strong
association.
 This statistic merely indicates (through its P-value) how
certain we can be that the variables are associated, not
how strong that association is.
2
 For a given strength of association, larger x values occur
2
for larger sample sizes. Large x values can occur with
weak associations, if the sample size is sufficiently large.
55
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 11
Analyzing the Association
Between Categorical
Variables
Section 11.4
Using Residuals to Reveal the Pattern of
Association
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Association Between Categorical
Variables
 The chi-squared test and measures of association such
as ( p1  p2 ) and p1 / p2 are fundamental methods
for analyzing contingency tables.

57
2
x
The P-value for
summarized the strength of
evidence against H 0 : independence.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Association Between Categorical
Variables
 If the P-value is small, then we conclude that
somewhere in the contingency table the population cell
proportions differ from independence.
 The chi-squared test does not indicate whether all
cells deviate greatly from independence or perhaps only
some of them do so.
58
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Residual Analysis
 A cell-by-cell comparison of the observed counts with
the counts that are expected when
H 0 is true reveals the
H0 .
nature of the evidence against
 The difference between an observed and expected
count in a particular cell is called a residual.
59
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Residual Analysis
 The residual is negative when fewer subjects are in
the cell than expected under
H0 .
 The residual is positive when more subjects are in the
cell than expected under
H0 .
60
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Residual Analysis
To determine whether a residual is large enough to
indicate strong evidence of a deviation from independence
in that cell we use a adjusted form of the residual: the
standardized residual.
61
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Residual Analysis




62
The standardized residual for a
cell = (observed count – expected count)/se
A standardized residual reports the number of
standard errors that an observed count falls from its
expected count.
The se describes how much the difference would
tend to vary in repeated sampling if the variables
were independent. Its formula is complex, software
can be used to find its value.
A large standardized residual value provides
evidence against independence in that cell.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Religiosity and Gender
“To what extent do you consider yourself a religious
person?”
Table 11.17 Religiosity by Gender, With Expected Counts and Standardized Residuals.
63
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Religiosity and Gender
Interpret the standardized residuals in the table.
 The table exhibits large positive residuals for the cells
for females who are very religious and for males who
are not at all religious.
 In these cells, the observed count is much larger than
the expected count.
 There is strong evidence that the population has more
subjects in these cells than if the variables were
independent.
64
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Religiosity and Gender
 The table exhibits large negative residuals for the cells
for females who are not at all religious and for males who
are very religious.
 In these cells, the observed count is much smaller
than the expected count.
 There is strong evidence that the population has fewer
subjects in these cells than if the variables were
independent.
65
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 11
Analyzing the Association
Between Categorical
Variables
Section 11.5
Small Sample Sizes: Fisher’s Exact Test
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Fisher’s Exact Test
 The chi-squared test of independence is a largesample test.
 When the expected frequencies are small, any of them
being less than about 5, small-sample tests are more
appropriate.
 Fisher’s exact test is a small-sample test of
independence.
67
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Fisher’s Exact Test
 The calculations for Fisher’s exact test are complex.
 Statistical software can be used to obtain the P-value
for the test that the two variables are independent.
 The smaller the P-value, the stronger the evidence
that the variables are associated.
68
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
This is an experiment conducted by Sir Ronald Fisher.
 His colleague, Dr. Muriel Bristol, claimed that when
drinking tea she could tell whether the milk or the tea had
been added to the cup first.
69
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
Experiment:
 Fisher asked her to taste eight cups of tea:
 Four had the milk added first.
 Four had the tea added first.
 She was asked to indicate which four had the
milk added first.
 The order of presenting the cups was
randomized.
70
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
Results:
Table 11.18 Result of Tea-Tasting Experiment. The table cross-tabulates what was actually
poured first (milk or tea) by what Dr. Bristol predicted was poured first. She had to indicate
which four of the eight cups had the milk poured first.
71
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
Analysis:
Table 11.19 Result of Fisher’s Exact Test for Tea-Tasting Experiment. The chi-squared Pvalue is listed under Asymp. Sig. and the Fisher’s exact test P-values are listed under Exact
Sig. “Sig” is short for significance and “asymp.” is short for asymptotic.
72
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
The one-sided version of the test pertains to the
alternative that her predictions are better than random
guessing.
Does the P-value suggest that she had the ability to
predict better than random guessing?
73
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: A Tea-Tasting Experiment
The P-value of 0.243 does not give much evidence
against the null hypothesis.
The data did not support Dr. Bristol’s claim that she
could tell whether the milk or the tea had been added to
the cup first.
If she had predicted all four cups correctly, the one-sided
P-value would have been 0.014. We might then believe
her claim.
74
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Summary of Fisher’s Exact Test of
Independence for 2x2 Tables
1. Assumptions:
 Two binary categorical variables
 Data are random
2. Hypotheses:
 H 0 : the two variables are independent ( p1  p2 )
 H a : the two variables are associated
( p1  p2 or p1  p2 or p1  p2 )
75
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Summary of Fisher’s Exact Test of
Independence for 2x2 Tables
3. Test Statistic:
 First cell count (this determines the others given
the margin totals).
4. P-value:
 Probability that the first cell count equals the
observed value or a value even more extreme as
predicted by H a .
5. Conclusion:
 Report the P-value and interpret in context. If a
decision is required, reject H 0 when P-value 
significance level.
76
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.