Download Chapter 13

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
Chapter 13
Categorical Data Analysis
Categorical Data and the
Multinomial Distribution
Properties of the Multinomial Experiment
1.
2.
3.
4.
5.
Experiment has n identical trials
There are k possible outcomes to each trial, called
classes, categories or cells
Probabilities of the k outcomes remain constant from
trial to trial
Trials are independent
Variables of interest are the cell counts, n1, n2…nk, the
number of observations that fall into each of the k
classes
2
Testing Category Probabilities:
One-Way Table
In a multinomial experiment with categorical
data from a single qualitative variable, we
summarize data in a one-way table.
Schema for one-way table for an experiment with k outcomes
Outcomes
Counts
k1
n1
k2
n2
…
…
k
nk
3
Testing Category Probabilities:
One-Way Table
Hypothesis Testing for a One-Way Table
•Based on the 2 statistic, which allows comparison
between the observed distribution of counts and an
expected distribution of counts across the k classes
•Expected distribution = E(nk)=npk, where n is the total
number of trials, and pk is the hypothesized probability of
being in class k according to H0
2
k
ni  E (ni )

2
2
•The test statistic,  , is calculated as   
E  ni 
i 1
and the rejection region is determined
by the 2 distribution using k-1 df and the desired 
4
Testing Category Probabilities:
One-Way Table
Hypothesis Testing for a One-Way Table
•The null hypothesis is often formulated as a no difference,
where H0: p1=p2=p3=…=pk=1/k, but can be formulated with
non-equivalent probabilities
•Alternate hypothesis states that Ha: at least one of the
multinomial probabilities does not equal its hypothesized
value
5
Testing Category Probabilities:
One-Way Table
Hypothesis Testing for a One-Way Table
•The null hypothesis is often formulated as a no
difference, where H0: p1=p2=p3=…=pk=1/k, but
can be formulated with non-equivalent
probabilities
•Alternate hypothesis states that Ha: at least one of
the multinomial probabilities does not equal its
hypothesized value
6
Testing Category Probabilities:
One-Way Table
One-Way Tables: an example
H0: pLegal=.07, pdecrim=.18, pexistlaw=.65, pnone=.10
Ha: At least 2 proportions
differ from proposed plan
Rejection region with
=.01, df = k-1 = 3 is
11.3449
Since the test statistic
falls in the rejection
region, we reject H0
7
Testing Category Probabilities:
One-Way Table
Conditions Required for a valid 2 Test
•Multinomial experiment has been
conducted
•Sample size is large, with E(ni) at least 5 for
every cell
8
Testing Category Probabilities:
Two-Way (Contingency) Table
Used when classifying with two qualitative variables
Row
General r x c Contingency Table
Column
1
2
…
1
n11
n12
…
2
n21
n22
…
…
…
…
r
nr1
nr2
…
Column Totals
C1
C2
…
c
n1c
n2c
…
nrc
Cc
Row Totals
R1
R2
…
Rr
n
H0: The two classifications are independent
Ha: The two classifications are
dependent
2
n  E 
Test Statistic:  2    ij ij  where Eij  Ri C j
Eij
n
Rejection region:2>2, where 2 has (r-1)(c-1) df
9
Testing Category Probabilities: TwoWay (Contingency) Table
Conditions Required for a valid 2 Test
•N observed counts are a random sample
from the population of interest
•Sample size is large, with E(ni) at least 5 for
every cell
10
Testing Category Probabilities: TwoWay (Contingency) Table
Sample Statistical package output
11
A Word of Caution about
Chi-Square Tests
•When an expected cell count is less than 5,
2 probability distribution should not be
used
•If H0 is not rejected, do not accept H0 that
the classifications are independent, due to
the implications of a Type II error.
•Do not infer causality when H0 is rejected.
Contingency table analysis determines
statistical dependence only.
12