Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
STA 517 – Introduction: Distribution and Inference
1
1.5 STATISTICAL INFERENCE FOR
MULTINOMIAL PARAMETERS
Recall multi(n, =(1, 2, …, c))
Suppose that each of n independent, identical trials can
have outcome in any of c categories.
if trial i has outcome in category j
= 0 otherwise
represents a multinomial trial, with
Let
denote the number of trials having
outcome in category j.
The counts
distribution.
Note:
have the multinomial
are random variables
STA 517 – Introduction: Distribution and Inference
2
Example: Mendel’s theory
To test Mendel’s theories of natural inheritance. Mendel
crossed pea plants of pure yellow strain with plants of
pure green strain.
He predicted that second-generation hybrid seeds
would be 75% yellow and 25% green, yellow being the
dominant strain.
One experiment: produce n=8023 seeds, and observed
n1=6022 yellow, n2=2001 green.
He want to test whether it follows 3:1 ratio.
STA 517 – Introduction: Distribution and Inference
3
1.5.1 Estimation of Multinomial
Parameters
To obtain MLE, the multinomial probability mass
function is proportional to the kernel
The MLE are the {j} that maximize (1.14).
Log likelihood
c 1
c 1
j 1
j 1
L( ) n j log j n j log j nc log( 1 j )
j
Differentiating L with respect to j gives the likelihood
equation
ML solution satisfies
STA 517 – Introduction: Distribution and Inference
MLE
Now
Thus
MLE
The MLE are the sample proportions.
4
STA 517 – Introduction: Distribution and Inference
1.5.2 Pearson Statistic for Testing a
Specified Multinomial
In 1900 the eminent British statistician Karl Pearson
introduced a hypothesis test that was one of the first
inferential methods.
It had a revolutionary impact on categorical data
analysis, which had focused on describing associations.
Pearson’s test evaluates whether multinomial
parameters equal certain specified values.
5
STA 517 – Introduction: Distribution and Inference
Pearson Statistic
Consider
When H0 is true, the expected values of {nj}, called
expected frequencies, are
Pearson proposed the test statistics
Greater difference
values, for fixed n.
Let
produce greater X2
denote the observed value of X2. The P-value is
6
STA 517 – Introduction: Distribution and Inference
1.5.3 Example: Testing Mendel’s
Theories
n1=6022 yellow, n2=2001 green
MLE:
ˆ1
6022
2001
ˆ
0.7506, 2
0.2494,
8023
8023
test whether it follows 3:1 ratio, i.e.
H 0 : 1 10 0.75, 2 20 0.25
Expected frequencies are
This does not contradict Mendel’s hypothesis.
7
STA 517 – Introduction: Distribution and Inference
SAS code
data D;
input outcome $ w;
cards;
yellow
green
6022
2001
;
proc freq; weight w;
table outcome/chisq TESTP=(0.25 0.75);
run;
8
STA 517 – Introduction: Distribution and Inference
9
Pearson statistic
When c=2, it can be proved Pearson chi-square statistic
is squared score statistic
PROOF: by Maple in matlab
syms y n pi0
f=(y-n*pi0)^2/pi0+((n-y)-n*(1-pi0))^2/(1-pi0);
f1=simple(f)
%result: -(-y+pi0*n)^2/n/pi0/(-1+pi0)
How about c>2?
STA 517 – Introduction: Distribution and Inference
10
1.5.5 Likelihood-Ratio Chi-Squared
An alternative test for multinomial parameters uses the
likelihood-ratio test.
The kernel of the multinomial likelihood is
Under H0 the likelihood is maximized when
In the general case, it is maximized when
The ratio of the likelihoods equals
Thus, the likelihood-ratio statistic is
STA 517 – Introduction: Distribution and Inference
LR
In the general case, the parameter space consists of
{j} subject to j=1, so the dimensionality is c-1.
Under H0, the {j} are specified completely, so the
dimension is 0. The difference in these dimensions
equals c-1.
For large n, G2 has a chi-squared null distribution with
df c-1.
11
STA 517 – Introduction: Distribution and Inference
Both chi-squared dist. With df=c-1
Asymptotically equivalent
12
STA 517 – Introduction: Distribution and Inference
Wu, Ma, George (2007)
13
STA 517 – Introduction: Distribution and Inference
14
1.5.6 Testing with Estimated
Expected Frequencies
Pearson’s chi-square was proposed for testing
H0: j=j0, where j0 are fixed.
In some application, j0=j0() are function of a small
set of unknown parameters .
ML estimates of determine ML estimates
{j0=j0()} and hence ML estimates
expected frequencies in X2.
Replacing
of X2.
by estimates
the true df=(c-1)-dim()
of
of
affects the distribution
STA 517 – Introduction: Distribution and Inference
15
Example
A sample of 156 dairy calves born in Okeechobee
County, Florida, were classified according to whether
they caught pneumonia within 60 days of birth.
Calves that got a pneumonia infection were also
classified according to whether they got a secondary
infection within 2 weeks after the first infection cleared
up.
Hypothesis: the primary infection had an immunizing
effect that reduced the likelihood of a secondary
infection.
How to test it?
STA 517 – Introduction: Distribution and Inference
16
Data structure
Calves that did not get a primary infection could not get
a secondary infection, so no observations can fall in the
category for ‘‘no’’ primary infection and ‘‘yes’’
secondary infection.
That combination is called a structural zero.
STA 517 – Introduction: Distribution and Inference
17
Test: whether the probability of primary
infection was the same as the conditional probability of
secondary infection, given that the calf got the primary
infection.
ab denotes the probability that a calf is classified in row
a and column b of this table, the null hypothesis is
Let =11+12 denote the probability of primary
infection. Then hypothesis probability is
STA 517 – Introduction: Distribution and Inference
18
MLE and chi-squared test
Likelihood
Log likelihood
Differentiation with respect to
Solution
For the example
Expected counts for each cell
Conclusion: the primary infection had an immunizing effect that
reduced the likelihood of a secondary infection.
STA 517 – Introduction: Distribution and Inference
19
Standard Error
Since
the information is its expected value, which is
which simplifies to
The asymptotic standard error is the square root of the
inverse information, or
STA 517 – Introduction: Distribution and Inference
How about confidence limits?
20
STA 517 – Introduction: Distribution and Inference
SAS code - MLE, test for binomial
proc IML;
y=842; n=1824;pi0=0.5; /*data*/
pihat=y/n; SE=sqrt(pihat*(1-pihat)/n); /*MLE*/
WaldStat=(pihat-pi0)**2/SE**2;
pWald=1-CDF('CHISQUARE', WaldStat, 1);
LR=2*(y*log(pihat/(pi0)) +(n-y)*log((1-pihat)/(1-pi0)));
pLR=1-CDF('CHISQUARE',LR, 1);
ScoreStat=(pihat-pi0)**2/(pi0*(1-pi0)/n);
pScore=1-CDF('CHISQUARE',ScoreStat, 1);
print WaldStat pWald;
print LR pLR;
print ScoreStat pScore;
21
STA 517 – Introduction: Distribution and Inference
SAS code - MLE, test for binomial
data D;
input outcome $ w;
cards;
Yes 842
No
982
;
proc freq;
weight w;
table outcome/all CL
BINOMIAL(P=0.5
LEVEL="Yes");
exact binomial;
run;
22
STA 517 – Introduction: Distribution and Inference
SAS code – multinomial
data D;
input outcome $ w;
cards;
yellow
green
6022
2001
;
proc freq; weight w;
table outcome/chisq TESTP=(0.25 0.75);
run;
23