Download homework6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Stat120C
Homework 6 ()
Instructor: Zhaoxia Yu
Problem 0: This problem is optional Let Y = (Y1 , Y2 , Y3 ) follows a multinomial distribution with n
trials and probabilities p = (p1 , p2 , p3 ). Noet that the sum of pi ’s is 1, i.e., p1 + p2 + p3 = 1. The probability
mass function (pmf) is
n!
P r(Y = (y1 , y2 , y3 )) =
py 1 py 2 py 3
y1 !y2 !y3 ! 1 2 3
Here y1 , y2 , y3 are nonnegative integers that satisfy y1 + y2 + y3 = n.
(a) Show that Y1 follows Binomial(n, p1 ) by showing that
P r(Y1 = y1 ) =
X
P (Y1 = y1 , Y2 = y2 , Y3 = y3 ) =
y2 ,y3
n!
py1 (1 − p1 )n−y1
y1 !(n − y1 )! 1
Hint: the Binomial theorem is useful:
(a + b)n =
n
X
n!
ax bn−x
x!(n
−
x)!
x=0
(b) Prove that Cov(Y1 , Y2 ) = −np1 p2 , Cov(Y1 , Y3 ) = −np1 p3 , Cov(Y2 , Y3 ) = −np2 p3 .
P
Hint: E[Y1 Y2 ] =
y1 ,y2 ,y3 y1 y2 P r(Y1 = y1 , Y2 = y2 , Y3 = y3 ). Show that it equals n(n − 1)p1 p2 . The
P
n!
trinomial theorem is useful: (a + b + c)n = x+y+z=n x!y!z!
ax by cz .
Problem 1: (Modified from 13.8 of Rice with cell values changed) Adult-onset diabetes is known to
be highly genetically determined. A study was done comparing frequencies of a particular allele in a sample
of such diabetics and a sample of nondiabetics. The data are shown in the following table:
Bb or bb
BB
Diabetic
3
5
Normal
1
6
Are the relative frequencies of the allele significantly different in the two groups? State your hypotheses, test
statistic, significance level and whether you should reject your null based on Fisher’s exact test.
Problem 2: Suppose that 300 persons are selected at random from a large population, and each person in
the sample is classified according to blood type: O, A, B, or AB, also according to Rh: positive or negative.
The observed numbers are given below.
1
2
Homework 6
O
82
13
Rh+
Rh-
A
89
27
B
54
7
AB
19
9
(a) Conduct a Pearson’s chi-square test (at level α = 0.05) to test the hypothesis that the two classifications
of blood types are independent.
(b) Confirm your calculation in (a) using R.
> rhp = c(82, 89, 54, 19)
> rhn = c(13, 27, 7, 9)
> chisq.test(rbind(rhp, rhn), correct=F)
(c) Calculate the likelihood ratio statistic for testing independence. To do so, first calculate the maximized
likelihood under the full model, i.e., the model with no constraint. Denote it by L1 . Second, calculate the
maximized likelihood under the reduced model, i.e., the model assumes independence. Denote it by L0 .
Third, calculate 2(log(L1 ) − log(L0 )).
(d) Compare the test statistic in (c) to Pearson’s chi-square statistic. Under the null hypothesis of independence, the likelihood ratio statistic follows a chi-squared distribution with three degrees of freedom. Based
upon the likelihood ratio statistic, would you reject the null the hypothesis at level α = 0.05?
Problem 3: Consider random samples taken from J populations, and each observation can be classified as
one of I different types. Let nij be the number of subjects classified to the ith type from population j. The
data can be arranged in the following I × J table.
n11
n21
..
.
n12
n22
..
.
···
···
..
.
n1J
n2J
..
.
n1·
n2·
..
.
nI1
n·1
nI2
n·2
···
···
nIJ
n·J
nI·
n··
Let pij denote the probability that an observation chosen at random from the jth population will be of type
i. Thus
I
X
pij = 1 for j = 1, · · · , J,
i=1
and the data from the jth population (n1j , n2j , · · · , nIj ) come from a multinomial distribution with n·j trials
and cell probabilities (p1j , p2j , · · · , pIj ).
The null hypothesis of homogeneity claims
H0 : pi1 = pi2 = · · · = piJ = pi for i = 1, · · · , I
Derive the maximum likelihood estimates of p1 , p2 , · · · , pI under the assumption of homogeneity.
Problem 4: In a study 1200 schoolchildren were questioned on whether they had severe colds at the age of
10 and at the age 12. The data are summarized in the table below:
Homework 6
3
Severe colds at age 10
Severe colds at age 12
Yes
No
165
Yes 200
No 235
600
Conduct a statistical analysis to test whether there was a significant change of the prevalence of severe cold.
Related documents