Download Assignment I

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Modern Statistics and Econometrics from a Computational Point
of View, G63.2707: Assignment I
by Neil Chriss
This assignment covers the material in the first two lectures. Please
note that you will have to use the book Judge, Hill, Griffiths, etc. as a
reference for some of the problems. The homework also has problems
from the discussion section as well. You may work in teams of size no
larger than 3 for this exercise, and each team may hand in a single report.
1. (From Poirier) Prove that the likelihood functions of the two experiments on page 124 of Poirier’s article have the stated likelihood
2. (From Poirier) Suppose that you run experiments one and two on
page 123–124 as follows:
• Protocol 1: you flip the coin 12 times and see three heads and
nine tails.
• Protocol 2: you flip the coin until you see 3 heads. The outcome of the experiment is 12 flips with three heads and nine
What is the maximum likelihood estimate in each protocol of the
probability p of flipping a head?
3. (Previous Problem Continued): Write a computer program to repeatedly simulate protocols 1 and 2 above, setting p = 3/12, where p is
the probability of flipping a head, in your program. Note that protocol 2 means "flip until you see 3 heads" and will not always produce
an outcome of 12 flips. Simulate each protocol 1,000 times and
from the results, estimate the probability and standard deviation
of the probability of flipping a head.
4. (Previous Problem Continued): Suppose protocol 1 yields a string
that is, 3 heads and 9 tails. Write a program to compute the boostrap standard error of the maximum likelihood estimate for the
probability p for protocol 1. Can this be done for protocol 2? Explain why or why not?
5. (Maximum likelihood): Let Y1 , . . . , YN be a sample from N(µ, σ ).
• What is the likelihood function of the sample?
• Assume that µ = 0 is known. What is the maximum likelihood
estimate of σ ? Is it unbiased? If it is biased, provide a formula
for the bias.
6. (Mixture of Distributions) Let f be a normal p.d.f. N(0, 1) and let
δa be a point mass at a. Let be small number and write
fa, = (1 − )f + δa
We call this the contaminated normal, where the contamination is
a point mass.
• Argue that to sample from fa, one can do a two stage sampling. Stage 1, determine the state, f or δa by sampling uniformly from [0, 1]. If the sample is in [0, 1 − ] then the state
is f , otherwise it is δa . Next, sample from the appropriate
• Write a computer program to sample from fa, for a = 5, 10, 20, 50,
and for = .01 and repeat the following exercise for each value
of a.
– Generate a sample of size 100 and compute the sample
mean µ̂ of the sample. Now compute the standard error
of µ̂ two ways. First by assuming that the "parent" population is normal, and then by using boostrap (use 1,000
replications). Describe the difference between the two estimates.
– Repeat the above example with sample size 200.
– Repeat again with sample size 1,000.
– Comment on the nature of the standard errors as the sample size grows.
– Report a one-paragraph description of your impressions
of the possible problems with using the standard error estimation assuming a normal parent when in fact the parent
population is contaminated.
– Can you use maximum likelihood estimation to estimate
the sample mean of the distribution fa, ? If so, outline the
procedure for carrying out the estimation.
– Given that you know a and , what is the true mean of the
distribution fa, ? State your answer in terms of a and .
7. (Maximum Entropy Estimation) Suppose a six-sided die is rolled
a large number of times you want to use maximum entropy estimation to find the probabilities p1 , . . . , p6 of rolling i = 1, . . . , 6.
Suppose after the j-th experiment you only know the average role
of the die aj . Find the Lagrange multiplier solution to the estimation problem described in the lecture notes. Make a table of
the results with the rows containing results for values of aj =
1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5. Let the columns be the probabilities
p1 , . . . , p6 .
8. (Previous Problem Continued) Design a simulation to test whether
maximum entropy estimation is appropriate in the previous problem. Describe the simulation and the rationale for its lending insight into the problem. Carry out the simulation.