Download STAT6390 Bayesian Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STAT6390 Bayesian Methods, Spring 2007
Homework 1
Due Feb 5, 2007
1. Show that random variables that are i.i.d. given  are exchangeable.
2. 2. #1, p. 29 of GCSR
3. The Weibull distribution is a distribution that is often used for lifetimes of
equipment/parts. It actually has two parameter but we will assume that one of
them is fixed at 2. The Weibull(2) density is f ( y |  )  2ye y for 0  y  .
The parameter  is something like the “inverse lifetime:” parameter (large 
means short lifetimes; small  means long lifetimes). The mean of the
2
distribution is .886 .5 . Suppose we observe data y1 ,..., y n as independent
(given  ) samples from the Weibull(2) distribution.
(a) Show that the gamma distribution is the conjugate prior distribution and derive the
posterior distribution (i.e., identify  and  in the posterior distribution).
(b) Derive the marginal distribution p( y1 ,..., y n ). Explain why we would say that
y1 ,..., y n are independent conditionally but not independent in their marginal
distribution.
(c) In one application the lifetime of a kind of gear was measured in 100,000’s of
hours (so y = 1 means the part lasted 100,000 hours and y = 0.5 means that it
lasted 50,000 hours). Gears tend to last between 50,000 and 500,000 hours
(depending on the size and manufacturer), with 100,000 being a typical lifetime.
Select a reasonable gamma prior to capture this information and justify your
choice. Sketch a graph of this prior.
(d) Suppose that we observe n = 10 with y = (0.25, 0.52, 0.60, 0.91, 0.97, 1.00, 1.07,
1.09, 1.18, 1.48). Find and graph the posterior distribution. Give the posterior
mean and variance. Give a 95% posterior interval for  .
y | y ) for a single new observation. Compare
(e) Find the predictive distribution p ( ~
this to a reasonable frequentist prediction.
4. We are interested in studying the proportion of children born that are male. A
sample of 12 families, each having six children, is obtained. Let yi denote the
number of male children in family i for i = 1, 2, …, 12. The entire vector of 12
observations is y = (4, 3, 5, 6, 4, 3, 3, 4, 1, 3, 2, 1).
(a) Let  denote the proportion of male children born in this population. Give a
plausible model for the vector y.
(b) Assume that p() is uniform from (.4, .6). This prior distribution expresses our
experience that there are approximately half boys and half girls born. This is not a
conjugate prior because of the restricted range. Use a numerical method (similar
to the one we used in class) to approximate the posterior distribution p(|y).
Calculate the Bayes estimate and a 95% credibility interval for q, and a 95%
credibility interval for the odds of a male birth.
(c) Simulate a sample of 1000 draws from the posterior. Re-estimate the quantities
above from the simulation. Compare results.
(d) This problem shows that using a prior distribution that assigns zero probability to
some part of the parameter space makes computing the posterior directly more
difficult. However, there is a more important problem with such a prior
distribution. Repeat the analysis of part (b) using data z = y + 4, where these data
represent the number of boys in families of size 10. Explain what happens. What
would happen if more and more families were observed, with approximately
similar data as observed for z. Why is this behavior for the posterior a problem?
Related documents