Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
STAT6390 Bayesian Methods, Spring 2007 Homework 1 Due Feb 5, 2007 1. Show that random variables that are i.i.d. given are exchangeable. 2. 2. #1, p. 29 of GCSR 3. The Weibull distribution is a distribution that is often used for lifetimes of equipment/parts. It actually has two parameter but we will assume that one of them is fixed at 2. The Weibull(2) density is f ( y | ) 2ye y for 0 y . The parameter is something like the “inverse lifetime:” parameter (large means short lifetimes; small means long lifetimes). The mean of the 2 distribution is .886 .5 . Suppose we observe data y1 ,..., y n as independent (given ) samples from the Weibull(2) distribution. (a) Show that the gamma distribution is the conjugate prior distribution and derive the posterior distribution (i.e., identify and in the posterior distribution). (b) Derive the marginal distribution p( y1 ,..., y n ). Explain why we would say that y1 ,..., y n are independent conditionally but not independent in their marginal distribution. (c) In one application the lifetime of a kind of gear was measured in 100,000’s of hours (so y = 1 means the part lasted 100,000 hours and y = 0.5 means that it lasted 50,000 hours). Gears tend to last between 50,000 and 500,000 hours (depending on the size and manufacturer), with 100,000 being a typical lifetime. Select a reasonable gamma prior to capture this information and justify your choice. Sketch a graph of this prior. (d) Suppose that we observe n = 10 with y = (0.25, 0.52, 0.60, 0.91, 0.97, 1.00, 1.07, 1.09, 1.18, 1.48). Find and graph the posterior distribution. Give the posterior mean and variance. Give a 95% posterior interval for . y | y ) for a single new observation. Compare (e) Find the predictive distribution p ( ~ this to a reasonable frequentist prediction. 4. We are interested in studying the proportion of children born that are male. A sample of 12 families, each having six children, is obtained. Let yi denote the number of male children in family i for i = 1, 2, …, 12. The entire vector of 12 observations is y = (4, 3, 5, 6, 4, 3, 3, 4, 1, 3, 2, 1). (a) Let denote the proportion of male children born in this population. Give a plausible model for the vector y. (b) Assume that p() is uniform from (.4, .6). This prior distribution expresses our experience that there are approximately half boys and half girls born. This is not a conjugate prior because of the restricted range. Use a numerical method (similar to the one we used in class) to approximate the posterior distribution p(|y). Calculate the Bayes estimate and a 95% credibility interval for q, and a 95% credibility interval for the odds of a male birth. (c) Simulate a sample of 1000 draws from the posterior. Re-estimate the quantities above from the simulation. Compare results. (d) This problem shows that using a prior distribution that assigns zero probability to some part of the parameter space makes computing the posterior directly more difficult. However, there is a more important problem with such a prior distribution. Repeat the analysis of part (b) using data z = y + 4, where these data represent the number of boys in families of size 10. Explain what happens. What would happen if more and more families were observed, with approximately similar data as observed for z. Why is this behavior for the posterior a problem?