Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MATH 472 – Bayesian Estimation Problems You may use any of the results we have derived in class. Just be sure to state the result so that I know you are using it. 1. The first step in testing a new cancer drug is to determine a safe dosage. This is done in what are called Phase I trials. Any cancer drug will cause a toxicity – that is, an adverse reaction – in some unknown proportion p of the patient population. In order to estimate p, researchers give the new drug to a random sample of n patients, and observe the number k of toxicities. Suppose we have a brand new drug, with no prior information on its toxicity. At the beginning of our Phase I trials we give the drug to n 20 patients, and observe k 6 toxicities. Please answer the following. You may want to include pictures from the Maple Worksheet made available to you to show the graphs of the distributions you find. (a) Find the posterior distribution of p given this information. (b) Find a 90% credibility interval for p. (c) We continue the Phase I testing by giving the drug to m 10 new patients selected at random. What is the predicted probability (using the posterior distribution in (a)) that the first of these new patients will have a toxic reaction to the drug. (d) Find the predicted probability (using the posterior distribution in (a)) that exactly j of the new patients will experience toxicities. (e) Suppose it now turns out that j 4 of the 10 new patients experience toxicities. Now what is the posterior distribution of p given all the results up to this point? 2. Suppose you have an underlying Bernoulli distribution with unknown probability of success p. You repeatedly sample from the population. For the first sample you have a sample size of n1 with y1 successes, for the second sample you have a sample size of n2 with y2 successes, etc. m m i 1 i 1 You repeat this process m times. Let N ni and Y xi . Thus N is the cumulative number of trials and Y is the cumulative number of successes. (a) Starting with a Beta prior with parameters and and updating dating your prior on the next step to be the posterior from the previous step, find the posterior distribution of p after the last sample has been collected and used to update the posterior (i.e. the posterior after you have repeated this process m times). (b) Find the mean of the distribution in part (a). Determine the limit as N of the mean. Explain how this is related to a frequency approach to the estimation of the probability of success p. (c) Find the variance of the distribution in part (a). Determine the limit as N of the variance. Explain why this makes sense. n 3. Let X i , i 1...n be a random sample from a Poisson distribution with mean and Y X i . i 1 Let the prior distribution of be gamma with parameters and . Show that the posterior n distribution of given the data from your random sample (i.e. X i xi , i 1...n and y xi ) is i 1 gamma with parameters y and 1 n 1 . Be sure to show and justify all steps in your derivation. 4. Consider the Galton Board from which you have collected data. In class we discussed how the proportion of balls landing in slots zero through six could be modeled using the binomial distribution with parameters n 6 and p 0.5 . To determine if this model is reasonable for the Galton Board, you will use the Maple Worksheet made available to you and Bayesian Analysis. Let be the probability a ball will land in slot 2, 3, or 4 (i.e. “success” is landing in slot 2, 3, or 4). (a) Using the binomial model above, find a theoretical probability for . (b) Using the Maple worksheet, decide on an informative beta prior that reflects your certainty about this theoretical probability. Clearly indicate the “spread” you entered on the Maple worksheet and the alpha, and beta values for this prior. (c) Now use your data to find the posterior distribution of . (d) Lastly determine a 95% credibility region for , the probability a ball will land in slot 2, 3, or 4 (i.e. success is landing in slot 2, 3, or 4) based on your posterior distribution. Include a graph of this in your solution (cut and paste the graph from Maple onto a document). (e) Based on your results, do you believe the binomial model above is a good model for this Galton Board? Why or why not?