Download Word document - MathSpace.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
MATH 472 – Bayesian Estimation Problems
You may use any of the results we have derived in class. Just be sure to state the result so that I
know you are using it.
1. The first step in testing a new cancer drug is to determine a safe dosage. This is done in what
are called Phase I trials. Any cancer drug will cause a toxicity – that is, an adverse reaction – in
some unknown proportion p of the patient population. In order to estimate p, researchers give
the new drug to a random sample of n patients, and observe the number k of toxicities. Suppose
we have a brand new drug, with no prior information on its toxicity. At the beginning of our
Phase I trials we give the drug to n  20 patients, and observe k  6 toxicities. Please answer
the following. You may want to include pictures from the Maple Worksheet made available to
you to show the graphs of the distributions you find.
(a) Find the posterior distribution of p given this information.
(b) Find a 90% credibility interval for p.
(c) We continue the Phase I testing by giving the drug to m  10 new patients selected at
random. What is the predicted probability (using the posterior distribution in (a)) that the first of
these new patients will have a toxic reaction to the drug.
(d) Find the predicted probability (using the posterior distribution in (a)) that exactly j of the
new patients will experience toxicities.
(e) Suppose it now turns out that j  4 of the 10 new patients experience toxicities. Now what is
the posterior distribution of p given all the results up to this point?
2. Suppose you have an underlying Bernoulli distribution with unknown probability of success
p. You repeatedly sample from the population. For the first sample you have a sample size of
n1 with y1 successes, for the second sample you have a sample size of n2 with y2 successes, etc.
m
m
i 1
i 1
You repeat this process m times. Let N   ni and Y   xi . Thus N is the cumulative number
of trials and Y is the cumulative number of successes.
(a) Starting with a Beta prior with parameters  and  and updating dating your prior on the
next step to be the posterior from the previous step, find the posterior distribution of p after the
last sample has been collected and used to update the posterior (i.e. the posterior after you have
repeated this process m times).
(b) Find the mean of the distribution in part (a). Determine the limit as N   of the mean.
Explain how this is related to a frequency approach to the estimation of the probability of success
p.
(c) Find the variance of the distribution in part (a). Determine the limit as N   of the
variance. Explain why this makes sense.
n
3. Let X i , i  1...n be a random sample from a Poisson distribution with mean  and Y   X i .
i 1
Let the prior distribution of  be gamma with parameters  and  . Show that the posterior
n
distribution of  given the data from your random sample (i.e. X i  xi , i  1...n and y   xi ) is
i 1
gamma with parameters   y and
1
n
1
. Be sure to show and justify all steps in your

derivation.
4. Consider the Galton Board from which you have collected data. In class we discussed how
the proportion of balls landing in slots zero through six could be modeled using the binomial
distribution with parameters n  6 and p  0.5 . To determine if this model is reasonable for the
Galton Board, you will use the Maple Worksheet made available to you and Bayesian Analysis.
Let  be the probability a ball will land in slot 2, 3, or 4 (i.e. “success” is landing in slot 2, 3, or
4).
(a) Using the binomial model above, find a theoretical probability for  .
(b) Using the Maple worksheet, decide on an informative beta prior that reflects your certainty
about this theoretical probability. Clearly indicate the “spread” you entered on the Maple
worksheet and the alpha, and beta values for this prior.
(c) Now use your data to find the posterior distribution of  .
(d) Lastly determine a 95% credibility region for  , the probability a ball will land in slot 2, 3,
or 4 (i.e. success is landing in slot 2, 3, or 4) based on your posterior distribution. Include a
graph of this in your solution (cut and paste the graph from Maple onto a document).
(e) Based on your results, do you believe the binomial model above is a good model for this
Galton Board? Why or why not?