Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Student's t-test wikipedia , lookup
Misuse of statistics wikipedia , lookup
Bayesian inference wikipedia , lookup
Special Topic: Bayesian Finite Population Survey Sampling Sudipto Banerjee Division of Biostatistics School of Public Health University of Minnesota April 22, 2010 1 Special Topic Overview Scientific survey sampling (Neyman, 1934) represents one of statistics’ greatest contribution to science 2 Special Topic Overview Scientific survey sampling (Neyman, 1934) represents one of statistics’ greatest contribution to science It provides the remarkable ability to obtain useful inferences about large populations from modest samples, with measurable uncertainty 2 Special Topic Overview Scientific survey sampling (Neyman, 1934) represents one of statistics’ greatest contribution to science It provides the remarkable ability to obtain useful inferences about large populations from modest samples, with measurable uncertainty It forms the backbone of data collection in science and government 2 Special Topic Overview Scientific survey sampling (Neyman, 1934) represents one of statistics’ greatest contribution to science It provides the remarkable ability to obtain useful inferences about large populations from modest samples, with measurable uncertainty It forms the backbone of data collection in science and government Traditional (and widely favoured) approach: Randomization-based 2 Special Topic Overview Scientific survey sampling (Neyman, 1934) represents one of statistics’ greatest contribution to science It provides the remarkable ability to obtain useful inferences about large populations from modest samples, with measurable uncertainty It forms the backbone of data collection in science and government Traditional (and widely favoured) approach: Randomization-based Advocating Bayes for survey sampling is like “swimming upstream”: Modelling assumptions of any kind are anathema here, let alone priors and further subjectivity that Bayes brings along! 2 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable 3 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable Yi is the value of unit i in the population 3 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable Yi is the value of unit i in the population Let I = (I1 , . . . , IN ) be the set of inclusion indicator variables 3 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable Yi is the value of unit i in the population Let I = (I1 , . . . , IN ) be the set of inclusion indicator variables Ii = 1 if unit i is selected in the sample; Ii = 0 otherwise 3 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable Yi is the value of unit i in the population Let I = (I1 , . . . , IN ) be the set of inclusion indicator variables Ii = 1 if unit i is selected in the sample; Ii = 0 otherwise Let I0 denote the complement of I, obtained by “switching” the 1’s to 0’s and 0’s to 1’s in I 3 Special Topic Fundamental Notions For a population with N units, let Y = (Y1 , . . . , YN ) denote the population variable Yi is the value of unit i in the population Let I = (I1 , . . . , IN ) be the set of inclusion indicator variables Ii = 1 if unit i is selected in the sample; Ii = 0 otherwise Let I0 denote the complement of I, obtained by “switching” the 1’s to 0’s and 0’s to 1’s in I Thus, I0 represents the exclusion indicator variables, indexing units not selected in the sample. 3 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. 4 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. Simple Random Sampling Without Replacement (SRSWOR): We draw a unit from the population, keep it aside and draw again 4 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. Simple Random Sampling Without Replacement (SRSWOR): We draw a unit from the population, keep it aside and draw again SRSWR leads to binomial distribution; chances of units being drawn remain the same between draws 4 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. Simple Random Sampling Without Replacement (SRSWOR): We draw a unit from the population, keep it aside and draw again SRSWR leads to binomial distribution; chances of units being drawn remain the same between draws SRSWOR leads to hypergeometric distribution: chances of units being drawn change from draw to draw 4 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. Simple Random Sampling Without Replacement (SRSWOR): We draw a unit from the population, keep it aside and draw again SRSWR leads to binomial distribution; chances of units being drawn remain the same between draws SRSWOR leads to hypergeometric distribution: chances of units being drawn change from draw to draw Key questions: What are the random variables? 4 Special Topic Simple random sampling Simple Random Sampling With Replacement (SRSWR): We draw a unit from the population, put it back and draw again. Simple Random Sampling Without Replacement (SRSWOR): We draw a unit from the population, keep it aside and draw again SRSWR leads to binomial distribution; chances of units being drawn remain the same between draws SRSWOR leads to hypergeometric distribution: chances of units being drawn change from draw to draw Key questions: What are the random variables? What are the parameters? 4 Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). 5 Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). IMPORTANT: The lower-case yi ’s simply index the sample; thus yi does not necessarily represent Yi , but the i-th member of the sample. 5 Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). IMPORTANT: The lower-case yi ’s simply index the sample; thus yi does not necessarily represent Yi , but the i-th member of the sample. Let Ȳ = 5 1 N PN i=1 Yi be the population mean Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). IMPORTANT: The lower-case yi ’s simply index the sample; thus yi does not necessarily represent Yi , but the i-th member of the sample. Let Ȳ = 1 N PN Let ȲI = ȳ = of YI 5 i=1 Yi 1 n Pn be the population mean i=1 yi be the sample mean; i.e. the mean Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). IMPORTANT: The lower-case yi ’s simply index the sample; thus yi does not necessarily represent Yi , but the i-th member of the sample. Let Ȳ = 1 N PN Let ȲI = ȳ = of YI i=1 Yi 1 n Pn be the population mean i=1 yi be the sample mean; i.e. the mean Let YI 0 be the collection of units excluded from the sample; let ȲI 0 be the mean of YI 0 . 5 Special Topic Estimating the population mean Let YI be the collection of units selected in the sample; we often write YI as y = (y1 , . . . , yn ). IMPORTANT: The lower-case yi ’s simply index the sample; thus yi does not necessarily represent Yi , but the i-th member of the sample. Let Ȳ = 1 N PN Let ȲI = ȳ = of YI i=1 Yi 1 n Pn be the population mean i=1 yi be the sample mean; i.e. the mean Let YI 0 be the collection of units excluded from the sample; let ȲI 0 be the mean of YI 0 . Then Ȳ = f ȳ + (1 − f )ȲI 0 ; f = 5 n N is the sampling fraction Special Topic Estimating the population mean Key questions answered: What are the random variables? Answer: The Ii ’s What are the parameters? Answer: The Yi ’s. 6 Special Topic Estimating the population mean Key questions answered: What are the random variables? Answer: The Ii ’s What are the parameters? Answer: The Yi ’s. For SRSWOR: E[Ii ] = P (Ii = 1) = 6 −1 (Nn−1 ) = N (n) Special Topic n N for each i. Estimating the population mean Key questions answered: What are the random variables? Answer: The Ii ’s What are the parameters? Answer: The Yi ’s. −1 (Nn−1 ) = N (n) Unbiasedness of the sample mean ȳ: Write P I Y and see: ȳ = n1 N i i i=1 For SRSWOR: E[Ii ] = P (Ii = 1) = E[ȳ] = 6 N N i=1 i=1 n N for each i. 1X 1X n E[Ii ]Yi = Yi = Ȳ . n n N Special Topic Estimating the population mean Key questions answered: What are the random variables? Answer: The Ii ’s What are the parameters? Answer: The Yi ’s. −1 (Nn−1 ) = N (n) Unbiasedness of the sample mean ȳ: Write P I Y and see: ȳ = n1 N i i i=1 For SRSWOR: E[Ii ] = P (Ii = 1) = E[ȳ] = N N i=1 i=1 n N for each i. 1X 1X n E[Ii ]Yi = Yi = Ȳ . n n N P For SRSWR: ȳ = N i=1 Wi Yi , where Wi is the number of times unit i appears in the sample. Then, Wi ∼ Bin(n, N1 ), n so E[Wi ] = N , hence P P N 1 n E[ȳ] = n i=1 E[Wi ]Yi = n1 N i=1 N Yi = Ȳ 6 Special Topic Bayesian modelling We need a prior for each population unit: Yi 7 Special Topic Bayesian modelling We need a prior for each population unit: Yi iid Suppose Yi ∼ N (µ, σ 2 ) 7 Special Topic Bayesian modelling We need a prior for each population unit: Yi iid Suppose Yi ∼ N (µ, σ 2 ) Let us first assume σ 2 is known and P (µ) ∝ 1. 7 Special Topic Bayesian modelling We need a prior for each population unit: Yi iid Suppose Yi ∼ N (µ, σ 2 ) Let us first assume σ 2 is known and P (µ) ∝ 1. What is “seen”? D = (YI , I) 7 Special Topic Bayesian modelling We need a prior for each population unit: Yi iid Suppose Yi ∼ N (µ, σ 2 ) Let us first assume σ 2 is known and P (µ) ∝ 1. What is “seen”? D = (YI , I) Note that P (D | µ, σ 2 ) = P (I)P (YI | µ, σ 2 ) ∝ P (YI | µ, σ 2 ); so likelihood is product of n independent N (µ, σ 2 ). 7 Special Topic Bayesian modelling We need a prior for each population unit: Yi iid Suppose Yi ∼ N (µ, σ 2 ) Let us first assume σ 2 is known and P (µ) ∝ 1. What is “seen”? D = (YI , I) Note that P (D | µ, σ 2 ) = P (I)P (YI | µ, σ 2 ) ∝ P (YI | µ, σ 2 ); so likelihood is product of n independent N (µ, σ 2 ). We need the posterior distribution Z P (Ȳ | D, σ 2 ) = P (Ȳ | D, µ, σ 2 )P (µ | D, σ 2 )dµ. 7 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). 8 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). Observe that Ȳ = f ȳ + (1 − f )ȲI 0 . 8 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). Observe that Ȳ = f ȳ + (1 − f )ȲI 0 . Given D, ȳ is fixed; so the posterior distribution of ȲI 0 will determine the posterior distribution of Ȳ . 8 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). Observe that Ȳ = f ȳ + (1 − f )ȲI 0 . Given D, ȳ is fixed; so the posterior distribution of ȲI 0 will determine the posterior distribution of Ȳ . So, what is the posterior distribution P (ȲI 0 | D, σ 2 )? 8 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). Observe that Ȳ = f ȳ + (1 − f )ȲI 0 . Given D, ȳ is fixed; so the posterior distribution of ȲI 0 will determine the posterior distribution of Ȳ . So, what is the posterior distribution P (ȲI 0 | D, σ 2 )? Z P (ȲI 0 | D, σ 2 ) = P (ȲI 0 | µ, σ 2 )P (µ | D, σ 2 )dµ, where we use P (ȲI 0 | D, µ, σ 2) = P (ȲI 0 | µ, σ 2 ). Note σ2 P (ȲI 0 | µ, σ 2 ) = N µ, N (1−f ) 8 Special Topic Bayesian modelling Let us take a closer look at P (Ȳ | D, σ 2 ). Observe that Ȳ = f ȳ + (1 − f )ȲI 0 . Given D, ȳ is fixed; so the posterior distribution of ȲI 0 will determine the posterior distribution of Ȳ . So, what is the posterior distribution P (ȲI 0 | D, σ 2 )? Z P (ȲI 0 | D, σ 2 ) = P (ȲI 0 | µ, σ 2 )P (µ | D, σ 2 )dµ, where we use P (ȲI 0 | D, µ, σ 2) = P (ȲI 0 | µ, σ 2 ). Note σ2 P (ȲI 0 | µ, σ 2 ) = N µ, N (1−f ) Can we simplify this posterior without integrating? 8 Special Topic Bayesian modelling What is the posterior distribution P (µ | D, σ 2 )? 9 Special Topic Bayesian modelling What is the posterior distribution P (µ | D, σ 2 )? Note that P (µ | D, σ 2 ) ∝ P (D | µ, σ 2 ) 9 Special Topic Bayesian modelling What is the posterior distribution P (µ | D, σ 2 )? Note that P (µ | D, σ 2 ) ∝ P (D | µ, σ 2 ) 2 =⇒ P (µ | D, σ 2 ) = N ȳ, σn . 9 Special Topic Bayesian modelling What is the posterior distribution P (µ | D, σ 2 )? Note that P (µ | D, σ 2 ) ∝ P (D | µ, σ 2 ) 2 =⇒ P (µ | D, σ 2 ) = N ȳ, σn . Conditional on D and σ 2 , we can write: Ȳ = f ȳ + (1 − f )ȲI 0 ; ȲI 0 = µ + u1 ; u1 ∼ N 0, σ2 N (1 − f ) ⇒ Ȳ = f ȳ + (1 − f )µ + (1 − f )u1 ; 2 Also, conditional on D and σ 2 , µ = ȳ + u2 ; u2 ∼ N 0, σn . 9 Special Topic Bayesian modelling Then: Ȳ = ȳ + (1 − f )u2 + (1 − f )u1 10 Special Topic Bayesian modelling Then: Ȳ = ȳ + (1 − f )u2 + (1 − f )u1 The posterior is P (Ȳ | D, σ 2 ) is: σ2 N ȳ, (1 − f ) n 10 Special Topic Bayesian modelling Then: Ȳ = ȳ + (1 − f )u2 + (1 − f )u1 The posterior is P (Ȳ | D, σ 2 ) is: σ2 N ȳ, (1 − f ) n Let σ 2 be unknown. Prior: P (µ, σ 2 ) ∝ 10 Special Topic 1 σ2 Bayesian modelling Then: Ȳ = ȳ + (1 − f )u2 + (1 − f )u1 The posterior is P (Ȳ | D, σ 2 ) is: σ2 N ȳ, (1 − f ) n Let σ 2 be unknown. Prior: P (µ, σ 2 ) ∝ 1 σ2 Then we reproduce classical result: Ȳ − ȳ q where s2 = 10 1 n−1 Pn s2 n (1 i=1 (yi ∼ tn−1 − f) − ȳ)2 is the sample variance. Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. 11 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: 11 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: 2 Draw σ(l) ∼ IG 11 2 n−1 (n−1)s 2 , 2 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: (n−1)s2 2 Draw σ(l) ∼ IG n−1 , 2 22 σ(l) Draw µ(l) ∼ N ȳ, n (1 − f ) 11 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: (n−1)s2 2 Draw σ(l) ∼ IG n−1 , 2 22 σ(l) Draw µ(l) ∼ N ȳ, n (1 − f ) 2 σ(l) (l) Draw ȲI 0 ∼ N µ(l) , N (1−f ) 11 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: (n−1)s2 2 Draw σ(l) ∼ IG n−1 , 2 22 σ(l) Draw µ(l) ∼ N ȳ, n (1 − f ) 2 σ(l) (l) Draw ȲI 0 ∼ N µ(l) , N (1−f ) (l) Set Ȳ (l) = f ȳ + (1 − f )ȲI 0 . 11 Special Topic Bayesian computation Note that the marginal posterior distribution n−1 (n−1)s2 2 P (σ | D) = IG 2 , 2 , where P n 1 2 s2 = n−1 i=1 (yi − ȳ) is the sample variance. We can perform exact posterior sampling as follows. For l = 1, . . . , L do the following: (n−1)s2 2 Draw σ(l) ∼ IG n−1 , 2 22 σ(l) Draw µ(l) ∼ N ȳ, n (1 − f ) 2 σ(l) (l) Draw ȲI 0 ∼ N µ(l) , N (1−f ) (l) Set Ȳ (l) = f ȳ + (1 − f )ȲI 0 . The resulting collection {Ȳ (l) }L l=1 is a sample from the posterior distribution of the population mean. 11 Special Topic Homework HOMEWORK A simple random sample of 30 households was drawn from a township containing 576 households. The numbers of persons per household in the sample were as follows: 5, 6, 3, 3, 2, 3, 3, 3, 4, 4, 3, 2, 7, 4, 3, 5, 4, 4, 3, 3, 4, 3, 3, 1, 2, 4, 3, 4, 2, 4 Use a non-informative Bayesian analysis to estimate the total number of people in the area and compute the posterior probability that the population total lies within 10% of the sample estimate. From a list of 468 small 2-year colleges in the North-Eastern United States, a simple sample of 100 colleges was drawn. Data for the number of students (y) and the number of teachers (x) for these colleges were summarized as follows: The total number of students in the sample was: 44, 987 The total number of teachers in the sample was: 3, 079 Pn 2 Also given are the sample sums of squares: i=1 yi = 29, 881, 219 and Pn 2 i=1 xi = 111, 090. Assuming a non-informative Bayesian setting and where the population of students and teachers are independent, find the posterior mean and 95% credible interval for the student-teacher ratio in the population (i.e. all the 468 colleges combined). 12 Special Topic Missing Data Inference A general theory for Bayesian missing data analysis 13 Special Topic Missing Data Inference A general theory for Bayesian missing data analysis Let y = (y1 , . . . , yN ) be the units in the population. We assume a likelihood for this population, say p(y | θ), where θ are the parameters in the model. 13 Special Topic Missing Data Inference A general theory for Bayesian missing data analysis Let y = (y1 , . . . , yN ) be the units in the population. We assume a likelihood for this population, say p(y | θ), where θ are the parameters in the model. Let I = (I1 , . . . , IN ) be the set of “inclusion indicators”. 13 Special Topic Missing Data Inference A general theory for Bayesian missing data analysis Let y = (y1 , . . . , yN ) be the units in the population. We assume a likelihood for this population, say p(y | θ), where θ are the parameters in the model. Let I = (I1 , . . . , IN ) be the set of “inclusion indicators”. Let us denote by “obs” the index set {i : Ii = 1} and by “mis” the index set {i; Ii = 0} 13 Special Topic Missing Data Inference A general theory for Bayesian missing data analysis Let y = (y1 , . . . , yN ) be the units in the population. We assume a likelihood for this population, say p(y | θ), where θ are the parameters in the model. Let I = (I1 , . . . , IN ) be the set of “inclusion indicators”. Let us denote by “obs” the index set {i : Ii = 1} and by “mis” the index set {i; Ii = 0} So, yobs is the set of observed data points and ymis is for the unobserved data. We will often write y = (yobs , ymis ). 13 Special Topic Building the models We will break the joint probability model into two parts 14 Special Topic Building the models We will break the joint probability model into two parts The model for the “complete data” p(y | θ) – including observed and missing data 14 Special Topic Building the models We will break the joint probability model into two parts The model for the “complete data” p(y | θ) – including observed and missing data The model for the inclusion vector I, say p(I | y, φ) 14 Special Topic Building the models We will break the joint probability model into two parts The model for the “complete data” p(y | θ) – including observed and missing data The model for the inclusion vector I, say p(I | y, φ) We define the complete-likelihood as: p(y, I | θ, φ) = p(y | θ)p(I | y, φ) 14 Special Topic Building the models We will break the joint probability model into two parts The model for the “complete data” p(y | θ) – including observed and missing data The model for the inclusion vector I, say p(I | y, φ) We define the complete-likelihood as: p(y, I | θ, φ) = p(y | θ)p(I | y, φ) Parameters: Scientific interest usually revolves around estimation of θ (sometimes called the “super-population” parameters) 14 Special Topic Building the models We will break the joint probability model into two parts The model for the “complete data” p(y | θ) – including observed and missing data The model for the inclusion vector I, say p(I | y, φ) We define the complete-likelihood as: p(y, I | θ, φ) = p(y | θ)p(I | y, φ) Parameters: Scientific interest usually revolves around estimation of θ (sometimes called the “super-population” parameters) The parameters φ index the missingness model (often called “design” parameters). 14 Special Topic Buliding the models The “complete-likelihood” is not the data likelihood as it involves the missing data component as well. 15 Special Topic Buliding the models The “complete-likelihood” is not the data likelihood as it involves the missing data component as well. The actual information available is (yobs , I). So the data likelihood is: Z p(yobs , I | θ, φ) = p(y, I | θ, φ)dymis 15 Special Topic Buliding the models The “complete-likelihood” is not the data likelihood as it involves the missing data component as well. The actual information available is (yobs , I). So the data likelihood is: Z p(yobs , I | θ, φ) = p(y, I | θ, φ)dymis The joint posterior distribution is p(θ, φ | yobs , I) ∝ p(θ, φ)p(yobs , I | θ, φ) Z ∝ p(θ, φ) p(y | θ)p(I | y, φ)dymis 15 Special Topic Buliding the models The “complete-likelihood” is not the data likelihood as it involves the missing data component as well. The actual information available is (yobs , I). So the data likelihood is: Z p(yobs , I | θ, φ) = p(y, I | θ, φ)dymis The joint posterior distribution is p(θ, φ | yobs , I) ∝ p(θ, φ)p(yobs , I | θ, φ) Z ∝ p(θ, φ) p(y | θ)p(I | y, φ)dymis 15 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) 16 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) Typically, we first compute samples (θ(l) , φ(l) ) from p(θ, φ | yobs , I) 16 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) Typically, we first compute samples (θ(l) , φ(l) ) from p(θ, φ | yobs , I) (l) Then we draw ymis ∼ p(ymis | yobs , I, θ(l) , φ(l) ) 16 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) Typically, we first compute samples (θ(l) , φ(l) ) from p(θ, φ | yobs , I) (l) Then we draw ymis ∼ p(ymis | yobs , I, θ(l) , φ(l) ) Computations often simplify from assumptions: p(I | y, φ) = p(I | yobs , φ) 16 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) Typically, we first compute samples (θ(l) , φ(l) ) from p(θ, φ | yobs , I) (l) Then we draw ymis ∼ p(ymis | yobs , I, θ(l) , φ(l) ) Computations often simplify from assumptions: p(I | y, φ) = p(I | yobs , φ) p(I | y, φ) = p(I | φ) = p(I) 16 Special Topic Buliding the models These integrals are evaluated by drawing samples (ymis , θ, φ) from p(ymis , θ, φ | yobs , I) Typically, we first compute samples (θ(l) , φ(l) ) from p(θ, φ | yobs , I) (l) Then we draw ymis ∼ p(ymis | yobs , I, θ(l) , φ(l) ) Computations often simplify from assumptions: p(I | y, φ) = p(I | yobs , φ) p(I | y, φ) = p(I | φ) = p(I) p(φ | θ) = p(φ) p(ymis | θ(l) , yobs ) = p(ymis | θ(l) ). 16 Special Topic