Download on p (D+ YA+)

Bayes for Beginners Reverend Thomas Bayes (1702-61) Velia Cardin Marta Garrido http://www.fil.ion.ucl.ac.uk/spm/software/spm2/ “In addition to WLS estimators and classical inference, SPM2 also supports Bayesian estimation and inference. In this instance the statistical parametric maps become posterior probability maps Posterior Probability Maps (PPMs), where the posterior probability is a probability of an effect given the data. There is no multiple comparison problem in Bayesian inference and the posterior probabilities do not require adjustment.” Overview Velia Bayes vs Frequentist approach An example Bayes theorem Marta Bayesian Inference Posterior Probability Distribution Bayes in SPM Summary Frequentist Statistics vs Bayesian statistics In general, we want to relate an event (E) to a hypothesis (H) …and the probability of E given H With a frequentist approach…. We obtain a p-value that is the probability, given a true H0, for the outcome to be more or equally extreme as the observed outcome. If the p-value is sufficiently small, you reject the null hypothesis, but… it doesn’t say anything about the probability of Hi. The frequentist conclusion is restricted to the data at hand, it doesn’t take into account previous, valuable information. In general, we want to relate an event (E) to a hypothesis (H) and the probability of E given H With a Bayesian approach… The probability of a H being true is determined. A probability distribution of the parameter or hypothesis is obtained You can compare the probabilities of different H for a same E Conclusions depends on previous evidence. Bayesian approach is not data analysis per se, it brings different types of evidence to answer the questions of importance. Given a prior state of knowledge or belief, it tells how to update beliefs based upon observations (current data). Dana Plato died of a drug overdose at age 34 Todd Bridges on suspicion of shooting and stabbing alleged drug dealer in a crack house. ... Macaulay Culkin Busted for Drugs! Feldman , arrested and charged with heroin possession Corey Haim in a spiral of prescription drug abuse! Our observations…. DREW BARRYMORE REVEALS ALCOHOL AND DRUG PROBLEMS STARTED AGED EIGHT Our hypothesis is: “Young actors have more probability of becoming drug-addicts” We took a random sample of 40 people, 10 of them were young stars, being 3 of them addicted to drugs. From the other 30, just one. Drug-addicted D+ Dyoung YA+ actors control YA- 3 7 10 1 29 30 4 36 40 With a frequentist approach we will test: Hi:’Conditions A and B have different effects’ Young actors have a different probability of becoming drug addicts than the rest of the people H0:’There is no difference in the effect of conditions A and B’ The statistical test of choice is 2 and Yates’ correction: 2 = 3.33 p=0.07 We can’t reject the null hypothesis, and the information the p is giving us is basically that if we “do this experiment” many times, 7% of the times we will obtain this result if there is no difference between both conditions. This is not what we want to know!!! …and we have strong believes that young actors have more probability of becoming drug addicts!!! With a Bayesian approach… D+ We want to know if p (D+YA+) > p (D+YA-) p(D+YA+) p (D+ YA+) = p (D+ and YA+) / p (YA+) p (D+ YA+) = 0.075 / 0.25 = 0.3 D- total YA+ 3 (0.075) 7 (0.175) 10 (0.25) YA- 1 (0.025) 29 (0.725) 30 (0.75) total 4 (0.1) 36 (0.9) 40 (1) p (D+YA-) p (D+ YA-) = p (D+ and YA-) / p (YA-) p (D+ YA-) = 0.025 / 0.75 = 0.033 0.3 > 0.033 p (D+YA+) > p (D+YA-) Reformulating p (D+ and YA+) = p (YA+ D+) * p (D+) Substituting p (D+ and YA+) on p (D+ YA+) = p (D+ and YA+) / p (YA+) p (D+YA+)  p (YA+D+) p (YA+D+) p (YA+ D+) = p (D+ and YA+) / p (D+) p (YA+ D+) = 0.075 / 0.1 = 0.75 0.3  0.75 p (D+ YA+) p (YA+ D+) * p (D+) p (YA+) This is Bayes’ Theorem !!! Bayes’ Theorem for a given parameter  p (data) = p (data) p () / p (data) 1/P (data) is basically a normalizing constant Posterior  likelihood x prior The prior is the probability of the parameter and represents what was thought before seeing the data. The likelihood is the probability of the data given the parameter and represents the data now available. The posterior represents what is thought given both prior information and the data just seen. It relates the conditional density of a parameter (posterior probability) with its unconditional density (prior, since depends on information present before the experiment). In fMRI…. • Classical – ‘What is the likelihood of getting these data given no activation occurred?’ • Bayesian (SPM2) – ‘What is the chance of getting these parameters, given these data? Bayesian Inference In order to make probability statements about  given y we begin with a model p( , y )  p( ) p( y |  ) p( | y )  joint prob. distribution p( , y ) p( ) p( y |  )  p( y ) p( y) where or p( y )   p( ) p( y |  ) discrete case p( y )   p( ) p( y |  ) continuous case  posterior prior likelihood p( | y )  p( ) p( y |  ) What you know about the model after p, (is what | y ) you the data arrive, p (what  ) the knew before, , and data told you, p( y.|  ) Posterior Probability Distribution precision l = 1/2 Likelihood: Prior: Posterior: p(y|) = N(Md, ld-1) p() = N(Mp, lp-1) p(|y) ∝ p(y|)* p() = N(Mpost, lpost-1) lpost = ld + lp Mpost = ld Md + lp Mp lpost lpost-1 ld-1 lp-1 Mp Mpost Md  The effects of different precisions lp = ld lp < ld lp > ld lp ≈ 0 Multivariate Distributions Bayes in SPM SPM uses priors for estimation in…  spatial normalization  segmentation and Bayesian inference in…  Posterior Probability Maps (PPM)  Dynamic Causal Modelling (DCM) Shrinkage Priors Large, variable effect Small, variable effect   Large, consistent effect Small, consistent effect   Thresholding g p( > g | y) = 0.95  Summary Bayesian methods use probability models for quantifying uncertainty in inferences based on statistical data analysis. priors over the parameters In Bayesian estimation we… 1. …start with the formulation of a model that we hope is adequate to describe the situation of interest. posterior distributions 2. …observe the data and when the information available changes it is necessary to update the degrees of belief (probability). new priors over the parameters 3. …evaluate the fit of the model. If necessary we compute predictive distributions for future observations. Prejudices or scientific judgment? The selection of a prior is subjective and arbitrary. It is reasonable to draw conclusions in the light of some reason. References • • • http://www.stat.ucla.edu/history/essay.pdf (Bayes’ original essay!!!) http://www.cs.toronto.edu/~radford/res-bayes-ex.html http://www.gatsby.ucl.ac.uk/~zoubin/bayesian.html • A. Gelman, J.B. Carlin, H.S. Stern and D.B. Rubin, 2nd ed. Bayesian Data Analysis. Chapman & Hall/CRC. • Mackay D. Information Theory, Inference and Learning Algorithms. Chapter 37: Bayesian inference and sampling theory. Cambridge University Press, 2003. • Berry D, Stangl D. Bayesian Methods in Health-Realated Research. In: Bayesian Biostatistics. Berry D and Stangl K (eds). Marcel Dekker Inc, 1996. • Friston KJ, Penny W, Phillips C, Kiebel S, Hinton G, Ashburner J. Classical and Bayesian inference in neuroimaging: theory. Neuroimage. 2002 Jun;16(2):465-83. Bayes for Beginners Reverend Thomas Bayes (1702-61) “We don’t see what we don’t seek.” E. M. Forster …good-bayes!!!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download on p (D+ YA+)