Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Preparatory Statistics • Beta Handbook: all (most) distributions • Wikibook in statistics: http:en.wikibooks.org/Statistics • MIT open course: EE&CS: Introductory Probability and Statistics • Mainly as lookup literature Kolmogorov Axioms Fundamentals of Bayes’ method D - observations from simple or complex Observation space - state or parameter from state or parameter space P- Likelihood function: probability that D is generated in state f - prior: what we know about the state f(|D)- posterior: what we know after seeing D Sign between right and left part: proportionality. Multiply right part with c normalising the left part so it integrates to 1 over State space alternatives • Discrete, two or more states (e.g., diseased, non-diseased) • Continuous, e.g., an interval of reals • High-dimensional, a vector of reals (e.g., target position and velocity) • Composite, a vector plus a label (e.g., missile type x at point p with velocity v) • Product space: vector of non-fixed dimension (e.g., one or more targets approaching, one or more disease-causing genes) • Voxel set (medical imaging, atmosphere mapping Testing for disease • State space (d,n), disease or not • Observation space (P,N) positive or negative • Prior: 0.1% of population has disease, prior is (0.001,0.999) • Likelihood: Test gives 5% false negatives, 10% false positives: P: (0.95, 0.1), N: (0.05, 0.90) Testing for disease • State space (d,n), disease or not • Observation space (P,N) positive or negative • Prior: 0.1% of population has disease, prior is Prior=[0.001,0.999] • Likelihood: 5% false negatives, 10% false positives: P: [0.95, 0.10], N: [0.05, 0.90] • Combing prior and likelihood: Prior.*P=[0.00095,0.0999]; ->[0.01,0.99] Prior.*N=[0.00005,0.8991]; ->[0.0001, 0.9999] Deciding target type • • • • • Attack aircraft: small, dynamic Bomber aircraft: large, dynamic Civilian: Large, slow dynamics Prior: (0.5,0.4,0.1); Observer 1: probably small, likelihood (0.8,0.1,0.1); • Observer 2: probably fast, likelihood (0.4,0.4,0.2); Target classification, Matlab >> prior=[0.5,0.4,0.1 ]; >> lik1=[0.8,0.1,0.1 ]; >> lik2=[0.4,0.4,0.2]; >> post1=prior.*lik1; post1=post1/sum(post1) post1 = 0.8889 0.0889 0.0222 >> post2=prior.*lik2; post2=post2/sum(post2) post2 = 0.5263 0.4211 0.0526 >> post12=post1.*lik2; post12=post12/sum(post12) post12 = 0.8989 0.0899 0.0112 Odds and Bayes’ Factor Odds for A against B is P(A|D)/P(B|D): >> OddsCiv=post12(3)/(1-post12(3)) %Civilian vs not OddsCiv = 0.0114 >> OddsAtt=post12(1)/(1-post12(1)) OddsAtt = 8.8889 >> In first case the Odds is conveniently low, not C Second case high, probably A % Attack vs not Inference on a probability • Bayes’ original problem: estimating success probability p from experiment with f failures and s successes, n=s+f; • Prior is uniform probability for p; • In particular s=9; n=12; • Likelihood: 9 3 p (1- p) Estimating a probability >> p=[0:0.01:1]; >> likh=p.^9.*(1-p).^3; >> posterior=likh/sum(likh); >> plot(p) >> print -depsc beta >> NOTE: In Lecture notes example, s and f are swapped and the computation is analytic instead of numeric! Estimating a probability >>postcum=cumsum(post); >> plot(p,postcum,'b-',[0,1],[0.025 0.025],… 'r-',[0,1],[0.975 0.975],'r-'); 95% credible interval >> 95% credible interval for p: [0.46, 0.91] in other words, fairness is not rejected. Estimate p by posterior mean: >> sum(posterior.*p) ans = 0.7143 >> postcum([50:51]) ans = 0.0427 0.0497 Is the coin balanced (LN 2.1.10)? • Use outcome D:(s,f) in flipping n=s+f times • Evaluate using two models, one H_r where probability is 0.5, one H_u where it is uniformly distributed over [0,1]. • P(D:(s,f)|H_r) = 2^(-n) • P(D:(s,f)|H_u) = s!f!/(n+1)! (normalization in Beta dist) • For s=3, f=9, Bayes factor P(D|H_u)/P(D|H_r)1.4, or P(H_r|D) 0.42 ; P(H_u|D) 0.58 HW 1 Is the coin balanced (LN 2.1.10)? >> s=3;f=9; >> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f) ans = 1.4322 >> s=6; f=18; >> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f) ans = 4.9859 >> s=30;f=90; >> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f) ans = 6.4717e+05 % in logs: >> exp(gammaln(s+1)+gammaln(f+1)-gammaln(s+f+2)+… log(2)*(s+f)) ans = 6.4717e+05 Dissecting Master Bayes’ formula • Parametrized and composite models: Recursive & Dynamic inference • Repeated measurements improve accuracy: • Chapman Kolmogorov, tracking in time: Retrodiction: what happened? Retrodiction(smoothing) gives additional precision, but later MCMC: PET camera likelihood prior D: film, count by detector j X: radioactivity in voxel i a_ij: camera geometry Fraction of emission from voxel i reaching detector j Inference about X gives posterior, its mean is often a good picture of patient MCMC: PET camera likelihood prior MCMC: Stochastic solution of probability problems Generate sequence of states with the same distribution as the posterior. In this case (X1, X2, …). Each member is a full 3D image. ESTIMATE X by taking mean over trace. MCMC: PET camera likelihood prior MCMC: PET camera Main MCMC loop: We have (X1, X2, … Xk) and want to compute X(k+1). Propose a new image Z by changing the value in one voxel Compute a=(Z)/(Xk), acceptance probability. Accept X(k+1)=Z if a>1 or with probability a. If not accept, X(k+1)=Xk. Matlab: if a>rand X(k+1)=Z else X(k+1)=X(k) end; In practise: Compute in logarithms to avoid underflow Differential computation: Most of the terms in (Z) same as in (Xk) Sinogram and reconstruction Tumour Fruit Fly Drosophila family (Xray) Does Bayes give the right answer? • Output is a posterior. How accurate? Depends on prior and likelihood assessed. • If data is generated by distribution g(x) and inference is for parameter of f(x|) then asymptotically posterior for will concentrate on argmin KL(g(.),f(.| )) l KL: Kullback-Leibler distance Does Bayes give right answer? • Coherence: If you evaluate bets on uncertain events, anyone who does not use Bayes’ rule to evaluate will potentially loose unlimited amount to you who use Bayes’ rule. (Freedman Purves, 1969) • Consistency: Observing properties of Boolean (set) algebra, the calculus of plausibility has to be embeddable in an ordered field where + and correspond to the combination functions for deriving plausibilities of disjunction and conjunction. (Jaynes Ch 2, Arnborg, Sjödin MaxEnt 2000, ECCAI 2000)