Download y - UFL MAE

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Bayesian inference review
• Problem statement
– Objective is to estimate or infer unknown parameter q based on
observations y. Result is given by probability distribution.
– Identify parameter q that we’d like to estimate.
– Identify observations, i.e., data y & type of distribution associated with q.
• Bayesian inference
– Establish prior of q if any.
– Establish likelihood of y conditional on q
– Derive posterior distribution of q.
• Posterior analysis & prediction
– Analyze posterior distribution of q.
– Predict distribution of new y~ based on posterior distribution q.
• Bayesian updating
– If new data comes in, old posterior turned into prior, and repeat process.
-1-
Bayesian inference review
• Bayesian inference
– Establish prior of q if any.
– Establish likelihood of y conditional on q
– Derive posterior distribution of q.
Prior
distribution
p  q | y   L  y | q p q
Likelihood
function
q
Posterior
distribution
5
Observed
data
Posterior
4
3
Prior
2
1
0
12
0.1
y
6
4
2
-2-
0.5
0.7
0.9
q post  ky  1  k  q prior
8
0
0.3
x
10
4
6
8
10
12
14
16
Bayesian inference review
• Bayesian Inference of a genetic probability
– Unknown parameter q
– Observation y and its likelihood
– Posterior distribution of q
• Bayesian inference of binomial problem
– Unknown parameter q
– Observation y and its likelihood
– Posterior distribution of q
-3-
Bayesian inference of normal distribution
• Problem statement
– Objective is to estimate unknown parameters q of normal distribution
based on observations y.
– Normal distribution has two parameters: mean m & var s2 (stdev s.
• Cases of study
– We have observations y that follows normal distribution.
Single observation y
Multiple observations y = {y1, y2, …}
– Estimate mean m with known variance s2
– Estimate variance s2 with known mean m
– Estimate both parameters will be addressed later.
-4-
Fundamentals of normal distribution by matlab
• Normal distribution y ~ N  m,s 
– Probability density function: normpdf(y,m,s)
  y  m 2 
1
p  y | m , s   fY  y  
exp  

2
2
s
s 2


– Cumulative distribution function: normcdf(y,m,s)
FY  y   
y

  y  m 2 
1
fY  y  dy  
exp 
 dy
2
 s 2
2s


y
– Inverse of CDF: norminv(p,m,s)
95% confidence intervals
– Random sampling and analysis of data
normrnd
[mean(y) std(y)]
[prctile(y,2.5) prctile(y,97.5)]
-5-
Estimating mean with known variance
• Case of single data (no prior)
– Varaince s2 known. Mean q  m is unknown. We have single observation y.
– No prior on q. Then q ~ constant or just ignore.
–
–
  y  q 2 
1
y | q ~ N q ,s 2
p y |q  
exp  

Likelihood of y:
2
2s 
2s

 q  y 2 
q | y ~ N y, s 2
Posterior density: p q | y   p  y | q  p q   exp 

2
2s 

• This means when we don’t have any knowledge on the prior
distribution, we just estimate
the distribution of the mean being the same as the sample distribution.




• Practice
– Have only one data y=90 with s=10. We want estimate the unknown
mean.
– Plot the shape of posterior pdf. Superpose original normal distribution.
• Posterior prediction
p  y | y    p  y | q  p q | y  dq
  y  q 2 
 q  y 2 
  exp  
 exp  
 dq
2s 2 
2s 2 


-6-
Estimating mean with known variance
• Posterior prediction (no prior)
– Rearrange the integrand in terms of q to obtain
2
 1 

  y  q 2 
 q  y 2 
y y
1
2
 

exp 
2 q
  y  y 
exp  
 exp  
 dq


2s 2


2s 2
 
 2s 2  




2 
2
– Ignore 1st term which becomes after integration. Then
  y  y 2 
p  y | y   exp  

2
2

2
s



p  y | y  ~ N y, 2s 2
– Mean is equal to the posterior mean y.
Variance is 2s2 or stdev is √2s. So,
E  y | y   y, var  y | y   2s 2
-7-





Estimating mean with known variance
• Case of single data (conjugate prior)
  y  q 2 
1
p y |q  
exp  

2
2
s
2s


 q  m 2 
0

p q   exp  
2
2t 0


– Likelihood of y
– Conjugate prior

y | q ~ N q ,s 2

implies that q is exponential of a quadratic form.
– Posterior density
or

q | y ~ N m1 ,t12

p q | y   p  y | q  p q 
m1  wm0  1  w y, w 
2
   y  q 2 q  m 2 

q  m1  

0
 exp  .5 


  exp  .5
2
2
t 0 2 
t
  s


1



c0
1
1
, c0  2 , c  2
c0  c
t0
s
t 12 
1
c0  c
• Posterior mean is a weighted average of prior mean m0 and observed y
with weights proportional to (inverse of variance)
• If t0 → ∞ then c0 → 0, p(q) is constant over (-∞ , ∞). Then we get m1 = y, t1 = s.
-8-
Estimating mean with known variance
• Posterior prediction (conjugate prior)
p  y | y    p  y | q  p q | y  dq
– Rearrange integrand in terms of q and ignore the resulting term. Then
 1  y  m 2
 1
B2 
1
exp    C 
   exp  
2
A  
 2 s  t12
 2 








p  y | y  ~ N m1 , s 2  t12

– Mean is equal to the posterior mean.
Variance has two components, inherent variance s2 and variance t12
due to the posterior uncertainty in q. So,
E  y | y   m1
var  y | y   s 2  t12
• If t0 → ∞ then c0 → 0, m1 = y, t1 = s.
-9-
Estimating mean with known variance
• Case of multiple data (no prior)
– Independent & identically distributed (iid) observations y   y1 , y2 ,..., yn 
– Posterior density
 1 n
2
p q | y   p  y | q  p q    p  yi | q   exp   2   yi  q  
 2s i 1

i 1
1
2
 1


 exp   2 nq 2  2qyi  yi 2   exp   2 q  y  
 2s

 2s / n

n


 s2 
q | y ~ N  y, 
 n 
– Posterior distribution of the unknown mean follows norm dist with
the mean being the sample mean ȳ and
the stdev being the sample stdev /√n
– Posterior prediction
  1 
p  y | y  ~ N mn , s 2  t n 2  N  y , 1   s 2 
  n 


- 10 -
Estimating mean with known variance
• Practice
–
–
–
–
Observations are 20 data of norm dist where ȳ 2.9 and stdev s 0.2.
Plot posterior pdf of unknown population mean.
Superpose the original normal distribution, assuming ȳ is the true mean.
Plot cdf of the two together.
- 11 -
Estimating variance with known mean
• Case of multiple data (non-informative prior)
– In case of no information, the prior is
– Sample distribution (observation)
p s 2   s 2 
1
2
 1
  p  yi | q   s  n exp   2   yi  q    s 2
 2s

i 1
1
where v    yi  q 2
n
2
2  n /21
p
s
|
y

s
– Posterior distribution
    exp   2sn 2 v 

p y |s
2

 
n
The expression is called s 2 | y ~ Inv   2  n, v 
• Remark
– s 2 | y ~ Inv   2  n, v  is identical to n
- 12 -
v
s2
~  2 n
 n /2
 n 
exp   2 v 
 2s 
Estimating variance with known mean
• Evaluation of posterior pdf s 2 | y
– Note that s 2 | y ~ Inv   2  n, v 
– Let us denote
z
nv
s
2
then
n
v
~  2 n
n
x
s2
1 
or f Z  z   z 2 e 2
z ~  2  n
– Then, by the definition of CDF, one can derive the following relation
zb
s b2
za
s a2
Fz   f Z  z  dz  

s a2
s b2
fZ  z 
z
s2
 nv   nv  s a2  nv  n
fZ  2  d  2    2 fZ  2 
d s 2 
2
 s   s  sb
 s  s 2 
d s 2    fs 2 s 2  d s 2   Fs 2
fs 2 s
2
  f zs
z
Z
2
– If we need pdf value of s2,
compute z = nv/ s2, next calculate 2 pdf value at z. Then multiply z/s2.
– If we need cdf of s2, i.e., P[s2 ≤ c] which is same as P[z ≥ zc] of 2
pdf .
compute zc = nv/ c, next calculate 1- 2 cdf value at zc.
- 13 -
Fundamentals of Chi2 distribution by matlab
• Chi2 distribution z ~  2  n  or f Z  z  
n
z
1 
z2 e 2
– pdf function: chi2pdf(z,n)
let’s plot the function when n=5 using pdf & original expression.
– cdf function: chi2cdf(z,n)
– random samples generation: chi2rnd(z,n)
• Posterior distribution of variance s2
– pdf function
compute z = nv/ s2, next calculate 2 pdf value at z. Then multiply z/s2.
• Simulation
– Random samples of chi2 distribution and comparison with pdf
– Random samples of Posterior distribution and comparison with pdf
- 14 -
Estimating variance with known mean
• Practice
– 5 samples of normal distribution are given with sample variance 0.04.
– Plot posterior pdf of unknown variance conditional on the observation
using the analytical expression.
– Plot the distribution also using the simulation draw.
– Superpose the two in one graph.
– Calculate the 95% credible interval, i.e., 2.5% & 97.5% percentiles of
the distribution.
- 15 -
Homework
• Mean with known variance (no prior)
n=20 samples of normal distribution with sample mean 2.9 and population
stdev 0.2 are given.
1) write the expression for the posterior distribution of the mean.
2) plot the posterior distribution of the mean using the pdf function and
simulation with N=1e6 respectively.
3) calculate 2.5%, 97.5% percentiles of the unknown mean from the posterior
distribution using the inv function and drawn samples respectively.
• Variance with known mean (non-informative prior)
n=20 samples of normal distribution with known mean 2.9 and sample stdev
0.2 are given.
1) write the expression for the posterior distribution of the variance.
2) plot the posterior distribution of the variance using the pdf function and
simulation with N=1e6 respectively.
3) calculate 2.5%, 97.5% percentiles of the unknown variance from the
posterior distribution using the inv function and drawn samples respectively.
- 16 -
Related documents