Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
5-1-2011
Stochastic Hydrology
Probability and
random variables
Marc F.P. Bierkens
Professor of Hydrology
Faculty of Geosciences
Random variable: definition
A variable that can have a set of different values
generated by some probabilistic mechanism.
We do not know the value of a stochastic variable,
but we do know the probability with which a certain
value can occur.
1
5-1-2011
Example: throwing 2 dice
D
Pr(d)
2
1/36
3
2/36
4
3/36
5
4/36
6
5/36
2
3
4
5
6
7
6/36
8
5/36
9
4/36
10
3/36
11
2/36
12
1/36
7
8
9
10
11
12
0.18
0.16
0.14
probability
0.12
0.10
0.08
0.06
0.04
0.02
0.00
outcome
Expected value or mean
Nd
E[ D ] d i Pr[ d i ] 2 1 / 36 3 2 / 36 ..... 12 1 / 36 7
i 1
Estimated as:
1 n
Ê[ D ] d j
n j 1
2
5-1-2011
Variance
Nd
VAR[ D] E[( D E[ D ]) 2 ] (d i E[ D]) 2 Pr[d i ]
i 1
(2 7) 2 1 / 36 (3 7) 2 2 / 36 ..... (12 7) 2 1 / 36
5.8333
Estimated as:
VÂR[ D ]
1 n
(di Ê[D])2
n 1 i 1
Continuous variables
Histogram (Probability mass function) -> probability density function
fz(z)
Pr[Z=z1] = fz(z1)
WRONG!
z1
z
Pdf = probability mass per unit z
3
5-1-2011
Continuous variables
Pdf = probability mass per unit z
fz(z)
Pr[ z1 Z z2 ]
z2
f
Z
( z )dz
z1
z1
z2
z
Continuous variables
fz(z)
Pr[ z Z z dz ]
dz
z
Formal definition probability density:
Pr[ z Z z dz ]
f Z ( z ) lim
dz 0
dz
where
f
Z
( z )dz 1
4
5-1-2011
Continuous variables
Cumulative probability distribution function
FZ (z )
1
Pr[Z z1 ]
0
z1
z
FZ ( z ) Pr[Z z ]
Continuous variables
z
FZ ( z ) Pr[Z z ]
f
Z
( z )dz
f Z ( z)
dFZ ( z )
dz
pdf
FZ (z )
fz(z)
cpdf
1
Pr[Z z1 ]
0
z1
z
z1
z
5
5-1-2011
Continuous variables
z2
Pr[ z1 Z z2 ] f Z ( z )dz
z1
Pr[ z1 Z z2 ] FZ ( z2 ) FZ ( z1 )
pdf
cpdf
FZ (z )
fz(z)
1
0
z1
z2
z
z1
z2
z
Exercise
Consider the following probability density function:
fZ ( z)
1 z /10
e
10
z0
1) Derive the cumulative probability distribution function.
2) What is the probability that Z lies between 5 and 10?
6
5-1-2011
Probability
Objectivistic definitions
• Classical
P( A)
NA
All outcomes resulting in A
N Total number of possible outcomes
Example 2 dice : P(d 6)
5 (5 1,4 2,3 3,2 4,1 5)
36
• Frequentistic
P( A) lim
n
nA number of trials resulting in A
n
Total number of trials
Probability
Objectivistic definitions
• Axiomatic (Kolmogorov, 1933)
1.
The probability of an event A is a positive number assigned to
this event:
P( A) 0
2.
The probability of the certain event (the event is equal to all
possible outcomes) equals 1:
3.
P(S ) 1
If the events A and B are mutually exclusive then their union
equals the sum of the individual probabilities:
P( A B ) P( A) P ( B)
7
5-1-2011
Probability
Objectivistic definitions
• Axiomatic (Kolmogorov, 1933)
Exclusive events
Non-exclusive events
B
A
P ( A)
Area A
Area S
S
A
B
S
Probability
Subjectivistic definitions
• Probability measures our “confidence” about the value or a range of values of a
property whose value is unknown.
• The probability distribution thus reflects our uncertainty about the unknown but
true value of a property.
Example 1: How tall is Marc Bierkens ?
Example 2: What is the IQ of George Bush?
8
5-1-2011
Measures of probability distributions
• Mean or Expected value (measure of locality)
Nd
E[ D ] d i Pr[ d i ]
(discrete, e.g. throwing dice)
i 1
Z E[Z ] z f Z ( z )dz
Estimated from data as:
(continuous: sum becomes an
integral and histogram a pdf)
̂ z
1 n
zi
n i 1
Measures of probability distributions
• Variance (measure of spread)
Z2 E[(Z Z ) 2 ] ( z Z ) 2 f Z ( z )dz
Estimated from data as:
ˆ z2
1 n
( zi ˆ Z )2
n 1 i 1
9
5-1-2011
Measures of probability distributions
• Skewness (measure of form)
CSZ
E[(Z Z )3 ]
Z3
(z
Z
)3 f Z ( z )dz
Z3
Estimated from data as:
1 n
( zi ˆ z )3
n 1 i 1
ˆ
CS Z
ˆ z3
Measures of probability distributions
• Rules with expected value and variance:
E[ a bZ ] a b E[ Z ]
VAR[a bZ ] b 2 VAR[Z ]
10
5-1-2011
Examples of probability density functions
Probability density functions
Gaussian (normal)
normal) probability density:
density:
fZ ( z)
1
2 Z
e
1 Z Z
2 Z
2
11
5-1-2011
Relation between normal and lognormal pdf
Y ln Z
Z lognormal distribution
Y normal distribution
Z e
Z2
Y Y2 / 2
2 Y Y2 Y2
e
(e 1)
Y ln Z
Y2 ln(
Y2
1)
2
Z
2
Z
2
Exercises
H ydraulic conductivity at som e unobserved location is m odelled w ith a log-norm al
distribution. T he m ean of Y=lnK is 2.0 and the variance is 1.5. C alculate the m ean and the
variance of K ?
H ydraulic conductivity for an aquifer has a lognorm al distribution w ith m ean 10 m /d and
variance 200 m 2 /d 2 . W hat is the probability that at a non-observed location the
conductivity is larger than 30 m /d?
12
5-1-2011
Two or more random variables
f ZY ( z , y )
0.0012
0.001
0.0008
0.0006
0.0004
0.0002
0
-100
-50
Bivariate pdf
y0
50
100
Pr[ z1 Z z2 y1 Y y2 ]
20
40
-20
0z
-40
y2 z2
f
ZY
( z , y )dzdy
y1 z1
Pr[ z1 Z z2 y1 Y y2 ]
dzdy
dy 0
Formal definition: f ZY ( z , y ) dzlim
0
Two or more random variables
FZY ( z , y )
Bivariate cpdf
1
0.8
0.6
0.4
0.2
0
-100
-50
y0
-20
- 50
- 100
- 40
- 20
-40
0z
FZY ( z , y ) Pr[ Z z Y y ]
y
FZY ( z, y )
z
f
ZY
( z , y )dzdy
f ZY ( z , y )
2 FZY ( z , y )
zy
13
5-1-2011
Two or more random variables
Marginal probability density:
fZ ( z)
f
ZY
( z , y )dy
Conditional probability:
FZ |Y ( z | y ) Pr{Z z | Y y )
Conditional pdf
f Z |Y ( z | y )
Independence of Z and Y
f ZY ( z , y ) f Z ( z ) fY ( y )
dFZ |Y ( z | y )
dz
Two or more random variables
Bayes’
Bayes’ theorem:
f Z |Y ( z | y )
fY | Z ( y | z ) f Z ( z )
f
Y |Z
( y | z ) f Z ( z )dz
14
5-1-2011
Two or more random variables
Covariance:
COV[ Z , Y ] E[( Z Z )(Y Y )]
(z
Z
)( y Y ) f ZY (z , y )dzdy
Correlation:
ZT
COV[ Z , Y ]
Z Y
In case of independence: COV[ Z , Y ] 0, ZT 0
Two or more random variables
Properties of variance and covariance:
VAR[aZ bY ] a 2 VAR[ Z ] b 2 VAR[Y ] 2ab COV[ Z , Y ]
VAR[aZ bY ] a 2 VAR[ Z ] b 2 VAR[Y ] 2ab COV[ Z , Y ]
15
5-1-2011
Two or more random variables
Bivariate Gaussian probability distribution:
f ZY ( z , y )
1
2
2 Z Y 1 ZY
Z Z
1
exp
2
2(1 ZY
) Z
2
Z Y
Y
2
Z Z
2
Z
Z Y
Y
Two or more random variables
Bivariate Gaussian probability distribution:
16
5-1-2011
Two or more random variables
Multivariate Gaussian probability distribution:
Z1
Z
z 2
Z
N
1
μ 2
N
f Z1 ...Z N ( z1 ,..., z N )
12
1 2 12
2 1 21
Czz
N 1 N1
1 N 1N
N2
12 (z μ)T Czz1 (z μ)
1
e
(2 ) N / 2 | C zz |1/2
Appendix:
Elementary probability theory
17
5-1-2011
Probability Rules
A1
A2
Ai
Mutually exclusive (no intersection)
and exhaustive (filling all of S)
events Ai:
AM
S
M
P( A ) P(S ) 1
i 1
i
Probability Rules
{A B}
intersection
A
B
{A B}
Union
S
P ( A B ) P( A) P( B ) P( A B)
18
5-1-2011
Probability Rules
{A B}
Conditional probability of A given B:
B
A
P( A | B)
{A B}
P( A B)
P( B)
S
P ( A B ) P ( A | B ) P ( B ) P ( B | A) P ( A)
Probability Rules
{A B}
Two events A and B are independent if:
A
B
P ( A B ) P ( A) P ( B )
{A B}
S
Because:
P ( A B ) P ( A | B ) P ( B ) P ( B | A) P ( A)
The following also holds if A and B are independent:
P ( A | B ) P ( A)
P ( B | A) P ( B )
19
5-1-2011
Probability Rules
A1 A2
Total probability theorem:
Ai
M
M
i 1
i 1
P ( B ) P ( Ai B ) P ( B | Ai ) P ( Ai )
AM
{Ai B}
B
S
Bayes’ Theorem
P( Ai | B)
P( B | Ai ) P( Ai )
M
P( B | A ) P( A )
j 1
j
j
Used for updating prior probability P(Ai)
given observations B and likelihood P(B|Ai)
20