Download Binomial Distribution Bernoulli Process: random process with

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of network traffic models wikipedia , lookup

History of statistics wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Tweedie distribution wikipedia , lookup

Poisson distribution wikipedia , lookup

Negative binomial distribution wikipedia , lookup

Transcript
Binomial Distribution
Bernoulli Process: random process with exactly two possible
outcomes which occur with fixed probabilities (e.g., flip coin,
heads or tails, particle recorded/not recorded, …). Probabilities
from symmetry argument or other information. We call this a
‘Direct’ probability distribution - the frequency distribution of
expected outcomes follows mathematically from the assumptions.
Definitions:
p is the probability of a ‘success’ (heads, detection of particle, …) 0p1
N independent trials (flip of the coin, number of particles crossing detector, …)
r is the number of successes (heads, observed particles, …) 0 rN
Then
f (r; N , p) =
Probability of r successes in N trials
N!
p r q N r
r!( N r )!
where q = 1 p
Number of combinations - Binomial coefficient
Summer Semester 2007
Monte Carlo Methods
Lecture 2 1
Derivation: Binomial Coefficient
Ways to order N distinct objects is N!=N(N-1)(N-2)…1
N choices for first position, then (N-1) for second, then (N-2) …
Now suppose we don’t have N distinct objects, but have subsets
of identical objects. E.g., in flipping a coin, two subsets (tails and heads).
Within a subset, the objects are indistinguishable. For the ith
subset, the ni! combinations are all equivalent. The number of
distinct combinations is then
N!
where
n !n ! n !
1
2
n
i
i
=N
n
For the binomial case, there are two subclasses (Success &
failure, heads or tails, …) The combinatorial coefficient is
therefore
N N!
=
r r!(N r)!
Summer Semester 2007
Monte Carlo Methods
Lecture 2 2
Binomial Distribution-cont.
P=0.5 N=4
P=0.5 N=5
P=0.5 N=15
P=0.5 N=50
P=0.1 N=5
P=0.1 N=15
P=0.8 N=5
P=0.8 N=15
Summer Semester 2007
Monte Carlo Methods
E[r]=Np
V[r]=Np(1-p)
Notes:
• for large N, p near
0.5 distribution is
approx. symmetric
• for p near 0 or 1, the
variance is reduced
Lecture 2 3
Poisson Distribution
A Poisson distribution applies when there is a large number of
trials, each with a small probability of success, and the trials occur
independently of each other.
High energy physics example: beams collide at a high frequency
(10 MHz, say), and the chance of a ‘good event’ is very small.
The number of good events in a time interval T>>100 ns will follow
a Poisson distribution. A single trial is one crossing of the beams.
Nuclear physics example: radioactive decay of a nucleus. The
number of events observed in some time period, T, follows a
Poisson distribution. A trial is the attempt to observe a decay
within a small time interval, t, within T, and a success is the
positive observation of a decay. The trial time interval t can be
defined infinitely small.
Summer Semester 2007
Monte Carlo Methods
Lecture 2 4
Poisson Distribution-cont.
Poisson distribution can be derived from the Binomial distribution
in the limit when N and p 0, but Np fixed and finite. For
previous example of radioactive decay, T=Nt, and expected
number of decays is Np. We define =Np. Then f (r; N, p) f (n; )
where n is the number of successes and is the expectation
based on the rate and the total number of trials. Note that will
depend on the observation time (number of trials).
N n
N!
n f (n; , N ) =
1 n
n!(N n)! N N N
For N
N!
= N(N 1)(N n +1) N n
(N n)!
N n N
1 e
1 N
N
Summer Semester 2007
Monte Carlo Methods
Lecture 2 5
Poisson Distribution-cont.
So,
=0.1
=1.0
=5.0
=20.
Summer Semester 2007
n e f (n; ) =
n!
E[n]= by definition
2=
variance=mean
most important property
=0.5
=2.0
=10.
=50.
Notes:
• As increases, the
distribution becomes more
symmetric
• Approximately Gaussian
for >20
• Poisson formula is much
easier to use than the
Binomial formula.
Monte Carlo Methods
Lecture 2 6
Poisson Distribution-cont.
Proof of Normalization, mean, variance:
n
n e Normalization : =e e = 1
=e n=0 n!
n=0 n!
n1
n e E[n] = n
= e = e e = n!
n=0
n=1 (n 1)!
V[n] = E[n 2 ] E[n]2
n n1
e
E[n 2 ] = n 2
= e n
n!
(n 1)!
write n = (n 1+1)
n=0
n=1
n1 n1
2
= e (n 1)
+
+
=
(n 1)! n=1 (n 1)!
n=1
V[n] = 2 + 2 = Summer Semester 2007
Monte Carlo Methods
Lecture 2 7
Poisson Distribution-cont.
Example: Observation of Supernovae – IMB experiment
Number of events in 10 sec interval 0
1
2
3
4 5 6
7
8
9
Occurences
1042 860 307 78 15 3 0
0
0
1
Poisson with mean 0.77
1064 823 318 82 16 2 0.3 0.03 0.003 0.0003
Note: a 10 sec interval
contains a very large
number of trials each
with a very small
success rate. It (the 10
sec interval) is not one
trial !
Summer Semester 2007
Monte Carlo Methods
Lecture 2 8
Gaussian Distribution
The Gaussian distribution is the most widely known distribution,
and the most widely used.
1
P(x; μ, ) =
2 (xμ ) 2
2
2
e
The mean is μ and the variance is 2.
All Gaussians are similarly in shape and symmetric, as opposed
to the Binomial or Poisson distribution, and easily characterized.
E.g., 68.3% of the probability lies within 1 standard deviation of
the mean 95.45% within 2 standard deviations and 99.7% within 3
standard deviations. FWHM = 2.35
Summer Semester 2007
Monte Carlo Methods
Lecture 2 9
Derivation of Gauss Distribution
We will consider Gauss’ derivation of the Gauss function. It can
also be derived as the limit of the binomial distribution in the limit
N and r and p not too small and not too big. We have
already seen that this leads to a symmetric distribution.
Binomial N=50, p=0.5
Gaussian μ=25,2=Np(1-p)
Summer Semester 2007
Monte Carlo Methods
Lecture 2 10
Derivation
Here we follow the argument used by Gauss. Gauss wanted to
solve the following problem: What is the form of the function (xiμ) which gives a maximum probability for μ=arithmetic mean of
the observed values {xi}.
f ( x | μ) = (x1 μ) (x 2 μ) (x n μ)
is the probability density to get {xi}
n
Gauss wanted this function to peak for μ = x = x i n
i=1
df
=0
dμ μ =x
d n
=0
(x i μ)
dμ i=1
μ =x
(x i x )
Assuming f (μ = x ) 0, =0
i (x i x )
zi = x i x
Define =
for all possible z i, so z
Then zi = 0
(zi ) = 0
i
Summer Semester 2007
i
Monte Carlo Methods
Lecture 2 11
Gauss’ derivation-cont.
d
= kz dz = kz,
kz 2 or (z) exp 2
We get the prefactor via normalization.
Lessons:
• Binomial looks like Gaussian for large enough N,p
• Poisson also looks like Gaussian for large enough • Gauss’ formula follows from general arguments (maximizing
posterior probability at arithmetic mean)
• Gauss’ formula is much easier to use than Binomial or
Poisson, so use it when you’re allowed.
Summer Semester 2007
Monte Carlo Methods
Lecture 2 12
Comparison Gaussian-Poisson
Four events expected
Binomial:
N
10
p
0.4
Poisson:
4
<r>
4
Gaussian:
μ
4
•
•
Summer Semester 2007
Monte Carlo Methods
<(r- μ)3>
0.48
<(r-μ)2>
2.4
<r>
4
<(r-μ)2>
4
2
2.4
<(r- μ)3>
4
<(r- μ)3>
0
In this case, the
Binomial more closely
resembles a Gaussian
than does the Poisson
Note, for Binomial, can
change N,p
Lecture 2 13
Comparison Gaussian-Poisson
Binomial:
N
p
2 0.9
Poisson:
1.8
<r>
1.8
Gaussian:
μ
1.8
<(r- μ)3>
-0.14
<(r-μ)2>
0.18
<r>
1.8
<(r-μ)2>
1.8
2
0.18
<(r- μ)3>
1.8
<(r- μ)3>
0
In general, need to
use Poisson or
Binomial when dealing
with small statistics or
p0,1
Summer Semester 2007
Monte Carlo Methods
Lecture 2 14
Comparison Gaussian-Poisson
Binomial:
N
p
100 0.1
<r>
10
Poisson:
10
<(r-μ)2>
10
<r>
10
Gaussian:
μ
10
2
9
<(r- μ)3>
7.2
<(r-μ)2>
9
<(r- μ)3>
10
<(r- μ)3>
0
For large numbers,
Gaussian excellent
approximation.
Summer Semester 2007
Monte Carlo Methods
Lecture 2 15
Random Numbers
We now consider how random numbers are generated on the
computer. Since these are generated with an algorithm, they are
not random, but pseudo-random. This means the distributions of
numbers produced by the algorithm should have the properties
we expect for uncorrelated random numbers.
Note that having a prescription for generating the random
numbers is useful, since we often need reproducible sequences
for debugging and reproducibility of programs.
Examples:
• linear congruential generators
• Lagged Fibonacci generator
•…
Follow Simulation and the Monte Carlo Method, R. Rubenstein
Summer Semester 2007
Monte Carlo Methods
Lecture 2 16
Linear Congruential Generator
Calculate the residues, modulo an integer, of a linear
transformation:
X i+1 = (aX i + c)(mod m),
i = 0,...,n
a is the multiplier
c is the increment
non-negative integers
m is the modulus
}
X 0 is the seed, remaining values completely fixed
Random numbers between (0,1) are obtained via:
Xi
Ui =
m
Summer Semester 2007
Monte Carlo Methods
Lecture 2 17
Linear Congruential Generator
Once a previous number is reached, then the sequence will
repeat itself. The maximum number of distinct numbers is
therefore m. The sequence is periodic, and the period is therefore
a key value to be determined.
a = c = X0 = 3 m = 5
Example:
X i+1 = ( 3X i + 3) mod(5)
X 0 = 3, X 1 = 2, X 2 = 4, X 3 = 0, X 4 = 3
Period p = 4 (Repeats after 4 steps)
The best we can do is p=m. A full period is achieved if
1. c is relative prime to m (c and m have no common divisors)
2. a1(mod g) for every prime factor g of m
3. a1(mod 4) if m is a multiple of 4
The Art of Computer Programming: Seminumerical Algorithms, Vol. 2, D. E. Knuth
Summer Semester 2007
Monte Carlo Methods
Lecture 2 18
Caveats
m=2 where represents the word length, guarantees a full
period, (other conditions mean c should be odd and
a=1(mod 4)
but
For instance, if an LCG is used to choose points in an ndimensional space, triples of points will lie on, at most, M1/n
hyperplanes. This is due to serial correlation between
successive values of the sequence …
A further problem of LCGs is that the lower-order bits of the
generated sequence have a far shorter period than the
sequence as a whole if m is set to a power of 2 …
From Wikipedia
Summer Semester 2007
Monte Carlo Methods
Lecture 2 19
Tests of pseudorandom number generators
It is important to test the random number generator which you will
use for your calculations (simulations), or use a generator which
has demonstrated properties. There are many tests one can
imagine. The most basic is obviously to see that the values are
uniformly distributed (you should compare to the theoretical
values for the different moments of the distribution, e.g.). In the
following, we look at some distributions generated using the
RNDM generator in the CERN Library.
Method has: c = 0
X 0 = 20000000011060471625 8
a = 20000000343277244615 8
On CDC Computer
Here ?
m = 2 47
Summer Semester 2007
Monte Carlo Methods
Lecture 2 20
Tests of RNDM
Expectations:
1
1
1
E[x] = x f (x) dx = x dx =
2
0
0
2
3
2
1
1
x
x
1
x
m2 = x dx = +
= =2
2
3 2 4 0 12
0
1
Let us see how our function
performs:
Summer Semester 2007
Monte Carlo Methods
Lecture 2 21
Mean & Variance
Summer Semester 2007
Monte Carlo Methods
Lecture 2 22
Tests of RNDM
Look at a somewhat more sophisticated quantity, the correlation
between successive random numbers. For the correlation
coefficient, we expect:
cov[x, y] E[xy] μ x μ y E[xy] 1/4
xy =
=
=
x y
x y
1/12
Summer Semester 2007
Monte Carlo Methods
E[xy] = E[x]E[y]
and xy = 0
Lecture 2 23
Exercise with Cumulative Distribution Function
What is the probability density for xy, if they are uniformly
distributed and independent ?
a
F(a) = Pr(xy a) = f (z) dz
where z = xy
0
xy a
two cases : x a 0 y 1
x > a y a/x
So,
a1
1a/x
1
00
a 0
a
1
F(a) = dy dx + dy dx = a + a / x dx = a + aln x a = a aln a
To get the pdf, we differentiate:
Summer Semester 2007
dF(z)
f (z) =
= 1 ln z 1 = ln z
dz
Monte Carlo Methods
Lecture 2 24
Example
Summer Semester 2007
Monte Carlo Methods
Lecture 2 25
Kolmogorov-Smirnov test
Define the cumulative distribution function for the sample and
compare with the expected:
N
I(-,x) (X i )
FN (x) =
i=1
N
1, if < X x where I(-,x) (X) = 0, otherwise
Look at the max deviation of this from the expected cdf:
DN = sup FN (x) FX (x)
<x <
DN should be within a certain value if FN is really from FX.
Expected results are tabulated.
Summer Semester 2007
Monte Carlo Methods
Lecture 2 26
Kolmogorov-Smirnov Test
For N>35 or so
Confidence
20%
10%
5%
2%
1%
DN
1.07/
1.22/
1.36/
1.52/
1.63/
In our case,
n=5 107, 1/N=1.4 10-4
Max deviation is 10-4, so high
confidence that the two
distributions agree
Summer Semester 2007
Monte Carlo Methods
Lecture 2 27
Exercises
1. Produce a linear congruential generator which generates
uniform random integers between 0,10. Generate a long
sequence of numbers and look at mean, variance.
2. Investigate which random number generators are available on
your computer and look into their properties. Download a
good generator if you do not have one.
3. Find the cumulative distribution function and the pdf for the
product of 3 iid rn flat between (0,1)
Summer Semester 2007
Monte Carlo Methods
Lecture 2 28