Download Probability

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Concept of Probability
AS3105 Astrophysical Processes 1
Dhani Herdiwijaya
Probability in Everyday Life
•
•
•
•
•
Rain fall
Traffic jam
Across the street
Catastrophic meteoroid
airplane travel. Is it safe to fly?
Laplace (1819)
Probability theory is nothing but common sense reduced to calculation
Maxwell (1850)
The true logic of this world is the calculus
of probabilities . . . That is, probability is a natural language for
describing real world phenomena
A mathematical formulation of games of chance began in the middle of
the 17th century. Some of the important contributors over the following
150 years include Pascal, Fermat, Descartes, Leibnitz, Newton,
Bernoulli, and Laplace
Development
it is remarkable that the theory of probability took
so long to develop.
An understanding of probability is elusive due in
part to the fact that the probably depends on the
status of the information that we have (a fact
well known to poker players).
Although the rules of probability are defined by
simple mathematical rules, an understanding of
probability is greatly aided by experience with
real data and concrete problems.
Probability
To calculate the probability of a particular
outcome, count the number of all possible
results. Then count the number that give the
desired outcome.
The probability of the desired outcome is
equal to the number that gives the desired
outcome divided by the total number of
outcomes. Hence, 1/6 for one die.
Rules of Probability
In 1933 the Russian mathematician A. N.
Kolmogorov formulated a complete set of
axioms for the mathematical definition of
probability.
For each event i, we assign a probability P(i)
that satisfies the conditions
P (i) ≥ 0
P (i) = 0 means that the event cannot occur
P (i) = 1 means that the event must occur
The normalization condition says that the sum of the probabilities
of all possible mutually exclusive outcomes is unity
Example. Let x be the number of points on the face of a die. What
is the sample space of x?
Solution. The sample space or set of possible events is xi = {1, 2,
3, 4, 5, 6}. These six outcomes are mutually exclusive.
There are many different interpretations of probability because
any interpretation that satisfies the rules of probability may be
regarded as a kind of probability. An interpretation of probability
that is relatively easy to understand is based on symmetry.
Addition rule
For an actual die, we can estimate the probability a posteriori,
that is, by the observation of the outcome of many throws.
Suppose that we know that the probability of rolling any face
of a die in one throw is equal to 1/6, and we want to find the
probability of finding face 3 or face 6 in one throw. the
probability of the outcome, i or j , where i is distinct from j
P (i or j ) = P (i) + P (j ).
(addition rule)
The above relation is generalizable to more than two events.
An important consequence is that if P (i) is the probability of
event i, then the probability of event i not occurring is
1 − P (i).
Combining Probabilities
•If a given outcome can be reached in two (or
more) mutually exclusive ways whose
probabilities are pA and pB, then the probability
of that outcome is: pA + pB.
•This is the probability of having either A or B.
Example
•Paint two faces of a die red. When the die
is thrown, what is the probability of a red
face coming up?
1 1 1
p  
6 6 3
Example: What is the probability of throwing a three or a six
with one throw of a die?
Solution. The probability that the face exhibits either 3 or 6 is
1/6 + 1/6 = 1/3
Example: What is the probability of not throwing a six with one
throw of die?
Solution. The answer is the probability of either “1 or 2 or 3 or
4 or 5.” The addition rule gives that the probability P (not six)
is
P (not six) = P (1) + P (2) + P (3) + P (4) + P (5)
= 1 − P (6) = 5/6
the sum of the probabilities for all outcomes sums to unity. It is
very useful to take advantage of this property when solving
many probability problems.
Multiplication rule
• Another simple rule is for the probability of the
joint occurrence of independent events.
These events might be the probability of throwing
a 3 on one die and the probability of throwing a 4
on a second die. If two events are independent,
then the probability of both events occurring is the
product of their probabilities
P (i and j ) = P (i) P (j )
(multiplication rule)
• Events are independent if the occurrence of one
event does not change the probability for the
occurrence of the other.
Combining Probabilities
•If a given outcome represents the combination
of two independent events, whose individual
probabilities are pA and pB, then the probability
of that outcome is: pA × pB.
•This is the probability of having both A and B.
Example
•Throw two normal dice. What is the
probability of two sixes coming up?
1 1 1
p ( 2)   
6 6 36
Example:
Consider the probability that a person
chosen at random is female and was born
on September 6. We can reasonably
assume equal likelihood of birthdays for all
days of the year, and it is correct to
conclude that this probability is
½ x 1/365
Being a woman and being born on
September 6 are independent events.
• Example. What is the probability of throwing an even
number with one throw of a die?
Solution. We can use the addition rule to find that
P (even) = P (2) + P (4) + P (6) = 1/6 + 1/6 +1/6 = ½
• Example. What is the probability of the same face
appearing on two successive throws of a die?
Solution. We know that the probability of any specific
combination of outcomes, for example, (1,1), (2,2), . . .
(6,6) is 1/6 x 1/6 = 1/36
P (same face) = P (1, 1) + P (2, 2) + . . . + P (6, 6)
= 6 × 1/36 = 1/6
• Example. What is the probability that in two
throws of a die at least one six appears?
Solution. We have already established that P (6) = 1/6 and P (not 6) =
5/6. In two throws, there are four possible outcomes
(6, 6), (6, not 6), (not 6, 6), (not 6, not 6) with the probabilities
P (6, 6) = 1/6 x 1/6 = 1/36
P (6, not 6) = P (not 6, 6) = 1/6 x 5/6 = 5/36
P (not 6, not 6) = 5/6 x 5/6 = 25/36
All outcomes except the last have at least one six. Hence, the
probability of obtaining at least one six is
P (at least one 6) = P (6, 6) + P (6, not 6) + P (not 6, 6)
= 1/36 + 5/36 + 5/36 = 11/36
A more direct way of obtaining this result is to use the normalization
condition. That is,
P (at least one six) = 1 − P (not 6, not 6) = 1 − (5/6)2
= 1 - 25/36 = 11/36 ~ 0.305…
Example. What is the probability of obtaining at least one six
in four throws of a die?
Solution. We know that in one throw of a die, there are
two outcomes with P (6) = 1/6 and P (not 6) = 5/6 .
Hence, in four throws of a die there are sixteen possible
outcomes, only one of which has no six. That is, in the
fifteen mutually exclusive outcomes, there is at least one six.
We can use the multiplication rule to find that
P (not 6, not 6, not 6, not 6) = P (not 6)4 = (5/6)4
and hence
P (at least one six) = 1 − P (not 6, not 6, not 6, not 6)
= 1 - (5/6)4 = 671/1296 ~ 0.517
Complications
p is the probability of success. (1/6 for one die)
q is the probability of failure. (5/6 for one die)
•p + q = 1,
or
q=1–p
When two dice are thrown, what is the
probability of getting only one six?
Complications
•Probability of the six on the first die and not
the second is:
1 5 5
pq   
6 6 36
•Probability of the six on the second die and
not the first is the same, so:
10 5
p (1)  2 pq 

36 18
Simplification
•Probability of no sixes coming up is:
5 5 25
p (0)  qq   
6 6 36
•The sum of all three probabilities is:
•p(2) + p(1) + p(0) = 1
Simplification
•p(2) + p(1) + p(0) = 1
•p² + 2pq + q² =1
•(p + q)² = 1
The exponent is the number of dice (or
tries).
•Is this general?
Three Dice
• (p + q)³ = 1
• p³ + 3p²q + 3pq² + q³ = 1
• p(3) + p(2) + p(1) + p(0) = 1
• It works! It must be general!
(p + q)N = 1
Renormalization
Suppose we know that P (i) is proportional
to f (i), where f (i) is a known function. To
obtain the normalized probabilities, we
divide each function f (i) by the sum of all
the unnormalized probabilities. That is,
if P (i) α f (i),
and
Z = ∑ f (i), then P (i) = f (i)/Z .
This procedure is called normalization.
• Example. Suppose that in a given class it is three times
as likely to receive a C as an A, twice as likely to obtain a
B as an A, one-fourth as likely to be assigned a D as an
A, and nobody fails the class. What are the probabilities
of getting each grade?
Solution. We first assign the unnormalized probability of
receiving an A as f (A) = 1. Then
f (B ) = 2, f (C ) = 3, and f (D) = 0.25.
Then Z = ∑ f (i) = 1 + 2 + 3 + 0.25 = 6.25.
Hence,
P (A) = f (A)/Z = 1/6.25 = 0.16,
P (B ) = 2/6.25 = 0.32,
P (C ) = 3/6.25 = 0.48, and
P (D) = 0.25/6.25 = 0.04.
Meaning of Probability
• How can we assign the probabilities of the various events? If
we say that event E1 is more probable than event E2 (P
(E1 ) > P (E2 )), we mean that E1 is more likely to occur than
E2 . This statement of our intuitive understanding of
probability illustrates that probability is a way of classifying the
plausibility of events under conditions of uncertainty.
Probability is related to our degree of belief in the occurrence
of an event.
• Probability assessments depend on who does the
evaluation and the status of the information the evaluator
has at the moment of the assessment. We always evaluate
the conditional probability, that is, the probability of an event E
given the information I , P (E | I ). Consequently, several
people can have simultaneously different degrees of belief
about the same event, as is well known to investors in the
stock market.
IHSG
If rational people have access to the same
information, they should come to the same
conclusion about the probability of an event.
The idea of a coherent bet forces us to make
probability assessments that correspond to our
belief in the occurrence of an event.
Probability assessments should be kept separate
from decision issues.
Decisions depend not only on the probability of the
event, but also on the subjective importance of
say, a given amount of money
Probability and Knowledge
• Probability as a measure of the degree of belief
in the occurrence of an outcome implies that
probability depends on our prior knowledge,
because belief depends on prior knowledge.
• Probability depends on what knowledge we
bring to the problem. If we have no knowledge
other than the possible outcomes, then the best
estimate is to assume equal probability for all
events. However, this assumption is not a
definition, but an example of belief. As an
example of the importance of prior knowledge,
consider the following problem.
Large numbers
We can estimate probabilities empirically by
sampling, that is, by making repeated
measurements of the outcome of independent
events.
Intuitively we believe that if we perform more and
more measurements, the calculated average will
approach the exact mean of the quantity of
interest. We should use computer to generate
random number.
The applet/application at
<stp.clarku.edu/simulations/cointoss> to
simulate multiple tosses of a single coin
This idea is called the law of large numbers.
Mean Value
• Consider the probability distribution P (1), P (2), . . . P (n) for
the n possible values of the variable x. In many cases it is
more convenient to describe the distribution of the possible
values of x in a less detailed way. The most familiar way is to
specify the average or mean value of x, which we will denote
as <x>. The definition of the mean value of <x> is
<x> ≡ x1 P (1) + x2 P (2) + . . . + xnP (n)
where P (i) is the probability of xi . If f (x) is a function of x, then
the mean value of f (x) is defined by
Example: A certain $50 or $100 if you flip a coin and get
a head and $0 if you get a tail. The mean value for the
second choice is
mean value = ∑ Pi × (value of i),
where the sum is over the possible outcomes and Pi is
the probability of outcome i. In this case the mean value
is 1/2 × $100 + 1/2 × $0 = $50. We see that the two
choices have the same mean value. (Most people prefer
the first choice because the outcome is “certain.”)
If f (x) and g(x) are any two functions of x, then
<f (x) + g(x)> = ∑ [f (xi) + g(xi )] P (i)
= ∑ f (xi) P (i) + ∑ g(xi) P (i)
or
<f (x) + g (x)> = <f (x)> + <g (x)>
if c is a constant, then
<c f (x)> = c <f (x)>
In general, we can define the mth moment of the probability
distribution P as
<xm> ≡ ∑ xim P (i)
where we have let f (x) = xm . The mean of x is the first moment of
the probability distribution
The mean value of x is a measure of the central value of x
about which the various values of xi are distributed. If
we measure x from its mean, we have that
Δx ≡ x − <x>
<Δx> = <(x − <x>)> = <x> − <x> = 0
That is, the average value of the deviation of x from its
mean vanishes
If only one outcome j were possible, we would have P (i) =
1 for i = j and zero otherwise, that is, the probability
distribution would have zero width. In general, there is
more than one outcome and a possible measure of the
width of the probability distribution is given by
<Δx2> ≡ <(x − <x>)2>
The quantity <Δx2> is known as the dispersion or variance
and its square root is called the standard deviation. It is
easy to see that the larger the spread of values of x
about <x>, the larger the variance.
The use of the square of x − <x> ensures that the
contribution of x values that are smaller and
larger than <x> enter with the same sign. A
useful form for the variance can be found by
noting that
<(x − <x>)2> = <(x2 − 2x<x> + <x>2)>
= <x2> - 2 <x><x> + <x>2
= <x2> - <x>2
Because <Δx2> is always nonnegative, it follows
that <x2> ≥ <x>2
it is useful to interpret the width of the probability
distribution in terms of the standard deviation σ,
which is defined as the square root of the
variance. The standard deviation of the
probability distribution P (x) is given by
σx = square (<Δx2>) = square (<x2> - <x>2)
Example: Find the mean value <x>, the variance <Δx2>, and the
standard deviation σx for the value of a single throw of a die.
Solution. Because P (i) = 1/6 for i = 1, . . . , 6, we have that
<x> = 1/6 (1+2+3+4+5+6) = 7/2
<x2> = 1/6 (1 + 4 + 9 + 25 + 36) = 46/3
(<Δx2>) = <x2> - <x>2
= 46/3 – 49/4 = 37/12 ~ 3.08
σx = square (3.08) ~ 1.76
Home work
There is an one-dimensional lattice constant a as
shown in Fig. 1. An atom transit from a site to a
nearest-neighbor site every r second. The
probability of transiting to the right and left are p
and q = 1 – p, respectively.
(a) Calculate the average position <x> of the atom
at the time t = N τ, where N >> 1
(b) Calculate the mean square value <(x - <x>)2>
at the time t
Ensemble
Another way of estimating the probability is to perform a
single measurement on many copies or replicas of the
system of interest.
For example, instead of flipping a single coin 100 times in
succession, we collect 100 coins and flip all of them at
the same time. The fraction of coins that show heads is
an estimate of the probability of that event.
The collection of identically prepared systems is called an
ensemble and the probability of occurrence of a single
event is estimated with respect to this ensemble.
The ensemble consists of a large number M of identical
systems, that is, systems that satisfy the same known
conditions.
Information and Uncertainty
Let us define the uncertainty function S (P1 , P2 , . . . ,
Pi , . . .) where Pi is the probability of event i.
In case where all the probabilities Pi are equal. Then,
P1 = P2 = . . . = Pi = 1/Ω,
where Ω is the total number of outcomes.
In this case we have S = S (1/Ω, 1/Ω, . . .) or simply S (Ω).
For only one outcome, Ω = 1 and there is no uncertainty,
S (Ω = 1) = 0 and
S (Ω1 ) > S (Ω2 )  if Ω1 > Ω2
That is, S (Ω) is a increasing function of Ω
We next consider multiple events.
For example, suppose that we throw a die with Ω1 outcomes
and flip a coin with Ω2 equally probable outcomes.
The total number of outcomes is Ω = Ω1 Ω2 . If the result of the
die is known, the uncertainty associated with the die is
reduced to zero, but there still is uncertainty associated with
the toss of the coin.
Similarly, we can reduce the uncertainty in the reverse order, but
the total uncertainty is still nonzero. These considerations
suggest that
S (Ω1 Ω2 ) = S (Ω1 ) + S (Ω2 ) or
S (xy) = S (x) + S (y)
This generalization is consistent with S (Ω) being a increasing
function of Ω
First we take the partial derivative of S (xy) with
respect to x and then with respect to y.
We let z = xy and obtain
From, S (xy) = S (x) + S (y)
By comparing the right-hand side
If we multiply the first by x and the second by y, we obtain
The first term depends only on x and the second term depends only on y.
Because x and y are independent variables, the three terms must be equal
to a constant. Hence we have the desired condition
where A is a constant.
It can be integrated to give
The integration constant B must be equal to zero to satisfy
the condition S (Ω = 1) = 0
The constant A is arbitrary so we choose A = 1. Hence for equal
probabilities we have that
S (Ω) = ln Ω.
In case where the probabilities for the various events are unequal?
The general form of the uncertainty S is
Note that if all the probabilities are equal, then Pi = 1 / Ω,
for all i.
In this case
We also see that if outcome j is certain,
Pj = 1 and Pi = 0 if i = j and
S = −1 ln 1 = 0.
That is, if the outcome is certain, the uncertainty is zero and
there is no missing information. We have shown that if the Pi
are known, then the uncertainty or missing information S
can be calculated.
Usually the problem is to determine the probabilities.
Suppose we flip a perfect coin for which there are two possibilities.
We know intuitively that P1 (heads) = P2 (tails) = 1/2.
That is, we would not assign a different probability
to each outcome unless we had information to justify it.
Intuitively we have adopted the principle of least bias or maximum
uncertainty.
Lets reconsider the toss of a coin. In this case S is given by
where we have used the fact that P1 + P2 = 1. To maximize S we
take the derivative with respect to P1. Use d(ln x)/dx = 1/x
The solution satisfies
which is satisfied by P1 = 1/2.
We can check that this solution is a maximum by calculating
the second derivative.
which is less than zero as expected for a maximum.
Example. The toss of a three-sided die yields events E1 , E2 , and E3 with
a face of one, two, and three points. As a result of tossing many dice, we
learn that the mean number of points is f = 1.9, but we do not know the
individual probabilities. What are the values of P1 , P2 , and P3 that
maximize the uncertainty?
Solution. We have, S = − [ P1 ln P1 + P2 ln P2 + P3 ln P3 ]
We also know that, f = 1P1 + 2P2 + 3P3 ,
and P1 + P2 + P3 = 1. We use the latter condition to eliminate P3
using P3 = 1 − P1 − P2 , and rewrite the above as
f = P1 + 2P2 + 3(1 − P1 − P2 ) = 3 − 2P1 − P2 .
We then use this to eliminate P2 and P3 from the first eq. using
P2 = 3 − f − 2P1 and P3 = f − 2 + P1, then
S = −[P1 ln P1 + (3 − f − 2P1 ) ln(3 − f − 2P1 ) + (f − 2 + P1 ) ln(f − 2 + P1 )].
Because S depends on only P1 , we can differentiate S with respect to P1
to find its maximum value:
Microstates and Macrostates
Each possible outcome is called a
“microstate”.
The combination of all microstates that give
the same number of spots is called a
“macrostate”.
The macrostate that contains the most
microstates is the most probable to occur.
i
Microstates and Macrostates
The evolution of a system can be represented by a trajectory
in the multidimensional (configuration, phase) space of micro1
parameters. Each point in this space represents a microstate.
During its evolution, the system will only pass through accessible microstates
– the ones that do not violate the conservation laws: e.g., for an isolated
system, the total internal energy must be conserved.
2
Microstate: the state of a system specified by describing the quantum state
of each molecule in the system. For a classical particle – 6 parameters (xi, yi,
zi, pxi, pyi, pzi), for a macro system – 6N parameters.
The statistical approach: to connect the macroscopic observables
(averages) to the probability for a certain microstate to appear along the
system’s trajectory in configuration space, P( 1,  2,..., N).
Macrostate: the state of a macro system specified by its macroscopic
parameters. Two systems with the same values of macroscopic parameters
are thermodynamically indistinguishable. A macrostate tells us nothing about a
state of an individual particle.
For a given set of constraints (conservation laws), a system can be in many
macrostates.
The Phase Space vs. the Space of Macroparameters
some macrostate
P
numerous microstates
in a multi-dimensional
configuration (phase)
space that correspond
the same macrostate
T
V
the surface
defined by an
equation of
states
i
i
1
2
i
i
1
1
2
2
1
etc., etc., etc. ...
2
Examples: Two-Dimensional Configuration Space
motion of a particle in a
one-dimensional box
K=K0
L
-L
0
K
“Macrostates” are characterized by a
single parameter: the kinetic energy K0
Another example: one-dimensional
harmonic oscillator
U(r)
K + U =const
px
-L
x
L x
px
-px
x
Each “macrostate” corresponds to a continuum of
microstates, which are characterized by specifying the
position and momentum
The Fundamental Assumption of Statistical Mechanics
i
1
2
The ergodic hypothesis: an isolated system in an
equilibrium state, evolving in time, will pass through all the
accessible microstates at the same recurrence rate, i.e.
all accessible microstates are equally probable.
microstates which correspond
to the same energy
The ensemble of all equi-energetic states
 a microcanonical ensemble.
Note that the assumption that a system is isolated is important. If a system is coupled to
a heat reservoir and is able to exchange energy, in order to replace the system’s
trajectory by an ensemble, we must determine the relative occurrence of states with
different energies. For example, an ensemble whose states’ recurrence rate is given by
their Boltzmann factor (e-E/kBT) is called a canonical ensemble.
The average over long times will equal the average over the ensemble of all equienergetic microstates: if we take a snapshot of a system with N microstates, we will find
the system in any of these microstates with the same probability.
Probability for a
stationary system
many identical measurements
on a single system
a single measurement on
many copies of the system
Probability of a Macrostate, Multiplicity
Probability of a particular microstate of a microcanon ical ensemble 

1
# of all accessible microstate s
The probability of a certain macrostate is determined by how many
microstates correspond to this macrostate – the multiplicity of a given
macrostate  .
Probability of a particular macrostate 
 # of microstate s that correspond to a given macrostate 

# of all accessible microstate s
This approach will help us to understand why some of the macrostates are
more probable than the other, and, eventually, by considering the interacting
systems, we will understand irreversibility of processes in macroscopic
systems.
Probability
“Probability theory is nothing but common sense
reduced to calculations”
Laplace (1819)
An event (very loosely defined) – any possible outcome of some measurement.
An event is a statistical (random) quantity if the probability of its occurrence, P, in the
process of measurement is < 1.
The “sum” of two events: in the process of measurement, we observe either one of the
events.
Addition rule for independent events:
P (i or j) = P (i) + P (j)
(independent events – one event does not change the probability for the
occurrence of the other).
The “product” of two events: in the process of measurement, we observe
both events.
Multiplication rule for independent events:
Example:
P (i and j) = P (i) x P (j)
What is the probability of the same face appearing on two successive
throws of a dice?
The probability of any specific combination, e.g., (1,1): 1/6x1/6=1/36 (multiplication rule) .
Hence, by addition rule, P(same face) = P(1,1) + P(2,2) +...+ P(6,6) = 6x1/36 = 1/6
a macroscopic observable A:
(averaged over all accessible microstates)
A   P 1 ,...,  N A 1 ,...,  N 
 
Two Interacting Einstein Solids, Macropartitions
Suppose that we bring two Einstein solids A and B (two sub-systems with
NA, UA and NB, UB) into thermal contact, to form a larger isolated system.
What happens to these solids (macroscopically) after they have been
brought into contact?
The combined system –
N = NA+ NB , U = UA + UB
energy
NA, UA
NB, UB
Macropartition: a given pair of macrostates for sub-systems A and B that
are consistent with conservation of the total energy U = UA + UB.
Different macropartitions amount to different ways that the energy can be
macroscopically divided between the sub-systems.
Example: the pair of macrostates where UA= 2  and UB= 4  is one possible
macropartition of the combined system with U = 6 
As time passes, the system of two solids will randomly shift between
different microstates consistent with the constraint that U = const.
Question: what would be the most probable macropartition for given
NA, NB , and U ?
Problem:
Consider the system consisting of two Einstein solids in thermal contact. A
certain macropartition has a multiplicity of 6101024, while the total number of
microstates available to the system in all macropartitions is 3101034. What is
the probability to find the system in this macropartition?
Imagine that the system is initially in the macropartition with a multiplicity of
6101024. Consider another macropartition of the same system with a
multiplicity of 6101026. If we look at the system a short time later, how many
more times likely is it to have moved to the second macropartition than to
have stayed with the first?
The Multiplicity of Two Sub-Systems Combined
The probability of a macropartition is
proportional to its multiplicity:
 AB   A  B
macropartition
A+B
sub-system
A
sub-system
B
Example: two one-atom “solids” into thermal contact, with the total U = 6.
Possible macropartitions for NA= NB = 3, U = qA+qB= 6
Macropartition
UA
UB
A
B
 AB
0:6
0
6
1
28
28
1:5
1
5
3
21
63
2:4
2
4
6
15
90
3:3
3
3
10
10
100
4:2
4
2
15
6
90
5:1
5
1
21
3
63
6:0
6
0
28
1
28
Grand total # of microstates:
U /   N  1 !  6  6  1 !  462
U /  !( N  1) ! 6!(6  1) !
Where is the Maximum? The Average Energy per Atom
Let’s explore how the macropartition multiplicity for two sub-systems A and B (NA,
NB, A= B= ) in thermal contact depends on the energy of one of the sub-systems:
 eU A
 A ( N A , U A )  
 NA 

 ,

N
 eU A 

  A ( N A , U A )   B ( N B , U B )  
N

 A 
The high-T
limit (q >> N):
 AB
NA
 eU 
d AB
 N A  A 
dU A
 N A 
N A 1
e
N A
 eU  U A  


N

B


NB
 eU 
 N B  A 
 N A 
A
NA
 eU B
 B ( N B , U B )  
NB 

N
 eU  U A  


N

B





NB
U B  U U A
B
 eU  U A  


N

B


N B 1
 e 

  0
N

 B 
UA UB

NA
NB
For two systems in thermal contact, the equilibrium (most probable) macropartition of
the combined system is the one where the average energy per atom in each
system is the same (the basis for introducing the temperature).
For two identical sub-systems (NA = NB),  AB(UA) is peaked at UA= UB= ½ U :
A
B
 AB
x
U/2
UA
=
U/2
UA
U/2
UA
At home: find the position of the maximum of  AB(UA) for NA = 200, NB = 100, U = 180
 AB
Sharpness of the Multiplicity Function
How sharp is the peak? Let’s consider small deviations
from the maximum for two identical sub-systems:
2U
U/2
UA= (U/2) (1+x)
UA
 AB  U A  U B 
N
Example: N = 100,000 x = 0.01
More rigorously
(p. 65):
 AB
 e 


N



2N
N

U 
 
2
2N
(x <<1)
1  x N 1  x N  eqAB 1  x 2 N
(0.9999)100,000 ~ 4.5·10-5  1
U A  U B 
N
UB= (U/2) (1-x)
N
 U / 2  U / 2
N
N
2

 U  
exp  N 
 
U
/
2

 

a Gaussian function
 U 
N
 1
U
/
2


2
The peak width:

U 
1
U /2
N
When the system becomes large, the probability as a function of UA
(macropartition) becomes very sharply peaked.
Problem:
Consider the system consisting of two Einstein solids P and Q in
thermal equilibrium. Assume that we know the number of atoms in
each solid and . What do we know if we also know
(a) the quantum state of each atom in each solid?
(b) the total energy of each of the two solids?
(c) the total energy of the combined system?
the system’s
macrostate
the system’s
microstate
X
the system’s
macropartition
(a)
X
X
(b)
X
X
(c)
X
X (fluctuations)
Implications? Irreversibility!
The vast majority of microstates are in macropartitions close to the most
probable one (in other words, because of the “narrowness” of the
macropartition probability graph). Thus,
(a) If the system is not in the most probable macropartition, it will rapidly
and inevitably move toward that macropartition. The reason for this
“directionality” (irreversibility): there are far more microstates in that
direction than away. This is why energy flows from “hot” to “cold” and
not vice versa.
(b) It will subsequently stay at that macropartition (or very near to it), in
spite of the random shuffling of energy back and forth between the
two solids.
When two macroscopic solids are in thermal equilibrium with each other,
completely random and reversible microscopic processes (leading to
random shuffling between microstates) tend at the macroscopic level to
push the solids inevitably toward an equilibrium macropartition (an
irreversible macro behavior). Any random fluctuations away from the
most likely macropartition are extremely small !
Problem:
Imagine that you discover a strange substance whose multiplicity is
always 1, no matter how much energy you put into it. If you put an
object made of this substance (sub-system A) into thermal contact
with an Einstein solid having the same number of atoms but much
more energy (sub-system B), what will happen to the energies of
these sub-systems?
A.
Energy flows from B to A until they have the same energy.
B.
Energy flows from A to B until A has no energy.
C.
No energy will flow from B to A at all.
Two model systems with fixed positions of particles
and discrete energy levels
- the models are attractive because they can be described in terms of
discrete microstates which can be easily counted (for a continuum of
microstates, as in the example with a freely moving particle, we still need
to learn how to do this). This simplifies calculation of . On the other
hand, the results will be applicable to many other, more complicated
models.
Despite the simplicity of the models, they describe a number of
experimental systems in a surprisingly precise manner.
- two-state paramagnet
(“limited” energy spectrum)
- the Einstein model of a solid
(“unlimited” energy spectrum)
....
The Two-State Paramagnet
- a system of non-interacting magnetic dipoles in an external magnetic field B, each dipole
can have only two possible orientations along the field, either parallel or any-parallel to this
axis (e.g., a particle with spin ½ ). No “quadratic” degrees of freedom (unlike in an ideal gas,
where the kinetic energies of molecules are unlimited), the energy spectrum of the particles
is confined within a finite interval of E (just two allowed energy levels).

B
A particular microstate (....)
is specified if the directions of all spins are
specified. A macrostate is specified by the total
# of dipoles that point “up”, N (the # of dipoles
that point “down”, N  = N - N ).
E
E2 = + B
an arbitrary choice
of zero energy
0
N  N  N
E1 = - B
N - the number of “up” spins
N - the number of “down” spins
 - the magnetic moment of an individual dipole (spin)
The total magnetic moment:
(a macroscopic observable)
 

M   N  N   N  N  N    2 N  N 
The energy of a single dipole in the
external magnetic field:
The energy of a macrostate:
 
 i   i  B
- B for  parallel to B,
+B for  anti-parallel to B
 
U  M  B   B N  N    B N  2N 
Example
Consider two spins. There are four possible configurations of microstates:
M=
2
0
0
- 2
In zero field, all these microstates have the same energy (degeneracy). Note
that the two microstates with M=0 have the same energy even when B0:
they belong to the same macrostate, which has multiplicity =2. The
macrostates can be classified by their moment M and multiplicity :
M=
2
0
- 2
=
1
2
1
For three spins:
M=
macrostates:
3



-
M=
3

-
-3
=
1
3
3
1
-
-
-3
The Multiplicity of Two-State Paramagnet
Each of the microstates is characterized by N numbers, the number of
equally probable microstates – 2N, the probability to be in a particular
microstate – 1/2N.
For a two-state paramagnet in zero field, the energy of all macrostates is
the same (0). A macrostate is specified by (N, N). Its multiplicity - the
number of ways of choosing N objects out of N :
 ( N ,0)  1
 ( N ,1)  N
 ( N ,2) 
N  N  1
2
N   N  1 ...  N  n  1
N!
 ( N , n) 

n  ...  3  2 1
n !n  1!
N  N  1 N  2
 ( N ,3) 
3 2
n !  n factorial =
1·2·....·n
0 !  1 (exactly one way to
arrange zero objects)
The multiplicity of a
macrostate of a two-state
paramagnet with (N, N):
 ( N , N ) 
N!
N!

N! N! N! ( N  N )!
The Probability of Macrostates of a Two-State PM (B=0)
P( N , N  ) 
 ( N , N )
 ( N , N )
 ( N , N )


# of all microstate s  ( N , all N  )
2N
N!
N N e N
P( N , N  ) 

 N  N   N  N   N
N
N
N
N  !N  N  !2
N   e  N  N    e
2
NN

N  N 
N
N   N  N    2 N
- as the system becomes larger, the
P(N,N) graph becomes more
sharply peaked:
N =1  (1,N) =1, 2N=2, P(1,N)=0.5
P(1, N)
P(15, N)
P(1023, N)
0.5

0
1
N

N
0
- random orientation
of spins in B=0 is
overwhelmingly
more probable
0.5·1023
(http://stat-www.berkeley.edu/~stark/Java/Html/)
1023
Nn
Bernoulli Processes and the
Binomial Distribution
Because most physicists spend little time
gambling, we will have to develop our
intuitive understanding of probability in
other ways. Our strategy will be to first
consider some physical systems, e.g
magnetic moment or spin, for which we can
calculate the probability distribution by
analytical methods. Then we will use the
computer to generate more data to analyze.
• Consider a system of N
noninteracting magnetic dipoles
each having a magnetic moment
µ and associated spin in an
external magnetic field B . The
field B is in the up (+z) direction.
According to quantum mechanics
the component of the magnetic
dipole moment along a given
axis is limited to certain discrete
values. Spin 1/2 implies that a
spin can either point up (parallel
to B ) or down (antiparallel to B ).
The energy of interaction of each
spin with the magnetic field is E
= −µB if the spin is up and +µB if
the spin is down. This model is a
simplification of more realistic
magnetic systems.
Take p to be the probability that the spin (magnetic
moment) is up and q the probability that the spin is down.
Because there are no other possible outcomes, we have
p +q = 1 or q = 1 −p. If B = 0, there is no preferred spatial
direction and p = q = 1/2. For B = 0 we do not yet know
how to calculate p and for now we will assume that p is a
known parameter.
We associate with each spin a random variable si which
has the values ±1 with probability p and q, respectively.
One of the quantities of interest is the magnetization M ,
which is the net magnetic moment of the system. For a
system of N spins the magnetization is given by
M = µ(s1 + s2 + . . . + sN ) = µ ∑ si
take µ = 1
first calculate the mean value of M , then its
variance, and finally the probability
distribution P (M ) that the system has
magnetization M .
To compute the mean value of M , we need
to take the mean values of both sides
<M> = <∑ si> = ∑ < si>
Because the probability that any spin has the value ±1 is
the same for each spin, the mean value of each spin is
the same, that is, <s1> = <s2> = . . . = <sN> ≡ <s>.
Therefore the sum consists of N equal terms and can be
written as
<M> = N<s>
The meaning of above equation is that the mean
magnetization is N times the mean magnetization of a
single spin.
Because <s> = (1 × p) + (−1 × q) = p − q, we have that
<M> = N (p − q)
Let us calculate the variance of M , that is, <(M − <M>)2> .
We write
ΔM = M − <M> = ∑ Δ< si>
Where
Δsi ≡ si − <s>
• Example: let us calculate <(ΔM)2> for N = 3 spins
• Solution:
(ΔM)2 = (Δs1 + Δs2 + Δs3 )(Δs1 + Δs2 + Δs3 )
= [(Δs1)2 + (Δs2)2 + (Δs3)2] + 2[Δs1Δs2 + Δs1 Δs3 + Δs2Δs3 ]
take the mean value, interchange the order of the sums and
averages, and write
<(ΔM)2> = [<(Δs1)2> + <(Δs2)2> + <(Δs3)2>] + 2[<Δs1Δs2> +
<Δs1Δs3> + <Δs2Δs3>]
Then <Δsi Δsj> = <Δsi><Δsj> = 0,(i ≠ j )
because < Δsi> = 0, Because different spins are statistically
independent (the spins do not interact), each cross term
vanishes on the average. Then,
<(ΔM)2> = [<(Δs1)2> + <(Δs2)2> + <(Δs3)2>]
Because each spin is equivalent on the average, each term is
equal. Hence, we obtain the desired result
<(ΔM)2> = 3 <(Δs1)2>
The variance of M is 3 times the variance of a single spin, that
is, the variance is additive.
We can evaluate <(ΔM)2> further by finding an
explicit expression for <(Δs)2> . We have that
<s2> = [12 × p] + [(−1)2 × q] = p + q = 1. Hence,
we have
<(Δs)2>= <s2>−<s>2 =1−(p − q)2=1−(2p−1)2
= 1 − 4p2 + 4p − 1 = 4p(1 − p) = 4pq
<(ΔM)2> = 3 (4pq)
for N non interacting spins
<(ΔM)2> = N (4pq)
Because of the simplicity of a system of non interacting spins,
we can calculate the probability distribution itself. Let us
consider the statistical properties of a system of N = 3 non
interacting spins. Because each spin can be in one of two
states, there are 2N=3 = 8 distinct outcomes. Because each
spin is independent
of the other spins, we can use the multiplication rule to
calculate the probabilities of each outcome. Although each
outcome is distinct, several of the configurations have the
same number of up spins.
One quantity of interest is
the probability PN (n) that
n spins are up out a total
of N spins. For example,
there are three states
with n = 2, each with
probability p2 q so the
probability that two spins
are up is equal to 3p2 q.
For N = 3 we see from Figure
P3 (n = 3) = p3
P3 (n = 2) = 3p2 q
P3 (n = 1) = 3pq2
P3 (n = 0) = q3
N=4  …
N= 6  …
N= n  …
Example: Find the first two moments of P3 (n)
Solution. The first moment n of the distribution is given by
<n> = 0 × q3 + 1 × 3pq2 + 2 × 3p2 q + 3 × p3
= 3p (q2 + 2pq + p2 ) = 3p (q + p)2 = 3p
Similarly, the second moment <n2> of the distribution is
given by
<n2> = 0 × q3 + 12 × 3pq2 + 22 × 3p2 q + 32 × p3
= 3p (q2 + 4pq + 3p2 ) = 3p(q + 3p)(q + p)
= 3p (q + 3p) = (3p)2 + 3pq
Hence
<(n − <n>)2> = <n2> − <n>2 = 3pq
First, in each trial there are only two outcomes, for
example, up or down, heads or tails, and right or
left.
Second, the result of each trial is independent of
all previous trials, for example, the drunken
sailor has no memory of his or her previous
steps.
This type of process is called a Bernoulli process
(after the mathematician Jacob Bernoulli, 16541705)
• Because of the importance of magnetic systems, we will
cast our discussion of Bernoulli processes in terms of the
non interacting magnetic moments of spin 2 .
• The main quantity of interest is the probability PN(n)
which we now calculate for arbitrary N and n. We know
that a particular outcome with n up spins and n′ down
spins occurs with probability pnqn′ . We write the
probability PN (n) as
PN (n) = WN (n, n′ ) s pnqn’
where n′ = N − n and WN (n, n′ ) is the number of distinct
configurations of N spins with n up spins and n′ down
spins. From our discussion of N = 3 non interacting spins,
we already know the first several values of WN (n, n′ ).
• We can determine the general form of WN (n, n′ ) by
obtaining a recursion relation between WN and W N −1 . A
total of n up spins and n′ down spins out of N total spins
can be found by adding one spin to N − 1 spins. The
additional spin is either
(a) up if there are (n − 1) up spins and n′ down spins, or
(b) down if there are n up spins and (n′ − 1) down spins.
• Because there are WN (n − 1, n′ ) ways of reaching the
first case and WN (n, n′ − 1) ways in the second case,
we obtain the recursion relation
WN (n, n′ ) = W N −1 (n − 1, n′ ) + W N −1 (n, n′ − 1).
• If we begin with the known values W0 (0, 0) = 1, W1 (1, 0)
= W1 (0, 1) = 1, we can use the recursion relation to
construct WN (n, n′ ) for any desired N . For example,
W2 (2, 0) = W1 (1, 0) + W1 (2, −1) = 1 + 0 = 1
W2 (1, 1) = W1 (0, 1) + W1 (1, 0) = 1 + 1 = 2
W2 (0, 2) = W1 (−1, 2) + W1 (0, 1) = 0 + 1
It shows that WN (n, n′ ) forms a pyramid or (a Pascal)
triangle.
The values of the
first few coefficients
WN (n, n′ ). Each
number is the sum
of the two numbers
to the left and right
above it. This
construction is
called a Pascal
triangle.
It is straightforward to show by induction that
the expression
Note the convention 0! = 1
Binomial distribution
Binomial Distribution
•Probability of n successes in N attempts
•(p + q)N = 1
N!
n N n
P ( n) 
p q
n!( N  n)!
•where, q = 1 – p.
Note that for p = q = 1/2, PN (n) reduces to
The binomial distribution
P16 (n) for p = q = 1/2 and
N = 16
Thermodynamic Probability
• The term with all the factorials in the
previous equation is the number of
microstates that will lead to the particular
macrostate. It is called the
“thermodynamic probability”, wn.
N!
wn 
n!( N  n)!
Microstates
• The total number of microstates is:
  w
wn
True probabilit y P(n) 

For a very large number of particles
  wmax
Mean of Binomial Distribution
n   P ( n) n
n
where
N!
n N n
P ( n) 
p q
n!( N  n)!

Notice : p P (n)  P(n)n
p
Mean of Binomial Distribution

n   P ( n) n   p P ( n)
p
n
n


N
n  p  P ( n)  p ( p  q )
p n
p
n  pN ( p  q )
N 1
n  pN
 pN (1)
N 1
Standard Deviation ()
n  n 

2
  n  n    P(n)n  n 
2
2
2
n
n  n 
2
 n  2n n  n  n  2n n  n
2
2
 n n
2
2
2
2
2
Standard Deviation
2
 
n   P ( n) n   p   P ( n)
n
 p  n
    
 
2
N
N 1
n   p  p ( p  q )   p  pN ( p  q )
 p  p 
 p 
2
n
2
 pN ( p  q )
2
N 1
 pN ( N  1)( p  q )
n 2  pN 1  pN  p   pN q  pN 
N 2

Standard Deviation
 n n
2
2
2
 2  pN q  pN   ( pN ) 2
 2  Npq  ( pN ) 2  ( pN ) 2  Npq
  Npq
For a Binomial Distribution
n  pN
  Npq

q

n
Np
Coins
• Toss 6 coins. Probability of n heads:
N!
6!  1 
n N n
P ( n) 
pq 
 
n!( N  n)!
n!(6  n)!  2 
6!  1 
P ( n) 
 
n!(6  n)!  2 
6
n
1
 
2
6 n
For Six Coins
Binomial Distribution
0.35
0.3
Probabilty
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
Successes
4
5
6
For 100 Coins
Binomial Distribution
0.09
0.08
0.06
0.05
0.04
0.03
0.02
0.01
Successes
96
90
84
78
72
66
60
54
48
42
36
30
24
18
12
6
0
0
Probabilty
0.07
For 1000 Coins
Binomial Distribution
0.03
0.02
0.015
0.01
0.005
Successes
960
900
840
780
720
660
600
540
480
420
360
300
240
180
120
60
0
0
Probabilty
0.025
Math required to bridge the gap between 1 and 1023
Typically, N is huge for macroscopic systems, and the multiplicity is
unmanageably large – for an Einstein solid with
1023
atoms,
 ~ 10
10
23
One of the ways to deal with these numbers – to take their logarithm [ in
fact, the entropy
S of the macrostate   kB ln of the macrostate  ]
 thus, we need to learn how to deal with logarithms of huge numbers.
e ln x  x
ln xy  ln x  ln  y  ln x / y   ln x  ln  y 
 
ln x  y ln x 
y
e  10
x
x / ln10
~ 10
0.43 x
Stirling’s Approximation for N! (N>>1)
Multiplicity depends on N!, and we need an approximation for ln(N!):
N
lnN!  ln1  ln2  ln3 ···  lnN   ln x dx  x ln x  x 1  N ln N  N
N
1
ln N! N ln N  N
N
N !  
e
i
More accurately:
N! N e
N
Check:
or
N
N
N
2N   
e
N
2N
1
1
ln N !  N ln N   N  ln N  ln 2  N ln N   N
2
2
because ln N << N for large N
Stirling’s Approximation
For large N : ln N ! N ln N  N
 N! 
  ln N ! ln  N i !  ln N ! ln N i !
ln w  ln 
i
  Ni! 


ln w  N ln N  N    ( N i ln N i )   N i 
i
 i

ln w  N ln N   ( N i ln N i )
i
Multiple outcomes for large N
N!
N!
w

N1! N 2! N 3!     N i !
N
i
i
N
Number Expected
• Toss 6 coins N times. Probability of n heads:
N!
6!  1 
n N n
P ( n) 
pq 
 
n!( N  n)!
n!(6  n)!  2 
6!  1 
P ( n) 
 
n!(6  n)!  2 
n
1
 
2
6
• Number of times n heads is expected is:
•
n = N P(n)
6 n
The Gaussian Distribution as a Limit of the
Binomial Distribution
for N>>1, PN (n) is a rapidly varying function of n near n = pN ,
and for this reason we do not want to approximate PN (n)
directly. Because the logarithm of PN (n) is a slowly varying
function, we expect that the power series expansion of ln PN
(n) will converge. Hence, we expand ln PN (n) in a Taylor
series about the value of n = ñ at which ln PN (n) reaches its
maximum value. We will write p(n) instead of PN (n) because
we will treat n as a continuous variable and hence p(n) is a
probability density. We find
• Because we have assumed that the expansion is about
the maximum n = ñ, the first derivative d ln p(n)/dn|n= ñ
must be zero. For the same reason the second
derivative must be negative. We assume that the higher
terms can be neglected and adopt the notation
ln A = ln p(n = ñ ) and
Then
We next use Stirling’s approximation to evaluate the first two derivatives of
ln p(n) and the value of ln p(n) at its maximum to find the parameters A, B ,
and ñ
From binomial distribution
Take log. To obtain
ln p(n) = ln N ! − ln n! − ln(N − n)! + n ln p + (N − n) ln q
Use relation
We have, d(ln p(n))/dn = − ln n + ln(N − n) + ln p − ln q
The most probable value of n is found by finding the
value of n that satisfies the condition d ln p/dn = 0.
We find
or (N − ñ )p = ñq. If we use the
relation p + q = 1, we obtain
ñ = pN
Note that ñ = n, that is, the value of n for which p(n) is a
maximum is also the mean value of n.
See, d(ln p(n))/dn = − ln n + ln(N − n) + ln p − ln q
The second derivative
Then
We use
or
B = 1/σ2 , where σ2 is the variance of n.
(Gaussian probability
density)
• Gaussian probability density is valid for
large values of N and for values of n near
<n>. Even for relatively small values of N ,
the Gaussian approximation is a good
approximation for most values of n.
• The most important feature of the
Gaussian probability distribution is that its
relative width, σn /<n>, decreases as N−1/2 .
The Poisson distribution and Should
You Fly in Air-planes?
• We now return to the question of whether or not
it is safe to fly. If the probability of a plane
crashing is p = 10−5 , then 1 − p is the
probability of surviving a single flight. The
probability of surviving N flights is then
PN = (1 − p)N .
For N = 400, PN ≈ 0.996, and
for N = 105 , PN ≈ 0.365.
Thus, our intuition is verified that if we took 400
flights, we would have only a small chance of
crashing.
This type of reasoning is typical when the
probability of an individual event is small,
but there are very many attempts.
Suppose we are interested in the probability of the
occurrence of n events out of N attempts given
that the probability p of the event for each
attempt is very small.
The resulting probability is called the Poisson
distribution, a distribution that is important in
the analysis of experimental data.
To derive the Poisson distribution, we begin with
the binomial distribution:
We first use Stirling’s approximation to write
For p << 1, we have ln(1 − p) ≈ −p, eln(1−p) =
1 − p ≈ e−p , and (1 − p)N −n ≈ e−p(N −n) ≈
e−pN . If we use the above approximations,
we find
With <n> = pN
Poisson
distribution
Let us apply the Poisson distribution to the airplane
survival problem. We want to know the probability
of never crashing, that is, P (n = 0).
The mean <N> = pN
equals 10−5 × 400 = 0.004 for N = 400 flights
and
<N> = 1
for N = 105 flights.
Thus, the survival probability is
P (0) = e−<N> ≈ 0.996 for N = 400
and P (0) ≈ 0.368
for N = 105 as we calculated
previously.
We see that if we fly 100,000 times, we have a
much larger probability of dying in a plane crash.
Traffic Flow and the Exponential Distribution
The Poisson distribution is closely related to the exponential
distribution as we will see in the following. Consider a
sequence of similar random events and let t1 , t2 , . . . be the
time at which each successive event occurs. Examples of
such sequences are the successive times when a phone call
is received and the times when a Geiger counter registers a
decay of a radioactive nucleus. Suppose that we determine
the sequence over a very long time T that is much greater
than any of the intervals ti − ti−1 . We also suppose that the
average number of events is λ per unit time so that in a time
interval t, the mean number of events is λt.
Assume that the events occur at random and are independent
of each other. Given λ, the mean number of events per unit
time, we wish to find the probability distribution w(t) of the
interval t between the events. We know that if an event
occurred at time t = 0, the probability that another event
occurs within the interval [0, t] is
and the probability that no event occurs in the interval t is
Thus the probability that the duration of the interval between
the two events is between t and t + Δt is given by
• w(t)Δt = probability that no event occurs in the interval [0, t]
× probability that an event occurs in interval [t, t +
Δt]
If we cancel Δt from each side and differentiate both sides with
respect to t, we find

The constant of integration A is determined from the
normalization condition:
Hence, w(t) is the exponential function
These results for the exponential distribution lead naturally to
the Poisson distribution.
Let us divide a long time interval T into n smaller intervals
t = T /n. What is the probability that 0, 1, 2, 3, . . . events occur
in the time interval t, given λ, the mean number of events per
unit time?
We will show that the probability that n events occur in the
time interval t is given by the Poisson distribution:
We first consider the case n = 0. If n = 0, the probability that
no event occurs in the interval t is
For the case n = 1, there is exactly one event in time interval t.
This event must occur at some time t′ which may occur with
equal probability in the interval [0, t]. Because no event can
occur in the interval [t′ , t], we have
with t → (t − t′ ). Hence,
In general, if n events are to occur in the interval [0, t], the first
must occur at some time t′ and exactly (n − 1) must occur in
the time (t − t′ ). Hence,
The above equation is a recurrence formula
Simulation
1. Approach to equilibrium
(http://stp.clarku.edu/simulations/approachtoequilibrium/index.h
tml
2. Sensitivity to initial conditions
(http://stp.clarku.edu/simulations/sensitive/index.html)
3. Random walks
(http://stp.clarku.edu/simulations/randomwalks/index.html)
4. Multiple coin toss
(http://stp.clarku.edu/simulations/cointoss/index.html)
5. The Binomial distribution
(http://stp.clarku.edu/simulations/binomial/index.html)
6. Monte Carlo estimation
(http://stp.clarku.edu/simulations/estimate/index.html)
7. Random multiplicative processes
(http://stp.clarku.edu/simulations/multiplicativeprocess/index.ht
ml)
Kuis Probabilitas,
31 Agustus 2010, 30 menit
1. Berapakah probabilitas kemunculan
pelemparan dadu sekali untuk angka 3 atau 6?
2. Berapakah probabilitas kemunculan
pelemparan dadu sekali untuk angka bukan 6?
3. Berapakah probabilitas kemunculan
pelemparan dadu sekali untuk angka genap ?
4. Berapakah probabilitas kemunculan
pelemparan dadu dua kali untuk paling tidak
satu angka 6?
Next … Boltzmann distribution