Download 8 Probability Distributions and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ludwig Boltzmann wikipedia , lookup

Thermodynamic system wikipedia , lookup

T-symmetry wikipedia , lookup

Second law of thermodynamics wikipedia , lookup

History of thermodynamics wikipedia , lookup

Entropy in thermodynamics and information theory wikipedia , lookup

H-theorem wikipedia , lookup

Maximum entropy thermodynamics wikipedia , lookup

Transcript
Probability Distributions
8
1
Probability Distributions and Statistics
The Maxent Principle
All macroscopic systems are far too complex to be fully specified,* but usually we
can expect the system to have a few well defined average properties. Statistical
mechanics can be characterized as the art of averaging microscopic properties to
obtain quantities observed in the macroscopic world. Probability distributions
{pi} are used to effect these averages. Here we show how to find the probability
pi that a system is in state i from the information entropy
expression, S   k  pi ln pi .
Averages
Consider a system with possible states i  1, 2,, N known to have a quantity Ei
associated with each state that contributes to a system average E. We want to show that
this average is given by the expression
E   pi Ei .
(1)
i
Suppose there are G1 occurrences of E1, G2 occurrences of E2, and so on. Then
the average is
G E  G2 E2 G N E N
E 1 1
G
where G  G1  G2 G N . However, we assign
G
pi  i
G
and thereby obtain Eq.(1).
1.
A two-state system has energy levels of 1.0 eV and 2.0 eV. The probability that
the system is in the lower energy state is ¾. Find the average energy of the
system.
The Maxent Principle
The best “guess” or statistical inference that we can make about a system is to
require that we (i) adhere to the known facts and (ii) include no spurious information.
(One author calls this “the truth, and nothing but the truth.”)
*
It is impossible even theoretically to fully specify positions and momenta due to the Heisenberg
uncertainty principle.
Probability Distributions
2
The known facts are usually averages expressed by constraint equations like
Eq.(1). We assure that no spurious information is included by maximizing the missing
information S. This is the Maxent or maximum entropy principle.
Symbolically, the best statistical inference follows from
d S  p1 , p2 , p N   0
constrained by averages,
 pi  1
 pi i1  1
(2)

 pi i N    N 
2.
Find the best statistical inference for a system where we know only that the
system must be in one of two states (that is, p1  p2  1 ). [ans. ½ , ½ ]
3.
(a) Find the best statistical inference for a system that we know has a well defined
average energy but at any moment it must be in one of two energy states (that is,
p1  p2  1 and p1E1  p2 E2  E ).
(b) Consider a case where E1= –1 and E2= +1 and find expressions for p1 and p2 .
Use a computer to find the appropriate undetermined multiplier given that
E  0.7616 and use it to evaluate p1. [ans. 0.88]
4.
Find the best statistical inference for a system that we know has a well defined
standard deviation in energy but at any moment must be in one of two energy
states (that is, p1  p2  1 and p1 ( E1  E ) 2  p2 ( E2  E ) 2   2 ).
The last three exercises suggest three widely used probability distributions; the
equiprobable distribution, the canonical distribution, and the normal distribution.
Equiprobable Distribution
Consider the case where there are N alternatives with respective probabilities
p1 , p2 , , p N . The only thing we know is that the system must be in one of these states
so
(3)
 pi  1
i
Now insist that missing information is maximized under this condition. We have


S   S  k   1  pi  1
 i

with
Probability Distributions
3
S   k  pi ln pi .
The peculiar choice of multiplier k(+1) is the result of hindsight. It just turns out neater
this way.
Maximizing S’ with respect to arbitrary pj gives
S 
  k ln p j  k  k    1  0
p j
p j  e
The result is the same for all pj so substituting into the constraint equation (3) gives the
equiprobable distribution,
(4)
1
pj 
for N states
N
5.
Derive the equiprobable distribution using information theory. The result justifies
that, in the absence of any information, equiprobable states are the best
determination.
The equiprobable distribution applies when a system does not exchange energy or
particles with the environment. This is refered to as a microcanonical distribution.
Distribution with a Known Mean
By far the most important probability distribution in statistical mechanics is the
Canonical Distribution where the system possesses a well defined average energy while it
continually exchanges some energy with its surroundings. The canonical distribution is a
special case of a distribution with one known mean.
The basic problem is to find the distribution of p’s that maximizes missing
information S subject to the constraints
 pi  1
(5a)
i
 pi Ei  E
(5b)
i
For convenience, we use undetermined multipliers k   1 and  k  and write




S   S  k   1  pi  1  k   pi Ei  E 
 i

 i

Maximizing S  gives
p j  exp    E j .


(6)
Probability Distributions
4
In principle, the Lagrange multipliers can be determined from the two constraint
equations (5). From (5a) we find
1
e 
 e Ei
i
and Eq.(6) becomes
pj 
with
e
 E j
(7)
Z
Z   e  Ei
(8)
i
The quantity Z plays a very prominent role in statistical mechanics where it is called the
partition function. (Note that we are leaving  unspecified).
6.
Use the information approach to derive the probability distribution for one known
mean quantity E .
Identify  for Thermodynamics
The canonical distribution, Eq. (7), connects with thermodynamics only when we
identify k as Boltzmann’s constant and the Lagrange multiplier  as 1 kT . We saw that k
had to be Boltzmann’s constant to agree with thermodynamics. The identification of 
can be seen in two steps: (i) evaluate entropy S with the canonical distribution and (ii)
demand that the result for dS is equivalent to the thermodynamic relation
(9)
dU  TdS  dW
where U is the more usual notation for E in thermodynamics.
The algebra is made simple by defining a quantity F such that
exp F   Z
and S   k  pi ln pi =  k  pi F  Ei  =  k F  U  . Assume constant temperature
and write the differential dS:
dU 
1
dS  dF
k
Comparing the latter with Eq.(9) determines that  
1
as required and, incidentally, dF
kT
is seen as a form of work.
7.
Repeat the development given above to identify  for a system with only two
levels, E1 and E2.
Probability Distributions
5
On Continuous Distributions
Until now we considered discrete sets of probabilities. Here we discuss how to
accommodate a continuous range of possibilities. Suppose that a system state has an
associated continuous value x. Since there are an infinite number of possible values or
outcomes, there is no chance of an perfectly accurate specific outcome x  x0 . It makes
more sense to speak of the system being in a neighborhood dx around x. In particular, we
define a probability density  x  such that
 xdx  probability of being in neighborhood dx around x
The continuous version of the normalization condition (5a) becomes
  x dx  1
and an average over x analogous to Eq.(5b) is written
 x  x dx  x .
(10)
(11)
The last two equations are the constraints for a continuum version of the canonical
distribution. Entropy becomes
(12)
S   k   x ln x0  x dx
where x0 is the hypothetical smallest measurable increment of x. It is included for
dimensional consistency, but does not enter any results.
In the following section we use a continuum analysis to derive the ubiquitous
normal distribution
Normal Distribution
Information theory produces the normal distribution for a continuous system with
a standard deviation  around an average value x . The standard deviation is given by
 2    x  x   x  dx
2
(13)
This is a measure of the average spread from the mean.
We construct S  from Eq.(12) and the constraint equations,
   x  x   dx    .

S   S  k   1   dx  1  k
Maximizing with respect to  gives
 x 
e
x0
2
2
e  x  x   const e  x  x  .
2
2
The constants are determined from Eqs.(10) and (13). These give the normal or Gaussian
distribution:
Probability Distributions

1
 x  
e
2 
6
 x  x 2
22
(14)
x/
8.
Use the information approach to derive the normal probability distribution.
9.
A cohort of U.S. males have an average height of 5’10” with a standard deviation
of 3”. Find the percentage of these men with heights between 5’7” and 5’9”. (Use
a table of the normal distribution.)
10.
A sample of the population has an average pulse rate of 70 beats/min with a
standard deviation of 10 beats/min. Find the percent of the population with pulse
rate (a) above 80 beats/min, (b) between 75 and 80 beats/min. ”. (Use a table of
the normal distribution.)