Download Simple_model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tight binding wikipedia , lookup

Probability amplitude wikipedia , lookup

Particle in a box wikipedia , lookup

Ising model wikipedia , lookup

Renormalization group wikipedia , lookup

T-symmetry wikipedia , lookup

Wave–particle duality wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Electron configuration wikipedia , lookup

Transcript
Analyzing a simple model to gain insight in the fundamental principles
of statistical mechanics.
In this chapter we will investigate a simple model to clearly understand the principles
underlying statistical mechanics. Following the discussion of the model, we will abstract
the main characteristics and these will then be applied in a more formal discussion of the
relation between statistical mechanics and thermodynamics in the next section.
1. Description of the model.
Consider a set of M distinguishable, non-interacting harmonic oscillators. Each oscillator
is in one of the available energy eigenstates, with energies n   n   n , n  0,1, 2,3,... ,
where the zeropoint energy of
1
 is neglected for convenience. The quantum states of
2
the complete system are specified by listing the quantum number for each oscillator. We
will indicate these states as n1 , n2 ,..., nM , ni  0,1, 2,... . The total energy of the system is
given by E  
n
i 1, M
i
  N . The key feature of the system is that there is a large degree
of degeneracy. Any set of quantum numbers that add up to N has the same energy, but
different (ordered) sets correspond to different states. To see how fast this number of
states may grow we can give some very simple examples.
In the representation in the example below we list the possible configurations, by
specifying the quantum numbers that are occupied, listed from low to high. To get a
count of the actual number of states one has to keep track of the number of distinct
permutations (= distinct states) of the quantum numbers in a particular configuration.
1
Example 1, 3 oscillators, E  5 .
ni  ,  ni  5
# of distinct permutations of ni ( states)
i
arranged low to high
0
0
5
3
0
1
4
6
0
2
3
6
1
1
3
3
1
2
2
3
5,0,0 , 0,5,0 , 0,0,5
21 states in total
Another example might be
Example 2, 4 oscillators, E  6
ni  ,  ni  6
# of distinct permutations of ni
i
= # of states / configuration
arranged low to high
0
0
0
6
4 =4! /(3! 1!)
0
0
1
5
12 = 4! / (2! 1! 1!)
0
0
2
4
12
0
1
1
4
12
0
0
3
3
6 = 4! /(2! 2!)
0
1
2
3
24 = 4!
1
1
1
3
4
0
2
2
2
4
1
1
2
2
6
84 states in total
2
As the number of oscillators grows, this is a tedious way of keeping track of things. It is
more convenient to enumerate the configurations by listing how many oscillators are
have zero quanta, how many have 1 quanta, etc. This avoids writing long lists of 0’s, 1’s
that simply indicate how many states have 0 quanta, how many have 1, and so forth. In
general we will have the situation that M, the number of oscillators, is much larger than
the highest occupied level, and the alternative representation is more economical.
Consider the following example:
Example 3, 6 oscillators, E  5
# of oscillators in level
0
1
2
3
4
5
4
# of permutations
5
1
1
4
1
1
6! / 5! = 6
6! / 4! = 30
1
6! / (4!)=30
1
6! / 3! 2! = 60
3
2
3
1
2
6! / 3! 2! = 60
2
3
1
6! / 3! 2! = 60
1
5
6! / 5! = 6
A configuration can hence be specified by the number of oscillators in each level. If an
entry is left empty, it simply means there are no oscillators with this energy level. Let us
denote the configuration by the vector m   m0 , m1 , m2 ,..., mN  . In principle the
configuration vector can be thought to be infinite, as there are an infinite number of
energy levels for each oscillator. If the total energy is given by N  , N is the highest level
that can be occupied.
3
We have the following constraints on the configuration vector m that yields a particular
total energy
m
i
 M , the total number of oscillators
i
m n
i i
 N , where  N is the total energy, or
i
m
i i
E
i
The number of permutations, Wm , or the number of states corresponding to a particular
configuration is given by
M!
Wm 

mo !m1 !...mN !
( mi )!
i
m !
i
i
This formula is well known in combinatorics, and it is easily checked for the examples
given. As the number of oscillators becomes large, the number of states per configuration
becomes a highly peaked distribution. This can be seen by looking at the previous
example, but now taking 100 oscillators with the same total energy E  5
Example 4, 100 oscillators, E  5
# of oscillators in level
0
1
2
3
4
99
98
# of permutations, Wm
5
1
1
98
1
1
100! / 99! = 100
100! / 98!  100^2
1
100! / (98!)  100^2
1
100! / 97! 2!  100^3 / 2
97
2
97
1
2
100! / 97! 2!  100^3 / 2
96
3
1
100! / 96! 3!  100^4 / 6
95
5
100! / 95! 5!  100^5 / 120
As can be seen the last two configurations are far more likely than the other
configurations. This effect grows as you increase the number of oscillators.
4
The basic principle of statistical mechanics is that macroscopic properties are evaluated
as averages over all possible states, and that each state in an isolated system (of specific
total energy, total volume and total number of particles) is equally likely. This is a
fundamental postulate of statistical mechanics, and Gibbs called this the principle of
equal a priori probability. Hence in the above example 4) each quantum state is equally
likely. But this means that the lowest two configurations in the table are far more likely
than any of the other configurations. They have far more states corresponding to the
configuration. Since any property is the same for each state in a configuration (as they
only differ by a permutation of quantum numbers over equivalent oscillators), it follows
that the average value is dominated by the contributions from the most likely
configurations (configurations that have many states).
If one deals with very large number of particles (on the order of 1023 say) then the most
likely configuration contains overwhelmingly more states than other configurations.
Hence if one would plot the number of states as a function of m, one would find a very
peaked distribution, like a delta function. Moreover, configurations that are still
somewhat likely compared to the most likely configuration, differ only little from the
most likely configuration, and any property for that configuration is close to the same
property of the maximum configuration. For example, the most likely configuration may
be given by (100000,1000,100,10) and another similar configurations, that also
corresponds to a large number of states might be (100009,990,101,10) . Any calculated
property for this configuration (averages over the individual oscillators) would be similar
to that for the most likely configuration. For this reason, in statistical mechanics one
proceeds by calculating the most likely configuration, and one obtains properties for this
most likely configuration. The error introduced compared to obtaining the full average
(the formally exact definition of the macroscopic property) is negligible, if the number of
particles is (very) large. We will do some numerical experiments to test this assumption
in the exercises, or by looking at computer simulations in class.
5
2. Determination of the most likely configuration corresponding to a particular total
energy.
The number of states corresponding to a particular configuration m   m0 , m1 , m2 ,... is
given by
M!
Wm 

mo !m1 !...mN !
( mi )!
i
m !
.
i
i
We wish to find the particular configuration m* for which this number is a maximum.
However, Wm is a gigantically large number, and typically we therefore calculate lnWm ,
(employing Stirlings formula for the factorials), which is a monotonously increasing
function of Wm . Hence, rather than Wm , we maximize lnWm , which will lead to the same
most likely configuration. In addition we need to impose constraints on the configuration
vector m such that it yields a particular total energy E, and corresponds to a particular
number of total oscillators.
m
i
 M , the total number of oscillators
i
m
i i
E
i
Such a constrained optimization is best performed by using Lagrange’s method of
undetermined multipliers. In this procedure one defines a function to be optimized as
F (m,  ,  )  ln Wm   ( mi i  E )   ( mi  M )
i
i
The function F is the original function to be optimized plus an undetermined multiplier
times each constraint. (The signs are chosen with hindsight such that the multipliers will
be positive numbers). This function can then be used in an unconstrained optimization,
provided that the function is also made stationary with respect to changes in the
multipliers. The function is hence required to be stationary in the variables mk ,  ,  k .
The stationarity conditions w.r.t. the Lagrange multipliers yield
6
F
 0   mi i  E  0

i
F
 0   mi  M  0

i
These are precisely the constraints. If these are satisfied the function F (m,  ,  ) reduces
to lnWm . The other stationary conditions are
F  ln Wm

  k    0 k
mk
mk
Hence, to carry out the optimization we need to take the partial derivatives of
ln Wm  ln
M!
 ln M ! ln  mi !  ln M !  ln mi !
i
i
 mi !
i
To evaluate the logarithms of factorials we use Stirlings approximation (see further notes,
1).
Stirling: ln n !  n ln n  n
And hence
ln Wm  M ln M  M  ( mi ln mi   mi )
i
i
and
 ln Wm
  ln mk  1  1   ln mk
mk
Combining this with the expression for
F
 0 , we find
mk
 ln mk   k    0  mk  e   k e  
Now using the constraint on the total number of oscillators we can eliminate 
m
k
 M  e    e   k  M  e   
k
k
M
 e  k
k
mk  M
e
  k
 e 

k
k
Let us note that by invoking Stirling’s approximation we can treat the integers mk as
continuous variables, and at the optimum, the mk may no longer be integers. For M very
7
large, this is hardly a problem. For small M, we can choose integers that are closest (by
rounding) to the optimal fractional numbers, and following the above recipe we will be
‘close’ to the most likely configuration.
Imposing the constraint on the total energy in principle determines the parameter 
m 
k k
 M   k e  k /  e  k  E
k
k
k
This equation is not easily solved, however, for  and the most likely configuration can
be thought to be an explicit function of  , which then determines the total energy in the
system, E.
Let us take stock at this moment of what has been achieved, and provide some further
interpretation. For the most likely configuration, m * , we have obtained
mk*
e  k

 Pk
M  e  k
k
For any configuration the quantity
mk
is the fraction of oscillators in energy level k,
M
which in Metiu would be denoted as  k , the frequency of the energy level. The most
likely configuration takes on a special significance as this configuration can be used to
determine the average values in the ensemble. The fractions in the most likely
configuration are therefore denoted the probabilities, Pk , to find an oscillator in energy
level k. In the most likely configuration these frequencies, or equivalently, the
probabilities are given by a formula that is identical to the Boltzmann distribution, if we
identify,   1/ kT . The average energy in the ensemble is then given by
E
     k e  k /  e  k    k Pk
M
k
k
k
and the average energy depends on the parameter  . Of course we can also define a
partition function for future convenience
Q(  )   e  k .
k
8
The quantity ln Wm  M ln M  M  ( mi ln mi   mi ) can be written as
i
ln Wm  M ln M  M 
  M  i ln i
i
i
mi
m
ln( i M )  M ln M  M ( i (ln i  ln M )
M
M
i
i
where we used

i
 1 . For the most likely distribution we would write
i
ln Wm*   M  Pi ln Pi . The quantity lnWm* scales linearly with the number of
i
oscillators, or the size of the system, and is extensive. The average quantity
ln Wm*
 ln Wm*   Pi ln Pi
M
i
is independent of the size of the system, and this is the fundamental quantity that is
optimized to reach the most likely distribution (under constraints). It is of interest to note
that the sum runs over the levels of the oscillator, and there is no reference anymore to
the total number of oscillators M. As discussed before in class, following the text by
Metiu, entropy is defined as S  k B  Pi ln Pi , and optimizing the probabilities to find
i
the most likely configuration is precisely equivalent to finding the state of maximum
‘entropy’.
It is of interest to redo the derivation of finding the most likely configuration, using the
probabilities as the fundamental variables. The problem is to find the probability
distribution, such that  Pi ln Pi is a maximum under the constraints
i
 P 1, a
i
i
normalization condition, and the condition that the average energy has a certain value
    i Pi , which reflects the conservation of energy for the complete system. This
i
maximizing probability distribution is overwhelmingly likely, if one assumes that every
quantum state of a given total energy is equally likely in an isolated system at
equilibrium. All averages can be obtained from the most likely configuration, and the
associated maximizing probability distribution. Hence, it is of key importance to
calculate this most likely distribution. To this end define the function
9
F (Pi  ,  ,  )   Pi ln Pi   ( Pi  1)   (  i Pi   )
i
i
i
which, as before, includes two Lagrange multipliers to impose the constraints, and
impose stationarity of the functional with respect to all parameters to find (besides the
constraints, which derive from stationarity with respect to the multipliers):
F
 
  ln Pj   j  (  1)  0  Pj  e(  1)e j
Pj
and from the normalization condition
P  e
 (  1)
j
j
e
  j
j
 1  e(  1)  1/  e
  j
j
Or, the probabilities in the most likely distribution are given by
Pj* ( )  e
  j
/ e
  j
j
The parameter  is in one to one correspondence with the average energy of an
oscillator, (the energy constraint):
  e  / 

i
i
i
e i  
i
Since the Lagrange multiplier  corresponding to overall normalization can always
easily be eliminated, we can use a short cut, by minimizing
f ( Pi  ,  )   Pi ln Pi   (  i Pi   )
i
i
  ln Pi   i  0
Pi  e   i
where the Pi are relative probabilities, which are not yet properly normalized. Then we
can define a partition function and normalized probabilities accordingly
Q(  )   Pi   e   i
i
i
Pi (  )  Pi / Q  e  i / Q
In the sequel we will follow this procedure for many different types of “ensembles”. We
can impose different constraints, but always optimize the basic quantity  Pi ln Pi to
i
find the appropriate formulas for the probabilities and partition functions.
10
As will be discussed further in one of the problem sets, one can equilibrate a system as
the one introduced above, by starting from an arbitrary state n1 , n2 ,....nM defined by the
quantum numbers of each individual oscillator, of a certain (total) energy, and then
randomly change the state by raising one (randomly chosen) oscillator by one level, and
lowering another (randomly) oscillator by one level, such that the total energy is
conserved. (This is one possible procedure, see also further notes 2). This random
perturbation of the state is repeated many times, and for each instant in the simulation,
one can count how many oscillators are in level 0 (  m0 ), how many in level 1 (  m1 )
and so forth. This allows one to define the fraction of oscillators in each level
 i  mi / M . By applying many random moves in the system, the system will reach
“equilibrium”, meaning that the fractions reach the probabilities of the most likely
distribution, independent from the starting state. The quantity   i ln i gradually
i
increases (while fluctuating) to reach its maximum value  Pi ln Pi . Once the system
i
has reached equilibrium, the fractions will keep fluctuating around the most probable
‘Boltzmann’ values. One can calculate average properties by calculating properties for
each state n1 , n2 ,....nM at a particular instant in the simulation and averaging the result.
This will agree (within statistical noise) with the averages from statistical mechanics,
using the probabilities Pi . One can also calculate fluctuations from the average values. It
is harder to reach convergence for these quantities, and one needs to run the simulation
for a long time. It is interesting to reflect on this numerical experiment. There is no real
dynamics in the system, just random moves. Because the most likely configuration
(particular values of the mi ) has by far the most states, and because we randomly sample
the various accessible states (all of the same energy), the system spends most of the time
in the most likely configuration. It can move away from it, lowering the function
  i ln i , but it is unlikely to deviate much as equilibrium is reached. The fluctuations
i
are very small for simulations (or experiments) involving a very large number of
molecules as are encountered in real systems.
11
3. Towards thermodynamical interpretation of the basic quantities.
We will assume the most likely configuration and define average quantities accordingly.
Up to this point we have obtained and/or defined the following
Pi  e   i /  e   i  e   i / Q
i
Q  e
  i
i
  U    i Pi
i
ln W   Pi ln Pi
i
We would like to determine from first principles considerations how to interpret  , and
also lnW , and in this way provide the connection to thermodynamics. Both   U and
lnW are functions of  . Let us first investigate their derivatives.
 U 
 Pi 

   i 

   i   
(expression will be used below )
  (e   i / Q) 
  i 


i


1
Q
   i2 e   i / Q  2 (  i e   i )( )
Q i

i
   i2 e   i / Q  (  i e   i / Q )(  j e
i
i
  j
/Q
j
   i2 Pi  (  i Pi )(  j Pj )
i
i
j
   P      ( i   ) 2 Pi
2
i i
2
i
  (   )
i
2
 U 
Hence we see that 
 is always negative. This derivative is directly related to
  
fluctuations around the mean energy, or the variance of the energy. We will return to this
issue at a later time. Next consider
12
 ln W

 
i
Pi
1 Pi
ln Pi   Pi

Pi 
i
Pi

(ln e   i  ln Q ) 
 Pi
 i
i 
P

    i i  ln Q
 Pi  0

 i
i
P
U
   i i  0  


i
 
where we used the first line of the
U

analysis and the fact that



 P   1  0 .
i
i
I am using partial derivatives here out of habit. The quantities at this moment only
depend on  , and it would be more appropriate to use total derivatives.
It then follows

U
1  ln W

 M 
which we will use in a moment.
To illuminate the meaning of the parameter  , consider two systems, the first consisting
of M 1 oscillators, W1 ( 1 ) states (in the most likely configuration) and a total average
energy U1 ( 1 ) . The other system consists analogously of M 2 oscillators, W2 (  2 ) states
in the most likely configuration, and average energy per oscillator U 2 (  2 ) . The values
for  are different initially in these two systems. In addition the oscillators could have
totally different energy levels in the two systems. Let us assume that these two systems
can exchange energy and we let them equilibrate such that they reach the most likely
configuration, while preserving total energy. The total number of states is given by WW
1 2,
and hence we can maximize the function
F ( 1 ,  2 )  ln WW
1 2   ( M 1U1  M 2U 2 )
 ln W1 ( 1 )  ln W2 (  2 )   ( M1U1 ( 1 )  M 2U 2 (  2 ))
which includes a constraint to preserve the total energy (and an associated Lagrange
multiplier  ). The maximum of the function is reached if this function is stationary with
respect to changes in 1 ,  2 , hence
13
  ln W1 
 U1 

   M1 
 0
 1  2
 1  2
  ln W2 
 U 2 

  M2 
 0
  2  1
  2  1
and using the relation 
U
1  ln W
, derived previously, we obtain

 M 
 U 
 U 
M 11  1    M 1  1   0  1  
 1  2
 1  2
 U 
 U 
M 22  2    M 2  2   0  2  
  2  1
  2  1
and hence we obtain the important result that the most likely (equilibrium) configuration
is reached if 1   2 . This situation is well known from thermodynamics: If two systems
can exchange energy they reach (thermal) equilibrium if they have the same temperature.
Hence, we deduce that the Lagrange multiplier  plays the same role as temperature in
thermodynamics:   f (T ) , where f indicates some monotonous and universal function
of temperature. These latter conditions are strong and one might need to do some hand
waving to justify this. (Universal, as we did not make any assumptions about the second
system. Monotonous, otherwise one value of  might correspond to two different
temperatures, or conversely). With this piece of knowledge established, let us return once
more to the relation 
 (T )
U  lnW
, and substitute   f (T ) . Hence



U dT  ln W dT
U
1  ln W
S



T
T d 
T d 
T  (T ) T
T
It is now easy to see that the identifications
S  kB ln W  kB  Pi ln Pi
i
and   1/ kBT are suggested by this relation, where k B could still be an arbitrary
constant, but it turns out to be the Boltzmann constant. The above is only a plausibility
argument and not a first principles derivation of the connection between thermodynamics
14
and statistical mechanics. I am not even sure, if the connection necessarily has to be like
it is. I have not yet encountered fully convincing arguments in the literature.
4. Generalization of the model, and abstraction of the defining features.
We used a concrete system of a collection of non-interacting harmonic oscillators to
provide a simple introduction to the basic principles of statistical mechanics. In the next
handout/chapter we will be discussing a more formal approach, and it is good to see the
connection between the two approaches. It is well possible that this section makes more
sense after digesting more of the next handout. In general one defines an ensemble of M
‘similar’ non-interacting systems, and assumes solved the Schrödinger equation for the
individual systems H n   n n . This is analogous to the use of the harmonic oscillator
model. The wave function for the complete ensemble is then a product (or
(anti)symmetrized product) of the individual system wave functions    1n1 2n 2 ... MnM .
This ensemble wave function, analogous to the harmonic oscillator states n1 , n2 ,..., nM ,
has a large degeneracy, and the basic assumption is that each state is equally likely to
occur (in an isolated ensemble of a particular total energy), and we wish to take the
average over all possible, accessible, ensemble wave functions. This provides the
connection to thermodynamic quantities. This averaging procedure is most easily done by
defining configurations, and counting how many individual systems are in quantum state
 nk and so forth. These configurations are analogous to the vectors m  (m0 , m1 ,..., mN ) .
In practice we only need the frequencies of occurrence  i  mi / M , and statistical
averages can be obtained by considering exclusively the most likely configuration, in
which the frequencies become the probabilities i  Pi . These probabilities
corresponding to the most likely configuration are obtained by maximizing  Pi ln Pi ,
i
subject to constraints that define the particular characteristics of the ensemble. Different
types of ensembles (constraints) then lead to different thermodynamic identifications.
This will be the subject of the next handout.
15
5. Further Notes
1. Stirlings approximation. To evaluate ln n! for large n , we use
ln(n !)  ln(n(n  1)(n  2)...1)  ln n  ln( n  1)  ...ln(1) 
 ln k
k 1, n
This sum is replaced by an integral to obtain Stirling’s approximation:
n
 ln n   ln xdx   x ln x  x1  n ln n  n  1  n ln n  n
n
k 1, n
1
2. Some comments on the simulation model. It was mentioned that one can start from an
arbitrary state n1 , n2 ,..., nM and then excite one (randomly chosen) oscillator by one
quantum and de-excite another. Because we are using a harmonic oscillator as the basic
system, which has equidistant energy levels, this procedure precisely conserves the total
energy of the state, and it provides a simple way to move randomly through the set of
accessible and allowed states. If the energy spacings would be general it is not so easy to
prescribe a procedure that (randomly!) traverses the space of allowed states and preserves
the total energy of the ensemble. So it is good to keep in mind that the suggested random
moves work for the simulation model with harmonic oscillators, but one should not draw
very general conclusions from it. It needs to be refined to describe more general systems.
16