* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Simple_model
Tight binding wikipedia , lookup
Probability amplitude wikipedia , lookup
Particle in a box wikipedia , lookup
Ising model wikipedia , lookup
Renormalization group wikipedia , lookup
Wave–particle duality wikipedia , lookup
Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup
Analyzing a simple model to gain insight in the fundamental principles of statistical mechanics. In this chapter we will investigate a simple model to clearly understand the principles underlying statistical mechanics. Following the discussion of the model, we will abstract the main characteristics and these will then be applied in a more formal discussion of the relation between statistical mechanics and thermodynamics in the next section. 1. Description of the model. Consider a set of M distinguishable, non-interacting harmonic oscillators. Each oscillator is in one of the available energy eigenstates, with energies n n n , n 0,1, 2,3,... , where the zeropoint energy of 1 is neglected for convenience. The quantum states of 2 the complete system are specified by listing the quantum number for each oscillator. We will indicate these states as n1 , n2 ,..., nM , ni 0,1, 2,... . The total energy of the system is given by E n i 1, M i N . The key feature of the system is that there is a large degree of degeneracy. Any set of quantum numbers that add up to N has the same energy, but different (ordered) sets correspond to different states. To see how fast this number of states may grow we can give some very simple examples. In the representation in the example below we list the possible configurations, by specifying the quantum numbers that are occupied, listed from low to high. To get a count of the actual number of states one has to keep track of the number of distinct permutations (= distinct states) of the quantum numbers in a particular configuration. 1 Example 1, 3 oscillators, E 5 . ni , ni 5 # of distinct permutations of ni ( states) i arranged low to high 0 0 5 3 0 1 4 6 0 2 3 6 1 1 3 3 1 2 2 3 5,0,0 , 0,5,0 , 0,0,5 21 states in total Another example might be Example 2, 4 oscillators, E 6 ni , ni 6 # of distinct permutations of ni i = # of states / configuration arranged low to high 0 0 0 6 4 =4! /(3! 1!) 0 0 1 5 12 = 4! / (2! 1! 1!) 0 0 2 4 12 0 1 1 4 12 0 0 3 3 6 = 4! /(2! 2!) 0 1 2 3 24 = 4! 1 1 1 3 4 0 2 2 2 4 1 1 2 2 6 84 states in total 2 As the number of oscillators grows, this is a tedious way of keeping track of things. It is more convenient to enumerate the configurations by listing how many oscillators are have zero quanta, how many have 1 quanta, etc. This avoids writing long lists of 0’s, 1’s that simply indicate how many states have 0 quanta, how many have 1, and so forth. In general we will have the situation that M, the number of oscillators, is much larger than the highest occupied level, and the alternative representation is more economical. Consider the following example: Example 3, 6 oscillators, E 5 # of oscillators in level 0 1 2 3 4 5 4 # of permutations 5 1 1 4 1 1 6! / 5! = 6 6! / 4! = 30 1 6! / (4!)=30 1 6! / 3! 2! = 60 3 2 3 1 2 6! / 3! 2! = 60 2 3 1 6! / 3! 2! = 60 1 5 6! / 5! = 6 A configuration can hence be specified by the number of oscillators in each level. If an entry is left empty, it simply means there are no oscillators with this energy level. Let us denote the configuration by the vector m m0 , m1 , m2 ,..., mN . In principle the configuration vector can be thought to be infinite, as there are an infinite number of energy levels for each oscillator. If the total energy is given by N , N is the highest level that can be occupied. 3 We have the following constraints on the configuration vector m that yields a particular total energy m i M , the total number of oscillators i m n i i N , where N is the total energy, or i m i i E i The number of permutations, Wm , or the number of states corresponding to a particular configuration is given by M! Wm mo !m1 !...mN ! ( mi )! i m ! i i This formula is well known in combinatorics, and it is easily checked for the examples given. As the number of oscillators becomes large, the number of states per configuration becomes a highly peaked distribution. This can be seen by looking at the previous example, but now taking 100 oscillators with the same total energy E 5 Example 4, 100 oscillators, E 5 # of oscillators in level 0 1 2 3 4 99 98 # of permutations, Wm 5 1 1 98 1 1 100! / 99! = 100 100! / 98! 100^2 1 100! / (98!) 100^2 1 100! / 97! 2! 100^3 / 2 97 2 97 1 2 100! / 97! 2! 100^3 / 2 96 3 1 100! / 96! 3! 100^4 / 6 95 5 100! / 95! 5! 100^5 / 120 As can be seen the last two configurations are far more likely than the other configurations. This effect grows as you increase the number of oscillators. 4 The basic principle of statistical mechanics is that macroscopic properties are evaluated as averages over all possible states, and that each state in an isolated system (of specific total energy, total volume and total number of particles) is equally likely. This is a fundamental postulate of statistical mechanics, and Gibbs called this the principle of equal a priori probability. Hence in the above example 4) each quantum state is equally likely. But this means that the lowest two configurations in the table are far more likely than any of the other configurations. They have far more states corresponding to the configuration. Since any property is the same for each state in a configuration (as they only differ by a permutation of quantum numbers over equivalent oscillators), it follows that the average value is dominated by the contributions from the most likely configurations (configurations that have many states). If one deals with very large number of particles (on the order of 1023 say) then the most likely configuration contains overwhelmingly more states than other configurations. Hence if one would plot the number of states as a function of m, one would find a very peaked distribution, like a delta function. Moreover, configurations that are still somewhat likely compared to the most likely configuration, differ only little from the most likely configuration, and any property for that configuration is close to the same property of the maximum configuration. For example, the most likely configuration may be given by (100000,1000,100,10) and another similar configurations, that also corresponds to a large number of states might be (100009,990,101,10) . Any calculated property for this configuration (averages over the individual oscillators) would be similar to that for the most likely configuration. For this reason, in statistical mechanics one proceeds by calculating the most likely configuration, and one obtains properties for this most likely configuration. The error introduced compared to obtaining the full average (the formally exact definition of the macroscopic property) is negligible, if the number of particles is (very) large. We will do some numerical experiments to test this assumption in the exercises, or by looking at computer simulations in class. 5 2. Determination of the most likely configuration corresponding to a particular total energy. The number of states corresponding to a particular configuration m m0 , m1 , m2 ,... is given by M! Wm mo !m1 !...mN ! ( mi )! i m ! . i i We wish to find the particular configuration m* for which this number is a maximum. However, Wm is a gigantically large number, and typically we therefore calculate lnWm , (employing Stirlings formula for the factorials), which is a monotonously increasing function of Wm . Hence, rather than Wm , we maximize lnWm , which will lead to the same most likely configuration. In addition we need to impose constraints on the configuration vector m such that it yields a particular total energy E, and corresponds to a particular number of total oscillators. m i M , the total number of oscillators i m i i E i Such a constrained optimization is best performed by using Lagrange’s method of undetermined multipliers. In this procedure one defines a function to be optimized as F (m, , ) ln Wm ( mi i E ) ( mi M ) i i The function F is the original function to be optimized plus an undetermined multiplier times each constraint. (The signs are chosen with hindsight such that the multipliers will be positive numbers). This function can then be used in an unconstrained optimization, provided that the function is also made stationary with respect to changes in the multipliers. The function is hence required to be stationary in the variables mk , , k . The stationarity conditions w.r.t. the Lagrange multipliers yield 6 F 0 mi i E 0 i F 0 mi M 0 i These are precisely the constraints. If these are satisfied the function F (m, , ) reduces to lnWm . The other stationary conditions are F ln Wm k 0 k mk mk Hence, to carry out the optimization we need to take the partial derivatives of ln Wm ln M! ln M ! ln mi ! ln M ! ln mi ! i i mi ! i To evaluate the logarithms of factorials we use Stirlings approximation (see further notes, 1). Stirling: ln n ! n ln n n And hence ln Wm M ln M M ( mi ln mi mi ) i i and ln Wm ln mk 1 1 ln mk mk Combining this with the expression for F 0 , we find mk ln mk k 0 mk e k e Now using the constraint on the total number of oscillators we can eliminate m k M e e k M e k k M e k k mk M e k e k k Let us note that by invoking Stirling’s approximation we can treat the integers mk as continuous variables, and at the optimum, the mk may no longer be integers. For M very 7 large, this is hardly a problem. For small M, we can choose integers that are closest (by rounding) to the optimal fractional numbers, and following the above recipe we will be ‘close’ to the most likely configuration. Imposing the constraint on the total energy in principle determines the parameter m k k M k e k / e k E k k k This equation is not easily solved, however, for and the most likely configuration can be thought to be an explicit function of , which then determines the total energy in the system, E. Let us take stock at this moment of what has been achieved, and provide some further interpretation. For the most likely configuration, m * , we have obtained mk* e k Pk M e k k For any configuration the quantity mk is the fraction of oscillators in energy level k, M which in Metiu would be denoted as k , the frequency of the energy level. The most likely configuration takes on a special significance as this configuration can be used to determine the average values in the ensemble. The fractions in the most likely configuration are therefore denoted the probabilities, Pk , to find an oscillator in energy level k. In the most likely configuration these frequencies, or equivalently, the probabilities are given by a formula that is identical to the Boltzmann distribution, if we identify, 1/ kT . The average energy in the ensemble is then given by E k e k / e k k Pk M k k k and the average energy depends on the parameter . Of course we can also define a partition function for future convenience Q( ) e k . k 8 The quantity ln Wm M ln M M ( mi ln mi mi ) can be written as i ln Wm M ln M M M i ln i i i mi m ln( i M ) M ln M M ( i (ln i ln M ) M M i i where we used i 1 . For the most likely distribution we would write i ln Wm* M Pi ln Pi . The quantity lnWm* scales linearly with the number of i oscillators, or the size of the system, and is extensive. The average quantity ln Wm* ln Wm* Pi ln Pi M i is independent of the size of the system, and this is the fundamental quantity that is optimized to reach the most likely distribution (under constraints). It is of interest to note that the sum runs over the levels of the oscillator, and there is no reference anymore to the total number of oscillators M. As discussed before in class, following the text by Metiu, entropy is defined as S k B Pi ln Pi , and optimizing the probabilities to find i the most likely configuration is precisely equivalent to finding the state of maximum ‘entropy’. It is of interest to redo the derivation of finding the most likely configuration, using the probabilities as the fundamental variables. The problem is to find the probability distribution, such that Pi ln Pi is a maximum under the constraints i P 1, a i i normalization condition, and the condition that the average energy has a certain value i Pi , which reflects the conservation of energy for the complete system. This i maximizing probability distribution is overwhelmingly likely, if one assumes that every quantum state of a given total energy is equally likely in an isolated system at equilibrium. All averages can be obtained from the most likely configuration, and the associated maximizing probability distribution. Hence, it is of key importance to calculate this most likely distribution. To this end define the function 9 F (Pi , , ) Pi ln Pi ( Pi 1) ( i Pi ) i i i which, as before, includes two Lagrange multipliers to impose the constraints, and impose stationarity of the functional with respect to all parameters to find (besides the constraints, which derive from stationarity with respect to the multipliers): F ln Pj j ( 1) 0 Pj e( 1)e j Pj and from the normalization condition P e ( 1) j j e j j 1 e( 1) 1/ e j j Or, the probabilities in the most likely distribution are given by Pj* ( ) e j / e j j The parameter is in one to one correspondence with the average energy of an oscillator, (the energy constraint): e / i i i e i i Since the Lagrange multiplier corresponding to overall normalization can always easily be eliminated, we can use a short cut, by minimizing f ( Pi , ) Pi ln Pi ( i Pi ) i i ln Pi i 0 Pi e i where the Pi are relative probabilities, which are not yet properly normalized. Then we can define a partition function and normalized probabilities accordingly Q( ) Pi e i i i Pi ( ) Pi / Q e i / Q In the sequel we will follow this procedure for many different types of “ensembles”. We can impose different constraints, but always optimize the basic quantity Pi ln Pi to i find the appropriate formulas for the probabilities and partition functions. 10 As will be discussed further in one of the problem sets, one can equilibrate a system as the one introduced above, by starting from an arbitrary state n1 , n2 ,....nM defined by the quantum numbers of each individual oscillator, of a certain (total) energy, and then randomly change the state by raising one (randomly chosen) oscillator by one level, and lowering another (randomly) oscillator by one level, such that the total energy is conserved. (This is one possible procedure, see also further notes 2). This random perturbation of the state is repeated many times, and for each instant in the simulation, one can count how many oscillators are in level 0 ( m0 ), how many in level 1 ( m1 ) and so forth. This allows one to define the fraction of oscillators in each level i mi / M . By applying many random moves in the system, the system will reach “equilibrium”, meaning that the fractions reach the probabilities of the most likely distribution, independent from the starting state. The quantity i ln i gradually i increases (while fluctuating) to reach its maximum value Pi ln Pi . Once the system i has reached equilibrium, the fractions will keep fluctuating around the most probable ‘Boltzmann’ values. One can calculate average properties by calculating properties for each state n1 , n2 ,....nM at a particular instant in the simulation and averaging the result. This will agree (within statistical noise) with the averages from statistical mechanics, using the probabilities Pi . One can also calculate fluctuations from the average values. It is harder to reach convergence for these quantities, and one needs to run the simulation for a long time. It is interesting to reflect on this numerical experiment. There is no real dynamics in the system, just random moves. Because the most likely configuration (particular values of the mi ) has by far the most states, and because we randomly sample the various accessible states (all of the same energy), the system spends most of the time in the most likely configuration. It can move away from it, lowering the function i ln i , but it is unlikely to deviate much as equilibrium is reached. The fluctuations i are very small for simulations (or experiments) involving a very large number of molecules as are encountered in real systems. 11 3. Towards thermodynamical interpretation of the basic quantities. We will assume the most likely configuration and define average quantities accordingly. Up to this point we have obtained and/or defined the following Pi e i / e i e i / Q i Q e i i U i Pi i ln W Pi ln Pi i We would like to determine from first principles considerations how to interpret , and also lnW , and in this way provide the connection to thermodynamics. Both U and lnW are functions of . Let us first investigate their derivatives. U Pi i i (expression will be used below ) (e i / Q) i i 1 Q i2 e i / Q 2 ( i e i )( ) Q i i i2 e i / Q ( i e i / Q )( j e i i j /Q j i2 Pi ( i Pi )( j Pj ) i i j P ( i ) 2 Pi 2 i i 2 i ( ) i 2 U Hence we see that is always negative. This derivative is directly related to fluctuations around the mean energy, or the variance of the energy. We will return to this issue at a later time. Next consider 12 ln W i Pi 1 Pi ln Pi Pi Pi i Pi (ln e i ln Q ) Pi i i P i i ln Q Pi 0 i i P U i i 0 i where we used the first line of the U analysis and the fact that P 1 0 . i i I am using partial derivatives here out of habit. The quantities at this moment only depend on , and it would be more appropriate to use total derivatives. It then follows U 1 ln W M which we will use in a moment. To illuminate the meaning of the parameter , consider two systems, the first consisting of M 1 oscillators, W1 ( 1 ) states (in the most likely configuration) and a total average energy U1 ( 1 ) . The other system consists analogously of M 2 oscillators, W2 ( 2 ) states in the most likely configuration, and average energy per oscillator U 2 ( 2 ) . The values for are different initially in these two systems. In addition the oscillators could have totally different energy levels in the two systems. Let us assume that these two systems can exchange energy and we let them equilibrate such that they reach the most likely configuration, while preserving total energy. The total number of states is given by WW 1 2, and hence we can maximize the function F ( 1 , 2 ) ln WW 1 2 ( M 1U1 M 2U 2 ) ln W1 ( 1 ) ln W2 ( 2 ) ( M1U1 ( 1 ) M 2U 2 ( 2 )) which includes a constraint to preserve the total energy (and an associated Lagrange multiplier ). The maximum of the function is reached if this function is stationary with respect to changes in 1 , 2 , hence 13 ln W1 U1 M1 0 1 2 1 2 ln W2 U 2 M2 0 2 1 2 1 and using the relation U 1 ln W , derived previously, we obtain M U U M 11 1 M 1 1 0 1 1 2 1 2 U U M 22 2 M 2 2 0 2 2 1 2 1 and hence we obtain the important result that the most likely (equilibrium) configuration is reached if 1 2 . This situation is well known from thermodynamics: If two systems can exchange energy they reach (thermal) equilibrium if they have the same temperature. Hence, we deduce that the Lagrange multiplier plays the same role as temperature in thermodynamics: f (T ) , where f indicates some monotonous and universal function of temperature. These latter conditions are strong and one might need to do some hand waving to justify this. (Universal, as we did not make any assumptions about the second system. Monotonous, otherwise one value of might correspond to two different temperatures, or conversely). With this piece of knowledge established, let us return once more to the relation (T ) U lnW , and substitute f (T ) . Hence U dT ln W dT U 1 ln W S T T d T d T (T ) T T It is now easy to see that the identifications S kB ln W kB Pi ln Pi i and 1/ kBT are suggested by this relation, where k B could still be an arbitrary constant, but it turns out to be the Boltzmann constant. The above is only a plausibility argument and not a first principles derivation of the connection between thermodynamics 14 and statistical mechanics. I am not even sure, if the connection necessarily has to be like it is. I have not yet encountered fully convincing arguments in the literature. 4. Generalization of the model, and abstraction of the defining features. We used a concrete system of a collection of non-interacting harmonic oscillators to provide a simple introduction to the basic principles of statistical mechanics. In the next handout/chapter we will be discussing a more formal approach, and it is good to see the connection between the two approaches. It is well possible that this section makes more sense after digesting more of the next handout. In general one defines an ensemble of M ‘similar’ non-interacting systems, and assumes solved the Schrödinger equation for the individual systems H n n n . This is analogous to the use of the harmonic oscillator model. The wave function for the complete ensemble is then a product (or (anti)symmetrized product) of the individual system wave functions 1n1 2n 2 ... MnM . This ensemble wave function, analogous to the harmonic oscillator states n1 , n2 ,..., nM , has a large degeneracy, and the basic assumption is that each state is equally likely to occur (in an isolated ensemble of a particular total energy), and we wish to take the average over all possible, accessible, ensemble wave functions. This provides the connection to thermodynamic quantities. This averaging procedure is most easily done by defining configurations, and counting how many individual systems are in quantum state nk and so forth. These configurations are analogous to the vectors m (m0 , m1 ,..., mN ) . In practice we only need the frequencies of occurrence i mi / M , and statistical averages can be obtained by considering exclusively the most likely configuration, in which the frequencies become the probabilities i Pi . These probabilities corresponding to the most likely configuration are obtained by maximizing Pi ln Pi , i subject to constraints that define the particular characteristics of the ensemble. Different types of ensembles (constraints) then lead to different thermodynamic identifications. This will be the subject of the next handout. 15 5. Further Notes 1. Stirlings approximation. To evaluate ln n! for large n , we use ln(n !) ln(n(n 1)(n 2)...1) ln n ln( n 1) ...ln(1) ln k k 1, n This sum is replaced by an integral to obtain Stirling’s approximation: n ln n ln xdx x ln x x1 n ln n n 1 n ln n n n k 1, n 1 2. Some comments on the simulation model. It was mentioned that one can start from an arbitrary state n1 , n2 ,..., nM and then excite one (randomly chosen) oscillator by one quantum and de-excite another. Because we are using a harmonic oscillator as the basic system, which has equidistant energy levels, this procedure precisely conserves the total energy of the state, and it provides a simple way to move randomly through the set of accessible and allowed states. If the energy spacings would be general it is not so easy to prescribe a procedure that (randomly!) traverses the space of allowed states and preserves the total energy of the ensemble. So it is good to keep in mind that the suggested random moves work for the simulation model with harmonic oscillators, but one should not draw very general conclusions from it. It needs to be refined to describe more general systems. 16