Download PHYS 3651 The Physical Universe

Document related concepts

Schiehallion experiment wikipedia , lookup

Elementary particle wikipedia , lookup

Negative mass wikipedia , lookup

Conservation of energy wikipedia , lookup

First observation of gravitational waves wikipedia , lookup

Equations of motion wikipedia , lookup

Old quantum theory wikipedia , lookup

Faster-than-light wikipedia , lookup

Newton's theorem of revolving orbits wikipedia , lookup

Photon polarization wikipedia , lookup

Speed of gravity wikipedia , lookup

Dialogue Concerning the Two Chief World Systems wikipedia , lookup

Newton's laws of motion wikipedia , lookup

Aristotelian physics wikipedia , lookup

Work (physics) wikipedia , lookup

Gravity wikipedia , lookup

Classical mechanics wikipedia , lookup

History of physics wikipedia , lookup

Time in physics wikipedia , lookup

A Brief History of Time wikipedia , lookup

Classical central-force problem wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Transcript
PHYS 3651 The Physical Universe
Dr. S.C.Y. Ng
[email protected]
Contents
Overview
1
1 Spherical Astronomy
4
1.1
Sky and Celestial Sphere . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2
Equatorial Coordinate System . . . . . . . . . . . . . . . . . . . . .
5
1.2.1
Longitude and Latitude . . . . . . . . . . . . . . . . . . . .
5
1.2.2
Motion of the Sun
. . . . . . . . . . . . . . . . . . . . . . .
6
1.2.3
Special Points . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.2.4
Equatorial Coordinates . . . . . . . . . . . . . . . . . . . . .
8
1.2.5
Circumpolar Stars . . . . . . . . . . . . . . . . . . . . . . .
9
1.2.6
Great Circle . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.3
Other Celestial Coordinate Systems . . . . . . . . . . . . . . . . . .
12
1.4
Limitations of Coordinate Systems . . . . . . . . . . . . . . . . . .
13
1.4.1
Precession . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.4.2
Aberration of Light . . . . . . . . . . . . . . . . . . . . . . .
14
1.4.3
Parallax . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2 Light and Telescopes
2.1
17
Electromagnetic Wave . . . . . . . . . . . . . . . . . . . . . . . . .
i
17
CONTENTS
ii
2.2
Magnitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.3
Spectrum, Spectral Lines, and Atoms . . . . . . . . . . . . . . . . .
21
2.4
Optics and Telescopes . . . . . . . . . . . . . . . . . . . . . . . . .
25
2.4.1
Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
2.4.2
Refracting telescopes . . . . . . . . . . . . . . . . . . . . . .
26
2.4.3
Reflecting and catadioptric telescopes . . . . . . . . . . . . .
26
2.4.4
Magnification and resolution . . . . . . . . . . . . . . . . . .
27
2.4.5
Lens speed . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
CCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.5
3 Celestial Mechanics
33
3.1
Newton’s Laws of Motion
. . . . . . . . . . . . . . . . . . . . . . .
33
3.2
Newton’s Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.2.1
Roche Lobe . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
3.2.2
Critical Density of the Universe . . . . . . . . . . . . . . . .
42
3.2.3
Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . .
43
Two-body Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
3.3.1
Kepler’s Laws of Planetary Motion . . . . . . . . . . . . . .
45
3.3.2
Orbits in Two-body Problem . . . . . . . . . . . . . . . . .
45
3.3.3
Proof of Kepler’s Laws . . . . . . . . . . . . . . . . . . . . .
51
3.4
Impact Parameter and Scattering Angle . . . . . . . . . . . . . . .
52
3.5
Restricted Three-body Problem . . . . . . . . . . . . . . . . . . . .
54
3.3
4 Introduction to Radiative Processes
4.1
Solid Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
58
CONTENTS
iii
4.2
Specific Intensity and Flux . . . . . . . . . . . . . . . . . . . . . . .
60
4.3
Emission and Absorption . . . . . . . . . . . . . . . . . . . . . . . .
64
4.4
Basics of Statistical Mechanics (Optional) . . . . . . . . . . . . . .
65
4.4.1
Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . .
65
4.4.2
Isolated Systems . . . . . . . . . . . . . . . . . . . . . . . .
67
4.4.3
Systems in a Heat Bath . . . . . . . . . . . . . . . . . . . .
67
4.4.4
The Perfect Classical Gas . . . . . . . . . . . . . . . . . . .
69
4.4.5
The Partition Function . . . . . . . . . . . . . . . . . . . . .
70
4.4.6
The Perfect Quantal Gas and Quantum Statistics . . . . . .
73
4.4.7
The Partition Function . . . . . . . . . . . . . . . . . . . . .
74
4.4.8
Derivation of Blackbody Radiation . . . . . . . . . . . . . .
74
Physics of Blackbody Radiation . . . . . . . . . . . . . . . . . . . .
76
4.5.1
Stefan-Boltzmann law . . . . . . . . . . . . . . . . . . . . .
77
4.5.2
Rayleigh-Jeans Law . . . . . . . . . . . . . . . . . . . . . . .
78
4.5.3
Wien Law . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
4.5.4
Wien’s Displacement Law . . . . . . . . . . . . . . . . . . .
79
4.5.5
Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . .
79
4.5.6
Temperature Definitions . . . . . . . . . . . . . . . . . . . .
80
4.6
Scattering Cross Section . . . . . . . . . . . . . . . . . . . . . . . .
81
4.7
Chemical Potential and Saha Equation (Optional) . . . . . . . . . .
84
4.5
Overview
• What is this course about?
This is an introductory course to astrophysics, likely your first formal course
in astrophysics.
• What will I learn in this course?
The aim is to provide essential knowledge on mathematical tools and physical concepts that used in astrophysics, in order to help appreciate the physical principles of how the Universe works and to set the stage for more serious
courses in astrophysics. Throughout this course, you may not see pretty astronomy pictures, but a lot of physics equations. These are what professional
astrophysicists see most of the time, and how we extract scientific results from
observations.
• What is astrophysics about?
Astrophysics is the study of the physical properties of objects in the Universe
and their interactions. It has a close connection with many other fields in
physics, including mechanics, electromagnetism, statistical mechanics, relativity, and even quantum physics in some cases. As we believe that the laws
of physics are universal, we can apply those laws we developed on Earth to
the celestial objects to understand how they work. In some cases, we can also
do it the other way round: use the Universe as a laboratory to test the laws
of physics, in particular, under the most extreme conditions that can never be
reproduced on Earth. Some may consider astronomers as those study the
physical properties of celestial objects and astrophysicists as those use the
celestial objects to study physics.
• Why observations?
Unlike other branches of physics, astrophysics relies heavily on observations.
This is because due to the distance and scale of the celestial objects, it is
very difficult (if not impossible) to collect samples for detailed studies or to
reproduce them in laboratories. Fundamentally, there are only four things
one can measure with astronomical observations: position, spectrum (flux),
time, and polarisation. In Chapter 1, we will introduce the coordinate systems, which are the fundamentals in positional astronomy. They are also
like the language of astronomy when we want to specify an object to another
astronomer. Different coordinates and their limitations will be discussed.
1
CONTENTS
2
• Course overview:
In traditional astronomy, electromagnetic radiation is the main cosmic messenger we rely on. In Chapter 2, we will talk about the basics of light and
how we detect them using telescopes. In Chapter 4, we will study how radiations are generated and propagated. This can give insights into the physical
processes happening inside celestial objects. Together these topics provide an
introduction to spectroscopy.
In Chapter 3, we will review Newtonian mechanics and apply it to simple
two-body problems. We will see how the orbital motions of celestial bodies
are governed by simple laws of physics.
A note on units: Throughout this course, we will follow the tradition in this
field to use cgs unit, i.e. cm, g, and s. This is what you will see in research
papers. Force will be expressed in terms of dyn (=g cm /s2 ) and energy is in erg
(=g cm2 /s2 ). Charge is in electrostatic unit of charge (esu), so that Coulombs law
becomes F = q1 q2 /r2 .
Exercise: 1 N= dyn and 1 J =
erg.
Syllabus
Ch. 1 Spherical Astronomy:
Sky and Celestial Sphere, Equatorial Coordinate System, Other Celestial Coordinate Systems, limitations of the coordinate systems.
Ch. 2 Light and Telescope: Electromagnetic Wave, Magnitudes, Spectrum,
Spectral Lines, and Atoms, Optics and Telescopes, CCD.
Ch. 3 Celestial Mechanics: Newton’s Laws of Motion, Kepler’s Laws, Two-body
Problems, Scattering, Restricted Three-body Problem.
Ch. 4 Introduction to Radiation Processes: Solid Angle, Specific Intensity
and Flux, Emission and Absorption, Blackbody Radiation.
Learning objectives
Ch. 1 Spherical Astronomy:
Understand the definition and limitations of coordinate systems.
Manage to apply the equatorial coordinates and to determine the angular separation
between points.
Ch. 2 Light and Telescope:
Be able to convert between magnitude and flux.
CONTENTS
3
Understand the formation of spectral lines.
Be able to compare the working principles of different types of telescopes and to
calculate the diffraction limit.
Ch. 3 Celestial Mechanics:
Manage to calculate orbits of celestial bodies using Newton’s gravitation law.
Manage to solve the two-body problem and derive Kepler’s laws.
Understand the physical significance of Lagrangian points.
Ch. 4 Introduction to Radiation Processes:
Understand the basic terminology of radiative transfer and the transfer equation.
Manage to apply the blackbody radiation to astrophysical situations.
Chapter 1
Spherical Astronomy
(Chapters 1 and 3.1 in textbook.)
Astronomy is the science studying objects in the sky. We need to have a coordinate
system to tell others where those objects are. How do we define such a system?
The first natural thing you may come up is the altitude-azimuth coordinates
(or horizontal coordinates). When you take a picture of the sky, you specify
the altitude (elevation) and azimuth (angle around the horizon). However, what
is the problem with this system? Objects are constantly moving in the sky and
this coordinates depend on the observer location! Therefore, we need a coordinate
system on the sky. Since the sky, or more precisely, the celestial sphere is a sphere,
we have to study the spherical geometry.
1.1
Sky and Celestial Sphere
What we see as the sky is, in fact, the mostly
empty space outside the Earth’s surface. During day time, the scattering of Sun light by the
atmosphere brightens up the sky and we cannot see much beyond the atmosphere or even
the clouds. At night, without the interference
of the Sun, we can see much further, including
the stars, galaxies and many other objects in the
universe.
Celestial sphere
Earth
Figure 1.1: Earth inside the infiSince the sky or the universe is all around us, nite large celestial sphere.
our ancestors thought that we were inside a large
sphere, the celestial sphere, and all stars, including the Sun, were moving on the
celestial sphere. We now know that celestial sphere is not real, but the concept that
we are inside an imaginary infinite sphere is still very useful in astronomy, Fig. 1.1,
4
CHAPTER 1. SPHERICAL ASTRONOMY
5
especially when we want to consider the positions of stars. This is what we mean
by a celestial sphere in the followings.
A fine point about the celestial sphere is where exactly is its center. When we
observe objects near to us, like the artificial satellites or even the Moon, we have
to be precise because their positions relative to the distant stars depend on the
center. Two common positions for the center are the center of the Earth, which
defines the geocentric coordinate system, and the position of the observer on
the surface of the Earth, which defines the topocentric coordinate system. We
are not worrying about the differences in the following.
Because the Earth is rotating from west to east, everything in the sky will move
from east to west. As a result, the Sun and almost all the stars will rise from the
east and set in the west. If the celestial sphere is fixed with the Earth, the positions
of the stars will change from minute to minute. That won’t be very convenient.
The celestial sphere is, hence, fixed with the stars. (Here we implicitly assume that
stars do not move. In fact, they do move in space. However, they are so far away
that we can only detect the motions of very few of them.) For observers on the
Earth, it will rotate once a day, just like the stars do. However, how far the stars
are away from us is not shown on the celestial sphere. In other words, celestial
sphere is just a two dimensional projection of the three dimensional universe. We
are not going to talk much about distance measurement of celestial objects because
it is very difficult although very important.
1.2
1.2.1
Equatorial Coordinate System
Longitude and Latitude
Before we go into the details of the coordinate systems on the celestial sphere, we
first briefly review the coordinate system on the Earth’s surface.
Earth’s surface is also a sphere. We use longitude and latitude to specify the
position of a city, say, on the Earth. The latitude is defined as the angle sustained
at the center of the Earth from the equator, where equator is the great circle
mid-way between the north and south poles.
Longitude is defined as the angular distance east or west from an imaginary line, a
meridian, running from the north pole to south pole. This meridian is chosen as
the one through the Greenwich Observatory in England. For example, the latitude
and longitude of Hong Kong are about 22.5◦ N and 114.2◦ E, i.e. 22.5◦ north of
equator and 114.2◦ east of Greenwich.
CHAPTER 1. SPHERICAL ASTRONOMY
6
60
North celestial pole
30
0
114.2
20h
Greenwich
Line of declination
22h
−30
Ecliptic
0h
22.5
Hong Kong
2h
Equator
Line of right ascension
South celestial pole
Vernal equinox
Figure 1.2: Coordinate system on Earth and the celestial sphere with equatorial
coordinate system.
We now back to celestial sphere, Fig. 1.2. Like the surface of the Earth, it has two
poles: the north and south celestial poles. They lie directly above the Earth’s
poles. The celestial equator lies directly above the Earth’s equator.
An observer standing on the Earth’s surface can only see half of the celestial sphere
at one time. The other half is blocked by the Earth itself. The visible half is
bounded by the observer’s horizon. The point on the celestial sphere directly
above the observer is called the zenith. Due to the rotation of the Earth, zenith
not only depends on the position of the observer on the Earth, but also is not a
fixed point on the celestial sphere.
1.2.2
Motion of the Sun
Figure 1.3: Solar day vs. sidereal day.
We use Solar time in daily life. A solar day is 24 hours, which is defined as the time
period for the Sun to return to the same position in the sky as observed on Earth.
Since the Earth orbits around the Sun, it has to rotate about 361 degree in 24 hours
(see Figure 1.3. On the other hand, a sidereal day is the time for a distant star to
◦
return to the same position in the sky. The Earth rotates
during this period.
day is shorter and stars rise earlier/later everyday by min.
Therefore, a
CHAPTER 1. SPHERICAL ASTRONOMY
7
As the Earth rotates around the Sun, the latter appears to move respect to background stars. Over one year, the Sun appears to move around the celestial sphere
once. The path of this motion is called ecliptic. Since the rotation of the Earth is
not perpendicular to the plane of revolution, the ecliptic does not coincide with the
celestial equator. It makes an angle of 23.5◦ with the celestial equator, the same
angle that the rotational axis of the Earth tilts from the revolution axis.
Questions:
There are 88 constellations in the sky but only 12 are zodiac, why?
How does the ecliptic of the Sun look like in the equatorial coordinate system?
1.2.3
Special Points
The two points that the ecliptic intersects the celestial equator are called the
equinoxes. Vernal equinox is the point where the Sun crosses the celestial equator from the southern to the northern half of the celestial sphere, around March 21
each year. It is in the constellation Pisces. Vernal equinox also marks the origin of
the celestial coordinate system, as we will talk about below.
Autumnal equinox is the point where the Sun goes from the northern half to the
southern half, around September 23 each year. It is in the constellation Virgo. The
Sun rises and sets due east and due west respectively at the days of equinoxes.
There are other two special points on the ecliptic. At the summer solstice, the
Sun reaches the greatest distance from the celestial equator in the northern half of
the celestial sphere, around June 21. For an observer on Earth, the Sun rises and
sets at different directions on different day during the year. On summer solstice, it
rises and sets at the northern most points. It is the longest day and shortest night
for the northern hemisphere. While it is summer for the northern hemisphere, it is
winter for the southern hemisphere.
At the winter solstice, the Sun reaches the greatest distance from the celestial
equator in the south, around December 22. The Sun rises and sets at the southern
most points of the year. It is shortest day and winter for the northern hemisphere.
The names of the solstices are biased to people in northern hemisphere.
As a note, have you ever wonder why Easter, unlike many other festivals, is not on
the same day every year? This is because Easter is held on the first Sunday after
the first full moon occurring on or after the vernal equinox.
Discussion: At the summer solstice, on which part of the Earth one will see the
Sun passes directly overhead? This is called the Tropic of Cancer.
CHAPTER 1. SPHERICAL ASTRONOMY
8
What is the difference of the Sun’s path in a year north and south of this line?
On the same day, on which part of the Earth one will not see Sun rise and Sun
set? (Ans: the Antarctic Circle and the Artic Circle, respectively. What are their
latitudes?)
A counterpart of the Tropic of Cancer in the southern hemisphere is called the
Tropic of Capricorn. What latitude is it and what is special about this line?
Question: Do you know why are there four seasons?
1.2.4
Equatorial Coordinates
The most common coordinate system on the celestial sphere is the equatorial coordinate system. As discussed above, this system is like an extension of longitude
and latitude on Earth to the sky. The reference plane is the celestial equator and
the coordinates used are the right ascension and declination. Just like longitude,
we need to choose a reference point. (How is 0◦ longitude defined on Earth?) This
is chosen using the vernal equinox.
Declination (dec., symbol δ) is the celestial equivalent of latitude on the Earth. It
is measured in degrees, from 0◦ at the celestial equator to 90◦ at the poles, positive
values or with an additional symbol “N” for the northern half and negative values
or “S” for the southern half of the celestial sphere.
Right ascension (RA, symbol α) is the equivalent of the longitude. The zero line
of right ascension is chosen to pass through the vernal equinox. Right ascension is
measured eastwards from the vernal equinox in hours, minutes and seconds, from 0
to 24 hours.
vernal equinox
summer solstice
autumnal equinox
winter solstice
RA
h m
0 0
6 0
12 0
18 0
Dec.
◦
0
23.5
0
-23.5
Table 1.1: The coordinates of the four special points on the ecliptic.
In the equatorial coordinate system, vernal equinox is at 0h 0m and 0◦ ; autumnal
equinox at 12h 0m and 0◦ ; summer solstice at 6h 0m and 23.5◦ and winter solstice
at 18h 0m and −23.5◦ , Table 1.1. (It is customary to write the hour and minute of
right ascension as superscripts.) Be careful! One common source of confusion is
CHAPTER 1. SPHERICAL ASTRONOMY
9
that the minute and second in RA are in units of time, but those in declination are
arcminutes and arcseconds.
Questions:
1h =
◦
′
, 1m =
, and 1s =
′′
.
Why the zodiac signs start with Aries, but the Vernal equinox is currently in Pisces?
1.2.5
Circumpolar Stars
We now talk about which stars an observer can see and which cannot. As we have
mentioned before, at any particular time, an observer on Earth can only see half
the celestial sphere. The other half is blocked by the Earth itself (see Fig. 1.4).
If we take the rotation of the Earth into account, can the observer see the whole
celestial sphere? Usually not.
If seen from the poles, stars move in circles parallel to the horizon and never rise
or set. If seen from the Earth’s equator, the rotation of the Earth does allow the
observer to see the whole celestial sphere. At intermediate latitudes, say L◦ north,
some stars never rise (see homework), and hence the observer cannot see it at all.
At the other extreme, some stars never set. They are called circumpolar stars.
For people in northern hemisphere, there is a star, called Polaris, only 1◦ away from
the northern celestial pole. It is often called the North Star.
What is the angular separation between two points? We would like to first relate
the equatorial coordinate system with a Cartesian coordinate system. Assume that
the celestial sphere is the unit sphere with the z-axis passing through the north
celestial pole and the x-axis passing through vernal equinox (see Fig. 1.4).
North
L
Visible
θ
L
(x,y,z)
Earth
φ
Invisible
South
Figure 1.4: Left: an observer on Earth can only see half the celestial sphere at a
time. Right: spherical and Cartesian coordinate systems.
A point P with Cartesian coordinates r = (x, y, z), x2 + y 2 + z 2 = 1 has spherical
CHAPTER 1. SPHERICAL ASTRONOMY
10
coordinates
x = sin θ cos ϕ
y = sin θ sin ϕ
z = cos θ .
For the equatorial coordinate system, α = ϕ and δ = π/2 − θ, therefore,
  

x
 .
⃗r1 =  y  = 
z
(1.1)
(1.2)
The angular separation ∆Θ between two points with coordinates (α1 , δ1 ) and (α2 , δ2 )
is
cos(∆Θ) = cos δ1 cos α1 cos δ2 cos α2 + cos δ1 sin α1 cos δ2 sin α2 + sin δ1 sin δ2 . (1.3)
The proof is left as an exercise (hint: use the dot product).
Exercise: Star A is at (17h 55m 0s , −60◦ 0′ 0′′ ), Star B is at (18h 5m 0s , −60◦ 0′ 0′′ ).
What is their angular separation in the sky in arcminutes? (10m × 15?)
For small angular separation ∆θ ≪ 1, assuming α1 = α, δ1 = δ, α2 = α + ∆α, and
δ2 = δ + ∆δ. We then use identites
sin(a + b) =
cos(a + b) =
(1.4)
(1.5)
and Taylor expansion
x3 x5
+
− ...
3!
5!
x2 x4
cos x = 1 −
+
− ...
2!
4!
sin x = x −
(1.6)
(1.7)
to obtain
1−
∆Θ2
= cos δ cos α cos(δ + ∆δ) cos(α + ∆α) + cos δ sin α cos(δ + ∆δ) sin(α + ∆α)
2
+ sin δ sin(δ + ∆δ)
CHAPTER 1. SPHERICAL ASTRONOMY
=
=
=
∆Θ2 =
11
(
)(
)
∆δ 2
∆α2
cos δ cos α cos δ − ∆δ sin δ −
cos δ
cos α − ∆α sin α −
cos α
2
2
(
)(
)
∆δ 2
∆α2
+ cos δ sin α cos δ − ∆δ sin δ −
cos δ
sin α + ∆α cos α −
sin α
2
2
(
)
∆δ 2
sin δ
+ sin δ sin δ + ∆δ cos δ −
2
(
∆α2
cos δ cos α cos δ cos α − ∆α cos δ sin α − ∆δ sin δ cos α −
cos δ cos α
2
)
∆δ 2
+∆δ∆α sin δ sin α −
cos δ cos α + cos δ sin α (cos δ sin α
2
∆α2
cos δ sin α − ∆δ∆α sin δ cos α
+∆α cos δ cos α − ∆δ sin δ sin α −
2
(
)
∆δ 2
∆δ 2
−
cos δ sin α) + sin δ sin δ + ∆δ cos δ −
sin δ
2
2
∆δ 2
∆δ 2
∆α2
cos2 δ −
cos2 δ −
sin2 δ
1−
2
2
2
(∆α cos δ)2 + ∆δ 2 .
(1.8)
This can be used to calculate the proper motion of a star, dΘ/dt, on the celestial
sphere in terms of small changes in R.A. and Dec. (see Fig. 1.5).
Figure 1.5: Proper motion of a star on the celestial sphere.
Question: How does the proper motion of an object relate to its space velocity?
1.2.6
Great Circle
A great circle is a circle which is the intersection of the sphere with a plane passing
through the origin. Let n0 = (x0 , y0 , z0 ) be a unit vector perpendicular to the plane.
CHAPTER 1. SPHERICAL ASTRONOMY
12
Then, all points (x, y, z) on the plane satisfy
xx0 + yy0 + zz0 = 0 .
(1.9)
Substituting Eq. (1.1) into the above equation, we have an equation of θ and ϕ
defining the great circle.
For example, the ecliptic is horizontal in Fig. 1.2. Hence, the unit vector is vertical
in the figure and is (18h , 66.5◦ ), for which the spherical coordinates are (θ, ϕ) =
(23.5◦ , 270◦ ). This point is the ecliptic north pole, see below. The equation of
ecliptic is
0 = − cos δ sin α sin 23.5◦ + sin δ cos 23.5◦
tan δ = tan 23.5◦ sin α .
1.3
(1.10)
Other Celestial Coordinate Systems
n0
Ecliptic
r
λ
β
P
Celestial equator
r’
rv
Figure 1.6: The ecliptic coordinate system.
The ecliptic coordinates are given by the ecliptic latitude β and ecliptic longitude
λ. Their definitions are similar to those of declination and right ascension, but now
referring to the ecliptic instead of celestial equator. In particular, β of a point P
is the angle between the point and the ecliptic, positive for northern hemisphere
and negative for southern hemisphere. λ is the angular distance between vernal
equinox and the intersection point of the great circle passing through the point and
the north ecliptic pole and the ecliptic, Fig. 1.6.
For the point P , use the dot product,
cos(90◦ − β) = r · n0
sin β = −y sin 23.5◦ + z cos 23.5◦
= − sin 23.5◦ cos δ sin α + cos 23.5◦ sin δ .
(1.11)
CHAPTER 1. SPHERICAL ASTRONOMY
13
The component of r perpendicular to n0 is r−(r·n0 )n0 . Hence, r′ is the unit vector
along this direction,
r′ =
(cos δ cos α, cos δ sin α, sin δ) − sin β(0, − sin 23.5◦ , cos 23.5◦ )
.
the magnitude
(1.12)
The Cartesian coordinates of the vernal equinox are rv = (1, 0, 0), and cos λ = rv ·r′ ,
which is just the x-component of r′ ,
cos λ =
cos δ cos α
.
|(cos δ cos α, cos δ sin α + sin β sin 23.5◦ , sin δ − sin β cos 23.5◦ )|
(1.13)
The readers could work out the expression in the denominator, which is not very
illuminating.
Another common coordinate system is the Galactic coordinate system, where
the defining great circle is the Galactic plane, with the north Galactic pole at
α = 12h 51.4m and δ = 27◦ 7′ and the zero point of Galactic longitude at the direction
of Galactic center (17h 45.6m , −28◦ 56′ ). Galactic coordinates are expressed in (l, b),
where l is the Galactic longitude b is the Galactic latitude. Finally, there is also
Supergalactic coordinate system, which is used in the studies of nearby galaxy
clusters, including the Virgo Supercluster.
1.4
Limitations of Coordinate Systems
As a final remark, we discuss some limitations of the coordinate systems. It is
actually more complicated to specify the position of objects in the sky. First of all,
stars are moving in space. But they are very far away, hence, the proper motions
are generally small, typically in the order of millisecond per year. Even if the objects
are not moving, their apparent positions in the sky are changing at different times
of a year due to the motion of the Earth. The two major causes are aberration of
light due to the finite speed of light and parallax for nearby objects.
1.4.1
Precession
The major problem with the equatorial coordinate system is that the vernal equinox
is not fixed with respect to the stars. It is constantly moving due to precession
of the Earth’s rotational axis. This is caused by the fact that the Earth is not
a perfect sphere but has a larger diameter at the equator than at the pole. The
gravitational pull of the Sun and the Moon on the near and far sides of the Earth
are thus different, causing a torque perpendicular to the rotational axis. (You can
find more details at http://courses.physics.northwestern.edu/Phyx125/Precession
of the Earth.pdf)
CHAPTER 1. SPHERICAL ASTRONOMY
14
The precession of the Earth has a period of 26,000 years, which means the celestial
north or south pole, traces out a circle with that period. 13,000 years from now, the
Earth spin axis will be 47◦ away from the Polaris. Note that direction of precession
is opposite to the rotation of the Earth. As a result, the intersection between the
ecliptic and the celestial equator, i.e. the equinoxes are constantly shifting westward
about the ecliptic pole, with a period of 26,000 yr, i.e. about 1.38◦ per century. This
is also the reason why the zodiac signs start with Aries, since the vernal equinox
was in Aries when the constellations were introduced in the past, and only moved
into Pisces in 67 B.C.
Because of precession, when we talk about the equatorial coordinates, we also need
to specify the date and time used for comparing star coordinates, the epoch. The
standard epoch commonly used now is the beginning of the year 2000, denoted
J2000.0. In some old books, you could find the epoch J1975.0 or B1950.0. In
professional observatories, the exact epoch, i.e. the observation date, is needed in
order to point the telescope to the right direction at very high accuracy.
The changes in equatorial coordinates relative to J2000.0 can be approximated by
∆α = M + N sin α tan δ
∆δ = N cos α ,
(1.14)
(1.15)
where
M = 1.2812323T + 0.0003879T 2 + 0.0000101T 3
N = 0.5567530T − 0.0001185T 2 − 0.0000116T 3 .
M and N are in degrees and T = (t − 2000.0)/100 with t in fractions of a year.
Question: What is the effect of precession on the seasons?
1.4.2
Aberration of Light
Imagine sitting inside a moving vehicle on a rainy day, the rain drops would appear
to travel down at an angle, even there is no wind. Same is true for light, due its
finite speed and the observer’s motion. This effect depends on the position of the
sky and time of the year. For example, there is no aberration along the direction
of motion, and it is maximum when perpendicular. The observer’s motion can be
due to: 1. the Earth’s rotation, 2. the Earth’s orbital motion around the Sun. 3.
the Sun’s motion around the Galaxy.
We can estimate the maximum displacements using classical mechanics since v ≪ c,
although strictly speaking, relativity is needed. Aberration caused by the Earth’s
rotation is called diurnal aberration. It changes every day, but the magnitude is
relatively small, because even at the equator, the rotation velocity is only 460 m/s.
CHAPTER 1. SPHERICAL ASTRONOMY
15
θ
d
Earth
v
Sun
Figure 1.7: Aberration of light.
1AU
Figure 1.8: parallax.
Therefore, the maximum shift is δθ = v/c ≈ 0.3′′ . On the other hand, annual
aberration is caused by the motion of the Earth around the Sun, which has a high
velocity of 29.8 km/s, and it has a period of 1 year. Hence, δθ = 20.5′′ . It is
interesting to note that this motion is always perpendicular to the Sun, therefore,
the Sun always appears to be 20.5′′ off from its true position. Finally, motion of the
Sun in the Galaxy results in secular aberration, which is of the order of arcminutes.
However, the Sun takes 230 million years to revolve around the center of the Galaxy.
In practice, this aberration never changes and hence it is often ignored.
1.4.3
Parallax
Parallax arises because stars are not at infinite distance. The change of the observer’s viewpoint, mostly due to the motion of the Earth around the Sun, causes a
nearby star appears to move relative to distant objects at the background. We can
turn it around to use the parallax to determine distance, which is the most fundamental parameter we wish to know, but also the most difficult one to measure.
1 parsec is defined as the distance that gives an annual parallax of 1′′ . Hence, from
Fig. 1.8,
tan 1′′ =
=
d = 3.26 light-year.
(1.16)
(Remember that 1◦ = 60′ and 1′ = 60′′ . How large is an arcsecond? Put a 10c/ coin
across the harbour. The diameter as seen from campus is about 1′′ !)
CHAPTER 1. SPHERICAL ASTRONOMY
16
The nearest star, Proxima Centauri1 , has a parallax of 0.7687′′ , therefore, its distance is
pc. Stars in the Milky Way have distances from a few pc to kpc. Our
Sun is 8.5 kpc from the Galactic center and the Milky Way has a diameter of 30 kpc.
The Andromeda Galaxy is 0.78 Mpc away and other galaxies are over Mpc away.
The observable Universe has a radius of 15 Gpc.
The Hipparcos satellite from ESA was able to measure the position and parallax
of 118,281 stars in the solar neighborhood brighter than magnitude 9, providing
valuable information on their distances. It will be succeeded by the Gaia satellite,
which was just launched to L22 on 2014 Jan 8 and it will measure the distance to
1 billion objects.
1
Scientists recently discovered an Earth-like planet orbiting Proxima
http://www.nature.com/nature/journal/v536/n7617/full/nature19106.html.
2
We will talk about the second Lagrangian point, L2, in Chapter 3.
Centauri
Chapter 2
Light and Telescopes
(Chapters 3.2, 3.3, 5 and 6 in textbook.)
Although we can now go to the Moon and bring back samples of soil and rocks, we
can still only study other objects by investigating their radiations; could it be neutrinos, charged particle cosmic rays, gravitational waves, or electromagnetic waves.
Among them, the electromagnetic waves, or EM waves for short, is dominating. In
this chapter, we will discuss the properties of EM waves and the detection methods.
2.1
Electromagnetic Wave
EM waves are oscillations of the electric and magnetic fields, Fig. 2.1. It can be
produced by the acceleration of charged particles, and in turn, EM waves, or in
general electric field and magnetic field will affect the motion of charged particles.
They have no effect on neutral particles.
Vertical polarization
Electric field
direction of
propagation
direction of
propagation
Magnetic field
Horizontal
polarization
λ
Figure 2.1: Properties of electromagnetic waves.
Light, radio waves, infrared, ultraviolet, X-rays and gamma rays are EM waves of
different frequencies. Like other kinds of waves, the three fundamental properties of
17
CHAPTER 2. LIGHT AND TELESCOPES
18
EM waves are the speed (the speed of light is usually denoted by c), the frequency,
f , and the wavelength, λ. They are related by
c = fλ .
(2.1)
For EM waves in vacuum, the speed of light is independent of frequency (no dispersion), and is equal to 2.99792458 × 1010 cm/s. This value is exact in the sense that
we define the length of one meter by this value and the definition of time (which is
defined by the transition of some atoms). The constant speed of light is the starting
point of special relativity.
EM waves can have a very wide range of wavelengths, from shorter than an atom
or as long as the size of the Universe. Radio waves are about 1 mm to 100 m, then
followed by microwave, and then infrared IR and visible light. The wavelengths
of visible light are from 400 nm to 700 nm. Our atmosphere is transparent to EM
waves in radio and over a few windows in IR and visible light. The most important
window to human eyes is the optical window between 300 nm to 1100 nm. The
atmosphere is opaque to all shorter wavelength EM waves. For example, if we want
to carry out X-ray (wavelengths from about 10−7 m to about 10−9 m) or gamma
ray (wavelengths less than about 10−10 m) astronomy, we have to get above the
atmosphere, for example, from a satellite.
Apart from the three basic properties, different EM wave can also carry different
polarization, which is the direction that the electric field in the EM wave points,
Fig. 2.1. The direction of polarization must be perpendicular to the direction of
propagation. Thus, there are two kinds of polarizations. The EM wave from a source
will in general contain a mixture of waves with different directions of propagation,
different wavelengths and different polarizations.
When EM waves propagate, energy is transported from one place to another. The
amount of energy radiated by a star per unit time is called the luminosity, which
in units of erg /s. Then at the observer, the amount of energy received per unit
area per unit time is called the intensity or energy flux, which is in units of
erg /s /cm2 . Notice that flux describes the energy received and it can be understood
as the brightness of an object. If the source is far away, even it radiates enormous
amount of energy, the intensity observed could be low. To be more specific, consider
a sphere with distance d from the source, the total surface area is 4πd2 . Hence, the
total energy passing through the sphere per unit time is flux F times the area, i.e.
F × 4πd2 , which should be equal to the total energy emitted per unit time, i.e. the
source luminosity L. Mathematically,
F =
L
.
4πd2
(2.2)
See Fig. 2.2 below. (What will be the relation looked like in a 2-D universe?)
So far we have discussed the wave nature of the EM wave, but in some situations,
we find that EM waves behave as particles. For example, atoms can only absorb
CHAPTER 2. LIGHT AND TELESCOPES
sphere area
19
Intensity at
surface of sphere
source strength
Figure 2.2: Inverse square law.
one, two or any integral multiple of certain unit of light. We call the basic unit
photon and said that atoms can only absorb one photon or two photons, etc, but
not half photon. (Of course, atoms can absorb no photon at all.) The wave nature
of EM wave is the collective behavior of a lot of photons. The energy, E, of each
photon is given by
E = hf
(2.3)
where h is the Planck’s constant, with value about 6.63 × 10−27 erg s= 4.14 ×
10−15 eV s, and f is the frequency of the photon. Higher the frequency, higher
the energy of each photon has. We find that particle nature of light is prominent
when the frequency is high. Thus, for gamma rays, we will often speak of the
photons, but for radio waves, we often think them as waves.
2.2
Magnitudes
The brightness of objects in the sky varies a lot. For example, Sirius, the brightest
star (apart from the Sun), is about 500 times brighter than the dimmest stars we
can see with naked eyes. Therefore, if we use the values of intensity to describe their
brightness, we have to write a lot of zeros. We use a log scale instead. Examples
of log scales include the Richter scale for earthquakes and decibel (dB) for sound
intensity.
The visual magnitude or apparent magnitude m of a star tells the brightness
the star as we see it. This system dates back to Hipparchus in ancient Greece. In
modern terms, stars visible to human eyes are classified into 6 magnitudes from the
brightest (m = 1) to the faintest (m = 6), and an m = 1 star is 100 times brighter
than an m = 6 star. In other words, if star A is 100 times brighter than star B, the
magnitude of A is 5 units less than the magnitude of star B. Thus, a bright star
has a smaller or even negative magnitude number.
Exercise: From the definition above, show that each grade of magnitude is about
CHAPTER 2. LIGHT AND TELESCOPES
20
2.5 times brighter than the next one.
What is the relation between the intensity and magnitude of a star? Similar to
above, one can show that
( )
5
I
m = − log10
,
(2.4)
2
I0
where we arbitrarily choose one fixed intensity I0 as reference.
We thought the star Vega had constant brightness and chose it as the reference.
Hence it had magnitude zero. However, we later found out that it is in fact a
variable star, and we have abandoned it as the reference. Its average magnitude
is about 0.03. We don’t have such “standard” reference star anymore and I0 is
just a number. The apparent magnitude of the Sun is about −26.8, the full Moon
about −12, Sirius about −1.3, the naked eye limit about 6 and the dimmest image
taken by the Hubble Space Telescope is about 30. The actual situation is more
complicated: one star may be brighter than another in the blue band, but fainter in
red. Therefore, if you look up a star in research papers or star catalogues, you may
find its magnitude specified in different bands, e.g. visual, blue, red, or IR. They
are denoted by U, B, V, R, I, Z, etc. The most common one is the V (meaning
visual) band magnitude, which centers on yellow color (551 nm), close to the peak
response of human eye response.
If a star is moved from distance d to D, then md will change to mD , where
( )
d
mD = md − 5 log
.
D
(2.5)
Note that if D is large, then d/D is small, log(d/D) is negative and mD is large.
This only means that an object farther away is dimmer. If we know the distance to
a star and have measured its apparent magnitude, we want to compare its intrinsic
brightness (i.e. luminosity), then it is more convenient to convert to a standard
distance of 10 parsec. This is the definition of the absolute magnitude M . Hence,
it follows from Eq. (2.6) that the apparent and absolute magnitudes are related by
(
)
d
M = m − 5 log
.
(2.6)
10 pc
The difference m − M is called the distance modulus. Type Ia supernovae have
absolute magnitude of M = −19.3. This has been used to determine the distance
to their host galaxies, leading to the discovery of acceleration in the Universe expansion. This was awarded the Nobel Prize in 2011. Finally, we note that in
scientific measurements, the apparent magnitude has to be corrected for the absorption through the atmosphere. Additionally, absorption from the interstellar
medium also need to be accounted for in the calculation of absolute magnitude.
CHAPTER 2. LIGHT AND TELESCOPES
21
Exercise: The apparent magnitude of the Sun is −26.8, what is its absolute magnitude? Which one is more luminous when compared with Sirius (M = 1.4) and
Vega (M = 0.6)?
2.3
Spectrum, Spectral Lines, and Atoms
It was discovered by Newton that white light is composed of all the colors of the
rainbow. The image of this range of colors is called the optical spectrum. The
entire electromagnetic spectrum is much wider than the optical spectrum, including
the infrared, ultraviolet, etc. In astronomy, spectrum also means the graph of
intensities versus the frequencies or wavelengths. A typical spectrum is shown in
Fig. 2.3. This graph tells us a lot about the nature of the source of the radiations.
What should be the unit of the y-axis in the figure? Since the integrated area under
the curve is flux, the y-axis is the flux density (i.e. flux per unit frequency), and
it has units of erg/s/cm2 /Å for optical spectrum. For radio spectrum, it is usually
flux per Hz, and per keV for X-ray spectrum.
Intensity
Intensity
Frequency
f1
f2
Frequency
Figure 2.3: A continuum spectrum (left) and a spectrum with spectral lines (right).
Showing on the left of Fig. 2.3 is a continuum spectrum, since the variations
of the intensity is smooth. Usually, a spectrum contains some abrupt changes in
intensities. This is a spectrum with spectral lines, or simply line spectrum as
shown in the right of Fig. 2.3. Here we can see two kinds of spectral lines. The line
at frequency f1 is called an absorption line because radiation at this frequency is
absorbed and hence the intensity is lower than the underlying continuum. The line
at frequency f2 is an emission line.
A German physicist Gustav Kirchhoff studied the formation of spectral lines and
summarized into Kirchhoff ’s three laws of spectroscopy:
CHAPTER 2. LIGHT AND TELESCOPES
22
1. A hot, dense gas or hot solid object produces a continuous spectrum with no
dark spectral lines.
2. A hot, diffuse gas produces bright spectral lines (emission lines).
3. A cool, diffuse gas in front of a source of a continuous spectrum produces dark
spectral lines (absorption lines) in the continuous spectrum.
To derive these laws, we have to know some properties of atoms. In conditions common to us, everything is made up of atoms. The classical picture of
an atom is shown in Fig. 2.4. The nucleus contains
protons, which is positively charged, and neutrons,
which is electrically neutral. Almost all mass of an
atom concentrates at the nucleus. There are usually
several electrons going around the nucleus. Electrons are negatively charged. For a neutral atom,
the number of electrons is equal to the number of
protons in the nucleus. There are about 110 different kinds of atoms, the simplest being hydrogen,
with only one proton in the nucleus. Hence, neutral
hydrogen has one electron.
Figure 2.4: A classical picture of atom with the nucleus
at the center and several electrons orbiting around it.
Quantum mechanics, which is the theory for atomic
or smaller systems, tells us that electrons can only be in certain configurations
relative to the nucleus. These allowed configurations are called states. Each state
corresponds to some definite energy. After solving the Schrödinger equation, it can
be shown that the possible energies of a hydrogen atom are
me e 4 1
2h̄2 n2
me e 4 1
= − 2 2 2
8h ϵ0 n
En = −
in cgs
(2.7)
in MKS,
(2.8)
where h̄ ≡ h/2π, me is the electron mass, e = 4.8 × 10−10 esu is its charge in cgs
unit (or e = 1.6 × 10−19 C in MKS), ϵ0 is the vacuum permittivity and n is any
positive integer. Different n labels different state. Numerically,
En = −
13.6
eV,
n2
(2.9)
where eV is an unit for energy with 1eV = 1.6×10−12 erg. With the leading negative
sign, higher states (larger n) have higher energies (less negative). The least energy
state (n = 1) is called the ground state and others are called the excited states.
Since electron can only be in those states, when it jumps from one state to another,
it can only emit or absorb energy equal to the difference between the energies of
CHAPTER 2. LIGHT AND TELESCOPES
23
the two states. For example, if the electron of a hydrogen atom in the excited state
n = 5 jumps to a lower state n = 3, it will emit energy
E =
=
eV.
(2.10)
By Eq. (2.3), the photon emitted is of frequency 2.3 × 1014 Hz, which is in the
infrared. If it jumps to a higher state, it has to absorb energy. In general, the
energy differences between states of hydrogen atom is given by
)
(
1
1
−
(2.11)
En − Em = −13.6 eV
n 2 m2
for various positive integers n and m.
low density gas
continuum
absorption
line spectrum
emission line spectrum
Figure 2.5: Radiations with continuum spectrum passing through low density gas.
It depends on viewing angle whether the observer sees an emission or an absorption
spectrum.
We can calculate the wavelength of the photon emitted by E = hf = hc/λ, such
that
(
)
(
)
1
2π 2 me e4
1
1
1
1
=
−
≡ R∞
−
,
(2.12)
λ
h3 c
m2 n 2
m2 n 2
where R∞ = 1.097 × 105 cm−1 is called the Rydberg constant. It corresponds to the
transition from m = 1 to n = ∞.
The transition lines of hydrogen is particularly important, since H is the most
abundant element in the Universe. The transitions from n ≥ 2 to n = 1 is called
the Lyman series, with n = 2 → 1 called the Ly-α, n = 3 → 1 called the Ly-β,
n = 4 → 1 called the Ly-γ, and so on. The Balmer series is transitions from n ≥ 3
to n = 2, i.e. n = 3 → 2 called the Balmer-α, n = 4 → 2 called the Balmer-β,
etc. These are also called the Hα, Hβ, Hγ, etc. For completeness, the transitions
to n = 3 is called the Paschen series, to n = 4 is called the Brackett series, and to
n = 5 is called the Pfund series.
CHAPTER 2. LIGHT AND TELESCOPES
24
Exercise: What are the wavelengths of Hα and Hβ? Which part of the EM
spectrum (or color) do they correspond to? Hence, why are they so important?
Now, it is easy to understand Kirchhoff’s laws. The continuous spectrum comes
from blackbody radiation emitted by any objects with temperature above absolute zero. We will discuss more on the blackbody radiation in Chapter 4 later in
this course. Emission lines are produced by electrons making downward transition
(“falling”) from a higher orbit to a lower orbit. When radiations with continuum
spectrum pass through gas of atoms in low pressure, those atoms will absorb photons with energy equal to the differences of energies of their states. The atoms
will be excited. When they fall back down, it will produce emission lines, but the
photons emitted will travel in all directions. As a result, depending on the view
point, the observer will see emission or absorption lines, Fig. 2.5.
Note that different atom has different set of states and hence, spectral lines. The
spectral lines of hydrogen correspond to energies given by Eq. (2.11). We can tell
from the spectrum of a star which elements are present on the outer atmosphere of
the star.
Not just atoms, molecules also have their own sets of spectral lines. We can find
all those lines in laboratories on the Earth. However, the spectral lines observed of
some star are often shifted to some other wavelengths, due to the relative motion
of the star and the Earth. This is the Doppler effect. Unlike the classical case,
EM wave does not require a medium to popagate, Doppler effect is a result of
time dilation according to special relativiity. The change in wavelength, called the
redshift is given by
√
∆λ
1 + v/c
z=
=
− 1,
(2.13)
λ0
1 − v/c
where ∆λ is the observed wavelength minus the original wavelength, λ0 , v is the
relative speed between the source and the observer, and c is the speed of light. the
motion of the star affects all of its spectral lines with the same factor β. Note that
for v ≪ c, z ≈ v/c.
This is an extremely useful technique in astronomy for velocity measurements, leading to all kinds of important discoveries, including extrasolar planets, binary star
systems, rotation of galaxies (hence dark matter), and the expansion of the Universe
(hence big bang and dark energy).
CHAPTER 2. LIGHT AND TELESCOPES
2.4
2.4.1
25
Optics and Telescopes
Basics
Astronomical telescopes have only one main purpose: to collect more light, not to
magnify or to focus. Therefore, larger is better. Not all telescopes can focus light.
Since high energy radiation can penetrate into materials, it is very difficult to deflect
the X-ray photons (need grazing at shallow incidence angles) and impossible to focus
gamma rays. Therefore, all gamma-ray telescopes and some X-ray telescopes are
non-focusing. (What is the advantage of focusing telescopes?)
In this section, we mainly discuss optical telescopes, since they are most common
type and have the longest history. However, the general principles apply to all kinds
of telescopes. A large collecting surface enables us to detect dimmer objects. In
the dark adapted condition, the pupil of a human eye can relax to about 7 mm in
diameter and stars of magnitude 6 can be seen. If a telescope of diameter 20 cm
is used, the intensity of stars is amplified by a factor of (200/7)2 = 816. Thus, by
Eq. (2.4), through this telescope, stars of magnitude 6 + 52 log10 (816) = 13 can be
seen. In general, if the diameter of the telescope is D mm, the dimmest star can be
detected is of magnitude
( )2
( )
5
D
D
6 + log10
= 6 + 5 log10
.
(2.14)
2
7
7
This is a rough estimate. Many other factors affect what we can see. Also, if we use
other detectors, like CCD, instead of our eyes, we usually can detect much dimmer
objects (why?).
One common misconcept is to ask “how far I can see with this telescope?” This is
not a well-defined question. Provided that the object is bright enough, no matter
how far, we can still see that object. If it is very dim, we cannot see it even if it is
near us. Another common mistake is paying too much attention on the “maximum
magnification” of a telescope.
focal length
focal length
Figure 2.6: Principles of refracting (left) and reflecting (right) telescopes.
CHAPTER 2. LIGHT AND TELESCOPES
2.4.2
26
Refracting telescopes
There are two ways to bend lights: refraction with lenses or reflection with mirrors.
Therefore, there are three kinds of telescopes. When EM wave enters a medium,
its speed will change. (If it enters the medium from vacuum, its speed must slow
down. Nothing can travel faster than the speed of light in vacuum.) If the light ray
enters the medium at an angle, the ray will be bent. This is refraction. It can be
described by the Snell’s Law
sin θ1
n2
=
,
(2.15)
sin θ2
n1
where n1 and n2 are the indices of refraction of the two medium, and θ1 and θ2
are the incident and refraction angles, respectively, measured from the normal. For
example, air has n = 1.0003 and water has n = 1.33 relative to vacuum. The main
component of a refracting telescope or a refractor is a lens, which focuses the light
rays by refraction (Fig. 2.6 left). The distance between the focus and the lens is
called the focal length.
The refractors have two main disadvantages. First, there is dispersion: in medium
other than vacuum, EM waves with different frequencies travel in different speeds.
Dispersion leads to chromatic aberration in refractors. It means that light rays
of different colors (different wavelengths) focus at different points. For example,
if we pick the best focus for green color, the resulting image will have a red halo
around it. This problem can be mitigated with a long focal length or using a lens
system of two or more lenses, but then the optics is complicated and the cost will
increase. An achromatic lens uses two lens elements made of different materials. It
can focus two colors to the same point. An apochromatic (“APO”) lens can bring
three colors to the same focus.
A much bigger problem is that refractor requires a large piece of perfect lens. It is
extremely difficult to manufacture a large piece of glass without bubbles in it. Even
if that can be done, the lens is usually too heavy that it would deform differently
under self gravity when the telescope points at different direction. As a result, all
modern telescopes for astronomical research are reflectors. The largest refractor
still in use today has a diameter of 102 cm.
2.4.3
Reflecting and catadioptric telescopes
If a thin layer of metal is deposited onto a polished glass surface, the reflectance
will be increased, and we have a mirror. A reflector to converge the light rays
to the focus, Fig. 2.6 right. The surface of the mirror can only deviate from the
desired shape by one quarter of the wavelength. This is about 100 nm for visible
lights (2 cm for radio waves). Since we only use one surface of the glass, defects
in the bulk are irrelevant. Also, the angle of reflection is the same for all colors,
there is no chromatic aberration. One additional advantage is that we can support
CHAPTER 2. LIGHT AND TELESCOPES
27
the mirror from “behind,” not just the edge as in refractors. Thus, we can built
very large reflectors. The largest is 10 m in diameter. To focus the light rays from
infinity (i.e. parallel incident rays) into a point, the shape of the mirror has to be
a parabola. (do you know how to prove that mathematically?) This kind of mirror
is much more difficult to polish than a spherical one. Using a spherical mirror will
result in spherical aberration.
The reflectors are not without shortcomings. The focus in Fig. 2.6 is in front of
the mirror. We have to divert the light rays to a position convenient for viewing.
Usually, a secondary mirror will be introduced. Two possible configurations are
shown in Fig. 2.7. Fig. 2.7(a) is the Newtonian design, while in Fig. 2.7(b), a hole is
opened at the center of the main mirror and is called the Cassegrain design. These
designs introduce some obstructions to the light path. The obstruction will scatter
light, and hence the image produced by a perfect reflector is not as sharp as the
image by a perfect refractor with a lens of same diameter.
(a)
(b)
Figure 2.7: Two possible focus arrangements for reflectors.
The third type of telescope is called catadioptric telescope. It is a hybrid design using both lenses and mirrors. One design very popular among the amateur
astronomers is Schmidt-Cassegrain, because of its compact size. Some designs use
a spherical primary mirror, which is easy to manufacture, with a corrector plate in
front.
2.4.4
Magnification and resolution
Another important component of an optical telescope is the eyepiece. This is not
necessary if CCD or photographic film is used to record the image, but is critical for
visual observations. The simplest design of an eyepiece is just a lens. It is usually
put at a position that the distance between the eyepiece and the objective (would
it be the main mirror in reflectors or the main lens in refractors) is equal to the
sum of their focal lengths. It is shown in Fig. 2.8 that object with angular size
θ1 will have an image of angular size θ2 . Thus, the angular magnification, or just
CHAPTER 2. LIGHT AND TELESCOPES
f
28
f
1
2
θ1
θ2
Eyepiece
Objective
Figure 2.8: The magnification of a telescope is determined by the focal lengths of
the objective and the eyepiece.
magnification, of the telescope is
magnification =
θ2
f1
=
.
θ1
f2
(2.16)
Exercise: Using the small angle approximation tan θ ≈ θ for small θ, prove the
equation above.
Changing the magnification is accomplished by simply changing the eyepiece with
a different focal length. Even for the largest telescopes, the magnification used is
seldom over 500, usually between 100 and 200. It is because a large image given
by the high magnification is usually very fuzzy. As we will see below, this is due to
the Earth’s atmosphere most of the time.
Even without considering atmospheric effects, theoretically is there a physical limit
on the angular resolution? Due to diffraction of light, a point source, like a star,
will not be focused to an infinitely small point by a telescope. The best image is
actually a blurred disk, called the Airy disk. If two stars are very close to each
other, their Airy disks could overlap and the observer cannot tell if it is one star
or two. We say that the two stars cannot be resolved. For a circular aperture, the
blurring produced by diffraction limits the angular resolution to an amount given
by the Rayleigh criterion:
θdiffraction
limit
=
1.22λ
D
(2.17)
where D is the diameter of the objective of the telescope and λ is the wavelength
of the light. (The value of θdiffraction limit is in the unit of radian.) For example, for
yellow light, λ = 600 nm, if D = 20 cm, the angular resolution is 0.75′′ , which means
CHAPTER 2. LIGHT AND TELESCOPES
29
we would see as one star if in fact the two stars are separated by less than 0.75′′ ,
even in ideal conditions. (The numerical value given in Eq. (2.17) is in the unit of
radian, where 0.75′′ is in arcsecond. How to convert one to another?)
(a)
(b)
Figure 2.9: In (a), the Airy disks produced by a large telescope are small. The two
stars can be resolved. In (b), the Airy disks of the same pair of stars by a small
telescope are larger. The two stars cannot be resolved.
Exercise: Sirius is the brightest star in the sky (except the Sun) it is at a distance
of 2.64 pc and a diameter of 3.4 solar radius (i.e. 2.4×106 km). How large a telescope
is needed to resolve its image in visible light (λ = 555 nm)?
So practically we can treat stars as unresolved point sources.
If we use a short focal length eyepiece to obtain high magnification, the image of
the Airy disk will be magnified. This will blur the whole image and degrade the
quality. In some department stores, they advertise their telescopes by claiming a
high magnification, say 600 or more. This is an unsound claim and their telescopes
can only be treated as toys.
By Eq. (2.17), it seems that we can obtain high resolution by increasing the size of
our telescope. This is true up to a point. For ground based observers, star light has
to pass through the atmosphere and it acts as a large refractive medium. The air in
the atmosphere is constantly moving and the image of a star will dance around, like
the bottom of a swimming pool when viewed above the water. The effect is called the
seeing. The seeing limit is usually about 1′′ . Thus, even for a small telescope like
the 20 cm we talked above, its resolution is limited by the seeing, not the diffraction
of its optics. This is one more reason why a high magnification is not useful.
However, we can go up above the atmosphere to avoid the bad seeing, for example
the Hubble Space Telescope. Many ground-based telescopes, including the biggest
ones, are seeing-limited and their resolution is nowhere near the diffraction limits.
This is why some astronomers advocate to build more medium-size telescopes, which
are cheaper, instead of extremely large ones.
CHAPTER 2. LIGHT AND TELESCOPES
30
For radio observations, since the wavelength of radio waves is about 105 times
that of visible light, we have to build an enormous telescope to obtain the same
resolution. One remedy is to use computer to recombine the signals from several
radio telescopes. The several radio telescopes will function as parts of the dish of
an imaginary radio telescope, with effective diameter equal to the distance between
the actual radio telescopes. (We call the objective of a radio telescope a dish.) This
is called the radio interferometry. This technique has been applied to optical as
well, but it is slightly different due to the much higher frequency of visible light.
2.4.5
Lens speed
For unresolved objects, such as a distant stars or quasar, all incident light rays are
focused into a point, hence, the brightness depends on the lens diameter. However,
for extended objects, e.g., the moon, planets, nebulae, or nearby galaxies, the light
coming out from the telescope spreads over some area. Therefore, the surface
brightness depends on the magnification. One important parameter to consider
is the f-number (or f-ratio) of a telescope or a lens, which is defined as
f-number ≡
f
,
D
(2.18)
where f is the focal length and D is the diameter of the lens. Note that a larger
f-number means a smaller
diameter. For photographic lenses, the f-number can be
√
adjusted in steps of 2, in the series of 1, 1.4, 2, 2.8,
√ 4, 5.6, 8, 11, 16,... Every
successive step reduces the lens’ effective diameter by 2, such that the amount of
light passing through is reduced by half.
Exercise: For a resolved object with angular size α (in radians), show that the
physical size of its real image by a telescope of focal length f is αf .
The amount of light collected is proportional to D2 , from the exercise above, we
know that the light collected is spread out over an area ∝ (αf )2 . As a result, the
surface brightness I of the image depends on
( )2 (
)2
1
D
=
.
(2.19)
I∝
f
f-number
We see that a smaller f-number gives a brighter image (i.e. with larger surface
brightness), since the lens aperture is larger. Telescopes or lenses with smaller
f-numbers are therefore referred to as having a “higher speed” or “faster”.
CHAPTER 2. LIGHT AND TELESCOPES
31
Exercise: Comparing between the lens on a cell phone with an f/2.2 and the
primary mirror of the Hubble Space Telescope, which has f/24. Which one gives a
brighter image? Which lens/mirror is “faster”?
Finally, we should mention that beside the optics, the mount of a telescope is also
very important. Not only must it support the optics, it must also track the stars
across the sky. As we have discussed in previous chapter, due to the rotation of
the Earth, stars and every object move on the celestial sphere. There must be
mechanism to turn the telescope such that the image of the stars is fixed for us or
the detectors.
2.5
CCD
CCD is Charge Coupled Device. It is a semiconductor device with the appearance
similar to an ordinary computer chip. On the top of the chip, there is a window,
allowing light to go in. After applying the power, each element of the device will
convert photons to electrons. The number of electrons released is proportional to
the number of photons hit the device. Hence, by reading the amount of charge, we
can tell the intensity of the light source.
A typical CCD consists of an array of light detecting elements, pixels, usually in the
range of 768×512 to 2048×2048. Thus, we could form a picture of such resolutions.
The size of one pixel depends on the model, with a typical value of 9µm × 9µm.
For a 1024 × 1024 CCD, the light detecting area is about 1 cm by 1 cm, which is
less than the size of photographic films. This is one of the disadvantages of CCD
as compared with films. The other disadvantages are low resolution and that only
black and white pictures can be taken, because CCD only detects the intensity,
not the color of the light source. There are two methods to obtain a color photo.
One is called the tricolor photo. The observer takes three photos with three red,
green and blue filters. Then, combines the three photos into one with the help of
computer software. The second method is to use a color CCD. In which three pixels
form a group. Each pixel in a group is covered by either red, green or blue filters,
and the electronics of the CCD would combine the data to output a color photo.
There seems to be many disadvantages of CCD, but there is one overwhelmingly
advantage. The quantum efficiency of a professional grade CCD could go up
to 80%, which means that it can detects most of the photons, as compared with
only 2% to 4% of photographic films, and 1% of human eyes. In astronomical
applications, almost all objects in the sky are very dim. A high efficiency device
will greatly reduce the exposure time. Note that the quantum efficiency is frequency
CHAPTER 2. LIGHT AND TELESCOPES
32
dependent. It is lowest in the blue.
As mentioned, for an object with angular size α (in radians), the physical size of
its real image by a telescope of focal length f is αf . For example, the image of
the Moon (angular size of about 0.5◦ ) of a telescope of focal length 2 m is about
. The CCD must have size larger than this to cover the whole Moon. To
match the resolving power of the telescope and the CCD, two pixels should cover
the angular resolution. Hence, for example, if the angular resolution is about 1′′
and the pixel size is 9 µm, then the matching focal length is
= 3.7 m .
(2.20)
Astronomical CCDs are usually cooled to low temperature to avoid thermal noise.
Chapter 3
Celestial Mechanics
(Chapter 2 in textbook.)
Since we believe that the laws of physics we developed on Earth should hold anywhere in the Universe, the motions of celestial bodies should be governed by classical
mechanics, which was mainly developed by Newton. In this chapter, we will review
the basics of the theory of gravitation, then apply it on the two-body problem. Why
study the two-body problem? This is the simplest case of celestial motion that can
be solved analytically. Also, it is very useful for describing the motions of objects
in a binary system or planetary system.
3.1
Newton’s Laws of Motion
We will briefly review Newton’s laws in this section. The readers are assumed to
know the material well. This section only serves as a reminder.
The first law of mechanics describes the resistance of matter to change in its state
of motion: A body in motion will remain in motion, unless it is acted upon by some
external force.
Newton’s formulation of the second law is the familiar
F = m a = m v̇
(3.1)
where F is the force vector, m is the mass of an object and a is the acceleration
vector. The mass in this equation is the inertial mass, which relates the response
of the body to external force. The acceleration is the rate of change of the velocity.
Velocity describes both the speed and the direction of the motion. Thus, sometime
the acceleration is non-zero even if the speed of the body remains constant.
33
CHAPTER 3. CELESTIAL MECHANICS
34
Eq. (3.1) is valid only in an inertial frame, which is any frame at rest or in constant
velocity with respect to the fixed stars. If we are careful enough, we can find that
the “rest” frame on the Earth is not an inertial frame by experiments. This concept
of inertial frame becomes very important when we discuss special relativity.
The third law states that whenever there is an action, there will be an equal in
magnitude but opposite in direction reaction. For example, we feel the gravitational
attraction of the Earth pulling us down, at the same time, there is a force of the
same strength pulling the Earth “up.” (Do you know how to prove Newton’s laws?)
Momentum, or linear momentum, of a particle is defined as the product
p = mv .
(3.2)
We found that in the absence of any external force, by Eq. (3.1), the total momentum of a system remains constant. This is the conservation of momentum. (Do you
know how to prove?)
The addition of velocities in classical mechanics is very simple. For example, if a
train is moving with velocity vt relative to the station and a ball is moving with
velocity vb relative to the train, then relative to the station, the ball is moving with
velocity vt + vb .
The kinetic energy of a particle is given by
1 2
p2
K. E. = mv =
2
2m
(3.3)
where v and p are the magnitude of velocity and momentum respectively. If we
want to change the velocity of the particle, a force must act on it. The change of
kinetic energy is equal to the work done W , which, for constant force, is the dot
product of the force vector F and the displacement vector d of the particle
W =F ·d .
(3.4)
Apart from the kinetic energy, another important form of energy is the potential energy. This is the energy associated with the configurations of the system.
The most important example is the gravitational potential energy. For a particle
with mass m at height h above some reference point on the Earth’s surface, the
gravitational potential energy is
U = mgh
(3.5)
where g is the free fall acceleration constant on Earth’s surface, g ≈ 9.8m/s2 .
There are other kinds of energies, like the chemical energy or nuclear energy. If we
sum up all kinds of energies in an isolated system, the total energy also remains
constant. This is the principle of conservation of energy. Energy (strictly speaking
should be mass-energy) cannot be created nor destroyed. It can be converted from
CHAPTER 3. CELESTIAL MECHANICS
35
one form into another, and the total energy is always conserved. (Do you know how
to prove?)
Other than the linear motion, rotation of a body is another important subject in
mechanics. We will mainly talk about the rotation of a particle on a plane around
some point, the center. The angular position of the particle is the angle made by the
line joining the center and the particle and some fixed reference line. The angular
velocity is the rate of change of the angular position, usually denoted by ω and in
the unit of radian per second. As the name implied, angular velocity also describes
the direction of the rotation. Similarly, we define the angular acceleration.
To Sun
h
R
R
θ
Figure 3.1: Can we see sunset twice a day?
Exercise: A simple investigation on the rotation of the Earth will tell us that we
can indeed see sunset twice, or many more times, a day, Fig. 3.1. Everyone knows
that if we are at a high mountain, the time of sunset will be later, because it takes
more time for the Earth to rotate the Sun out of our sight. If we are of height h
above the Earth surface, how much later will the sunset be? Referring to the figure,
the angle θ is given by
θ=
(3.6)
The time for the Earth to rotate this amount is
∆t =
(3.7)
Hence, if we substitute to R the radius of the Earth and take h as 1.7 m, the height
of a typical adult, then ∆t = s. In order to see sunset twice, all you have to do
is to sit down to see the sunset. After it just sets, stand up immediately. You can
see the Sun sets again. The Sun has an angular diameter of 0.5◦ , just after sunset,
how high you have to climb (assuming it takes no time) to make the Sun totally go
above the horizon?
Corresponding to the mass in linear or translational motion, we have moment of
inertia I in rotational motion, defined as
I = mr2
(3.8)
CHAPTER 3. CELESTIAL MECHANICS
36
for a particle with mass m and a distance r from the center. The angular momentum L is defined as
L = Iω = mr2 ω = r × p .
(3.9)
Obviously this depends on the choice of origin. Angular momentum is a very
important quantity in many astrophysical situations. For example, when material
collapses due to gravity, it always forms a disk first, because angular momentum
is conserved and very hard to get rid of. This results in accretion disks, and it is
also the reason why disk structure is seen everywhere in the Universe, from solar
systems to galaxies. In spacecrafts, angular momentum gradually builds up through
pointing. The angular momentum is stored in reaction wheels and eventually need
to be cancelled through thrusters or other means.
A force acting on a particle does not necessarily change its angular velocity; the
tangential component of the force must be non-zero to do so. We define the torque
τ as the product of the tangential component of the force and the distance of the
application point from the center
τ =r×F .
(3.10)
Newton’s second law applying to rotation becomes
τ =
dL
,
dt
(3.11)
that is the rate of change of the angular momentum equals to the torque. If the
net torque is zero, the angular momentum remains constant. This is the principle
of conservation of angular momentum. (Do you know how to prove?) The most
important example is the systems of central force. In such a system, a particle
is moving under the influence of a force which always points to or points away
from a fixed point. Since the force is always radial, its tangential component and
hence the torque are always zero. The angular momentum is conserved. Note that
this depends on the choice of origin: angular momentum can be conserved about
one point but not another, depending on the net torque (indeed it is ususally not
conserved about other points). The kinetic energy of rotation is given by
1
K. E. = Iω 2 .
2
(3.12)
Before closing this section, we will give an application of mechanics on the spacecraft
trajectory Almost always, a spacecraft will visit a few major planets several times
before it goes to its destination, which could be Saturn, for example. The major
reason for such a visit is to accelerate the spacecraft, called gravity assist. The
point is that if we can appropriately choose the trajectory of the spacecraft, the
speed of the spacecraft relative to the Sun will be increased. A large amount of fuel
can be saved. We now see how it could be in details.
CHAPTER 3. CELESTIAL MECHANICS
37
m
v1
M
v’2
v2
m
v’1
M
m
M
u
in center of mass frame
Figure 3.2: The figure at the left shows the velocity of the planet and the spacecraft.
The middle shows the point of view in the center of mass frame. In this frame, the
Sun is moving to the left. After the encounter with the planet, the right figure
shows that the spacecraft gains speed by gravity assist.
Suppose a spacecraft of mass m is approaching a planet of mass M ≫ m. Their
velocities are respectively v1 and v2 relative to the Sun, Fig. 3.2. To simplify the
problem, we consider the center-of-mass frame, in which the total momentum is
zero. We first determine the center-of-mass velocity u relative to the Sun,
(M + m)u = M v1 − mv2 .
(3.13)
We have
u=
(3.14)
Hence, the velocities of M and m in the center-of-mass frame are
v1′ = v1 − u =
(3.15)
v2′ = v2 + u =
(3.16)
respectively. We can see that the total momentum in this frame is M v1′ − mv2′ = 0.
Assume for simplicity that after they gravitationally interact, the directions of their
velocities are perpendicular to the original velocities in the center of mass frame.
(Do you know how to calculate their speeds if this assumption is not true? Hint:
It depends on the angle of scattering.) The conservation of momentum and energy
requires that their speeds do not change. (how to prove?) Transforming back to
the frame in which the Sun is at rest, the velocity of the spacecraft is given by the
vector sum of the velocity of the spacecraft in the center of mass frame and the
velocity of the center of mass frame. We found that its speed increases
√
√
v2′ 2 + u2 = (v2 + u)2 + u2 > v2 .
(3.17)
CHAPTER 3. CELESTIAL MECHANICS
38
Discussion: Assuming that the high temperature is not a problem, can a spacecraft
use the Sun for gravity assist acceleration?
3.2
Newton’s Gravitation
Over 300 years ago, Newton proposed a theory for gravitation, which essentially
says that everything attracts everything. This theory not only explains the falling
of an apple, but also the motions of planets around the Sun and even the motion
of distant binary stars.
The Newton’s law of gravitation states that every particle attracts any other particle
with a force
m1 m2
F = G 2 r̂
(3.18)
r
where m1 and m2 are the masses of the two particles and r is the distance between
them. G is the gravitational constant, whose value is
G = 6.67 × 10−8 cm3 g−1 s−2 .
(3.19)
This law can be derived from the general relativity in the small potential and low
speed limit. Note that this law only holds between two particles, for extended objects we need to take the sum of every particles (see example below). The direction
of the force on one particle is toward the other particle. Thus, the gravitational force
tends to pull them together. This simple statement has great implications. Since
we have not found any “anti-gravity,” the gravitational force cannot be canceled
and is accumulative. A greater mass will create a greater force. In astronomical
scale, gravitational force is the dominant force.
A fine point about the masses in Eq. (3.18) is in order. To be precise, the mass in
Eq. (3.18) is the gravitational mass, comparing with the inertial mass in Eq. (3.1).
The gravitational mass is a property of the particle which describes the magnitude
of its influence on other objects gravitationally. The inertial mass describes its
response to force. These two kinds of masses need not to be the same. The fact
that they are exactly the same is the called the equivalence principal and it is
the starting point of general relativity.
CHAPTER 3. CELESTIAL MECHANICS
39
A brief review on gravitational force, potential, and energy.
Force
F =
?
GM m
r2
Potential Energy
−−−−−−−−−→
E=





y



 per

y
?
=
GM
r2
Potential
−−−−−−−−−→
V =
Example: The gravitational potential due to a point mass M is easily deduced
to be V (r) = −GM/r at a point of distance r from it. We now find out by
direct integration the gravitational potential due to a thin uniform spherical shell
of material.
z
P
z0
1
0
0
1
θ
y
φ
x
Figure 3.3: The geometry of a uniform spherical shell of material.
Let the surface mass density of the shell be ρ, the total mass of the shell be M =
4πR2 ρ, and its radius be R. We choose the coordinate system, Fig. 3.3, such that
the point of interest P is on the z-axis with coordinates (0, 0, z0 ), where z0 could
be greater than or less than R (outside or inside the shell).
The surface element shown has area sin θ dθdϕ. Its distance from P is
√
R2 sin2 θ + (R cos θ − z0 )2 . Hence, its contribution to the gravitational potential
CHAPTER 3. CELESTIAL MECHANICS
is
40
GR2 (ρ sin θ dθdϕ)
dV = − √
.
R2 sin2 θ + (R cos θ − z0 )2
(3.20)
The gravitational potential is then given by
∫ ∫
GR2 ρ sin θ dθdϕ
√
V = −
R2 sin2 θ + (R cos θ − z0 )2
∫ π
sin θ dθ
2
√
= −2πGR ρ
R2 − 2Rz0 cos θ + z02
0
∫ π
d(cos θ)
√
= 2πGR2 ρ
2
R + z02 − 2Rz0 cos θ
0
∫ 1
dx
2
√
= −2πGR ρ
2
R + z02 − 2Rz0 x
−1
( √
)1
−1
2
2
= −2πGRρ
R + z0 − 2Rz0 x z0
.
(3.21)
−1
Be very careful on how we take the square root. From the very definition of potential, Eq. (3.20), we have to take the positive roots. If z0 < R,
V
=
=
=
(3.22)
which is independent of z0 . The potential is constant, the force is zero. If z0 > R,
V
=
=
=
(3.23)
As a result, outside the shell, gravitationally, it acts as a point mass.
The gravitational potential energy in Eq. (3.5) is valid only near the Earth’s surface.
For object above the Earth’s surface, the gravitational potential energy is given by
U = −G
M⊕ m
r
(3.24)
where M⊕ is the mass of the Earth, m is the mass of the particle and r is the
distance of the particle from the center of the Earth. By convention, the potential
energy at infinity is zero. If the particle is near the Earth’s surface, r = R⊕ + h
where R⊕ is the radius of the Earth and h is small,
1
1
=
=
r
R⊕ + h
≈
.
(3.25)
CHAPTER 3. CELESTIAL MECHANICS
41
Eq. (3.24) becomes
M⊕ m
≈
(3.26)
r
Up to an irrelevant constant term, the potential is in the form U = mgh. We can
2
numerically check that g is equal to GM⊕ /R⊕
.
U = −G
Imagine we throw a rock to the sky, the rock will fall back to the ground. However,
if we throw it fast enough, it can escape the gravitational pull of the Earth and
not return. The critical speed is called the escape velocity. By conservation of
energy, it is easy to calculate the escape velocity. At the Earth’s surface, K.E. of
the rock is 21 mv 2 and potential energy (P.E.) is −GM⊕ m/R⊕ . At infinity, both
K.E. and P.E. are zero;
1 2
M⊕ m
=0
mv − G
2
R⊕
v=
≈
km s−1
(3.27)
Exercise: What are the escape velocities of the solar system and the Milky Way?
As a warm-up for the next section, we recall the uniform circular motion, which is
a particle revolving around a center with constant speed v and constant distance r
from the center. Hence the angular speed of a particle ω = v/r is constant. The
period T is given by
2π
T =
.
(3.28)
ω
The acceleration is given by
v2
a=
= ω2r .
(3.29)
r
The particle needs a centripetal force to keep it in uniform circular motion. If
the force is provided by the gravitational force of an object with mass M at the
center, then
Mm
(3.30)
G 2 = ma = mω 2 r ,
r
which implies
T2 =
(3.31)
This is a special case of Kepler’s third law.
Exercise: assuming no air resistance, how fast a bullet has to travel on the surface
of the Earth so that it can keep going around in a circular motion? Compared the
value with the escape velocity.
CHAPTER 3. CELESTIAL MECHANICS
3.2.1
42
Roche Lobe
We now study the gravitational potential around a binary star system. Most stars
are in binary systems. Mass exchange can occur between the two stars. Roche
lobe, is the equipotential surface which just encloses the two stars. The Roche
lobe hugs the larger star tighter. The intercepting point is called the Lagrangian
point. Note that this is not at the same location as the center of mass. It is closer
to the lighter star than to the heavier star. If matter flows out from one star of the
binary, for example, if one of them goes to the red giant phase, it will first fill up
the Roche lobe then channel to the companion star via the Lagrangian point.
Roche lobe
l
d
Figure 3.4: The Roche lobe of a pair of stars. Which star is more massive?
Suppose the masses of the two stars are m1 at position with coordinates (0, 0, 0)
and m2 at (d, 0, 0). By symmetry, the Lagrangian point must lie on the x-axis. Let
its coordinates be (l, 0, 0). At the Lagrangian point, the gravitational forces due to
the two stars are equal, we have
Gm1
=
(3.32)
l2
Solve for l,
l=
(3.33)
Notice that the position of the Lagrangian is between the two stars and it is nearer
to the lighter one. This is, in fact, not the exactly correct. See Section 3.5 for more
detailed analysis of Lagrangian points.
3.2.2
Critical Density of the Universe
We can estimate the critical density of the universe by simple Newtonian gravity,
although we need general relativity to rigorously derive it. Let the average mass
CHAPTER 3. CELESTIAL MECHANICS
43
density of the universe be ρ. Suppose a galaxy be a distance r from us. Then, the
total mass inside the sphere of radius r is
4
M = πr3 ρ .
3
(3.34)
If the mass of the galaxy is m, the potential energy of the galaxy due to the mass
in the sphere is given by
GM m
U =−
=
(3.35)
r
Assume that velocity of the galaxy is radial, and the speed is given by the Hubble’s
law: the speed of a galaxy is proportional to its distance from us, i.e. v = H0 r,
where H0 is the Hubble constant. The kinetic energy T of the galaxy is
1
T = mv 2 =
2
(3.36)
E =T +U =
(3.37)
and the total energy is
If E < 0, the galaxy is bounded, which means that the galaxy is not energetic
enough to escape from the gravitational pull of other mass. If E > 0, the galaxy
is unbounded and will fly away. Therefore, the critical density ρc for which is the
galaxy is just bounded is
E = 0
=
ρc =
(3.38)
If we take the value of the Hubble constant as H0 = 70 km s−1 Mpc−1 , ρc =
g/cm3 . This is about the mass of five hydrogen atoms per cubic metre, but the average density observed from stars, gas, etc (excluding dark matter
and dark energy) is found to be only 0.2 hydrogen atoms per cubic metre.
3.2.3
Virial Theorem
This theorem states that for a gravitationally bound system in equilibrium, the
time averaged total energy is one-half of the time averaged potential energy
⟨E⟩ =
1
⟨U ⟩ .
2
(3.39)
The proof goes as follow. For a system of particles, let pi and ri be the linear
momentum and position of the i-th particle at sometime t. Consider the quantity
∑
Q≡
pi · ri .
(3.40)
i
CHAPTER 3. CELESTIAL MECHANICS
44
We note that
dQ
d ∑ dri
1 ∑ d2
1 d2 I
2
m
=
· ri =
(mr
)
=
.
i
dt
dt i
dt
2 i dt2
2 dt2
(3.41)
In equilibrium, we expect that I should stay roughly the same. Hence, we assume
that the time average of its derivative is zero
⟨
⟩
dQ
=0.
(3.42)
dt
The time derivative of Q is given by
(
)
dQ ∑ dpi
dri
=
· ri + pi ·
.
dt
dt
dt
i
(3.43)
We recognize the second term as twice the kinetic energy
∑
i
pi ·
dri ∑ 1 2
=
p = 2T .
dt
mi i
i
(3.44)
By Newton’s second law,
∑
dpi ∑ Gmi mj
Fij .
=
(r
−
r
)
≡
j
i
dt
|rj − ri |3
j̸=i
j̸=i
(3.45)
Here, we sum only over j. Notice that Fij = −Fji , we have
)
(
∑ dpi
∑ ∑
Fij · ri
· ri =
dt
i
i
j̸=i
)
(
∑
∑
∑
∑
1
=
Fij · ri +
Fji · rj
2
j i̸=j
i j̸=i
)
(
∑∑
1 ∑∑
=
Fij · ri +
Fji · rj
2
i j̸=i
i j̸=i
1 ∑∑
=
(Fij · ri + Fji · rj )
2 i j̸=i
1 ∑∑
(Fij · ri − Fij · rj )
=
2 i j̸=i
1 ∑∑
=
Fij · (ri − rj )
2 i j̸=i
1 ∑ ∑ Gmi mj
= −
2 i j̸=i |rj − ri |
= U .
(3.46)
CHAPTER 3. CELESTIAL MECHANICS
45
We need the factor 1/2 to compensate for the double counting of the number of
pairs of particles. In summary, U + 2T = dQ/dt and after taking time average,
⟨U ⟩ + 2 ⟨T ⟩ = 0. Total energy is E = T + U and we have Eq. (3.39).
Stars were formed in nebulae. At the beginning, the density of gas and dust in a
nebula is very low, and they are moving slowly. Hence, the total energy is roughly
zero. On the other hand, a star is gravitationally bounded. Its potential energy is
non-zero and negative. By virial theorem, its total energy is also negative. This
implies that energy must be transferred out for a nebula to form a star. Physically
this is via thermal radiation during star formation.
Another interesting consequence of Virial theorem is that as a star loses energy, the
total energy becomes more negative. Since ⟨T ⟩ = − ⟨U ⟩ /2, the KE of the particles
actually increases! The net result is that the star gets hotter while loses energy,
somewhat like having a “negative heat capacity”.
3.3
Two-body Problem
In this section, we talk about the motion of two bodies under their mutual gravitational attraction. Why study this? It can tell us different orbits of planets and
comets around the Sun, e.g. circle, ellipse, parabola, and hyperbola. After a brief
review on Kepler’s laws, we will derive the general solution to the two-body problem
using Newtonian mechanics, and apply it to prove Kepler’s laws.
3.3.1
Kepler’s Laws of Planetary Motion
• First Law: A planet orbits the Sun in an ellipse, with the Sun at one focus of
the ellipse.
• Second Law: A line connecting a planet to the sun sweeps out equal areas in
equal time intervals.
• Third Law: The orbital period P of a planet and the semi-major axis of its
orbit a are related by P 2 ∝ a3 .
3.3.2
Orbits in Two-body Problem
Our goal here is to derive the motion of objects in two-body problem using Newton’s
gravitational force equation. We will employ transformations (r1 , r2 → r and
m1 , m2 → µ) to solve Newton’s 2nd law.
CHAPTER 3. CELESTIAL MECHANICS
46
Let first assume that the force between the two bodies depends only on their relative
position, the force F acting on the first body by the second is a function of r1 − r2 ,
where r1 and r2 are the position of the two bodies. Then, if the masses of the two
are m1 and m2 , their equations of motion are
m1 v̇1 = F (r1 − r2 )
m2 v̇2 = −F (r1 − r2 )
(3.47)
(3.48)
where vi = dri /dt are their velocities. Let
R=
m1 r1 + m2 r2
m1 + m2
(3.49)
be the position of the center of mass, and r = r1 − r2 be the relative position,
m2
r
r1 = R +
(3.50)
m1 + m2
m1
r2 = R −
r.
(3.51)
m1 + m2
Substitute the above equations into the sum of Eq. (3.47) and Eq. (3.48), we have
(m1 + m2 )R̈ = 0 ,
(3.52)
which just means that the center of mass of the system will move in a straight line
with constant velocity.
If we calculate the difference of Eq. (3.47) and Eq. (3.48), we have
m1 m2 v̇1 − m1 m2 v̇2 = (m1 + m2 )F (r)
m21 m2
m1 m22
r̈ +
r̈ = (m1 + m2 )F (r)
m1 + m2
m1 + m2
m1 m2
r̈ = F (r) .
m1 + m2
(3.53)
If we define µ ≡ m1 m2 /(m1 + m2 ) then Eq. (3.53) is reduced to a one-body problem
F (r) = µr̈. Therefore, µ is called the reduced mass. If, for example, m2 ≫ m1 ,
r is the position of the first body relative to the second and the reduced mass is
m1 . This is the case for planets orbiting around the Sun in our solar system.
We now assume that F is a central force, which means that the direction of F is
equal to r and its magnitude is a function of the distance only, F = F (r)r̂. Hence,
the torque is zero and angular momentum is conserved. If the angular momentum
is denoted by L, it is a constant vector. Since r · L = 0, r is always on the plane
perpendicular to L. Hence, the bodies are moving on a plane.
On the plane, we usually label the position of a point by its Cartesian coordinates,
(x, y). Here, we also need the polar coordinate system (r, θ), Fig. 3.5. Their relations
are
√
{
{
x = r cos θ
r = x2 + y 2
.
(3.54)
y = r sin θ
θ = tan−1 (y/x)
CHAPTER 3. CELESTIAL MECHANICS
^
θ
47
^r
r
θ
Figure 3.5: A particle moving on a plane with polar coordinates.
The unit vectors r̂ and θ̂ are illustrated in Fig. 3.5. In terms of the unit vectors x̂
and ŷ along the x- and y-axes, they are
{
r̂ = cos θ x̂ + sin θ ŷ
.
(3.55)
θ̂ = − sin θ x̂ + cos θ ŷ
For Newton’s gravitation, from Eq. (3.53),
µr̈ = −G
m1 m2
Mµ
r̂
≡
−G
r̂ ,
r2
r2
(3.56)
where M ≡ m1 + m2 is the total mass. We would like to calculate r̈ in terms of r̈
and θ̈. Since x̂ and ŷ are constant vectors, they do not change with time (but r̂
and θ̂ do change). We have
{
r̂˙ =
= θ̇ θ̂
.
(3.57)
˙
θ̂ =
= −θ̇ r̂
We can now calculate r̈,
r = r r̂
ṙ =
=
r̈ =
=
= (r̈ − rθ̇2 )r̂ + (2ṙθ̇ + rθ̈)θ̂ .
(3.58)
(3.59)
Comparing with Eq. (3.56), separating into azimuthual and radial parts, we have
{
r̈ − rθ̇2 = − GM
r2
.
(3.60)
2ṙθ̇ + rθ̈ = 0
CHAPTER 3. CELESTIAL MECHANICS
48
These are the equations of motion. We are just interested in the equation of orbit
(not interested in time), that is the dependence of r in terms of θ. Notice that the
azimuthual part
d 2
(r θ̇) =
=
=0
(3.61)
dt
by the second equation of Eq. (3.60). Thus, L ≡ µr2 θ̇ is a constant of motion. It is
in fact the angular momentum. We can rewrite the radial part of Eq. (3.60) as
L2
GM
+
,
r2
µ2 r3
GM µ
L2
µr̈ = − 2 + 3
r
µr
r̈ = −
or
(3.62)
This is the modified force law. It has a similar form as the gravitation law, but
slightly modified. This effective force has an extra term representing the centripetal force. This can be used to derive the effective potential energy of the
system
GM µ
L2
Ueff = −
+
.
(3.63)
r
2µr2
Although this is not real potential energy but just a mathematical form, it can help
us easily visualize the orbits of a particle with certain energy (Fig. 3.6).
Figure 3.6: Effective potential.
As L = µr2 θ̇, we can write ṙ in terms of L by
ṙ =
dr dθ
L dr
dr
=
= 2
.
dt
dθ dt
µr dθ
Therefore, the radial part, i.e. the first equation of Eq. (3.60) becomes
)2
(
)
(
GM
L d
L dr
L
= − 2
−r
2
2
2
µr dθ µr dθ
µr
r
(
)
d
1 dr
1 GM µ2
−
.
=
dθ r2 dθ
r
L2
(3.64)
(3.65)
CHAPTER 3. CELESTIAL MECHANICS
49
Let u = 1/r, dr/dθ = −1/u2 du/dθ. The above equation gives
d2 u
GM µ2
+
u
=
,
dθ2
L2
(3.66)
for which the general solution is
1
GM µ2
=u=
[1 + e cos(θ − θ′ )]
2
r
L
(3.67)
where e and θ′ are two constants of integration. θ′ just tells us how the orbit orients
relative to our coordinate system. We can set it to 0 or π such that e ≥ 0. e is an
important parameter of the orbit, called the eccentricity.
If e = 0, The particle is in constant distance from the center. This is a circle. More
generally, for e < 1, the orbit is called an ellipse. It is closed and bounded. The
particle will revolve around the center of mass periodically, where it is called the
focus of orbit.
If e = 1, the orbit is called a parabola. If e > 1, the orbit is a hyperbola. Both
parabola and hyperbola are open orbits. The particle will come near the focus once
and then go away. This is the case for some comets, they will enter the inner solar
system, fly by the Sun only once, then never come back. Particle with parabolic
orbit is that its kinetic energy just balances the potential energy, i.e. the total energy
is zero. When it gets farther away from the focus, its speed will decrease and tends
to zero, while particle with hyperbolic orbit has a non-zero speed at infinity.
parabola
ellipse
focus
hyperbola
Figure 3.7: Left: three kinds of orbits around the focus. Right: conic sections.
All these shapes are called the conic sections, Fig. 3.7. The point on the orbit nearest to the focus is called the perihelion (the farthest point is called the
aphelion). The distance between perihelion and the focus is given by
rp =
L2
1
.
GM µ2 1 + e
(3.68)
This occurs when the right hand side of Eq. (3.67) is the largest, that is when θ = θ′ .
CHAPTER 3. CELESTIAL MECHANICS
50
Exercise: Show that the total energy of a two-body system is: (a) minimum for a
circular orbit, (b) < 0 for a elliptical orbit, (c) = 0 for a parabolic orbit, and (d)
> 0 for a hyperbolic orbit.
For closed orbits, we can calculate the period. We can put θ′ = 0 in Eq. (3.67)
because the period does not depend on it. Then, the equation of orbit is
1
GM µ2
=u=
(1 + e cos θ) .
r
L2
(3.69)
We would like to calculate the area of the orbit. First transform to the Cartesian
coordinate system, we have r cos θ = x and r + ex = L2 /GM µ2 . Hence,
r =
x2 + y 2 =
2eL2
x =
GM µ2
[
]2
eL2
2
2
y + (1 − e ) x +
=
GM µ2 (1 − e2 )
y 2 + (1 − e2 )x2 +
L2
− ex
GM µ2
)2
(
2eL2
L2
−
x + e2 x2
GM µ2
GM µ2
(
)2
L2
GM µ2
(
)2
L2
1
(3.70)
.
1 − e2 GM µ2
This is the equation of an ellipse with a shifted center. Its area equal to the area
of the following ellipse
1
y + (1 − e )x =
1 − e2
2
Let B 2 =
1
1−e2
(
L2
GM µ2
)2
2
2
(
L2
GM µ2
)2
.
(3.71)
√
and w = x 1 − e2 /B. The area is
∫
A = 2
√
B/ 1−e2
√
B 2 − (1 − e2 )x2 dx
√
−B/ 1−e2
∫ 1
2
√
2B
= √
1 − w2 dw
1 − e2 −1
πB 2
= √
1 − e2
(
)2
π
L2
=
.
(1 − e2 )3/2 GM µ2
(3.72)
CHAPTER 3. CELESTIAL MECHANICS
3.3.3
51
Proof of Kepler’s Laws
First law: As discussed above, in a close orbit, a planet moves around the center
of mass of the system, which is the focus of the orbit. If the star is much more
massive than the planet, it will essentially sit at the focus.
Exercise: How much does the Sun move due to due to Earth? Due to Jupiter?
Second law: Area in polar coordinates is given by
dA = dr(r dθ) .
Integrating r, the rate of change in area swept out
by a line joining a point from the focus to the ellipse
is
1
dA = r2 dθ .
(3.74)
2
Then
dA
1L
1 dθ
1
= r2
= r2 θ̇ =
.
dt
2 dt
2
2µ
(3.75)
(3.73)
dA=rdrdθ
rd θ
dr
r
dθ
This is a constant because the angular momentum
L is a constant. To calculate the period P , it follows
Figure 3.8: Area in polar cofrom Eqs. (3.72) and (3.75) that
ordinates.
2µA
L3
2π
P =
= 2 2 3
.
(3.76)
L
G M µ (1 − e2 )3/2
Third law: The semi-major axis a is the average distance of the perihelion and
the aphelion. From Eq. (3.68) and a similar expression for the aphelion,
(
)
1
1
1
L2
L2
+
.
(3.77)
a=
=
2
2
2GM µ 1 + e 1 − e
GM µ 1 − e2
Comparing Eqs. (3.76) and (3.77), we have
P2 =
4π 2 3
a .
GM
(3.78)
This is the enhanced version of Kepler’s third law. We have derived the proportional
constant from Newton’s law. This provides a very powerful tool to estimate the mass
of celestial objects, from moons to planets to stars to galaxies.
Exercise: From Eq. (3.76), what is the relation between velocity and distance?
CHAPTER 3. CELESTIAL MECHANICS
52
Figure 3.9: Solid line: observed rotation curve of a galaxy. Dotted line: prediction
from Kepler’s third law.
In a galaxy, stars are concentrated at the center, therefore, the rotation curve is expected to follow Kepler’s third law except at the very center. However, the observed
rotation curves is much flatter (see Figure 3.9). The discrepancy is attributed to
dark matter. For the Milky Way, it is estimated that 90% of the mass is dark
matter, while ordinary matter is only 10%.
3.4
Impact Parameter and Scattering Angle
We now study the two body problem in another point of view. Consider the case
that a light body approaches a very heavy body from far away, for example, a
satellite approaching a planet. (Does it sound familiar?) Results of last section
tell us that the reduced mass is very closed to the mass of the light body and the
trajectory is a hyperbola.
Let the mass of the heavy body be M , the mass of the light body be m. We consider
M ≫ m such that µ ≈ m. When the light body is far away from the heavy body,
let its incident velocity be v and the impact parameter, b, be the perpendicular
distance between the heavy body and the incident velocity, Fig. 3.10. We would
like to determine the scattering angle, Θ.
Θ
M
b
m v
rp θ r
vp
Figure 3.10: Impact parameter of a two body system.
CHAPTER 3. CELESTIAL MECHANICS
53
When the light body is far away, it is obvious that the angular momentum L is
given by L = mbv, which is a constant of motion. At perihelion, let its velocity
be vp , it is perpendicular to the line joining the two bodies. Hence, vp = rθ̇ at
perihelion. (At other point on the trajectory, the velocity has radial component,
see Eq. 3.58.) As L = mr2 θ̇, we have
mbv = L = mr2 θ̇ = mrp (rp θ̇) = mrp vp
(3.79)
and by Eq. (3.68),
bv
GM (1 + e)
GM (1 + e)
= bv
=
.
2
rp
(bv)
bv
vp =
(3.80)
By conservation of energy,
1 2
1 2 GM m
mv =
mv −
2
2 p
rp
(
)2
GM (1 + e)
GM (1 + e)
2
v =
− 2GM
bv
(bv)2
G2 M 2 (1 + e)
(e − 1)
v2 =
b2 v 2
b2 v 4
= e2 − 1
G2 M 2
√
b2 v 4
e =
1+ 2 2 .
GM
(3.81)
We see from this equation that e > 1, and hence we have a mathematical proof
that the trajectory is a hyperbola. For simplicity, let’s define our coordinate system
such that θ′ = 0 in Eq. (3.67) when r = rp . Then from Eq .(3.68), the light body
will go infinitely far away at the angles
1 + e cos θ = 0
cos θ = −
1
.
e
(3.82)
Negative cos θ means that θ > π/2. We can also write
θ = π − cos−1
1
.
e
(3.83)
Finally, from the geometry in Fig. 3.10, the scattering angle Θ is given by θ + (θ −
Θ) = π. Hence,
Θ = 2θ − π
1
e
)−1/2
(
b2 v 4
−1
.
= π − 2 cos
1+ 2 2
GM
= π − 2 cos−1
(3.84)
CHAPTER 3. CELESTIAL MECHANICS
54
Let us check that if the impact parameter is very large, the light body should not
be affected much by the heavy body. We expect the scattering angle is small. If b
is large, e is large, cos−1 1/e is closed to π/2, and Θ is small.
This calculation also sheds some light on the problem of gravity assist. Since in
this section we have assumed that heavy body does not move, we are essentially in
the center-of-mass frame. We have mentioned in Section 3.1 that the final speed of
the spacecraft depends on the scattering angle. We now know that we can control
the scattering angle by adjusting the impact parameter and the incident velocity.
3.5
Restricted Three-body Problem
In general, three-body problem is very difficult. It is sometimes chaotic and does
not have analytical solution. In this section, we will discuss a special case and some
more about Lagrangian points.
Y
(X,Y)
R1
R2
b
m1
X
a
m2
Figure 3.11: The positions of the three bodies.
We assume that two point masses m1 and m2 revolve around their center of mass
in circular motion. The origin of the coordinates is chosen to be at the center of
mass, and at time t = 0, m1 is at the positive x-axis. If a and b are the distances of
the masses from the origin, Fig. 3.11, then a/b = m1 /m2 . The angular speed of the
point masses is n, where n2 = G(m1 + m2 )/(a + b)3 . Let c ≡ cos nt and s ≡ sin nt.
Then, the positions of the m1 and m2 are respectively (cb, sb) and (−ca, −sa).
A third body is assumed to be always lie on the orbital plane defined by the two
masses. We also assume that it is very light (i.e. a test particle) and it will not
influence the motion of the other two. This is the case, for example, if the two
masses are the Sun and Jupiter, while the third body is an asteroid.
CHAPTER 3. CELESTIAL MECHANICS
55
If the position of the third body is (X, Y ), then R1 and R2 defined by the figure
are
R1 = (X, Y ) − (cb, sb) = (X − cb, Y − sb)
R2 = (X, Y ) − (−ca, −sa) = (X + ca, Y + sa) .
(3.85)
(3.86)
The equations of motion are
Gm1
Gm2
(X − cb) −
(X + ca)
3
R1
R23
Gm1
Gm2
= − 3 (Y − sb) −
(Y + sa)
R1
R23
Ẍ = −
(3.87)
Ÿ
(3.88)
where Ri are the magnitude of Ri .
We now transform to a moving coordinate system which rotates with the two masses.
This is called the co-rotating frame. It is not an inertial reference frame, meaning
that there will be extra terms in Newton’s 2nd law. Physically, it could mean that
we see the Sun and the asteroid from Jupiter. Mathematically, we define (x, y) by
X ≡ cx − sy
Y ≡ sx + cy .
(3.89)
(3.90)
Note that even if x and y are time independent, X and Y still depend on time
through c and s. In terms of x and y, R12 = (x − b)2 + y 2 and R22 = (x + a)2 + y 2 .
Since
Ẋ
Ẍ
Ẏ
Ÿ
=
=
=
=
−nsx + cẋ − ncy − sẏ
−n2 cx − 2nsẋ + cẍ + n2 sy − 2ncẏ − sÿ
ncx + sẋ − nsy + cẏ
−n2 sx + 2ncẋ + sẍ − n2 cy − 2nsẏ + cÿ ,
(3.91)
(3.92)
(3.93)
(3.94)
the equations of motion are
−n2 cx − 2nsẋ + cẍ + n2 sy − 2ncẏ − sÿ
Gm1
Gm2
= − 3 (cx − cb − sy) −
(cx + ca − sy)
R1
R23
−n2 sx + 2ncẋ + sẍ − n2 cy − 2nsẏ + cÿ
Gm1
Gm2
= − 3 (sx − sb + cy) −
(sx + sa + cy) .
R1
R23
(3.95)
(3.96)
If we calculate the sum of c times Eq. (3.95) and s times Eq. (3.96), we have
− n2 x + ẍ − 2nẏ = −
Gm2
Gm1
(x − b) −
(x + a) .
3
R1
R23
(3.97)
Subtract s times Eq. (3.95) from c times Eq. (3.96), we have
2nẋ − n2 y + ÿ = −
Gm1
Gm2
y−
y.
3
R1
R23
(3.98)
CHAPTER 3. CELESTIAL MECHANICS
56
y
L4
L1
L2
L3
x
m2
a
b m1
L5
Figure 3.12: Left: The positions of the five Lagrangian points relative to the two
masses. Right: Gravitational potential of the system in the co-rotating frame.
These two equations are the main equations. We will not study the general solutions.
Instead, we will only find out the equilibrium points, which means the solutions with
fixed x and y values.
If x and y do not depend on time, ẋ = ẍ = ẏ = ÿ = 0. Substitute into Eq. (3.97)
and Eq. (3.98), we have
Gm1
Gm2
(x − b) +
(x + a)
3
R1
R23
Gm1
Gm2
n2 y =
y+
y.
3
R1
R23
n2 x =
(3.99)
(3.100)
If y = 0, Eq. (3.99) gives us three solutions. They are the Lagrangian points L1 ,
L2 and L3 . For m1 ≫ m2 , Eq. (3.99) can be solved to obtain
(
( )1/3 )
b
L1 :
−a + (a + b)
,0
3a
(
( )1/3 )
b
L2 :
−a − (a + b)
,0 .
(3.101)
3a
)
(
5
L3 :
a + b + b, 0
12
L1 is in between the two masses, and it is in fact the Lagrangian point that we
found in Eq. (3.33), but now we have the correction (the left hand side of Eq. (3.99))
coming from the rotation of the two masses. All these three Lagrangian points lie
on the line joining the two masses, Fig. 3.12.
If y ̸= 0, by Eq. (3.100), we have
n2 =
Gm1 Gm2
+
.
R13
R23
(3.102)
CHAPTER 3. CELESTIAL MECHANICS
Substitute this into Eq. (3.99), we have
(
)
Gm1 Gm2
Gm1
Gm2
+
x =
(x − b) +
(x + a)
3
3
3
R1
R2
R1
R23
Gm1
Gm2
a
0 = − 3 b+
R1
R23
R13 = R23
57
(3.103)
because m1 b = m2 a. Hence, x = (b − a)/2. From the facts that n2 = G(m1 +
m2 )/(a + b)3 , R13 = R23 and Eq. (3.102), we have
R23 = (a + b)3
a+b 2
) + y 2 = (a + b)2
2
√
3
y = ±
(a + b) .
(3.104)
2
Thus, the coordinates of the fourth and fifth Lagrangian points L4 , L5 are
(
(
L4
L5
)
√
1
3
:
(b − a),
(a + b)
2
2
(
)
√
3
1
:
(b − a), −
(a + b) .
2
2
(3.105)
It can be shown that L1 , L2 and L3 are unstable1 , in the sense that material at
these points would fly away if slightly perturbed. We saw that material will be
transferred from one star to another through L1 . On the other hand, L4 and L5
are stable if m1 /m2 > 25. These are the cases for the Sun-Earth and Earth-Moon
systems.
The Lagrangian points have been used as parking lots for satellites. For the SunEarth system, the solar observatory SOHO is in L1 , such that it can observe the
sun continuously. L2 is used by the microwave probes WMAP and Planck, which
can observe the sky without any interference from the Sun and the Earth. It will
also be used by the James Webb Space Telescope in future. L3 is too far to be
useful, but it is a popular place for the hypothetical counter-Earth. The STEREO
satellites were able to observe L3 and have ruled out the existence of any large
objects there. L4 and L5 are the homes to many asteroids. This class of objects are
called Trojans asteroids. The STEREO satellites have visited L4 and L5 to detect
the Trojans asteroids. We have also discovered many asteroids at L4 and L5 of the
Sun-Jupiter system.
Question: In Fig. 3.12 right, why do the Lagrangian points seem to lie on top of
the potential? And why does the potential fall off at large distance?
1
See http://wmap.gsfc.nasa.gov/media/ContentMedia/lagrange.pdf.
Chapter 4
Introduction to Radiative
Processes
(Chapters 3.4–3.6, 9.1–9.4 in textbook.)
In astrophysics, one studies distant objects through their emission. It is very important to understand how the radiation is generated and transmitted. We will
briefly introduce the radiative transfer and blackbody radiation.
4.1
Solid Angle
Recall that one way to define an angle is that the angle sustained by an arc is
the ratio of the length of the arc to the radius of arc, the left diagram of Fig. 4.1,
θ = l/r. The angle of a whole circle is, of course, 2π.
θ
Ω
l
r
r
111
000
000
111
000
111
000
111
A
000
111
000
111
000
111
000
111
Figure 4.1: Angle is the ratio of arc length to radius. The solid angle is the ratio of
the area to the square of the distance.
Suppose there is a sphere from a distance from us. We could talk about the angular
size of it, with units in degrees or radians. However, if we like to talk about how
much of the sky is blocked by the sphere, we are talking about the solid angle,
Fig. 4.1. Its definition is the ratio of the area to the square of the distance, Ω = A/r2 ,
and has a unit of sr. The solid angle of the whole sky is 4π sr.
To find out the infinitesimal solid angle, consider the infinitesimal area in Fig. 4.2.
The two sides are rdθ and r sin θdϕ, hence the area is dA = r2 sin θdθdϕ and
58
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
59
z
y
dΩ = sinθdθdφ
θ
dθ
(x,y,R)
y
φ
φ
dφ
x
Figure 4.3: The shaded area is one eighth
of a face of a cube, from the point of view
of +z-axis.
x
Figure 4.2: The differential solid angle is
dΩ = sin θdθdϕ.
dΩ = dA/r2 . This gives
dΩ = sin θ dθ dϕ .
(4.1)
Example: If you are at the center of a cube, what will be the solid angle sustained
by one face of the cube?
By symmetry, there are six faces and the solid angle of whole sky is 4π, hence the
solid angle of one face should be 2π/3. Let’s do it the hard way.
Let the length of one side of the cube be 2a. We consider the face at z = a,
−a ≤ x, y ≤ a. The required solid angle is eight times the solid angle of the
shaded area in Fig. 4.3. We have to figure out the ranges of ϕ and θ. It is easy
for ϕ, 0 ≤ ϕ ≤ π/4. The length of the dark line in Fig. 4.3 is a/ cos ϕ, and the
θ-coordinate of the point (x, y, a) is
a tan θ =
(4.2)
Hence, the range of θ is 0 ≤ θ ≤ α, where α ≡ tan−1 (1/ cos ϕ).
The required solid angle is
∫
π/4
∫
α
sin θ dθ dϕ
Ω = 8
ϕ=0
θ=0
=
=
We need
1
cos2 ϕ
1
=
=
.
cos α =
1 + tan2 α
1 + 1/ cos2 ϕ
1 + cos2 ϕ
2
(4.3)
(4.4)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
60
If we put u = sin ϕ, then
∫
π/4
Ω = 8
0
1− √
∫
√
2/2
cos ϕ
1 + cos2 ϕ
dϕ
1
du
2 − u2
0
(
)√2/2
u
= 2π − 8 sin−1 √ 2
= 2π − 8
√
0
= 2π − 8(π/6)
2π
=
.
3
4.2
(4.5)
Specific Intensity and Flux
In this section, we will discuss various flux, intensity and energy density. These are
scientific terms to describe a radiation field and light rays.
dA
dΩ
n
Figure 4.4: For the definition of specific intensity.
Imagine inside a region filled with radiation, or photons, with all kinds of frequencies
and going in all directions. Consider a photon (or a light ray) goes in a specific
direction ⃗n, the specific intensity, Iν , is defined as the energy carried by photons
pass through a small area dA perpendicular to ⃗n, pointing to similar directions
within a small solid angle dΩ, between frequency ν and ν + dν, and time dt (see
Fig. 4.4)
dEν
Iν ≡
.
(4.6)
dA dt dΩ dν
Physically, the specific intensity could be understood as the brightness. It is
generally a function of Ω, position, ν and time, and has a dimension of energy per
unit area per unit time per unit solid angle per unit frequency. In c.g.s. the units
are erg/s/cm2 /sr/Hz. Note that Iν is independent of distance! It is like surface
brightness (magnitude per square arcsec). Stars do not fade with distance, but just
get smaller, i.e. smaller solid angle. If we take two pictures of the Sun, one from
Earth and one from Venus, using the same camera setting, the surface brightness
of the sun will look identical in the two pictures.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
61
If we want to express the specific intensity in terms of per unit wavelength instead
of frequency, we use the relation between wavelength and frequency c = νλ and
define
c
Iλ = 2 Iν ,
(4.7)
λ
so that |Iν dν| = |Iλ dλ| and
∫ ∞
∫ 0
Iν dν =
Iλ dλ
(4.8)
∞
0
What is the unit of Iλ ?
dA
dΩ
θ
Figure 4.5: For the definition of energy flux.
Next, we consider the case when the light ray direction is not perpendicular to dA.
As shown in Figure 4.5, it is obvious that if θ = 90◦ the flux passing through the
area dA will be zero, and the flux is maximum when θ = 0. For a general θ, the
effective area is reduced by a factor of cos θ, therefore, the specific energy flux
passing through a small area dA with direction dΩ is
dFν = Iν cos θ dΩ .
(4.9)
To obtain the net flux (actually flux density, as it is per unit frequency), we have
to integrate all directions
∫
Fν = Iν cos θ dΩ .
(4.10)
It is in units of erg s−1 cm−2 Hz−1 .
Exercise: Show that if the radiation is isotropic, i.e., Iν does not depend on Ω,
then Fν = 0.
∫
Hint: You will see this type of integral dΩ a lot in astrophysics. In the actual
calculation, one needs to express dΩ in terms of θ and ϕ first. Also, it is sometimes
useful to make a substitution µ = cos θ, which gives dµ = − sin θ dθ, such that
∫ π
∫ 1
f (θ) sin θ dθ =
f (µ) dµ .
(4.11)
0
−1
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
62
Example: Constancy of specific intensity along rays in free space.
Consider any ray L and any two points along the ray. As shown in Fig. 4.6, construct
areas dA1 and dA2 normal to the ray at these points. By energy conservation, energy
carried by the set of rays passing through both dA1 and dA2 can be expressed in
two ways:
dE1 = Iν1 dA1 dtdΩ1 dν1 = dE2 = Iν2 dA2 dtdΩ2 dν2 .
Here dΩ1 is the solid angle subtended by dA2 at dA1 and so forth. Since dΩ1 =dA2 /R2 ,
dΩ2 =dA1 /R2 , and dν1 =dν2 , we have Iν1 = Iν2 , i.e. Iν =constant along a ray.
dA2
dA1
R
Figure 4.6: Constancy of intensity along rays.
We now prove the inverse square law. For a distant star, one can take cos θ = 1.
Since Iν is independent of distance,
∫
Fν = Iν cos θ dΩ = Iν Ω .
(4.12)
Ω = πR2 /d2 , where R is the stellar radius and d is the distance. Hence, Fν ∝ d−2 .
Physically, the star does not get fainter with distance, but the solid angle gets
smaller, therefore, the total flux is smaller.
For a photon with energy E, it carries a momentum of is E/c. The momentum
flux pν is the momentum per unit time per unit area perpendicular to dA, which
also equals to the pressure. Imagine a photon strikes a wall, only the perpendicular
component of the momentum will change and exert pressure. Therefore, only the
perpendicular component matters and this introduces an extra factor of cos θ in the
momentum flux
∫
1
Iν cos2 θ dΩ .
(4.13)
pν =
c
We define the specific energy density, uν , as the energy per unit volume per unit
frequency. It has units of erg/cm3 /Hz. Consider a cylinder in Figure 4.7 of cross
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
63
dA
dΩ
ds= cdt
Figure 4.7: For the definition of energy flux.
section dA and length ds=cdt, the total energy in the volume is uν dA cdt dν. On
the other hand, after time dt, all photons inside will come out, passing through the
area dA to all solid angle.
From the definition of Iν , the total energy passing can
∫
also be expressed as ( Iν dΩ) dA dt dν. Equating these two,
(∫
)
Iν dΩ dA dt dν = uν dA c dt dν
∫
1
Iν dΩ .
(4.14)
uν =
c
Finally, we are going to prove that for isotropic radiation, the radiation pressure
p is equal to 1/3 of the energy density. Consider a system consists of a container
with isotropic radiation field inside. Since the photons have to turn around on the
boundary, the radiation pressure on the boundary is twice the momentum flux; but
we integrate only over 2π solid angle,
∫
p=2
2
pν dν =
c
∫
π/2
Iν cos2 θ dΩ dν .
(4.15)
θ=0
By isotropy, Iν does not depend on Ω, therefore,
p=
.
(4.16)
On the other hand, the energy density is
u =
= 3p .
(4.17)
This relation is the equation of state for radiation. In cosmology, it can be used to
describe the radiation-dominated era in the early Universe.
Similarly, the energy flux flowing out of the boundary is
∫
F ≡
∫
Fν dν =
∫
∫
π/2
Iν dν
cos θ dΩ = π
θ=0
It is related to the energy density by F = cu/4.
Iν dν .
(4.18)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.3
64
Emission and Absorption
The monochromatic emission coefficient jν is defined as the energy emitted
per unit time per unit solid angle per unit volume:
jν ≡
dE
dV dΩdt dν
(4.19)
and it has units of erg/cm3 /s/sr/Hz. When a beam of cross section dA travels
though a emission region for a distance ds, the volume it covers is dV =dAds. The
energy energy it gains is
dIν = jν ds .
(4.20)
For absorption, we first note that the amount of absorption depends on the intensity
of the incident beam, e.g., no absorption could occur if the beam contains no energy.
Therefore, the absorption coefficient αν is defined as the change in the beam
intensity after traveling for a distance ds:
dIν = −αν Iν ds .
(4.21)
αν is in units of cm−1 . Putting Eqs. (4.20) and (4.21) together, the general form of
the radiative transfer equation is
dIν
= −αν Iν + jν .
ds
(4.22)
For pure emission, αν = 0, the solution is
∫
s
Iν (s) = Iν (s0 ) +
jν (s′ ) ds′ .
(4.23)
s0
For pure absorption, jν = 0, the solution is
−
Iν (s) = Iν (s0 )e
∫s
s0
αν (s′ ) ds′
.
The integral in the exponent is called the optical depth
∫ s
τν ≡
αν (s′ ) ds′ .
(4.24)
(4.25)
s0
This is an important parameter to describe the emission properties of plasmas in
astrophysics. A medium is optically thick or opaque when τν ≫ 1, meaning that
photons cannot transmit for a long distance without being absorbed. A medium
with τν ≪ 1 is optically thin or transparent, such that photons can propagate
more or less freely. For example, the early Universe had a high temperature, most
materials were ionized, absorbing all electromagnetic radiation. Therefore, it was
opaque to light (i.e. optically thick). It was until the last scattering surface when
matter decoupled from photons, resulting in the cosmic microwave background we
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
65
see today. Also, the surface i.e. photosphere of a star (e.g. the Sun) is defined as
the point where τ = 2/3, such that photons can escape freely into space.
We can rewrite equation 4.22 as
jν
1 dIν
= −Iν +
.
αν ds
αν
(4.26)
From the definition of τν , we have dτν = αν ds. Define the source function
Sν ≡ jν /αν , then
dIν
= −Iν + Sν .
(4.27)
dτν
It is easy to see that if Iν > Sν , then dIν /dτν < 0. Physically, it means that Iν
tends to decrease along the ray until it is the same as Sν . Conversely, Iν increases
if Iν < Sν . In other words, Iν always tries to approach Sν . The general solution to
Equation 4.27 is
∫ τν
′
−τν
Iν = Iν (0)e
+
e−(τν −τν ) Sν (τν′ ) dτν′ .
(4.28)
0
For the simple case of a constant source function Sν , this reduces to
Iν = Iν (0)e−τν + Sν (1 − e−τν ) .
(4.29)
Again, τν → ∞ implies Iν → Sν , i.e., given a large optical depth (e.g., travel for
sufficient distance in a optically thick medium), the observed intensity will approach
to the source function.
4.4
Basics of Statistical Mechanics (Optional)
(Chapter 8.1 in textbook. Chapters 2, 7, 9, 10, 11 in F. Mandl: Statistical
Physics, John Wiley, 1988, 2nd ed.)
4.4.1
Thermodynamics
We very briefly review the basics of thermodynamics. The zeroth law of thermodynamics states that if two systems A and B are in thermal equilibrium with each
other and B and C are in thermal equilibrium, then A and C are in thermal equilibrium. We say that they all have same temperature. There are many kinds of
temperature scales, for example, the length of a rod at different temperatures.
We can verify by experiments that same amount of work, no matter which kind,
produce same temperature rise. We call the form of energy transfer heat. The first
law states that change in energy of a system is equal to net heat input plus net
work done on the system. Note that energy of a system is a function of state, while
heat and work done on it are not. They depend on its history.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
66
For an isolated system in equilibrium, usually energy E, volume V and number
of molecules N are fixed. We call it the macrostate: (E, V, N ); or (E, V, N, α)
where α are other macroscopic variables, for example, the dependence of density
on location. On the other hand, a microstate specifies the positions, velocities,
internal states of each particles. It is almost impossible to fully describe.
We need to count the number of microstates corresponding to the same macrostates.
The states are discrete for quantum system and continuous for classical system. We
will adopt the quantum system viewpoint. We denote the number of states for
energy between E and E + δE by Ω(E, V, N, α).
We give an example. Consider spins in magnetic field. For single paramagnetic
atom, energy E = −µ · B where µ is the magnetic moment, B is the magnetic field.
Let assume that the spin can only take two values ±h̄/2. Thus, energy can only be
±µ · B.
For N such atoms, if n of them align with the field and (N − n) anti-align with the
field, total energy and the number of states are
E = n(−µB) + (N − n)(µB) = (N − 2n)µB ,
N!
Ω =
.
n!(N − n)!
(4.30)
(4.31)
We assume that each microstate compatible with the constraints has equal a priori
probabilities. Then, (one form of) the second law states that value of α will evolve in
such a way that Ω(E, V, N, α) is always non-decreasing and equilibrium corresponds
to value of α for which Ω(E, V, N, α) attains its maximum.
We define entropy by
S(E, V, N, α) = k ln Ω(E, V, N, α)
(4.32)
First law is about energy conservation. Second law is about direction. All experiments show that isolated systems tend to equilibrium, not the opposite. Real
processes are non-reversible. Hence, entropy is always non-decreasing for isolated
systems. We also see that the more disordered the system is, the larger its entropy.
Equivalently, larger entropy, less information.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.4.2
67
Isolated Systems
If an isolated system is partitioned into two subsystems and they are nearly independent, then
E
V
N
Ω(E, V, N, E1 , V1 , N1 )
=
=
=
=
E1 + E2
V1 + V2
N1 + N2
Ω1 (E1 , V1 , N1 )Ω2 (E2 , V2 , N2 ) .
(4.33)
(4.34)
(4.35)
(4.36)
Hence,
S(E, V, N, E1 , V1 , N1 ) = S1 (E1 , V1 , N1 ) + S2 (E2 , V2 , N2 )
(4.37)
entropy is an extensive quantity (proportional to the size of the system). (Compared
to intensive quantity, e.g. temperature.)
To define absolute temperature scale, first consider diathermal wall (not permeable
to everything, except heat). At equilibrium, entropy is maximum,
(
)
(
)
(
)
∂S
∂S1
∂S2
dE2
0=
=
+
(4.38)
∂E1 E,V,N,V1 ,N1
∂E1 V1 ,N1
∂E2 V2 ,N2 dE1
Since dE2 /dE1 = −1, we have
(
)
)
(
∂S1
∂S2
=
∂E1 V1 ,N1
∂E2 V2 ,N2
(4.39)
We see that (∂Si /∂Ei )Vi ,Ni is a measure of temperature, and define the absolute
temperature T by
(
)
∂Si
1
=
.
(4.40)
∂Ei Vi ,Ni
T
Defined as this, the perfect gas temperature scale is equal to the absolute temperature scale. By second law,
0<
dS
1
1 dE1
=( − )
.
dt
T1 T2 dt
(4.41)
Heat flow from high temperature to low temperature.
4.4.3
Systems in a Heat Bath
To consider systems in constant temperature, we study an isolated system such
that our system of interest is subsystem 1 and the subsystem 2 is a heat bath,
which means it can absorb or provide as much energy as subsystem 1 needs without
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
68
changing its temperature. The probability that subsystem 1 is in a definite state r
is proportional to the number of states of the heat bath compatible with it
pr = const. Ω2 (E0 − Er ) = const. exp(S2 (E0 − Er )/k)
(4.42)
where E0 is the total energy of our system and the heat bath. By the definition of
heat bath, Er ≪ E0 ,
S2 (E0 − Er )
S2 (E0 ) Er ∂S2 (E0 ) 1 Er2 ∂ 2 S2 (E0 )
+
=
−
+ ···
(4.43)
k
k
k ∂E0
2 k
∂E02
By Eq. (4.40), the second term is Er /kT . The third is the change of temperature
of the heat bath, which, by definition, negligible. Eq. (4.42) is then
1
pr = e−βEr
(4.44)
Z
∑
where Z =
r exp(−βEr ) and β = 1/kT . This is Boltzmann distribution,
which gives the probability of a microstate of a system at some fixed temperature.
Z is called the partition function.
If there are degeneracies g(Er ), the probability of the system at particular energy
is
1
p(Er ) = g(Er )e−βEr .
(4.45)
Z
The mean energy of the system is
∑
∂ ln Z
Ē =
pr Er = −
.
(4.46)
∂β
r
Scientists usually will employ a conceptual construction. The energy, for example,
of a particular system will have some particular time dependent value. There will
be fluctuations. We could consider a large number of identical systems at some
fixed temperature, called a canonical ensemble. The average of the energy of
each system will be given by Eq. (4.46), without any fluctuation.
Example: From the Boltzmann equation above, the ratio of probability that a
system will be in state a and in state b is given by
gb
P (Eb )
= e−(Eb −Ea )/kT .
(4.47)
P (Ea )
ga
At what temperature a gas of neutral hydrogen will have equal number of atoms in
the ground and first excited states?
For hydrogen atoms, ground state is n = 1 and first excited state is n = 2. The
degeneracy is gn = 2n2 . Therefore,
2(22 ) −[(−13.6 eV/22 )−(−13.6 eV/12 )]/kT
e
1 =
2(12 )
10.2 eV
= ln 4
kT
T = 8.54 × 104 K .
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
69
This is even higher than the surface of the Sun. However, we know from observations
that some hydrogen atoms are ionized in the Sun. How can that be possible? (The
answer lies in the Saha Equations, which will be discussed later in this chapter.)
4.4.4
The Perfect Classical Gas
Gas consists of molecules moving about fairly freely in space. Perfect gas represents an idealization in which the potential energy of interaction between the
molecules is negligible compared to their kinetic energy of motion. If the energy
states of one single molecule are εr , the partition function of a single molecule is
∑
Z(T, V, 1) =
exp(−βεr ) .
(4.48)
r
∑
The partition function of many identical molecules is not ( r exp(−βεr ))N (otherwise this leads to the Gibbs paradox). Since the molecules are identical, we cannot
distinguish the cases: the first molecule is at state r and the second is at state s;
or the first molecule is at state s and the second is at state r. We can only say
that one molecule is at state r and one is at state s. The partition function for two
molecules is then
∑
1 ∑
Z(T, V, 2) =
exp(−2βεr ) +
exp(−β(εr + εs )) .
(4.49)
2!
r,s
r
r̸=s
The partition function for N molecules is then
Z(T, V, N )
∑
=
exp(−N βεr ) + . . .
r
+
∑
1
N!
exp(−β(εr1 + . . . + εrN )) .
(4.50)
r1 ,...,rN
all ri different
We define classical regime, in which the probability that any single-particle state
is occupied by more than one molecule is very small. If we define the occupation
number for a state as the number of particles in that state, then classical regime
is in which the occupation number for any state is much less then one.
For classical perfect gas, only the last term in Eq. (4.50) is important. Consider the
function
1 ∑
exp(−β(εr1 + . . . + εrN ))
(4.51)
N ! r ,...,r
1
N
The difference between this and the last term of Eq. (4.50) is not significant for
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
70
classical perfect gas. So, our final result for this section is
Z(T, V, N )
1 ∑
exp(−β(εr1 + . . . + εrN ))
=
N ! r ,...,r
1
N
(
)N
1 ∑
=
exp(−βεr )
N!
r
(4.52)
for classical (occupation number small) perfect (interaction negligible) gas (particles
moving freely).
4.4.5
The Partition Function
To calculate the partition function of the gas, it is reduced to calculate the partition
function for a single molecule. The energy of a molecule can be written as
int
εr = εtr
s + εα
(4.53)
int
where εtr
s is the energy for the translational motion and εα is the energy of the
internal excitations. Hence,
(
)(
)
∑
∑
Z(T, V, 1) =
exp(−βεtr
exp(−βεint
≡ Z1tr Zint .
(4.54)
s)
α )
s
α
The internal energy εint
α depends on the internal details of the molecules, for example, type, excited states, etc. It does not depend on the volume. We now evaluate
Z1tr , which can be applied to any perfect gas. The translational energy is
εtr =
p2
.
2m
(4.55)
We would like to find the number of states f (p)dp with momentum of magnitude
between p and p + dp. f (p) is called the density of states. Consider a cube of
sides with length L. From quantum mechanics, the allowed wavefunctions are
(n π )
(n π )
(n π )
x
y
z
ψ = const. sin
x sin
y sin
z
(4.56)
L
L
L
with nx , ny , nz = 1, 2, . . .. The magnitude of the wave vector k is defined by
k2 =
π2 2
(n + n2y + n2z ) .
L2 x
(4.57)
Hence, the volume per allowed point in k-space is (π/L)3 . Since the n’s can only
be positive, we have to count only the positive octant, and the volume of the region
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
71
lying between the radii k and k + dk in the positive octant is 18 4πk 2 dk. The number
of states in this region is
1
V k 2 dk
4πk 2 dk/(π/L)3 =
.
8
2π 2
(4.58)
The relation between the wave vector and momentum is k = 2πp/h, the final result
of the density of states is
V 4πp2 dp
f (p) dp =
.
(4.59)
h3
Note that we have only considered the translational motion. For example, if the
particle has non-zero spin, we have to multiply the above formula by the number
of internal degrees of freedom.
We come back to calculate the partition function Z1tr .
∑
Z1tr =
exp(−βεtr
s)
∫s ∞
=
=
=
=
=
exp(−βp2 /2m)f (p)dp
0
∫
4πV ∞
exp(−βp2 /2m)p2 dp
h3 0
(
)3/2 ∫ ∞
4πV 2m
exp(−x2 )x2 dx
h3
β
0
(
)3/2 √
π
4πV 2m
3
h
β
4
(
)3/2
2πmkT
V
.
h2
(4.60)
The full partition function is
1 N
Z(T, V, N ) =
V
N!
(
2πmkT
h2
)3N/2
N
Zint
.
(4.61)
If a particle is in state r and the energy of this state Er depends on the volume,
then dEr /dV ≡ −Pr is by definition the negative of pressure (contributed by this
particle), because the derivative is the work done per unit change in volume. The
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
72
Pressure
Gravity
II
I
Figure 4.8: The outward pointing pressure balances the inward pointing gravitational force.
average pressure is just
P =
∑
pr Pr
r
=
∑
r
pr (−
dEr
)
dV
∑1
dEr
=
e−βEr (−
)
Z
dV
r
(
)
1 ∑ ∂e−βEr
=
Zβ r
∂V
β
(
)
∂Z
1
=
Zβ ∂V β
(
)
1 ∂ ln Z
=
.
β
∂V
β
(4.62)
Substitute Eq. (4.61) into the above equation and notice that Zint is independent
of the volume, we have
N kT
1 N
=
,
(4.63)
P =
β V
V
or P V = N kT , which holds irrespective of the internal molecular structure.
To illustrate one simple application of ideal gas law in astronomy, we could roughly
estimate the temperature at the core of a star. We hypothetically divide the star
into to halves, the regions I and II in Fig. 4.8. The mass of each half is M/2 if the
total mass of the star is M . They are separated by a distance R/2 if the radius of
2
. The
the star is R. Thus, the gravitational attraction of the two halves is G(M/2)
(R/2)2
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
73
area between them is 4π(R/2)2 . The average pressure is in the order of
G(M/2)2
/4π(R/2)2
(R/2)2
4 GM ρ
=
3 R
⟨P ⟩ =
(4.64)
where ρ is the average density of the star. For ideal gas with N particles, ideal gas
law gives us the temperature
PV
Nk
P mp
=
(N mp /V )k
P mp
=
ρk
4GM mp
=
3kR
T =
(4.65)
where k is the Boltzmann’s constant, mp is the mass of proton. We have the
second equality because in a main sequence star, most of it are protons. The fourth
equality is given by Eq. (4.64). If we substitute the data of our Sun, the result is
T = 3×107 K. A more detailed calculation of the core temperature gives 1.5×107 K.
4.4.6
The Perfect Quantal Gas and Quantum Statistics
One basic assumption of classical gas of identical particles is that the mean occupation number for single-particle states is much less than one. For quantal gas, the
mean occupation number could be near or even greater than one. The main quantum effect is the quantum statistics: how particles occupy single-particle states.
Eq. (4.52) is no longer correct.
Let nr be the occupation number of the single-particle state with energy εr . What
are the possible values of nr ? There are two mutually exclusive classes.
Bose-Einstein statistics (BE): there is no restriction on nr , i.e. nr = 0, 1, 2, . . ..
They are called bosons, with integral spin 0, h̄, 2h̄, . . ..
Fermi-Dirac statistics (FD): the nr can only be 0 or 1. They are called fermions,
with half-integral spin 21 h̄, 32 h̄, . . .. Another way to say about fermions is that they
satisfy the Pauli exclusion principle: no two fermions can be in the same singleparticle state.
The particles in consideration could be fundamental or composite, as long as they
are identical. A composite particle consisting of odd number of fermions is a fermion.
A composite particle consisting of even number of fermions is a boson.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.4.7
74
The Partition Function
We will give the general form of the partition function in this subsection. Suppose the energy of the single-particle states are ε1 , ε2 , ε3 , . . ., and the corresponding
occupation numbers are n1 , n2 , n3 , . . .. For gas of N particles,
∑
nr = N .
(4.66)
r
{
Also
nr =
0, 1
fermions
0, 1, 2, 3, . . . bosons
.
(4.67)
Any set of {nr } that satisfies these two conditions defines a state of the gas. The
partition function is then
∑
∑
Z(T, V, N ) =
exp(−β
nr εr ) .
(4.68)
n1 ,n2 ,...
r
where the first sum is over all sets of {nr }. The mean occupation number is
∑
∑
ni exp(−β
nr εr )
n̄i =
n1 ,n2 ,...
∑
r
exp(−β
n1 ,n2 ,...
1
= −
β
4.4.8
(
∂ ln Z
∂εi
)
∑
nr εr )
r
.
(4.69)
T,εr (r̸=i)
Derivation of Blackbody Radiation
In this section, we discuss the thermal gas of photons, using the method developed
in the last section. An ideal black body is an object that absorbs all radiations fall
on it. A black body is also a perfect radiator and its radiation is a thermal gas of
photons. Most hot objects, including stars and a piece of hot metal, behave roughly
like black bodies.
Photons are of spin 1, they are bosons and obey BE statistics. They also do not
interact with each other (Maxwell’s equations are linear). Hence, photon gas is
perfect gas.
Since photons can be emitted or absorbed, photon number is not a constant.
Eq. (4.66) does not apply. The occupation number for each state r can take any
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
75
values, nr = 0, 1, 2, . . . . Partition function is
∑
∑
Z(T, V ) =
exp(−β
nr εr )
n1 ,n2 ,...
=
(
∑
e−βn1 ε1
n1
r
)(
∑
)
e−βn2 ε2
···
n2
1
1
=
···
1 − exp(−βε1 ) 1 − exp(−βε2 )
∞
∏
1
=
1 − exp(−βεr )
r=1
(4.70)
because
∑
e−βnε = 1 + e−βε + (e−βε )2 + · · ·
n
1
.
1 − exp(−βε)
=
Thus, we have ln Z(T, V ) = −
of state r is
∑∞
r=1
(4.71)
ln(1 − e−βεr ) and the mean occupation number
1 ∂ ln Z
β ∂εr
e−βεr
=
1 − e−βεr
1
.
= βεr
e −1
n̄r = −
(4.72)
We now derive the Planck’s law of black-body radiation. Energy and momentum
of a photon of frequency ν are ε = hν and p = hν/c. The density of states is given
by Eq. (4.59). However, there is one more complication for photons. There are two
polarizations for each translational degree of freedom, two perpendicular directions
of the linear polarization, for example. The number of internal degrees of freedom is
two. The label r specifies the translational motion, the frequency, and polarization.
In terms of frequency ν, we have
V 4π(hν/c)2 hdν/c
h3
8πV ν 2 dν
.
=
c3
f (ν)dν = 2
(4.73)
Combining the mean occupation number (Eq. 4.72) and the density of state above,
we have the number of photons in the frequency range ν and ν + dν as
dNν =
8πV
ν 2 dν
.
c3 exp(βhν) − 1
(4.74)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
76
The total energy of photons in volume V in this frequency range is
8πV h
ν 3 dν
dEν = hνdNν =
.
c3 exp(βhν) − 1
(4.75)
The energy density per unit frequency is defined as
uν = Eν /V =
8πhν 3
.
c3 (ehν/kT − 1)
(4.76)
The energy density per unit wavelength uλ is defined as uλ |dλ| = uν |dν| where
λ = c/ν is the wavelength. We have |dν/dλ| = c/λ2 , and
c 8πh(c/λ)3
λ2 c3 (ehc/λkT − 1)
8πhc
1
=
.
hc
5
λ exp( λkT ) − 1
uλ =
4.5
(4.77)
Physics of Blackbody Radiation
Astrophysical emissions can be classified as thermal and non-thermal origins. Thermal radiation is emitted by the thermal motion of charged particles, such as blackbody radiation and thermal Bremsstrahlung. Non-thermal radiation is generated
by other processes, and the particles do not follow a thermal distribution. Examples
are synchrotron radiation and inverse Compton scattering.
In this course, we will only discuss the blackbody radiation. It is one of the most
common and important radiation mechanisms. You will find it in the Sun and other
stars and even the cosmic microwave background of the Universe. When we say the
Universe has a temperature of 3 K, how do we measure it?
A blackbody is an idealized object that absorbs all radiation in any frequencies. It can
be approximated by cavity with a small hole
in it, such that any incident light can never
come out (Figure 4.9). The photons inside are
then in thermal equilibrium with the surrounding. Their distribution follows the Bose-Einstein
statistics. This gives the blackbody radiation.
Every object has a finite temperature emits
blackbody radiation. As we will see below, the
peak frequency only depends on temperature.
This can be used to explain why objects glow
at different color at different temperature, e.g.
from red to blue as it heats up.
Figure 4.9: A blackbody.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
77
The specific intensity of blackbody radiation is given by Planck’s law, which can
be derived from Eqs. (4.14) and (4.76):
IνBB ≡ Bν (T ) =
2hν 3 /c2
,
exp(hν/kB T ) − 1
(4.78)
where h is the Planck constant and kB is the Boltzmann constant. Examples of
blackbody spectrum are shown in Figure 4.10. Note that thermal radiation is NOT
Figure 4.10: Blackbody radiation spectrum.
the same as blackbody radiation. For the former, the source function Sν is equal to
the blackbody intensity Bν
Sν = Bν (T ) ,
(4.79)
so that the emission and absorption coefficients are related by
jν = αν Bν (T ) .
(4.80)
This is the Kirchhoff’s law. For blackbody radiation, Iν = Bν . It is clear that
for thermal radiation in optically thick media, it becomes blackbody. We describe
below some properties of blackbody radiation.
4.5.1
Stefan-Boltzmann law
The flux from blackbody radiation is
F = σSB T 4 ,
(4.81)
where σSB is the Stefan-Boltzmann constant
σSB =
2π 5 kB4
= 5.67 × 10−5 erg cm−2 K−4 s−1 .
15c2 h3
(4.82)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
78
It can be proved by integrating the Planck spectrum directly
∫ ∞
∫ ∞∫
∫ ∞ ∫ π/2 ∫ 2π
F =
Fν dν =
Bν cos θ dΩ dν =
Bν cos θ sin θ dϕ dθ dν .
0
0
0
0
0
(4.83)
We consider only radiation going out from a surface, therefore, θ goes from 0 to
π/2 only.
∫ ∞ ∫ π/2
∫ ∞
∫ 1
∫ ∞
F = 2π
Bν cos θ sin θ dθ dν = 2π
Bν dν
µ dµ = π
Bν dν ,
0
0
0
0
0
(4.84)
where we have substituted µ = cos θ. Finally,
(
)4 ∫ ∞
∫ ∞
2hν 3 /c2
2πh kB T
u3
F =π
dν = 2
du ,
exp(hν/kB T ) − 1
c
h
eu − 1
0
0
(4.85)
where u = hν/kB T . The integral is nontrivial but it can be shown that the answer
is π 4 /15. This gives Stefan-Boltzmann law above.
This law tells us the total energy emitted per unit area per unit time. As an
example, we know that the Sun has a surface temperature T = 5800 K, to estimate
2
the solar luminosity L⊙ , we just need to integrate over its surface area 4πR⊙
,
2
L⊙ = 4πR⊙
σSB T 4 .
4.5.2
(4.86)
Rayleigh-Jeans Law
At the low energy limit, hν ≪ kB T , we can expand
(
)
hν
hν
exp
−1=
+ ...
kB T
kB T
(4.87)
This is the Rayleigh-Jeans limit of the Planck’s law
IνRJ (T ) =
(4.88)
Physically, it is the classical limit and it leads to the ultraviolet catastrophe.
4.5.3
Wien Law
At the high energy limit, hν ≫ kB T , we can expand
(
)
(
)
hν
hν
exp
− 1 ≈ exp
kB T
kB T
(4.89)
This is the Wien limit of the Planck’s law
IνW (T ) =
(4.90)
Unlike Rayleigh-Jeans law, Wien law is a quantum effect, as you can guess from
the fact that it contains the constant h.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.5.4
79
Wien’s Displacement Law
At what frequency the blackbody radiation peaks at? We can find the answer by
setting
∂Bν =0.
(4.91)
∂ν ν=νmax
This results in
x = 3(1 − e−x ) ,
(4.92)
where x ≡ hνmax /kB T . This can only be solved numerically
hνmax = 2.82kB T ,
(4.93)
νmax = 5.88 × 1010 T Hz ,
(4.94)
or
where T is in K. This is Wien’s displacement law. Similarly, we can also derive
the law in wavelength:
∂Bλ =0.
(4.95)
∂λ λ=λmax
Solving y = 5(1 − e−y ), we obtain
λmax = 0.290T −1 cm ,
(4.96)
T is in K. However, note that λmax νmax ̸= c. (why?) For example, the sun has a
temperature of 5800 K, which corresponds to λmax =500 nm. This is the wavelength
of green light, coincident with the peak sensitivity of human vision.
4.5.5
Monotonicity
One important property of blackbody radiation is that the curves of different temperatures in Figure 4.10 never cross, i.e. a curve of higher temperature is entirely
above the lower temperature one. This can be proved by
2h2 ν 4
exp(hν/kB T )
∂Bν
= 2
2
∂T
c kB T [exp(hν/kB T ) − 1]2
(4.97)
always > 0. This has two consequences. First, although the blackbody peak shifts
toward shorter wavelength (blue color) as temperature increases, the intensity at
long wavelength (red color) always increases with temperature. Second, given ν,
there is one-to-one correspondence between Iν and T . As we will show below, this
can be used to define temperature.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.5.6
80
Temperature Definitions
Brightness temperature
At a given frequency ν, one can equate the brightness of an object Iν to the blackbody intensity. The temperature obtained this way is called the brightness temperature Tb , i.e.
Iν = Bν (Tb ) .
(4.98)
We can then measure the brightness in units of K. This is often used in radio astronomy, where we express the surface brightness of a nebula in terms of brightness
temperature, and also the system noise temperature of antennas. In radio frequencies, the Rayleigh-Jeans law is usually applicable, so that
Tb = Iν
c2
.
2kB ν 2
(4.99)
This is essentially how infrared thermometers (pyrometers) work, which provides a
useful way to measure temperature when conventional methods are not practical,
e.g., fast moving objects, objects far away, or having too high temperature to contact. The brightness temperature of giant pulses from some pulsars can go beyond
5 × 1039 K, the highest known brightness temperature in the Universe.
Color temperature
If the measured spectrum of an object have a shape more or less like a blackbody,
we can perform a fit to obtain the temperature. Even simpler, one could just fit
the peak of the emission then apply Wien’s displacement law. This gives the color
temperature. This is the same as the color temperature you set in digital photography or TV screens. Why a lower color temperature gives a “warmer” tone, while
a high temperature gives a “cool” tone? Some advanced infrared thermometers may
measure two or more frequency bands to estimate the color temperature from the
intensity ratio. This can give more accurate measurements.
Effective temperature
Recall that the Stefan-Boltzmann law relates the total flux to the temperature. This
can also be used to define the temperature. The effective temperature Teff is the
temperature of a blackbody that emits the same amount of flux as the observed
object:
4
.
(4.100)
Fobs = σSB Teff
The Sun is not a perfect blackbody, but we can derive the effective temperature of
5800 K from the total flux, same for other stars.
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
4.6
81
Scattering Cross Section
How strong do two particles interact with each other? How can we describe it?
Experimentally, we usually send a uniform beam of particles, with same mass and
energy, to certain target particle. Some incident particles will be absorbed, some
will be scattered, some will be even chemically changed to others. The number of
incident particles affected is proportional to the number of incident particles. If
this ratio is larger, the interaction between the incident particles and the target is
considered to be stronger.
(a)
(b)
Figure 4.11: The total cross section depends on the orientation.
The intensity of the beam is defined to be the number of incident particles crossing
unit area normal to the beam in unit time. The total cross section is defined to
be
number of particles affected per unit time
σT =
.
(4.101)
incident intensity
Note that the dimension of the total cross section is same as an area, because
the numerator has dimension of pure number per time, while the denominator has
dimension of pure number per time per area.
To illustrate the idea, let say there is a board with area A which absorbs every
particles incident on it. If the board is normal to the incident particles, Fig. 4.11a,
it is immediately that the total cross section is
σT = A .
(4.102)
A larger total cross section does mean a stronger interaction. If the board is parallel
to the direction of motion of the incident particles (and the board is very thin),
Fig. 4.11b, the total cross section is zero. Hence, the cross section depends on the
orientation in general.
For simplicity, from now on, we assume that the incident particles will just be
scattered. To have a finer description on scattering, we define the differential
cross section, σ(Ω), as
σ(Ω) dΩ =
number of particles scattered into solid angle dΩ per unit time
.
incident intensity
(4.103)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
82
dΩ
Figure 4.12: The differential cross section.
In this formula, we have assumed that the place where we actually do the measurement is far away from the scattering center. If a incident particle is scattered, it
must be scattered to some direction. We have
∫
σT = σ(Ω) dΩ
(4.104)
where the integration is over the 4π solid angle.
θ+dθ
b
θ
Figure 4.13: The geometry for the calculation of differential cross section.
We are going to calculate the differential cross section of the gravitational interaction. We know the relation between impact parameter and the scattering angle
from Eq. (3.84). Notice that there is a cylindrical symmetry. We only have to
calculate the θ dependency of the differential cross section, σ(Ω) = σ(θ).
Referring to Fig. 4.13, all incident particles passing through the annulus at the left,
with radii b and b + db, will be scattered to the annulus at the right, with angles
between θ and θ + dθ. If the incident intensity is I particles per unit normal area
per unit time, then the number of particles passing through between b and b + db
is I 2πb db per unit time. This number should be equal to 2πIσ(θ) sin θ dθ by the
definition of differential cross section, where the 2π comes from the integration of
the ϕ dependency. We have
b db .
(4.105)
σ(θ) =
sin θ dθ CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
83
We take the absolute value because b and θ usually vary in opposite directions.
Then, by Eq. (3.84), we have
(
)
π−θ
cos
2
1
2
sin θ/2
b2 v 4
G2 M 2
bv 2
GM
v 2 db
GM dθ
db
dθ
(
)−1/2
b2 v 4
=
1+ 2 2
GM
b2 v 4
= 1+ 2 2
GM
θ
= cot2
2
θ
= cot
2
1
= −
2 sin2 θ/2
GM
1
= − 2
.
2
2v sin θ/2
(4.106)
The differential cross section is
GM
b
2
sin θ v sin2 θ/2
cot θ/2
G2 M 2
=
2v 4 sin θ sin2 θ/2
G2 M 2
=
4v 4 sin4 θ/2
(GM m)2
=
16E 2 sin4 θ/2
σ(θ) =
(4.107)
where E = mv 2 /2 is the energy of the incident particles. M ≫ m is assumed such
that µ ≈ m. In principle, we could get the total cross section by integration
∫
σT = σ(θ) sin θdϕ dθ .
(4.108)
If we actually do the integration, we find that σT diverges to infinity:
∫
(GM m)2
σT =
sin θ dϕ dθ
16E 2 sin4 2θ
∫
2π(GM m)2 π 2 sin 2θ cos 2θ dθ
= −
16E 2
sin4 2θ
0
∫
2π(GM m)2 π d(sin 2θ )
= −
4E 2
sin3 θ
0
π 2
2π(GM m)2 1 =
8E 2
sin2 2θ (4.109)
0
= ∞.
(4.110)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
84
Physically, this is because the total cross section describes how far away the target
can affect the incident particles. But gravitational interaction is infinitely long
range, all incident particles will be affected, although the effect could be tiny.
4.7
Chemical Potential and Saha Equation (Optional)
We would like to study the relative abundance of the reactants and products of a
reaction in equilibrium. We will only derive the simplest case in the first half of
this section and state the main result, Saha equation, in the second.
The total energy of a particle moving with momentum p is given by E 2 = m2 c4 +
p2 c2 . If it isn’t moving near speed of light, we have
E=
√
m2 c4 + p2 c2 ≈ mc2 +
p2
.
2m
(4.111)
Note that the rest mass energy mc2 includes the internal energy, for example, bounding energy. Consider a simple reaction
A⇀
↽B ,
(4.112)
where A and B could be two excited states of a single particle. Let the total number
of particles be N . We have the constraint N = NA + NB in obvious notations.
The total energy of the microstate that there are NA particles of type A with
B
momenta pA
i , i = 1, . . . , NA and NB particles of type B with momenta pj , j =
1, . . . , NB is
)
) ∑(
2
2
∑(
(pB
(pA
j )
i )
2
2
mA c +
+
mB c +
.
(4.113)
2mA
2mB
j
i
There are N !/(NA !NB !) microstates with the same energy. Hence, the probability
that the system is in a state with NA particles of type A with momenta {pA
i } and
B
NB particles of type B with momenta {pj } is, according to Eq. (4.44), proportional
to
{
[
)]}
) ∑(
B 2
A 2
∑(
)
(p
)
(p
N!
j
exp −β
mA c 2 + i
+
mB c 2 +
. (4.114)
NA !NB !
2m
2m
A
B
i
j
We do not care about the momenta of the particles, and sum (integrate) up all
microstates of different momenta. We have already done this in Eq. (4.60). The
probability that the system is in a state with NA particles of type A and NB particles
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
85
of type B (with any momenta) is proportional to
N!
2
e−βNA mA c V NA
NA !NB !
(
2πmA kT
h2
(
)3NA /2
e
−βNB mB c2
V
NB
2πmB kT
h2
)3NB /2
.
(4.115)
By Stirling formula for large s, ln s! = s ln s − s, we can rewrite the factor in the
above formula as
1 −βNA mA c2 NA
e
V
=
NA !
=
e−βNA mA c
2 +N −N
A
A
e−βNA (mA c
2 +kT
ln NA +NA ln V
ln(NA /V )−kT )
.
(4.116)
For classical ideal gas of particles with mass m at temperature T , the chemical
potential is defined by
(g n )
s Q
(4.117)
µ = mc2 − kT ln
n
where n is the number density of particles in the gas, nQ is called the quantum
concentration
(
)3/2
2πmkT
nQ =
(4.118)
h2
and gs is the internal degree of freedom of the particle. It depends on the spin of the
particle and, for example, the excited states of the hydrogen atom. For electron,
proton or neutron, gs = 2. We have implicitly assumed that for our particles A and
B, gA = gB = 1.
In terms of these, Eq. (4.115) is
N ! e−βNA (mA c +kT ln(NA /V )−kT )+NA ln nQA e−βNB (mB c +kT ln(NB /V )−kT )+NB ln nQB
2
2
= N ! eN e−βNA (mA c −ln(nQA /nA )) e−βNB (mB c −ln(nQB /nB ))
= N ! eN e−βNA µA e−βNB µB
= N ! eN e−βN µB exp[−βNA (µA − µB )] .
(4.119)
2
2
Since µB depends only very weakly on NA (through NB = N − NA ), the probability
is greatest when µA = µB . This is the main result: in equilibrium, the chemical
potentials of the reactant and product equal.
Let ∆E = mA c2 − mB c2 . The equality of chemical potentials implies
(
)
(
)
nQA
nQB
2
2
mA c − kT ln
= mB c − kT ln
nA
nB
(
)
nA nQB
∆E = −kT ln
.
nB nQA
(4.120)
If ∆E is much smaller than mA c2 , then nQA ≈ nQB and
nA = nB e−∆E/kT .
(4.121)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
86
Chemical potential can be interpreted as how much energy is needed to put one
more particle into the system. It depends on the physical parameters of the system.
For example, to keep the temperature unchanged, we have to speed up the particle
before injecting it to the system. Since it costs arbitrarily low energy to create a
photon with very long wavelength, we claim that the chemical potential of photon
is zero.
For reactions with more reactants,
A+B ⇀
↽C +D ,
(4.122)
at thermodynamical equilibrium, the energy needed to create particles A and B
must equal the energy needed to create particles C and D. Hence, we claim that
µ(A) + µ(B) = µ(C) + µ(D) .
(4.123)
Substitute the expressions of the chemical potential in Eq. (4.117) in this, the
resulting equation is the Saha equation.
Let us consider a very important example,
γ + Hn ⇀
↽ e− + p ,
(4.124)
the ionization of hydrogen atom at the n-th excited state by absorbing a photon
γ. To satisfy the assumption that they are ideal gases, their density must be much
less than the quantum concentration, n ≪ nQ . As mentioned, µ(γ) = 0 and
(
)
ge nQe
2
µ(e) = me c − kT ln
,
(4.125)
ne
(
)
gp nQp
2
µ(p) = mp c − kT ln
,
(4.126)
np
(
)
g(Hn )nQH
2
µ(Hn ) = m(Hn )c − kT ln
(4.127)
.
n(Hn )
The energy of the n-th excited hydrogen atom is En , Eq. (2.7), hence we have
m(Hn )c2 = me c2 + mp c2 + En .
Substitute these into the chemical potential equation, we have
)
(
n(Hn ) ge nQe gp nQp
.
− En = kT ln
g(Hn )nQH ne
np
(4.128)
(4.129)
Since the mass of hydrogen is approximately equal to the mass of a proton, nQH =
nQp . Also, let εn = |En |. The Saha equation becomes
exp(−εn /kT )
g(Hn ) ne np
.
ge gp nQe n(Hn )
(4.130)
CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES
87
For free electrons, ge = 2
np
2gp
=
n(Hn )
ne g(Hn )
(
2πme kT
h2
)3/2
exp(−εn /kT ) .
(4.131)
In most cases, there is no net charge, ne = np and all the gs are of order of unity.
We finally have
)3/2
(
n2e
2πme kT
∼
exp(−εn /kT ) .
(4.132)
n(Hn )
h2
We see that there is significant change in the percentage of ionization when temperature is around ε1 /k, which is about 160,000K.
Example: In Sun’s photosphere, ne = 1.88 × 1013 cm−3 , T = 5777 K, what is the
number ratio between Ca II (singly-ionized calcium) and Ca I (neutral calcium)?
Given that gII = 2.30 and gI = 1.32 and ionization energy χI = 6.11 eV for Ca I.
Using the Saha equation,
NII
2gII (2πme kT )3/2 −χI /kT
=
e
NI
gI ne
h3
2 × 2.30
(2πme kT )3/2 −6.11 eV/kT
=
e
1.32 × 1.88 × 1013
h3
≈ 927 .