* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PHYS 3651 The Physical Universe
Schiehallion experiment wikipedia , lookup
Elementary particle wikipedia , lookup
Negative mass wikipedia , lookup
Conservation of energy wikipedia , lookup
First observation of gravitational waves wikipedia , lookup
Equations of motion wikipedia , lookup
Old quantum theory wikipedia , lookup
Faster-than-light wikipedia , lookup
Newton's theorem of revolving orbits wikipedia , lookup
Photon polarization wikipedia , lookup
Speed of gravity wikipedia , lookup
Dialogue Concerning the Two Chief World Systems wikipedia , lookup
Newton's laws of motion wikipedia , lookup
Aristotelian physics wikipedia , lookup
Work (physics) wikipedia , lookup
Classical mechanics wikipedia , lookup
History of physics wikipedia , lookup
Time in physics wikipedia , lookup
A Brief History of Time wikipedia , lookup
Classical central-force problem wikipedia , lookup
Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup
PHYS 3651 The Physical Universe Dr. S.C.Y. Ng [email protected] Contents Overview 1 1 Spherical Astronomy 4 1.1 Sky and Celestial Sphere . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Equatorial Coordinate System . . . . . . . . . . . . . . . . . . . . . 5 1.2.1 Longitude and Latitude . . . . . . . . . . . . . . . . . . . . 5 1.2.2 Motion of the Sun . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 Special Points . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.4 Equatorial Coordinates . . . . . . . . . . . . . . . . . . . . . 8 1.2.5 Circumpolar Stars . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.6 Great Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Other Celestial Coordinate Systems . . . . . . . . . . . . . . . . . . 12 1.4 Limitations of Coordinate Systems . . . . . . . . . . . . . . . . . . 13 1.4.1 Precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.2 Aberration of Light . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.3 Parallax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Light and Telescopes 2.1 17 Electromagnetic Wave . . . . . . . . . . . . . . . . . . . . . . . . . i 17 CONTENTS ii 2.2 Magnitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Spectrum, Spectral Lines, and Atoms . . . . . . . . . . . . . . . . . 21 2.4 Optics and Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.2 Refracting telescopes . . . . . . . . . . . . . . . . . . . . . . 26 2.4.3 Reflecting and catadioptric telescopes . . . . . . . . . . . . . 26 2.4.4 Magnification and resolution . . . . . . . . . . . . . . . . . . 27 2.4.5 Lens speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 CCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.5 3 Celestial Mechanics 33 3.1 Newton’s Laws of Motion . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Newton’s Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.1 Roche Lobe . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.2 Critical Density of the Universe . . . . . . . . . . . . . . . . 42 3.2.3 Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 43 Two-body Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Kepler’s Laws of Planetary Motion . . . . . . . . . . . . . . 45 3.3.2 Orbits in Two-body Problem . . . . . . . . . . . . . . . . . 45 3.3.3 Proof of Kepler’s Laws . . . . . . . . . . . . . . . . . . . . . 51 3.4 Impact Parameter and Scattering Angle . . . . . . . . . . . . . . . 52 3.5 Restricted Three-body Problem . . . . . . . . . . . . . . . . . . . . 54 3.3 4 Introduction to Radiative Processes 4.1 Solid Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 58 CONTENTS iii 4.2 Specific Intensity and Flux . . . . . . . . . . . . . . . . . . . . . . . 60 4.3 Emission and Absorption . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 Basics of Statistical Mechanics (Optional) . . . . . . . . . . . . . . 65 4.4.1 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.2 Isolated Systems . . . . . . . . . . . . . . . . . . . . . . . . 67 4.4.3 Systems in a Heat Bath . . . . . . . . . . . . . . . . . . . . 67 4.4.4 The Perfect Classical Gas . . . . . . . . . . . . . . . . . . . 69 4.4.5 The Partition Function . . . . . . . . . . . . . . . . . . . . . 70 4.4.6 The Perfect Quantal Gas and Quantum Statistics . . . . . . 73 4.4.7 The Partition Function . . . . . . . . . . . . . . . . . . . . . 74 4.4.8 Derivation of Blackbody Radiation . . . . . . . . . . . . . . 74 Physics of Blackbody Radiation . . . . . . . . . . . . . . . . . . . . 76 4.5.1 Stefan-Boltzmann law . . . . . . . . . . . . . . . . . . . . . 77 4.5.2 Rayleigh-Jeans Law . . . . . . . . . . . . . . . . . . . . . . . 78 4.5.3 Wien Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5.4 Wien’s Displacement Law . . . . . . . . . . . . . . . . . . . 79 4.5.5 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.5.6 Temperature Definitions . . . . . . . . . . . . . . . . . . . . 80 4.6 Scattering Cross Section . . . . . . . . . . . . . . . . . . . . . . . . 81 4.7 Chemical Potential and Saha Equation (Optional) . . . . . . . . . . 84 4.5 Overview • What is this course about? This is an introductory course to astrophysics, likely your first formal course in astrophysics. • What will I learn in this course? The aim is to provide essential knowledge on mathematical tools and physical concepts that used in astrophysics, in order to help appreciate the physical principles of how the Universe works and to set the stage for more serious courses in astrophysics. Throughout this course, you may not see pretty astronomy pictures, but a lot of physics equations. These are what professional astrophysicists see most of the time, and how we extract scientific results from observations. • What is astrophysics about? Astrophysics is the study of the physical properties of objects in the Universe and their interactions. It has a close connection with many other fields in physics, including mechanics, electromagnetism, statistical mechanics, relativity, and even quantum physics in some cases. As we believe that the laws of physics are universal, we can apply those laws we developed on Earth to the celestial objects to understand how they work. In some cases, we can also do it the other way round: use the Universe as a laboratory to test the laws of physics, in particular, under the most extreme conditions that can never be reproduced on Earth. Some may consider astronomers as those study the physical properties of celestial objects and astrophysicists as those use the celestial objects to study physics. • Why observations? Unlike other branches of physics, astrophysics relies heavily on observations. This is because due to the distance and scale of the celestial objects, it is very difficult (if not impossible) to collect samples for detailed studies or to reproduce them in laboratories. Fundamentally, there are only four things one can measure with astronomical observations: position, spectrum (flux), time, and polarisation. In Chapter 1, we will introduce the coordinate systems, which are the fundamentals in positional astronomy. They are also like the language of astronomy when we want to specify an object to another astronomer. Different coordinates and their limitations will be discussed. 1 CONTENTS 2 • Course overview: In traditional astronomy, electromagnetic radiation is the main cosmic messenger we rely on. In Chapter 2, we will talk about the basics of light and how we detect them using telescopes. In Chapter 4, we will study how radiations are generated and propagated. This can give insights into the physical processes happening inside celestial objects. Together these topics provide an introduction to spectroscopy. In Chapter 3, we will review Newtonian mechanics and apply it to simple two-body problems. We will see how the orbital motions of celestial bodies are governed by simple laws of physics. A note on units: Throughout this course, we will follow the tradition in this field to use cgs unit, i.e. cm, g, and s. This is what you will see in research papers. Force will be expressed in terms of dyn (=g cm /s2 ) and energy is in erg (=g cm2 /s2 ). Charge is in electrostatic unit of charge (esu), so that Coulombs law becomes F = q1 q2 /r2 . Exercise: 1 N= dyn and 1 J = erg. Syllabus Ch. 1 Spherical Astronomy: Sky and Celestial Sphere, Equatorial Coordinate System, Other Celestial Coordinate Systems, limitations of the coordinate systems. Ch. 2 Light and Telescope: Electromagnetic Wave, Magnitudes, Spectrum, Spectral Lines, and Atoms, Optics and Telescopes, CCD. Ch. 3 Celestial Mechanics: Newton’s Laws of Motion, Kepler’s Laws, Two-body Problems, Scattering, Restricted Three-body Problem. Ch. 4 Introduction to Radiation Processes: Solid Angle, Specific Intensity and Flux, Emission and Absorption, Blackbody Radiation. Learning objectives Ch. 1 Spherical Astronomy: Understand the definition and limitations of coordinate systems. Manage to apply the equatorial coordinates and to determine the angular separation between points. Ch. 2 Light and Telescope: Be able to convert between magnitude and flux. CONTENTS 3 Understand the formation of spectral lines. Be able to compare the working principles of different types of telescopes and to calculate the diffraction limit. Ch. 3 Celestial Mechanics: Manage to calculate orbits of celestial bodies using Newton’s gravitation law. Manage to solve the two-body problem and derive Kepler’s laws. Understand the physical significance of Lagrangian points. Ch. 4 Introduction to Radiation Processes: Understand the basic terminology of radiative transfer and the transfer equation. Manage to apply the blackbody radiation to astrophysical situations. Chapter 1 Spherical Astronomy (Chapters 1 and 3.1 in textbook.) Astronomy is the science studying objects in the sky. We need to have a coordinate system to tell others where those objects are. How do we define such a system? The first natural thing you may come up is the altitude-azimuth coordinates (or horizontal coordinates). When you take a picture of the sky, you specify the altitude (elevation) and azimuth (angle around the horizon). However, what is the problem with this system? Objects are constantly moving in the sky and this coordinates depend on the observer location! Therefore, we need a coordinate system on the sky. Since the sky, or more precisely, the celestial sphere is a sphere, we have to study the spherical geometry. 1.1 Sky and Celestial Sphere What we see as the sky is, in fact, the mostly empty space outside the Earth’s surface. During day time, the scattering of Sun light by the atmosphere brightens up the sky and we cannot see much beyond the atmosphere or even the clouds. At night, without the interference of the Sun, we can see much further, including the stars, galaxies and many other objects in the universe. Celestial sphere Earth Figure 1.1: Earth inside the infiSince the sky or the universe is all around us, nite large celestial sphere. our ancestors thought that we were inside a large sphere, the celestial sphere, and all stars, including the Sun, were moving on the celestial sphere. We now know that celestial sphere is not real, but the concept that we are inside an imaginary infinite sphere is still very useful in astronomy, Fig. 1.1, 4 CHAPTER 1. SPHERICAL ASTRONOMY 5 especially when we want to consider the positions of stars. This is what we mean by a celestial sphere in the followings. A fine point about the celestial sphere is where exactly is its center. When we observe objects near to us, like the artificial satellites or even the Moon, we have to be precise because their positions relative to the distant stars depend on the center. Two common positions for the center are the center of the Earth, which defines the geocentric coordinate system, and the position of the observer on the surface of the Earth, which defines the topocentric coordinate system. We are not worrying about the differences in the following. Because the Earth is rotating from west to east, everything in the sky will move from east to west. As a result, the Sun and almost all the stars will rise from the east and set in the west. If the celestial sphere is fixed with the Earth, the positions of the stars will change from minute to minute. That won’t be very convenient. The celestial sphere is, hence, fixed with the stars. (Here we implicitly assume that stars do not move. In fact, they do move in space. However, they are so far away that we can only detect the motions of very few of them.) For observers on the Earth, it will rotate once a day, just like the stars do. However, how far the stars are away from us is not shown on the celestial sphere. In other words, celestial sphere is just a two dimensional projection of the three dimensional universe. We are not going to talk much about distance measurement of celestial objects because it is very difficult although very important. 1.2 1.2.1 Equatorial Coordinate System Longitude and Latitude Before we go into the details of the coordinate systems on the celestial sphere, we first briefly review the coordinate system on the Earth’s surface. Earth’s surface is also a sphere. We use longitude and latitude to specify the position of a city, say, on the Earth. The latitude is defined as the angle sustained at the center of the Earth from the equator, where equator is the great circle mid-way between the north and south poles. Longitude is defined as the angular distance east or west from an imaginary line, a meridian, running from the north pole to south pole. This meridian is chosen as the one through the Greenwich Observatory in England. For example, the latitude and longitude of Hong Kong are about 22.5◦ N and 114.2◦ E, i.e. 22.5◦ north of equator and 114.2◦ east of Greenwich. CHAPTER 1. SPHERICAL ASTRONOMY 6 60 North celestial pole 30 0 114.2 20h Greenwich Line of declination 22h −30 Ecliptic 0h 22.5 Hong Kong 2h Equator Line of right ascension South celestial pole Vernal equinox Figure 1.2: Coordinate system on Earth and the celestial sphere with equatorial coordinate system. We now back to celestial sphere, Fig. 1.2. Like the surface of the Earth, it has two poles: the north and south celestial poles. They lie directly above the Earth’s poles. The celestial equator lies directly above the Earth’s equator. An observer standing on the Earth’s surface can only see half of the celestial sphere at one time. The other half is blocked by the Earth itself. The visible half is bounded by the observer’s horizon. The point on the celestial sphere directly above the observer is called the zenith. Due to the rotation of the Earth, zenith not only depends on the position of the observer on the Earth, but also is not a fixed point on the celestial sphere. 1.2.2 Motion of the Sun Figure 1.3: Solar day vs. sidereal day. We use Solar time in daily life. A solar day is 24 hours, which is defined as the time period for the Sun to return to the same position in the sky as observed on Earth. Since the Earth orbits around the Sun, it has to rotate about 361 degree in 24 hours (see Figure 1.3. On the other hand, a sidereal day is the time for a distant star to ◦ return to the same position in the sky. The Earth rotates during this period. day is shorter and stars rise earlier/later everyday by min. Therefore, a CHAPTER 1. SPHERICAL ASTRONOMY 7 As the Earth rotates around the Sun, the latter appears to move respect to background stars. Over one year, the Sun appears to move around the celestial sphere once. The path of this motion is called ecliptic. Since the rotation of the Earth is not perpendicular to the plane of revolution, the ecliptic does not coincide with the celestial equator. It makes an angle of 23.5◦ with the celestial equator, the same angle that the rotational axis of the Earth tilts from the revolution axis. Questions: There are 88 constellations in the sky but only 12 are zodiac, why? How does the ecliptic of the Sun look like in the equatorial coordinate system? 1.2.3 Special Points The two points that the ecliptic intersects the celestial equator are called the equinoxes. Vernal equinox is the point where the Sun crosses the celestial equator from the southern to the northern half of the celestial sphere, around March 21 each year. It is in the constellation Pisces. Vernal equinox also marks the origin of the celestial coordinate system, as we will talk about below. Autumnal equinox is the point where the Sun goes from the northern half to the southern half, around September 23 each year. It is in the constellation Virgo. The Sun rises and sets due east and due west respectively at the days of equinoxes. There are other two special points on the ecliptic. At the summer solstice, the Sun reaches the greatest distance from the celestial equator in the northern half of the celestial sphere, around June 21. For an observer on Earth, the Sun rises and sets at different directions on different day during the year. On summer solstice, it rises and sets at the northern most points. It is the longest day and shortest night for the northern hemisphere. While it is summer for the northern hemisphere, it is winter for the southern hemisphere. At the winter solstice, the Sun reaches the greatest distance from the celestial equator in the south, around December 22. The Sun rises and sets at the southern most points of the year. It is shortest day and winter for the northern hemisphere. The names of the solstices are biased to people in northern hemisphere. As a note, have you ever wonder why Easter, unlike many other festivals, is not on the same day every year? This is because Easter is held on the first Sunday after the first full moon occurring on or after the vernal equinox. Discussion: At the summer solstice, on which part of the Earth one will see the Sun passes directly overhead? This is called the Tropic of Cancer. CHAPTER 1. SPHERICAL ASTRONOMY 8 What is the difference of the Sun’s path in a year north and south of this line? On the same day, on which part of the Earth one will not see Sun rise and Sun set? (Ans: the Antarctic Circle and the Artic Circle, respectively. What are their latitudes?) A counterpart of the Tropic of Cancer in the southern hemisphere is called the Tropic of Capricorn. What latitude is it and what is special about this line? Question: Do you know why are there four seasons? 1.2.4 Equatorial Coordinates The most common coordinate system on the celestial sphere is the equatorial coordinate system. As discussed above, this system is like an extension of longitude and latitude on Earth to the sky. The reference plane is the celestial equator and the coordinates used are the right ascension and declination. Just like longitude, we need to choose a reference point. (How is 0◦ longitude defined on Earth?) This is chosen using the vernal equinox. Declination (dec., symbol δ) is the celestial equivalent of latitude on the Earth. It is measured in degrees, from 0◦ at the celestial equator to 90◦ at the poles, positive values or with an additional symbol “N” for the northern half and negative values or “S” for the southern half of the celestial sphere. Right ascension (RA, symbol α) is the equivalent of the longitude. The zero line of right ascension is chosen to pass through the vernal equinox. Right ascension is measured eastwards from the vernal equinox in hours, minutes and seconds, from 0 to 24 hours. vernal equinox summer solstice autumnal equinox winter solstice RA h m 0 0 6 0 12 0 18 0 Dec. ◦ 0 23.5 0 -23.5 Table 1.1: The coordinates of the four special points on the ecliptic. In the equatorial coordinate system, vernal equinox is at 0h 0m and 0◦ ; autumnal equinox at 12h 0m and 0◦ ; summer solstice at 6h 0m and 23.5◦ and winter solstice at 18h 0m and −23.5◦ , Table 1.1. (It is customary to write the hour and minute of right ascension as superscripts.) Be careful! One common source of confusion is CHAPTER 1. SPHERICAL ASTRONOMY 9 that the minute and second in RA are in units of time, but those in declination are arcminutes and arcseconds. Questions: 1h = ◦ ′ , 1m = , and 1s = ′′ . Why the zodiac signs start with Aries, but the Vernal equinox is currently in Pisces? 1.2.5 Circumpolar Stars We now talk about which stars an observer can see and which cannot. As we have mentioned before, at any particular time, an observer on Earth can only see half the celestial sphere. The other half is blocked by the Earth itself (see Fig. 1.4). If we take the rotation of the Earth into account, can the observer see the whole celestial sphere? Usually not. If seen from the poles, stars move in circles parallel to the horizon and never rise or set. If seen from the Earth’s equator, the rotation of the Earth does allow the observer to see the whole celestial sphere. At intermediate latitudes, say L◦ north, some stars never rise (see homework), and hence the observer cannot see it at all. At the other extreme, some stars never set. They are called circumpolar stars. For people in northern hemisphere, there is a star, called Polaris, only 1◦ away from the northern celestial pole. It is often called the North Star. What is the angular separation between two points? We would like to first relate the equatorial coordinate system with a Cartesian coordinate system. Assume that the celestial sphere is the unit sphere with the z-axis passing through the north celestial pole and the x-axis passing through vernal equinox (see Fig. 1.4). North L Visible θ L (x,y,z) Earth φ Invisible South Figure 1.4: Left: an observer on Earth can only see half the celestial sphere at a time. Right: spherical and Cartesian coordinate systems. A point P with Cartesian coordinates r = (x, y, z), x2 + y 2 + z 2 = 1 has spherical CHAPTER 1. SPHERICAL ASTRONOMY 10 coordinates x = sin θ cos ϕ y = sin θ sin ϕ z = cos θ . For the equatorial coordinate system, α = ϕ and δ = π/2 − θ, therefore, x . ⃗r1 = y = z (1.1) (1.2) The angular separation ∆Θ between two points with coordinates (α1 , δ1 ) and (α2 , δ2 ) is cos(∆Θ) = cos δ1 cos α1 cos δ2 cos α2 + cos δ1 sin α1 cos δ2 sin α2 + sin δ1 sin δ2 . (1.3) The proof is left as an exercise (hint: use the dot product). Exercise: Star A is at (17h 55m 0s , −60◦ 0′ 0′′ ), Star B is at (18h 5m 0s , −60◦ 0′ 0′′ ). What is their angular separation in the sky in arcminutes? (10m × 15?) For small angular separation ∆θ ≪ 1, assuming α1 = α, δ1 = δ, α2 = α + ∆α, and δ2 = δ + ∆δ. We then use identites sin(a + b) = cos(a + b) = (1.4) (1.5) and Taylor expansion x3 x5 + − ... 3! 5! x2 x4 cos x = 1 − + − ... 2! 4! sin x = x − (1.6) (1.7) to obtain 1− ∆Θ2 = cos δ cos α cos(δ + ∆δ) cos(α + ∆α) + cos δ sin α cos(δ + ∆δ) sin(α + ∆α) 2 + sin δ sin(δ + ∆δ) CHAPTER 1. SPHERICAL ASTRONOMY = = = ∆Θ2 = 11 ( )( ) ∆δ 2 ∆α2 cos δ cos α cos δ − ∆δ sin δ − cos δ cos α − ∆α sin α − cos α 2 2 ( )( ) ∆δ 2 ∆α2 + cos δ sin α cos δ − ∆δ sin δ − cos δ sin α + ∆α cos α − sin α 2 2 ( ) ∆δ 2 sin δ + sin δ sin δ + ∆δ cos δ − 2 ( ∆α2 cos δ cos α cos δ cos α − ∆α cos δ sin α − ∆δ sin δ cos α − cos δ cos α 2 ) ∆δ 2 +∆δ∆α sin δ sin α − cos δ cos α + cos δ sin α (cos δ sin α 2 ∆α2 cos δ sin α − ∆δ∆α sin δ cos α +∆α cos δ cos α − ∆δ sin δ sin α − 2 ( ) ∆δ 2 ∆δ 2 − cos δ sin α) + sin δ sin δ + ∆δ cos δ − sin δ 2 2 ∆δ 2 ∆δ 2 ∆α2 cos2 δ − cos2 δ − sin2 δ 1− 2 2 2 (∆α cos δ)2 + ∆δ 2 . (1.8) This can be used to calculate the proper motion of a star, dΘ/dt, on the celestial sphere in terms of small changes in R.A. and Dec. (see Fig. 1.5). Figure 1.5: Proper motion of a star on the celestial sphere. Question: How does the proper motion of an object relate to its space velocity? 1.2.6 Great Circle A great circle is a circle which is the intersection of the sphere with a plane passing through the origin. Let n0 = (x0 , y0 , z0 ) be a unit vector perpendicular to the plane. CHAPTER 1. SPHERICAL ASTRONOMY 12 Then, all points (x, y, z) on the plane satisfy xx0 + yy0 + zz0 = 0 . (1.9) Substituting Eq. (1.1) into the above equation, we have an equation of θ and ϕ defining the great circle. For example, the ecliptic is horizontal in Fig. 1.2. Hence, the unit vector is vertical in the figure and is (18h , 66.5◦ ), for which the spherical coordinates are (θ, ϕ) = (23.5◦ , 270◦ ). This point is the ecliptic north pole, see below. The equation of ecliptic is 0 = − cos δ sin α sin 23.5◦ + sin δ cos 23.5◦ tan δ = tan 23.5◦ sin α . 1.3 (1.10) Other Celestial Coordinate Systems n0 Ecliptic r λ β P Celestial equator r’ rv Figure 1.6: The ecliptic coordinate system. The ecliptic coordinates are given by the ecliptic latitude β and ecliptic longitude λ. Their definitions are similar to those of declination and right ascension, but now referring to the ecliptic instead of celestial equator. In particular, β of a point P is the angle between the point and the ecliptic, positive for northern hemisphere and negative for southern hemisphere. λ is the angular distance between vernal equinox and the intersection point of the great circle passing through the point and the north ecliptic pole and the ecliptic, Fig. 1.6. For the point P , use the dot product, cos(90◦ − β) = r · n0 sin β = −y sin 23.5◦ + z cos 23.5◦ = − sin 23.5◦ cos δ sin α + cos 23.5◦ sin δ . (1.11) CHAPTER 1. SPHERICAL ASTRONOMY 13 The component of r perpendicular to n0 is r−(r·n0 )n0 . Hence, r′ is the unit vector along this direction, r′ = (cos δ cos α, cos δ sin α, sin δ) − sin β(0, − sin 23.5◦ , cos 23.5◦ ) . the magnitude (1.12) The Cartesian coordinates of the vernal equinox are rv = (1, 0, 0), and cos λ = rv ·r′ , which is just the x-component of r′ , cos λ = cos δ cos α . |(cos δ cos α, cos δ sin α + sin β sin 23.5◦ , sin δ − sin β cos 23.5◦ )| (1.13) The readers could work out the expression in the denominator, which is not very illuminating. Another common coordinate system is the Galactic coordinate system, where the defining great circle is the Galactic plane, with the north Galactic pole at α = 12h 51.4m and δ = 27◦ 7′ and the zero point of Galactic longitude at the direction of Galactic center (17h 45.6m , −28◦ 56′ ). Galactic coordinates are expressed in (l, b), where l is the Galactic longitude b is the Galactic latitude. Finally, there is also Supergalactic coordinate system, which is used in the studies of nearby galaxy clusters, including the Virgo Supercluster. 1.4 Limitations of Coordinate Systems As a final remark, we discuss some limitations of the coordinate systems. It is actually more complicated to specify the position of objects in the sky. First of all, stars are moving in space. But they are very far away, hence, the proper motions are generally small, typically in the order of millisecond per year. Even if the objects are not moving, their apparent positions in the sky are changing at different times of a year due to the motion of the Earth. The two major causes are aberration of light due to the finite speed of light and parallax for nearby objects. 1.4.1 Precession The major problem with the equatorial coordinate system is that the vernal equinox is not fixed with respect to the stars. It is constantly moving due to precession of the Earth’s rotational axis. This is caused by the fact that the Earth is not a perfect sphere but has a larger diameter at the equator than at the pole. The gravitational pull of the Sun and the Moon on the near and far sides of the Earth are thus different, causing a torque perpendicular to the rotational axis. (You can find more details at http://courses.physics.northwestern.edu/Phyx125/Precession of the Earth.pdf) CHAPTER 1. SPHERICAL ASTRONOMY 14 The precession of the Earth has a period of 26,000 years, which means the celestial north or south pole, traces out a circle with that period. 13,000 years from now, the Earth spin axis will be 47◦ away from the Polaris. Note that direction of precession is opposite to the rotation of the Earth. As a result, the intersection between the ecliptic and the celestial equator, i.e. the equinoxes are constantly shifting westward about the ecliptic pole, with a period of 26,000 yr, i.e. about 1.38◦ per century. This is also the reason why the zodiac signs start with Aries, since the vernal equinox was in Aries when the constellations were introduced in the past, and only moved into Pisces in 67 B.C. Because of precession, when we talk about the equatorial coordinates, we also need to specify the date and time used for comparing star coordinates, the epoch. The standard epoch commonly used now is the beginning of the year 2000, denoted J2000.0. In some old books, you could find the epoch J1975.0 or B1950.0. In professional observatories, the exact epoch, i.e. the observation date, is needed in order to point the telescope to the right direction at very high accuracy. The changes in equatorial coordinates relative to J2000.0 can be approximated by ∆α = M + N sin α tan δ ∆δ = N cos α , (1.14) (1.15) where M = 1.2812323T + 0.0003879T 2 + 0.0000101T 3 N = 0.5567530T − 0.0001185T 2 − 0.0000116T 3 . M and N are in degrees and T = (t − 2000.0)/100 with t in fractions of a year. Question: What is the effect of precession on the seasons? 1.4.2 Aberration of Light Imagine sitting inside a moving vehicle on a rainy day, the rain drops would appear to travel down at an angle, even there is no wind. Same is true for light, due its finite speed and the observer’s motion. This effect depends on the position of the sky and time of the year. For example, there is no aberration along the direction of motion, and it is maximum when perpendicular. The observer’s motion can be due to: 1. the Earth’s rotation, 2. the Earth’s orbital motion around the Sun. 3. the Sun’s motion around the Galaxy. We can estimate the maximum displacements using classical mechanics since v ≪ c, although strictly speaking, relativity is needed. Aberration caused by the Earth’s rotation is called diurnal aberration. It changes every day, but the magnitude is relatively small, because even at the equator, the rotation velocity is only 460 m/s. CHAPTER 1. SPHERICAL ASTRONOMY 15 θ d Earth v Sun Figure 1.7: Aberration of light. 1AU Figure 1.8: parallax. Therefore, the maximum shift is δθ = v/c ≈ 0.3′′ . On the other hand, annual aberration is caused by the motion of the Earth around the Sun, which has a high velocity of 29.8 km/s, and it has a period of 1 year. Hence, δθ = 20.5′′ . It is interesting to note that this motion is always perpendicular to the Sun, therefore, the Sun always appears to be 20.5′′ off from its true position. Finally, motion of the Sun in the Galaxy results in secular aberration, which is of the order of arcminutes. However, the Sun takes 230 million years to revolve around the center of the Galaxy. In practice, this aberration never changes and hence it is often ignored. 1.4.3 Parallax Parallax arises because stars are not at infinite distance. The change of the observer’s viewpoint, mostly due to the motion of the Earth around the Sun, causes a nearby star appears to move relative to distant objects at the background. We can turn it around to use the parallax to determine distance, which is the most fundamental parameter we wish to know, but also the most difficult one to measure. 1 parsec is defined as the distance that gives an annual parallax of 1′′ . Hence, from Fig. 1.8, tan 1′′ = = d = 3.26 light-year. (1.16) (Remember that 1◦ = 60′ and 1′ = 60′′ . How large is an arcsecond? Put a 10c/ coin across the harbour. The diameter as seen from campus is about 1′′ !) CHAPTER 1. SPHERICAL ASTRONOMY 16 The nearest star, Proxima Centauri1 , has a parallax of 0.7687′′ , therefore, its distance is pc. Stars in the Milky Way have distances from a few pc to kpc. Our Sun is 8.5 kpc from the Galactic center and the Milky Way has a diameter of 30 kpc. The Andromeda Galaxy is 0.78 Mpc away and other galaxies are over Mpc away. The observable Universe has a radius of 15 Gpc. The Hipparcos satellite from ESA was able to measure the position and parallax of 118,281 stars in the solar neighborhood brighter than magnitude 9, providing valuable information on their distances. It will be succeeded by the Gaia satellite, which was just launched to L22 on 2014 Jan 8 and it will measure the distance to 1 billion objects. 1 Scientists recently discovered an Earth-like planet orbiting Proxima http://www.nature.com/nature/journal/v536/n7617/full/nature19106.html. 2 We will talk about the second Lagrangian point, L2, in Chapter 3. Centauri Chapter 2 Light and Telescopes (Chapters 3.2, 3.3, 5 and 6 in textbook.) Although we can now go to the Moon and bring back samples of soil and rocks, we can still only study other objects by investigating their radiations; could it be neutrinos, charged particle cosmic rays, gravitational waves, or electromagnetic waves. Among them, the electromagnetic waves, or EM waves for short, is dominating. In this chapter, we will discuss the properties of EM waves and the detection methods. 2.1 Electromagnetic Wave EM waves are oscillations of the electric and magnetic fields, Fig. 2.1. It can be produced by the acceleration of charged particles, and in turn, EM waves, or in general electric field and magnetic field will affect the motion of charged particles. They have no effect on neutral particles. Vertical polarization Electric field direction of propagation direction of propagation Magnetic field Horizontal polarization λ Figure 2.1: Properties of electromagnetic waves. Light, radio waves, infrared, ultraviolet, X-rays and gamma rays are EM waves of different frequencies. Like other kinds of waves, the three fundamental properties of 17 CHAPTER 2. LIGHT AND TELESCOPES 18 EM waves are the speed (the speed of light is usually denoted by c), the frequency, f , and the wavelength, λ. They are related by c = fλ . (2.1) For EM waves in vacuum, the speed of light is independent of frequency (no dispersion), and is equal to 2.99792458 × 1010 cm/s. This value is exact in the sense that we define the length of one meter by this value and the definition of time (which is defined by the transition of some atoms). The constant speed of light is the starting point of special relativity. EM waves can have a very wide range of wavelengths, from shorter than an atom or as long as the size of the Universe. Radio waves are about 1 mm to 100 m, then followed by microwave, and then infrared IR and visible light. The wavelengths of visible light are from 400 nm to 700 nm. Our atmosphere is transparent to EM waves in radio and over a few windows in IR and visible light. The most important window to human eyes is the optical window between 300 nm to 1100 nm. The atmosphere is opaque to all shorter wavelength EM waves. For example, if we want to carry out X-ray (wavelengths from about 10−7 m to about 10−9 m) or gamma ray (wavelengths less than about 10−10 m) astronomy, we have to get above the atmosphere, for example, from a satellite. Apart from the three basic properties, different EM wave can also carry different polarization, which is the direction that the electric field in the EM wave points, Fig. 2.1. The direction of polarization must be perpendicular to the direction of propagation. Thus, there are two kinds of polarizations. The EM wave from a source will in general contain a mixture of waves with different directions of propagation, different wavelengths and different polarizations. When EM waves propagate, energy is transported from one place to another. The amount of energy radiated by a star per unit time is called the luminosity, which in units of erg /s. Then at the observer, the amount of energy received per unit area per unit time is called the intensity or energy flux, which is in units of erg /s /cm2 . Notice that flux describes the energy received and it can be understood as the brightness of an object. If the source is far away, even it radiates enormous amount of energy, the intensity observed could be low. To be more specific, consider a sphere with distance d from the source, the total surface area is 4πd2 . Hence, the total energy passing through the sphere per unit time is flux F times the area, i.e. F × 4πd2 , which should be equal to the total energy emitted per unit time, i.e. the source luminosity L. Mathematically, F = L . 4πd2 (2.2) See Fig. 2.2 below. (What will be the relation looked like in a 2-D universe?) So far we have discussed the wave nature of the EM wave, but in some situations, we find that EM waves behave as particles. For example, atoms can only absorb CHAPTER 2. LIGHT AND TELESCOPES sphere area 19 Intensity at surface of sphere source strength Figure 2.2: Inverse square law. one, two or any integral multiple of certain unit of light. We call the basic unit photon and said that atoms can only absorb one photon or two photons, etc, but not half photon. (Of course, atoms can absorb no photon at all.) The wave nature of EM wave is the collective behavior of a lot of photons. The energy, E, of each photon is given by E = hf (2.3) where h is the Planck’s constant, with value about 6.63 × 10−27 erg s= 4.14 × 10−15 eV s, and f is the frequency of the photon. Higher the frequency, higher the energy of each photon has. We find that particle nature of light is prominent when the frequency is high. Thus, for gamma rays, we will often speak of the photons, but for radio waves, we often think them as waves. 2.2 Magnitudes The brightness of objects in the sky varies a lot. For example, Sirius, the brightest star (apart from the Sun), is about 500 times brighter than the dimmest stars we can see with naked eyes. Therefore, if we use the values of intensity to describe their brightness, we have to write a lot of zeros. We use a log scale instead. Examples of log scales include the Richter scale for earthquakes and decibel (dB) for sound intensity. The visual magnitude or apparent magnitude m of a star tells the brightness the star as we see it. This system dates back to Hipparchus in ancient Greece. In modern terms, stars visible to human eyes are classified into 6 magnitudes from the brightest (m = 1) to the faintest (m = 6), and an m = 1 star is 100 times brighter than an m = 6 star. In other words, if star A is 100 times brighter than star B, the magnitude of A is 5 units less than the magnitude of star B. Thus, a bright star has a smaller or even negative magnitude number. Exercise: From the definition above, show that each grade of magnitude is about CHAPTER 2. LIGHT AND TELESCOPES 20 2.5 times brighter than the next one. What is the relation between the intensity and magnitude of a star? Similar to above, one can show that ( ) 5 I m = − log10 , (2.4) 2 I0 where we arbitrarily choose one fixed intensity I0 as reference. We thought the star Vega had constant brightness and chose it as the reference. Hence it had magnitude zero. However, we later found out that it is in fact a variable star, and we have abandoned it as the reference. Its average magnitude is about 0.03. We don’t have such “standard” reference star anymore and I0 is just a number. The apparent magnitude of the Sun is about −26.8, the full Moon about −12, Sirius about −1.3, the naked eye limit about 6 and the dimmest image taken by the Hubble Space Telescope is about 30. The actual situation is more complicated: one star may be brighter than another in the blue band, but fainter in red. Therefore, if you look up a star in research papers or star catalogues, you may find its magnitude specified in different bands, e.g. visual, blue, red, or IR. They are denoted by U, B, V, R, I, Z, etc. The most common one is the V (meaning visual) band magnitude, which centers on yellow color (551 nm), close to the peak response of human eye response. If a star is moved from distance d to D, then md will change to mD , where ( ) d mD = md − 5 log . D (2.5) Note that if D is large, then d/D is small, log(d/D) is negative and mD is large. This only means that an object farther away is dimmer. If we know the distance to a star and have measured its apparent magnitude, we want to compare its intrinsic brightness (i.e. luminosity), then it is more convenient to convert to a standard distance of 10 parsec. This is the definition of the absolute magnitude M . Hence, it follows from Eq. (2.6) that the apparent and absolute magnitudes are related by ( ) d M = m − 5 log . (2.6) 10 pc The difference m − M is called the distance modulus. Type Ia supernovae have absolute magnitude of M = −19.3. This has been used to determine the distance to their host galaxies, leading to the discovery of acceleration in the Universe expansion. This was awarded the Nobel Prize in 2011. Finally, we note that in scientific measurements, the apparent magnitude has to be corrected for the absorption through the atmosphere. Additionally, absorption from the interstellar medium also need to be accounted for in the calculation of absolute magnitude. CHAPTER 2. LIGHT AND TELESCOPES 21 Exercise: The apparent magnitude of the Sun is −26.8, what is its absolute magnitude? Which one is more luminous when compared with Sirius (M = 1.4) and Vega (M = 0.6)? 2.3 Spectrum, Spectral Lines, and Atoms It was discovered by Newton that white light is composed of all the colors of the rainbow. The image of this range of colors is called the optical spectrum. The entire electromagnetic spectrum is much wider than the optical spectrum, including the infrared, ultraviolet, etc. In astronomy, spectrum also means the graph of intensities versus the frequencies or wavelengths. A typical spectrum is shown in Fig. 2.3. This graph tells us a lot about the nature of the source of the radiations. What should be the unit of the y-axis in the figure? Since the integrated area under the curve is flux, the y-axis is the flux density (i.e. flux per unit frequency), and it has units of erg/s/cm2 /Å for optical spectrum. For radio spectrum, it is usually flux per Hz, and per keV for X-ray spectrum. Intensity Intensity Frequency f1 f2 Frequency Figure 2.3: A continuum spectrum (left) and a spectrum with spectral lines (right). Showing on the left of Fig. 2.3 is a continuum spectrum, since the variations of the intensity is smooth. Usually, a spectrum contains some abrupt changes in intensities. This is a spectrum with spectral lines, or simply line spectrum as shown in the right of Fig. 2.3. Here we can see two kinds of spectral lines. The line at frequency f1 is called an absorption line because radiation at this frequency is absorbed and hence the intensity is lower than the underlying continuum. The line at frequency f2 is an emission line. A German physicist Gustav Kirchhoff studied the formation of spectral lines and summarized into Kirchhoff ’s three laws of spectroscopy: CHAPTER 2. LIGHT AND TELESCOPES 22 1. A hot, dense gas or hot solid object produces a continuous spectrum with no dark spectral lines. 2. A hot, diffuse gas produces bright spectral lines (emission lines). 3. A cool, diffuse gas in front of a source of a continuous spectrum produces dark spectral lines (absorption lines) in the continuous spectrum. To derive these laws, we have to know some properties of atoms. In conditions common to us, everything is made up of atoms. The classical picture of an atom is shown in Fig. 2.4. The nucleus contains protons, which is positively charged, and neutrons, which is electrically neutral. Almost all mass of an atom concentrates at the nucleus. There are usually several electrons going around the nucleus. Electrons are negatively charged. For a neutral atom, the number of electrons is equal to the number of protons in the nucleus. There are about 110 different kinds of atoms, the simplest being hydrogen, with only one proton in the nucleus. Hence, neutral hydrogen has one electron. Figure 2.4: A classical picture of atom with the nucleus at the center and several electrons orbiting around it. Quantum mechanics, which is the theory for atomic or smaller systems, tells us that electrons can only be in certain configurations relative to the nucleus. These allowed configurations are called states. Each state corresponds to some definite energy. After solving the Schrödinger equation, it can be shown that the possible energies of a hydrogen atom are me e 4 1 2h̄2 n2 me e 4 1 = − 2 2 2 8h ϵ0 n En = − in cgs (2.7) in MKS, (2.8) where h̄ ≡ h/2π, me is the electron mass, e = 4.8 × 10−10 esu is its charge in cgs unit (or e = 1.6 × 10−19 C in MKS), ϵ0 is the vacuum permittivity and n is any positive integer. Different n labels different state. Numerically, En = − 13.6 eV, n2 (2.9) where eV is an unit for energy with 1eV = 1.6×10−12 erg. With the leading negative sign, higher states (larger n) have higher energies (less negative). The least energy state (n = 1) is called the ground state and others are called the excited states. Since electron can only be in those states, when it jumps from one state to another, it can only emit or absorb energy equal to the difference between the energies of CHAPTER 2. LIGHT AND TELESCOPES 23 the two states. For example, if the electron of a hydrogen atom in the excited state n = 5 jumps to a lower state n = 3, it will emit energy E = = eV. (2.10) By Eq. (2.3), the photon emitted is of frequency 2.3 × 1014 Hz, which is in the infrared. If it jumps to a higher state, it has to absorb energy. In general, the energy differences between states of hydrogen atom is given by ) ( 1 1 − (2.11) En − Em = −13.6 eV n 2 m2 for various positive integers n and m. low density gas continuum absorption line spectrum emission line spectrum Figure 2.5: Radiations with continuum spectrum passing through low density gas. It depends on viewing angle whether the observer sees an emission or an absorption spectrum. We can calculate the wavelength of the photon emitted by E = hf = hc/λ, such that ( ) ( ) 1 2π 2 me e4 1 1 1 1 = − ≡ R∞ − , (2.12) λ h3 c m2 n 2 m2 n 2 where R∞ = 1.097 × 105 cm−1 is called the Rydberg constant. It corresponds to the transition from m = 1 to n = ∞. The transition lines of hydrogen is particularly important, since H is the most abundant element in the Universe. The transitions from n ≥ 2 to n = 1 is called the Lyman series, with n = 2 → 1 called the Ly-α, n = 3 → 1 called the Ly-β, n = 4 → 1 called the Ly-γ, and so on. The Balmer series is transitions from n ≥ 3 to n = 2, i.e. n = 3 → 2 called the Balmer-α, n = 4 → 2 called the Balmer-β, etc. These are also called the Hα, Hβ, Hγ, etc. For completeness, the transitions to n = 3 is called the Paschen series, to n = 4 is called the Brackett series, and to n = 5 is called the Pfund series. CHAPTER 2. LIGHT AND TELESCOPES 24 Exercise: What are the wavelengths of Hα and Hβ? Which part of the EM spectrum (or color) do they correspond to? Hence, why are they so important? Now, it is easy to understand Kirchhoff’s laws. The continuous spectrum comes from blackbody radiation emitted by any objects with temperature above absolute zero. We will discuss more on the blackbody radiation in Chapter 4 later in this course. Emission lines are produced by electrons making downward transition (“falling”) from a higher orbit to a lower orbit. When radiations with continuum spectrum pass through gas of atoms in low pressure, those atoms will absorb photons with energy equal to the differences of energies of their states. The atoms will be excited. When they fall back down, it will produce emission lines, but the photons emitted will travel in all directions. As a result, depending on the view point, the observer will see emission or absorption lines, Fig. 2.5. Note that different atom has different set of states and hence, spectral lines. The spectral lines of hydrogen correspond to energies given by Eq. (2.11). We can tell from the spectrum of a star which elements are present on the outer atmosphere of the star. Not just atoms, molecules also have their own sets of spectral lines. We can find all those lines in laboratories on the Earth. However, the spectral lines observed of some star are often shifted to some other wavelengths, due to the relative motion of the star and the Earth. This is the Doppler effect. Unlike the classical case, EM wave does not require a medium to popagate, Doppler effect is a result of time dilation according to special relativiity. The change in wavelength, called the redshift is given by √ ∆λ 1 + v/c z= = − 1, (2.13) λ0 1 − v/c where ∆λ is the observed wavelength minus the original wavelength, λ0 , v is the relative speed between the source and the observer, and c is the speed of light. the motion of the star affects all of its spectral lines with the same factor β. Note that for v ≪ c, z ≈ v/c. This is an extremely useful technique in astronomy for velocity measurements, leading to all kinds of important discoveries, including extrasolar planets, binary star systems, rotation of galaxies (hence dark matter), and the expansion of the Universe (hence big bang and dark energy). CHAPTER 2. LIGHT AND TELESCOPES 2.4 2.4.1 25 Optics and Telescopes Basics Astronomical telescopes have only one main purpose: to collect more light, not to magnify or to focus. Therefore, larger is better. Not all telescopes can focus light. Since high energy radiation can penetrate into materials, it is very difficult to deflect the X-ray photons (need grazing at shallow incidence angles) and impossible to focus gamma rays. Therefore, all gamma-ray telescopes and some X-ray telescopes are non-focusing. (What is the advantage of focusing telescopes?) In this section, we mainly discuss optical telescopes, since they are most common type and have the longest history. However, the general principles apply to all kinds of telescopes. A large collecting surface enables us to detect dimmer objects. In the dark adapted condition, the pupil of a human eye can relax to about 7 mm in diameter and stars of magnitude 6 can be seen. If a telescope of diameter 20 cm is used, the intensity of stars is amplified by a factor of (200/7)2 = 816. Thus, by Eq. (2.4), through this telescope, stars of magnitude 6 + 52 log10 (816) = 13 can be seen. In general, if the diameter of the telescope is D mm, the dimmest star can be detected is of magnitude ( )2 ( ) 5 D D 6 + log10 = 6 + 5 log10 . (2.14) 2 7 7 This is a rough estimate. Many other factors affect what we can see. Also, if we use other detectors, like CCD, instead of our eyes, we usually can detect much dimmer objects (why?). One common misconcept is to ask “how far I can see with this telescope?” This is not a well-defined question. Provided that the object is bright enough, no matter how far, we can still see that object. If it is very dim, we cannot see it even if it is near us. Another common mistake is paying too much attention on the “maximum magnification” of a telescope. focal length focal length Figure 2.6: Principles of refracting (left) and reflecting (right) telescopes. CHAPTER 2. LIGHT AND TELESCOPES 2.4.2 26 Refracting telescopes There are two ways to bend lights: refraction with lenses or reflection with mirrors. Therefore, there are three kinds of telescopes. When EM wave enters a medium, its speed will change. (If it enters the medium from vacuum, its speed must slow down. Nothing can travel faster than the speed of light in vacuum.) If the light ray enters the medium at an angle, the ray will be bent. This is refraction. It can be described by the Snell’s Law sin θ1 n2 = , (2.15) sin θ2 n1 where n1 and n2 are the indices of refraction of the two medium, and θ1 and θ2 are the incident and refraction angles, respectively, measured from the normal. For example, air has n = 1.0003 and water has n = 1.33 relative to vacuum. The main component of a refracting telescope or a refractor is a lens, which focuses the light rays by refraction (Fig. 2.6 left). The distance between the focus and the lens is called the focal length. The refractors have two main disadvantages. First, there is dispersion: in medium other than vacuum, EM waves with different frequencies travel in different speeds. Dispersion leads to chromatic aberration in refractors. It means that light rays of different colors (different wavelengths) focus at different points. For example, if we pick the best focus for green color, the resulting image will have a red halo around it. This problem can be mitigated with a long focal length or using a lens system of two or more lenses, but then the optics is complicated and the cost will increase. An achromatic lens uses two lens elements made of different materials. It can focus two colors to the same point. An apochromatic (“APO”) lens can bring three colors to the same focus. A much bigger problem is that refractor requires a large piece of perfect lens. It is extremely difficult to manufacture a large piece of glass without bubbles in it. Even if that can be done, the lens is usually too heavy that it would deform differently under self gravity when the telescope points at different direction. As a result, all modern telescopes for astronomical research are reflectors. The largest refractor still in use today has a diameter of 102 cm. 2.4.3 Reflecting and catadioptric telescopes If a thin layer of metal is deposited onto a polished glass surface, the reflectance will be increased, and we have a mirror. A reflector to converge the light rays to the focus, Fig. 2.6 right. The surface of the mirror can only deviate from the desired shape by one quarter of the wavelength. This is about 100 nm for visible lights (2 cm for radio waves). Since we only use one surface of the glass, defects in the bulk are irrelevant. Also, the angle of reflection is the same for all colors, there is no chromatic aberration. One additional advantage is that we can support CHAPTER 2. LIGHT AND TELESCOPES 27 the mirror from “behind,” not just the edge as in refractors. Thus, we can built very large reflectors. The largest is 10 m in diameter. To focus the light rays from infinity (i.e. parallel incident rays) into a point, the shape of the mirror has to be a parabola. (do you know how to prove that mathematically?) This kind of mirror is much more difficult to polish than a spherical one. Using a spherical mirror will result in spherical aberration. The reflectors are not without shortcomings. The focus in Fig. 2.6 is in front of the mirror. We have to divert the light rays to a position convenient for viewing. Usually, a secondary mirror will be introduced. Two possible configurations are shown in Fig. 2.7. Fig. 2.7(a) is the Newtonian design, while in Fig. 2.7(b), a hole is opened at the center of the main mirror and is called the Cassegrain design. These designs introduce some obstructions to the light path. The obstruction will scatter light, and hence the image produced by a perfect reflector is not as sharp as the image by a perfect refractor with a lens of same diameter. (a) (b) Figure 2.7: Two possible focus arrangements for reflectors. The third type of telescope is called catadioptric telescope. It is a hybrid design using both lenses and mirrors. One design very popular among the amateur astronomers is Schmidt-Cassegrain, because of its compact size. Some designs use a spherical primary mirror, which is easy to manufacture, with a corrector plate in front. 2.4.4 Magnification and resolution Another important component of an optical telescope is the eyepiece. This is not necessary if CCD or photographic film is used to record the image, but is critical for visual observations. The simplest design of an eyepiece is just a lens. It is usually put at a position that the distance between the eyepiece and the objective (would it be the main mirror in reflectors or the main lens in refractors) is equal to the sum of their focal lengths. It is shown in Fig. 2.8 that object with angular size θ1 will have an image of angular size θ2 . Thus, the angular magnification, or just CHAPTER 2. LIGHT AND TELESCOPES f 28 f 1 2 θ1 θ2 Eyepiece Objective Figure 2.8: The magnification of a telescope is determined by the focal lengths of the objective and the eyepiece. magnification, of the telescope is magnification = θ2 f1 = . θ1 f2 (2.16) Exercise: Using the small angle approximation tan θ ≈ θ for small θ, prove the equation above. Changing the magnification is accomplished by simply changing the eyepiece with a different focal length. Even for the largest telescopes, the magnification used is seldom over 500, usually between 100 and 200. It is because a large image given by the high magnification is usually very fuzzy. As we will see below, this is due to the Earth’s atmosphere most of the time. Even without considering atmospheric effects, theoretically is there a physical limit on the angular resolution? Due to diffraction of light, a point source, like a star, will not be focused to an infinitely small point by a telescope. The best image is actually a blurred disk, called the Airy disk. If two stars are very close to each other, their Airy disks could overlap and the observer cannot tell if it is one star or two. We say that the two stars cannot be resolved. For a circular aperture, the blurring produced by diffraction limits the angular resolution to an amount given by the Rayleigh criterion: θdiffraction limit = 1.22λ D (2.17) where D is the diameter of the objective of the telescope and λ is the wavelength of the light. (The value of θdiffraction limit is in the unit of radian.) For example, for yellow light, λ = 600 nm, if D = 20 cm, the angular resolution is 0.75′′ , which means CHAPTER 2. LIGHT AND TELESCOPES 29 we would see as one star if in fact the two stars are separated by less than 0.75′′ , even in ideal conditions. (The numerical value given in Eq. (2.17) is in the unit of radian, where 0.75′′ is in arcsecond. How to convert one to another?) (a) (b) Figure 2.9: In (a), the Airy disks produced by a large telescope are small. The two stars can be resolved. In (b), the Airy disks of the same pair of stars by a small telescope are larger. The two stars cannot be resolved. Exercise: Sirius is the brightest star in the sky (except the Sun) it is at a distance of 2.64 pc and a diameter of 3.4 solar radius (i.e. 2.4×106 km). How large a telescope is needed to resolve its image in visible light (λ = 555 nm)? So practically we can treat stars as unresolved point sources. If we use a short focal length eyepiece to obtain high magnification, the image of the Airy disk will be magnified. This will blur the whole image and degrade the quality. In some department stores, they advertise their telescopes by claiming a high magnification, say 600 or more. This is an unsound claim and their telescopes can only be treated as toys. By Eq. (2.17), it seems that we can obtain high resolution by increasing the size of our telescope. This is true up to a point. For ground based observers, star light has to pass through the atmosphere and it acts as a large refractive medium. The air in the atmosphere is constantly moving and the image of a star will dance around, like the bottom of a swimming pool when viewed above the water. The effect is called the seeing. The seeing limit is usually about 1′′ . Thus, even for a small telescope like the 20 cm we talked above, its resolution is limited by the seeing, not the diffraction of its optics. This is one more reason why a high magnification is not useful. However, we can go up above the atmosphere to avoid the bad seeing, for example the Hubble Space Telescope. Many ground-based telescopes, including the biggest ones, are seeing-limited and their resolution is nowhere near the diffraction limits. This is why some astronomers advocate to build more medium-size telescopes, which are cheaper, instead of extremely large ones. CHAPTER 2. LIGHT AND TELESCOPES 30 For radio observations, since the wavelength of radio waves is about 105 times that of visible light, we have to build an enormous telescope to obtain the same resolution. One remedy is to use computer to recombine the signals from several radio telescopes. The several radio telescopes will function as parts of the dish of an imaginary radio telescope, with effective diameter equal to the distance between the actual radio telescopes. (We call the objective of a radio telescope a dish.) This is called the radio interferometry. This technique has been applied to optical as well, but it is slightly different due to the much higher frequency of visible light. 2.4.5 Lens speed For unresolved objects, such as a distant stars or quasar, all incident light rays are focused into a point, hence, the brightness depends on the lens diameter. However, for extended objects, e.g., the moon, planets, nebulae, or nearby galaxies, the light coming out from the telescope spreads over some area. Therefore, the surface brightness depends on the magnification. One important parameter to consider is the f-number (or f-ratio) of a telescope or a lens, which is defined as f-number ≡ f , D (2.18) where f is the focal length and D is the diameter of the lens. Note that a larger f-number means a smaller diameter. For photographic lenses, the f-number can be √ adjusted in steps of 2, in the series of 1, 1.4, 2, 2.8, √ 4, 5.6, 8, 11, 16,... Every successive step reduces the lens’ effective diameter by 2, such that the amount of light passing through is reduced by half. Exercise: For a resolved object with angular size α (in radians), show that the physical size of its real image by a telescope of focal length f is αf . The amount of light collected is proportional to D2 , from the exercise above, we know that the light collected is spread out over an area ∝ (αf )2 . As a result, the surface brightness I of the image depends on ( )2 ( )2 1 D = . (2.19) I∝ f f-number We see that a smaller f-number gives a brighter image (i.e. with larger surface brightness), since the lens aperture is larger. Telescopes or lenses with smaller f-numbers are therefore referred to as having a “higher speed” or “faster”. CHAPTER 2. LIGHT AND TELESCOPES 31 Exercise: Comparing between the lens on a cell phone with an f/2.2 and the primary mirror of the Hubble Space Telescope, which has f/24. Which one gives a brighter image? Which lens/mirror is “faster”? Finally, we should mention that beside the optics, the mount of a telescope is also very important. Not only must it support the optics, it must also track the stars across the sky. As we have discussed in previous chapter, due to the rotation of the Earth, stars and every object move on the celestial sphere. There must be mechanism to turn the telescope such that the image of the stars is fixed for us or the detectors. 2.5 CCD CCD is Charge Coupled Device. It is a semiconductor device with the appearance similar to an ordinary computer chip. On the top of the chip, there is a window, allowing light to go in. After applying the power, each element of the device will convert photons to electrons. The number of electrons released is proportional to the number of photons hit the device. Hence, by reading the amount of charge, we can tell the intensity of the light source. A typical CCD consists of an array of light detecting elements, pixels, usually in the range of 768×512 to 2048×2048. Thus, we could form a picture of such resolutions. The size of one pixel depends on the model, with a typical value of 9µm × 9µm. For a 1024 × 1024 CCD, the light detecting area is about 1 cm by 1 cm, which is less than the size of photographic films. This is one of the disadvantages of CCD as compared with films. The other disadvantages are low resolution and that only black and white pictures can be taken, because CCD only detects the intensity, not the color of the light source. There are two methods to obtain a color photo. One is called the tricolor photo. The observer takes three photos with three red, green and blue filters. Then, combines the three photos into one with the help of computer software. The second method is to use a color CCD. In which three pixels form a group. Each pixel in a group is covered by either red, green or blue filters, and the electronics of the CCD would combine the data to output a color photo. There seems to be many disadvantages of CCD, but there is one overwhelmingly advantage. The quantum efficiency of a professional grade CCD could go up to 80%, which means that it can detects most of the photons, as compared with only 2% to 4% of photographic films, and 1% of human eyes. In astronomical applications, almost all objects in the sky are very dim. A high efficiency device will greatly reduce the exposure time. Note that the quantum efficiency is frequency CHAPTER 2. LIGHT AND TELESCOPES 32 dependent. It is lowest in the blue. As mentioned, for an object with angular size α (in radians), the physical size of its real image by a telescope of focal length f is αf . For example, the image of the Moon (angular size of about 0.5◦ ) of a telescope of focal length 2 m is about . The CCD must have size larger than this to cover the whole Moon. To match the resolving power of the telescope and the CCD, two pixels should cover the angular resolution. Hence, for example, if the angular resolution is about 1′′ and the pixel size is 9 µm, then the matching focal length is = 3.7 m . (2.20) Astronomical CCDs are usually cooled to low temperature to avoid thermal noise. Chapter 3 Celestial Mechanics (Chapter 2 in textbook.) Since we believe that the laws of physics we developed on Earth should hold anywhere in the Universe, the motions of celestial bodies should be governed by classical mechanics, which was mainly developed by Newton. In this chapter, we will review the basics of the theory of gravitation, then apply it on the two-body problem. Why study the two-body problem? This is the simplest case of celestial motion that can be solved analytically. Also, it is very useful for describing the motions of objects in a binary system or planetary system. 3.1 Newton’s Laws of Motion We will briefly review Newton’s laws in this section. The readers are assumed to know the material well. This section only serves as a reminder. The first law of mechanics describes the resistance of matter to change in its state of motion: A body in motion will remain in motion, unless it is acted upon by some external force. Newton’s formulation of the second law is the familiar F = m a = m v̇ (3.1) where F is the force vector, m is the mass of an object and a is the acceleration vector. The mass in this equation is the inertial mass, which relates the response of the body to external force. The acceleration is the rate of change of the velocity. Velocity describes both the speed and the direction of the motion. Thus, sometime the acceleration is non-zero even if the speed of the body remains constant. 33 CHAPTER 3. CELESTIAL MECHANICS 34 Eq. (3.1) is valid only in an inertial frame, which is any frame at rest or in constant velocity with respect to the fixed stars. If we are careful enough, we can find that the “rest” frame on the Earth is not an inertial frame by experiments. This concept of inertial frame becomes very important when we discuss special relativity. The third law states that whenever there is an action, there will be an equal in magnitude but opposite in direction reaction. For example, we feel the gravitational attraction of the Earth pulling us down, at the same time, there is a force of the same strength pulling the Earth “up.” (Do you know how to prove Newton’s laws?) Momentum, or linear momentum, of a particle is defined as the product p = mv . (3.2) We found that in the absence of any external force, by Eq. (3.1), the total momentum of a system remains constant. This is the conservation of momentum. (Do you know how to prove?) The addition of velocities in classical mechanics is very simple. For example, if a train is moving with velocity vt relative to the station and a ball is moving with velocity vb relative to the train, then relative to the station, the ball is moving with velocity vt + vb . The kinetic energy of a particle is given by 1 2 p2 K. E. = mv = 2 2m (3.3) where v and p are the magnitude of velocity and momentum respectively. If we want to change the velocity of the particle, a force must act on it. The change of kinetic energy is equal to the work done W , which, for constant force, is the dot product of the force vector F and the displacement vector d of the particle W =F ·d . (3.4) Apart from the kinetic energy, another important form of energy is the potential energy. This is the energy associated with the configurations of the system. The most important example is the gravitational potential energy. For a particle with mass m at height h above some reference point on the Earth’s surface, the gravitational potential energy is U = mgh (3.5) where g is the free fall acceleration constant on Earth’s surface, g ≈ 9.8m/s2 . There are other kinds of energies, like the chemical energy or nuclear energy. If we sum up all kinds of energies in an isolated system, the total energy also remains constant. This is the principle of conservation of energy. Energy (strictly speaking should be mass-energy) cannot be created nor destroyed. It can be converted from CHAPTER 3. CELESTIAL MECHANICS 35 one form into another, and the total energy is always conserved. (Do you know how to prove?) Other than the linear motion, rotation of a body is another important subject in mechanics. We will mainly talk about the rotation of a particle on a plane around some point, the center. The angular position of the particle is the angle made by the line joining the center and the particle and some fixed reference line. The angular velocity is the rate of change of the angular position, usually denoted by ω and in the unit of radian per second. As the name implied, angular velocity also describes the direction of the rotation. Similarly, we define the angular acceleration. To Sun h R R θ Figure 3.1: Can we see sunset twice a day? Exercise: A simple investigation on the rotation of the Earth will tell us that we can indeed see sunset twice, or many more times, a day, Fig. 3.1. Everyone knows that if we are at a high mountain, the time of sunset will be later, because it takes more time for the Earth to rotate the Sun out of our sight. If we are of height h above the Earth surface, how much later will the sunset be? Referring to the figure, the angle θ is given by θ= (3.6) The time for the Earth to rotate this amount is ∆t = (3.7) Hence, if we substitute to R the radius of the Earth and take h as 1.7 m, the height of a typical adult, then ∆t = s. In order to see sunset twice, all you have to do is to sit down to see the sunset. After it just sets, stand up immediately. You can see the Sun sets again. The Sun has an angular diameter of 0.5◦ , just after sunset, how high you have to climb (assuming it takes no time) to make the Sun totally go above the horizon? Corresponding to the mass in linear or translational motion, we have moment of inertia I in rotational motion, defined as I = mr2 (3.8) CHAPTER 3. CELESTIAL MECHANICS 36 for a particle with mass m and a distance r from the center. The angular momentum L is defined as L = Iω = mr2 ω = r × p . (3.9) Obviously this depends on the choice of origin. Angular momentum is a very important quantity in many astrophysical situations. For example, when material collapses due to gravity, it always forms a disk first, because angular momentum is conserved and very hard to get rid of. This results in accretion disks, and it is also the reason why disk structure is seen everywhere in the Universe, from solar systems to galaxies. In spacecrafts, angular momentum gradually builds up through pointing. The angular momentum is stored in reaction wheels and eventually need to be cancelled through thrusters or other means. A force acting on a particle does not necessarily change its angular velocity; the tangential component of the force must be non-zero to do so. We define the torque τ as the product of the tangential component of the force and the distance of the application point from the center τ =r×F . (3.10) Newton’s second law applying to rotation becomes τ = dL , dt (3.11) that is the rate of change of the angular momentum equals to the torque. If the net torque is zero, the angular momentum remains constant. This is the principle of conservation of angular momentum. (Do you know how to prove?) The most important example is the systems of central force. In such a system, a particle is moving under the influence of a force which always points to or points away from a fixed point. Since the force is always radial, its tangential component and hence the torque are always zero. The angular momentum is conserved. Note that this depends on the choice of origin: angular momentum can be conserved about one point but not another, depending on the net torque (indeed it is ususally not conserved about other points). The kinetic energy of rotation is given by 1 K. E. = Iω 2 . 2 (3.12) Before closing this section, we will give an application of mechanics on the spacecraft trajectory Almost always, a spacecraft will visit a few major planets several times before it goes to its destination, which could be Saturn, for example. The major reason for such a visit is to accelerate the spacecraft, called gravity assist. The point is that if we can appropriately choose the trajectory of the spacecraft, the speed of the spacecraft relative to the Sun will be increased. A large amount of fuel can be saved. We now see how it could be in details. CHAPTER 3. CELESTIAL MECHANICS 37 m v1 M v’2 v2 m v’1 M m M u in center of mass frame Figure 3.2: The figure at the left shows the velocity of the planet and the spacecraft. The middle shows the point of view in the center of mass frame. In this frame, the Sun is moving to the left. After the encounter with the planet, the right figure shows that the spacecraft gains speed by gravity assist. Suppose a spacecraft of mass m is approaching a planet of mass M ≫ m. Their velocities are respectively v1 and v2 relative to the Sun, Fig. 3.2. To simplify the problem, we consider the center-of-mass frame, in which the total momentum is zero. We first determine the center-of-mass velocity u relative to the Sun, (M + m)u = M v1 − mv2 . (3.13) We have u= (3.14) Hence, the velocities of M and m in the center-of-mass frame are v1′ = v1 − u = (3.15) v2′ = v2 + u = (3.16) respectively. We can see that the total momentum in this frame is M v1′ − mv2′ = 0. Assume for simplicity that after they gravitationally interact, the directions of their velocities are perpendicular to the original velocities in the center of mass frame. (Do you know how to calculate their speeds if this assumption is not true? Hint: It depends on the angle of scattering.) The conservation of momentum and energy requires that their speeds do not change. (how to prove?) Transforming back to the frame in which the Sun is at rest, the velocity of the spacecraft is given by the vector sum of the velocity of the spacecraft in the center of mass frame and the velocity of the center of mass frame. We found that its speed increases √ √ v2′ 2 + u2 = (v2 + u)2 + u2 > v2 . (3.17) CHAPTER 3. CELESTIAL MECHANICS 38 Discussion: Assuming that the high temperature is not a problem, can a spacecraft use the Sun for gravity assist acceleration? 3.2 Newton’s Gravitation Over 300 years ago, Newton proposed a theory for gravitation, which essentially says that everything attracts everything. This theory not only explains the falling of an apple, but also the motions of planets around the Sun and even the motion of distant binary stars. The Newton’s law of gravitation states that every particle attracts any other particle with a force m1 m2 F = G 2 r̂ (3.18) r where m1 and m2 are the masses of the two particles and r is the distance between them. G is the gravitational constant, whose value is G = 6.67 × 10−8 cm3 g−1 s−2 . (3.19) This law can be derived from the general relativity in the small potential and low speed limit. Note that this law only holds between two particles, for extended objects we need to take the sum of every particles (see example below). The direction of the force on one particle is toward the other particle. Thus, the gravitational force tends to pull them together. This simple statement has great implications. Since we have not found any “anti-gravity,” the gravitational force cannot be canceled and is accumulative. A greater mass will create a greater force. In astronomical scale, gravitational force is the dominant force. A fine point about the masses in Eq. (3.18) is in order. To be precise, the mass in Eq. (3.18) is the gravitational mass, comparing with the inertial mass in Eq. (3.1). The gravitational mass is a property of the particle which describes the magnitude of its influence on other objects gravitationally. The inertial mass describes its response to force. These two kinds of masses need not to be the same. The fact that they are exactly the same is the called the equivalence principal and it is the starting point of general relativity. CHAPTER 3. CELESTIAL MECHANICS 39 A brief review on gravitational force, potential, and energy. Force F = ? GM m r2 Potential Energy −−−−−−−−−→ E= y per y ? = GM r2 Potential −−−−−−−−−→ V = Example: The gravitational potential due to a point mass M is easily deduced to be V (r) = −GM/r at a point of distance r from it. We now find out by direct integration the gravitational potential due to a thin uniform spherical shell of material. z P z0 1 0 0 1 θ y φ x Figure 3.3: The geometry of a uniform spherical shell of material. Let the surface mass density of the shell be ρ, the total mass of the shell be M = 4πR2 ρ, and its radius be R. We choose the coordinate system, Fig. 3.3, such that the point of interest P is on the z-axis with coordinates (0, 0, z0 ), where z0 could be greater than or less than R (outside or inside the shell). The surface element shown has area sin θ dθdϕ. Its distance from P is √ R2 sin2 θ + (R cos θ − z0 )2 . Hence, its contribution to the gravitational potential CHAPTER 3. CELESTIAL MECHANICS is 40 GR2 (ρ sin θ dθdϕ) dV = − √ . R2 sin2 θ + (R cos θ − z0 )2 (3.20) The gravitational potential is then given by ∫ ∫ GR2 ρ sin θ dθdϕ √ V = − R2 sin2 θ + (R cos θ − z0 )2 ∫ π sin θ dθ 2 √ = −2πGR ρ R2 − 2Rz0 cos θ + z02 0 ∫ π d(cos θ) √ = 2πGR2 ρ 2 R + z02 − 2Rz0 cos θ 0 ∫ 1 dx 2 √ = −2πGR ρ 2 R + z02 − 2Rz0 x −1 ( √ )1 −1 2 2 = −2πGRρ R + z0 − 2Rz0 x z0 . (3.21) −1 Be very careful on how we take the square root. From the very definition of potential, Eq. (3.20), we have to take the positive roots. If z0 < R, V = = = (3.22) which is independent of z0 . The potential is constant, the force is zero. If z0 > R, V = = = (3.23) As a result, outside the shell, gravitationally, it acts as a point mass. The gravitational potential energy in Eq. (3.5) is valid only near the Earth’s surface. For object above the Earth’s surface, the gravitational potential energy is given by U = −G M⊕ m r (3.24) where M⊕ is the mass of the Earth, m is the mass of the particle and r is the distance of the particle from the center of the Earth. By convention, the potential energy at infinity is zero. If the particle is near the Earth’s surface, r = R⊕ + h where R⊕ is the radius of the Earth and h is small, 1 1 = = r R⊕ + h ≈ . (3.25) CHAPTER 3. CELESTIAL MECHANICS 41 Eq. (3.24) becomes M⊕ m ≈ (3.26) r Up to an irrelevant constant term, the potential is in the form U = mgh. We can 2 numerically check that g is equal to GM⊕ /R⊕ . U = −G Imagine we throw a rock to the sky, the rock will fall back to the ground. However, if we throw it fast enough, it can escape the gravitational pull of the Earth and not return. The critical speed is called the escape velocity. By conservation of energy, it is easy to calculate the escape velocity. At the Earth’s surface, K.E. of the rock is 21 mv 2 and potential energy (P.E.) is −GM⊕ m/R⊕ . At infinity, both K.E. and P.E. are zero; 1 2 M⊕ m =0 mv − G 2 R⊕ v= ≈ km s−1 (3.27) Exercise: What are the escape velocities of the solar system and the Milky Way? As a warm-up for the next section, we recall the uniform circular motion, which is a particle revolving around a center with constant speed v and constant distance r from the center. Hence the angular speed of a particle ω = v/r is constant. The period T is given by 2π T = . (3.28) ω The acceleration is given by v2 a= = ω2r . (3.29) r The particle needs a centripetal force to keep it in uniform circular motion. If the force is provided by the gravitational force of an object with mass M at the center, then Mm (3.30) G 2 = ma = mω 2 r , r which implies T2 = (3.31) This is a special case of Kepler’s third law. Exercise: assuming no air resistance, how fast a bullet has to travel on the surface of the Earth so that it can keep going around in a circular motion? Compared the value with the escape velocity. CHAPTER 3. CELESTIAL MECHANICS 3.2.1 42 Roche Lobe We now study the gravitational potential around a binary star system. Most stars are in binary systems. Mass exchange can occur between the two stars. Roche lobe, is the equipotential surface which just encloses the two stars. The Roche lobe hugs the larger star tighter. The intercepting point is called the Lagrangian point. Note that this is not at the same location as the center of mass. It is closer to the lighter star than to the heavier star. If matter flows out from one star of the binary, for example, if one of them goes to the red giant phase, it will first fill up the Roche lobe then channel to the companion star via the Lagrangian point. Roche lobe l d Figure 3.4: The Roche lobe of a pair of stars. Which star is more massive? Suppose the masses of the two stars are m1 at position with coordinates (0, 0, 0) and m2 at (d, 0, 0). By symmetry, the Lagrangian point must lie on the x-axis. Let its coordinates be (l, 0, 0). At the Lagrangian point, the gravitational forces due to the two stars are equal, we have Gm1 = (3.32) l2 Solve for l, l= (3.33) Notice that the position of the Lagrangian is between the two stars and it is nearer to the lighter one. This is, in fact, not the exactly correct. See Section 3.5 for more detailed analysis of Lagrangian points. 3.2.2 Critical Density of the Universe We can estimate the critical density of the universe by simple Newtonian gravity, although we need general relativity to rigorously derive it. Let the average mass CHAPTER 3. CELESTIAL MECHANICS 43 density of the universe be ρ. Suppose a galaxy be a distance r from us. Then, the total mass inside the sphere of radius r is 4 M = πr3 ρ . 3 (3.34) If the mass of the galaxy is m, the potential energy of the galaxy due to the mass in the sphere is given by GM m U =− = (3.35) r Assume that velocity of the galaxy is radial, and the speed is given by the Hubble’s law: the speed of a galaxy is proportional to its distance from us, i.e. v = H0 r, where H0 is the Hubble constant. The kinetic energy T of the galaxy is 1 T = mv 2 = 2 (3.36) E =T +U = (3.37) and the total energy is If E < 0, the galaxy is bounded, which means that the galaxy is not energetic enough to escape from the gravitational pull of other mass. If E > 0, the galaxy is unbounded and will fly away. Therefore, the critical density ρc for which is the galaxy is just bounded is E = 0 = ρc = (3.38) If we take the value of the Hubble constant as H0 = 70 km s−1 Mpc−1 , ρc = g/cm3 . This is about the mass of five hydrogen atoms per cubic metre, but the average density observed from stars, gas, etc (excluding dark matter and dark energy) is found to be only 0.2 hydrogen atoms per cubic metre. 3.2.3 Virial Theorem This theorem states that for a gravitationally bound system in equilibrium, the time averaged total energy is one-half of the time averaged potential energy ⟨E⟩ = 1 ⟨U ⟩ . 2 (3.39) The proof goes as follow. For a system of particles, let pi and ri be the linear momentum and position of the i-th particle at sometime t. Consider the quantity ∑ Q≡ pi · ri . (3.40) i CHAPTER 3. CELESTIAL MECHANICS 44 We note that dQ d ∑ dri 1 ∑ d2 1 d2 I 2 m = · ri = (mr ) = . i dt dt i dt 2 i dt2 2 dt2 (3.41) In equilibrium, we expect that I should stay roughly the same. Hence, we assume that the time average of its derivative is zero ⟨ ⟩ dQ =0. (3.42) dt The time derivative of Q is given by ( ) dQ ∑ dpi dri = · ri + pi · . dt dt dt i (3.43) We recognize the second term as twice the kinetic energy ∑ i pi · dri ∑ 1 2 = p = 2T . dt mi i i (3.44) By Newton’s second law, ∑ dpi ∑ Gmi mj Fij . = (r − r ) ≡ j i dt |rj − ri |3 j̸=i j̸=i (3.45) Here, we sum only over j. Notice that Fij = −Fji , we have ) ( ∑ dpi ∑ ∑ Fij · ri · ri = dt i i j̸=i ) ( ∑ ∑ ∑ ∑ 1 = Fij · ri + Fji · rj 2 j i̸=j i j̸=i ) ( ∑∑ 1 ∑∑ = Fij · ri + Fji · rj 2 i j̸=i i j̸=i 1 ∑∑ = (Fij · ri + Fji · rj ) 2 i j̸=i 1 ∑∑ (Fij · ri − Fij · rj ) = 2 i j̸=i 1 ∑∑ = Fij · (ri − rj ) 2 i j̸=i 1 ∑ ∑ Gmi mj = − 2 i j̸=i |rj − ri | = U . (3.46) CHAPTER 3. CELESTIAL MECHANICS 45 We need the factor 1/2 to compensate for the double counting of the number of pairs of particles. In summary, U + 2T = dQ/dt and after taking time average, ⟨U ⟩ + 2 ⟨T ⟩ = 0. Total energy is E = T + U and we have Eq. (3.39). Stars were formed in nebulae. At the beginning, the density of gas and dust in a nebula is very low, and they are moving slowly. Hence, the total energy is roughly zero. On the other hand, a star is gravitationally bounded. Its potential energy is non-zero and negative. By virial theorem, its total energy is also negative. This implies that energy must be transferred out for a nebula to form a star. Physically this is via thermal radiation during star formation. Another interesting consequence of Virial theorem is that as a star loses energy, the total energy becomes more negative. Since ⟨T ⟩ = − ⟨U ⟩ /2, the KE of the particles actually increases! The net result is that the star gets hotter while loses energy, somewhat like having a “negative heat capacity”. 3.3 Two-body Problem In this section, we talk about the motion of two bodies under their mutual gravitational attraction. Why study this? It can tell us different orbits of planets and comets around the Sun, e.g. circle, ellipse, parabola, and hyperbola. After a brief review on Kepler’s laws, we will derive the general solution to the two-body problem using Newtonian mechanics, and apply it to prove Kepler’s laws. 3.3.1 Kepler’s Laws of Planetary Motion • First Law: A planet orbits the Sun in an ellipse, with the Sun at one focus of the ellipse. • Second Law: A line connecting a planet to the sun sweeps out equal areas in equal time intervals. • Third Law: The orbital period P of a planet and the semi-major axis of its orbit a are related by P 2 ∝ a3 . 3.3.2 Orbits in Two-body Problem Our goal here is to derive the motion of objects in two-body problem using Newton’s gravitational force equation. We will employ transformations (r1 , r2 → r and m1 , m2 → µ) to solve Newton’s 2nd law. CHAPTER 3. CELESTIAL MECHANICS 46 Let first assume that the force between the two bodies depends only on their relative position, the force F acting on the first body by the second is a function of r1 − r2 , where r1 and r2 are the position of the two bodies. Then, if the masses of the two are m1 and m2 , their equations of motion are m1 v̇1 = F (r1 − r2 ) m2 v̇2 = −F (r1 − r2 ) (3.47) (3.48) where vi = dri /dt are their velocities. Let R= m1 r1 + m2 r2 m1 + m2 (3.49) be the position of the center of mass, and r = r1 − r2 be the relative position, m2 r r1 = R + (3.50) m1 + m2 m1 r2 = R − r. (3.51) m1 + m2 Substitute the above equations into the sum of Eq. (3.47) and Eq. (3.48), we have (m1 + m2 )R̈ = 0 , (3.52) which just means that the center of mass of the system will move in a straight line with constant velocity. If we calculate the difference of Eq. (3.47) and Eq. (3.48), we have m1 m2 v̇1 − m1 m2 v̇2 = (m1 + m2 )F (r) m21 m2 m1 m22 r̈ + r̈ = (m1 + m2 )F (r) m1 + m2 m1 + m2 m1 m2 r̈ = F (r) . m1 + m2 (3.53) If we define µ ≡ m1 m2 /(m1 + m2 ) then Eq. (3.53) is reduced to a one-body problem F (r) = µr̈. Therefore, µ is called the reduced mass. If, for example, m2 ≫ m1 , r is the position of the first body relative to the second and the reduced mass is m1 . This is the case for planets orbiting around the Sun in our solar system. We now assume that F is a central force, which means that the direction of F is equal to r and its magnitude is a function of the distance only, F = F (r)r̂. Hence, the torque is zero and angular momentum is conserved. If the angular momentum is denoted by L, it is a constant vector. Since r · L = 0, r is always on the plane perpendicular to L. Hence, the bodies are moving on a plane. On the plane, we usually label the position of a point by its Cartesian coordinates, (x, y). Here, we also need the polar coordinate system (r, θ), Fig. 3.5. Their relations are √ { { x = r cos θ r = x2 + y 2 . (3.54) y = r sin θ θ = tan−1 (y/x) CHAPTER 3. CELESTIAL MECHANICS ^ θ 47 ^r r θ Figure 3.5: A particle moving on a plane with polar coordinates. The unit vectors r̂ and θ̂ are illustrated in Fig. 3.5. In terms of the unit vectors x̂ and ŷ along the x- and y-axes, they are { r̂ = cos θ x̂ + sin θ ŷ . (3.55) θ̂ = − sin θ x̂ + cos θ ŷ For Newton’s gravitation, from Eq. (3.53), µr̈ = −G m1 m2 Mµ r̂ ≡ −G r̂ , r2 r2 (3.56) where M ≡ m1 + m2 is the total mass. We would like to calculate r̈ in terms of r̈ and θ̈. Since x̂ and ŷ are constant vectors, they do not change with time (but r̂ and θ̂ do change). We have { r̂˙ = = θ̇ θ̂ . (3.57) ˙ θ̂ = = −θ̇ r̂ We can now calculate r̈, r = r r̂ ṙ = = r̈ = = = (r̈ − rθ̇2 )r̂ + (2ṙθ̇ + rθ̈)θ̂ . (3.58) (3.59) Comparing with Eq. (3.56), separating into azimuthual and radial parts, we have { r̈ − rθ̇2 = − GM r2 . (3.60) 2ṙθ̇ + rθ̈ = 0 CHAPTER 3. CELESTIAL MECHANICS 48 These are the equations of motion. We are just interested in the equation of orbit (not interested in time), that is the dependence of r in terms of θ. Notice that the azimuthual part d 2 (r θ̇) = = =0 (3.61) dt by the second equation of Eq. (3.60). Thus, L ≡ µr2 θ̇ is a constant of motion. It is in fact the angular momentum. We can rewrite the radial part of Eq. (3.60) as L2 GM + , r2 µ2 r3 GM µ L2 µr̈ = − 2 + 3 r µr r̈ = − or (3.62) This is the modified force law. It has a similar form as the gravitation law, but slightly modified. This effective force has an extra term representing the centripetal force. This can be used to derive the effective potential energy of the system GM µ L2 Ueff = − + . (3.63) r 2µr2 Although this is not real potential energy but just a mathematical form, it can help us easily visualize the orbits of a particle with certain energy (Fig. 3.6). Figure 3.6: Effective potential. As L = µr2 θ̇, we can write ṙ in terms of L by ṙ = dr dθ L dr dr = = 2 . dt dθ dt µr dθ Therefore, the radial part, i.e. the first equation of Eq. (3.60) becomes )2 ( ) ( GM L d L dr L = − 2 −r 2 2 2 µr dθ µr dθ µr r ( ) d 1 dr 1 GM µ2 − . = dθ r2 dθ r L2 (3.64) (3.65) CHAPTER 3. CELESTIAL MECHANICS 49 Let u = 1/r, dr/dθ = −1/u2 du/dθ. The above equation gives d2 u GM µ2 + u = , dθ2 L2 (3.66) for which the general solution is 1 GM µ2 =u= [1 + e cos(θ − θ′ )] 2 r L (3.67) where e and θ′ are two constants of integration. θ′ just tells us how the orbit orients relative to our coordinate system. We can set it to 0 or π such that e ≥ 0. e is an important parameter of the orbit, called the eccentricity. If e = 0, The particle is in constant distance from the center. This is a circle. More generally, for e < 1, the orbit is called an ellipse. It is closed and bounded. The particle will revolve around the center of mass periodically, where it is called the focus of orbit. If e = 1, the orbit is called a parabola. If e > 1, the orbit is a hyperbola. Both parabola and hyperbola are open orbits. The particle will come near the focus once and then go away. This is the case for some comets, they will enter the inner solar system, fly by the Sun only once, then never come back. Particle with parabolic orbit is that its kinetic energy just balances the potential energy, i.e. the total energy is zero. When it gets farther away from the focus, its speed will decrease and tends to zero, while particle with hyperbolic orbit has a non-zero speed at infinity. parabola ellipse focus hyperbola Figure 3.7: Left: three kinds of orbits around the focus. Right: conic sections. All these shapes are called the conic sections, Fig. 3.7. The point on the orbit nearest to the focus is called the perihelion (the farthest point is called the aphelion). The distance between perihelion and the focus is given by rp = L2 1 . GM µ2 1 + e (3.68) This occurs when the right hand side of Eq. (3.67) is the largest, that is when θ = θ′ . CHAPTER 3. CELESTIAL MECHANICS 50 Exercise: Show that the total energy of a two-body system is: (a) minimum for a circular orbit, (b) < 0 for a elliptical orbit, (c) = 0 for a parabolic orbit, and (d) > 0 for a hyperbolic orbit. For closed orbits, we can calculate the period. We can put θ′ = 0 in Eq. (3.67) because the period does not depend on it. Then, the equation of orbit is 1 GM µ2 =u= (1 + e cos θ) . r L2 (3.69) We would like to calculate the area of the orbit. First transform to the Cartesian coordinate system, we have r cos θ = x and r + ex = L2 /GM µ2 . Hence, r = x2 + y 2 = 2eL2 x = GM µ2 [ ]2 eL2 2 2 y + (1 − e ) x + = GM µ2 (1 − e2 ) y 2 + (1 − e2 )x2 + L2 − ex GM µ2 )2 ( 2eL2 L2 − x + e2 x2 GM µ2 GM µ2 ( )2 L2 GM µ2 ( )2 L2 1 (3.70) . 1 − e2 GM µ2 This is the equation of an ellipse with a shifted center. Its area equal to the area of the following ellipse 1 y + (1 − e )x = 1 − e2 2 Let B 2 = 1 1−e2 ( L2 GM µ2 )2 2 2 ( L2 GM µ2 )2 . (3.71) √ and w = x 1 − e2 /B. The area is ∫ A = 2 √ B/ 1−e2 √ B 2 − (1 − e2 )x2 dx √ −B/ 1−e2 ∫ 1 2 √ 2B = √ 1 − w2 dw 1 − e2 −1 πB 2 = √ 1 − e2 ( )2 π L2 = . (1 − e2 )3/2 GM µ2 (3.72) CHAPTER 3. CELESTIAL MECHANICS 3.3.3 51 Proof of Kepler’s Laws First law: As discussed above, in a close orbit, a planet moves around the center of mass of the system, which is the focus of the orbit. If the star is much more massive than the planet, it will essentially sit at the focus. Exercise: How much does the Sun move due to due to Earth? Due to Jupiter? Second law: Area in polar coordinates is given by dA = dr(r dθ) . Integrating r, the rate of change in area swept out by a line joining a point from the focus to the ellipse is 1 dA = r2 dθ . (3.74) 2 Then dA 1L 1 dθ 1 = r2 = r2 θ̇ = . dt 2 dt 2 2µ (3.75) (3.73) dA=rdrdθ rd θ dr r dθ This is a constant because the angular momentum L is a constant. To calculate the period P , it follows Figure 3.8: Area in polar cofrom Eqs. (3.72) and (3.75) that ordinates. 2µA L3 2π P = = 2 2 3 . (3.76) L G M µ (1 − e2 )3/2 Third law: The semi-major axis a is the average distance of the perihelion and the aphelion. From Eq. (3.68) and a similar expression for the aphelion, ( ) 1 1 1 L2 L2 + . (3.77) a= = 2 2 2GM µ 1 + e 1 − e GM µ 1 − e2 Comparing Eqs. (3.76) and (3.77), we have P2 = 4π 2 3 a . GM (3.78) This is the enhanced version of Kepler’s third law. We have derived the proportional constant from Newton’s law. This provides a very powerful tool to estimate the mass of celestial objects, from moons to planets to stars to galaxies. Exercise: From Eq. (3.76), what is the relation between velocity and distance? CHAPTER 3. CELESTIAL MECHANICS 52 Figure 3.9: Solid line: observed rotation curve of a galaxy. Dotted line: prediction from Kepler’s third law. In a galaxy, stars are concentrated at the center, therefore, the rotation curve is expected to follow Kepler’s third law except at the very center. However, the observed rotation curves is much flatter (see Figure 3.9). The discrepancy is attributed to dark matter. For the Milky Way, it is estimated that 90% of the mass is dark matter, while ordinary matter is only 10%. 3.4 Impact Parameter and Scattering Angle We now study the two body problem in another point of view. Consider the case that a light body approaches a very heavy body from far away, for example, a satellite approaching a planet. (Does it sound familiar?) Results of last section tell us that the reduced mass is very closed to the mass of the light body and the trajectory is a hyperbola. Let the mass of the heavy body be M , the mass of the light body be m. We consider M ≫ m such that µ ≈ m. When the light body is far away from the heavy body, let its incident velocity be v and the impact parameter, b, be the perpendicular distance between the heavy body and the incident velocity, Fig. 3.10. We would like to determine the scattering angle, Θ. Θ M b m v rp θ r vp Figure 3.10: Impact parameter of a two body system. CHAPTER 3. CELESTIAL MECHANICS 53 When the light body is far away, it is obvious that the angular momentum L is given by L = mbv, which is a constant of motion. At perihelion, let its velocity be vp , it is perpendicular to the line joining the two bodies. Hence, vp = rθ̇ at perihelion. (At other point on the trajectory, the velocity has radial component, see Eq. 3.58.) As L = mr2 θ̇, we have mbv = L = mr2 θ̇ = mrp (rp θ̇) = mrp vp (3.79) and by Eq. (3.68), bv GM (1 + e) GM (1 + e) = bv = . 2 rp (bv) bv vp = (3.80) By conservation of energy, 1 2 1 2 GM m mv = mv − 2 2 p rp ( )2 GM (1 + e) GM (1 + e) 2 v = − 2GM bv (bv)2 G2 M 2 (1 + e) (e − 1) v2 = b2 v 2 b2 v 4 = e2 − 1 G2 M 2 √ b2 v 4 e = 1+ 2 2 . GM (3.81) We see from this equation that e > 1, and hence we have a mathematical proof that the trajectory is a hyperbola. For simplicity, let’s define our coordinate system such that θ′ = 0 in Eq. (3.67) when r = rp . Then from Eq .(3.68), the light body will go infinitely far away at the angles 1 + e cos θ = 0 cos θ = − 1 . e (3.82) Negative cos θ means that θ > π/2. We can also write θ = π − cos−1 1 . e (3.83) Finally, from the geometry in Fig. 3.10, the scattering angle Θ is given by θ + (θ − Θ) = π. Hence, Θ = 2θ − π 1 e )−1/2 ( b2 v 4 −1 . = π − 2 cos 1+ 2 2 GM = π − 2 cos−1 (3.84) CHAPTER 3. CELESTIAL MECHANICS 54 Let us check that if the impact parameter is very large, the light body should not be affected much by the heavy body. We expect the scattering angle is small. If b is large, e is large, cos−1 1/e is closed to π/2, and Θ is small. This calculation also sheds some light on the problem of gravity assist. Since in this section we have assumed that heavy body does not move, we are essentially in the center-of-mass frame. We have mentioned in Section 3.1 that the final speed of the spacecraft depends on the scattering angle. We now know that we can control the scattering angle by adjusting the impact parameter and the incident velocity. 3.5 Restricted Three-body Problem In general, three-body problem is very difficult. It is sometimes chaotic and does not have analytical solution. In this section, we will discuss a special case and some more about Lagrangian points. Y (X,Y) R1 R2 b m1 X a m2 Figure 3.11: The positions of the three bodies. We assume that two point masses m1 and m2 revolve around their center of mass in circular motion. The origin of the coordinates is chosen to be at the center of mass, and at time t = 0, m1 is at the positive x-axis. If a and b are the distances of the masses from the origin, Fig. 3.11, then a/b = m1 /m2 . The angular speed of the point masses is n, where n2 = G(m1 + m2 )/(a + b)3 . Let c ≡ cos nt and s ≡ sin nt. Then, the positions of the m1 and m2 are respectively (cb, sb) and (−ca, −sa). A third body is assumed to be always lie on the orbital plane defined by the two masses. We also assume that it is very light (i.e. a test particle) and it will not influence the motion of the other two. This is the case, for example, if the two masses are the Sun and Jupiter, while the third body is an asteroid. CHAPTER 3. CELESTIAL MECHANICS 55 If the position of the third body is (X, Y ), then R1 and R2 defined by the figure are R1 = (X, Y ) − (cb, sb) = (X − cb, Y − sb) R2 = (X, Y ) − (−ca, −sa) = (X + ca, Y + sa) . (3.85) (3.86) The equations of motion are Gm1 Gm2 (X − cb) − (X + ca) 3 R1 R23 Gm1 Gm2 = − 3 (Y − sb) − (Y + sa) R1 R23 Ẍ = − (3.87) Ÿ (3.88) where Ri are the magnitude of Ri . We now transform to a moving coordinate system which rotates with the two masses. This is called the co-rotating frame. It is not an inertial reference frame, meaning that there will be extra terms in Newton’s 2nd law. Physically, it could mean that we see the Sun and the asteroid from Jupiter. Mathematically, we define (x, y) by X ≡ cx − sy Y ≡ sx + cy . (3.89) (3.90) Note that even if x and y are time independent, X and Y still depend on time through c and s. In terms of x and y, R12 = (x − b)2 + y 2 and R22 = (x + a)2 + y 2 . Since Ẋ Ẍ Ẏ Ÿ = = = = −nsx + cẋ − ncy − sẏ −n2 cx − 2nsẋ + cẍ + n2 sy − 2ncẏ − sÿ ncx + sẋ − nsy + cẏ −n2 sx + 2ncẋ + sẍ − n2 cy − 2nsẏ + cÿ , (3.91) (3.92) (3.93) (3.94) the equations of motion are −n2 cx − 2nsẋ + cẍ + n2 sy − 2ncẏ − sÿ Gm1 Gm2 = − 3 (cx − cb − sy) − (cx + ca − sy) R1 R23 −n2 sx + 2ncẋ + sẍ − n2 cy − 2nsẏ + cÿ Gm1 Gm2 = − 3 (sx − sb + cy) − (sx + sa + cy) . R1 R23 (3.95) (3.96) If we calculate the sum of c times Eq. (3.95) and s times Eq. (3.96), we have − n2 x + ẍ − 2nẏ = − Gm2 Gm1 (x − b) − (x + a) . 3 R1 R23 (3.97) Subtract s times Eq. (3.95) from c times Eq. (3.96), we have 2nẋ − n2 y + ÿ = − Gm1 Gm2 y− y. 3 R1 R23 (3.98) CHAPTER 3. CELESTIAL MECHANICS 56 y L4 L1 L2 L3 x m2 a b m1 L5 Figure 3.12: Left: The positions of the five Lagrangian points relative to the two masses. Right: Gravitational potential of the system in the co-rotating frame. These two equations are the main equations. We will not study the general solutions. Instead, we will only find out the equilibrium points, which means the solutions with fixed x and y values. If x and y do not depend on time, ẋ = ẍ = ẏ = ÿ = 0. Substitute into Eq. (3.97) and Eq. (3.98), we have Gm1 Gm2 (x − b) + (x + a) 3 R1 R23 Gm1 Gm2 n2 y = y+ y. 3 R1 R23 n2 x = (3.99) (3.100) If y = 0, Eq. (3.99) gives us three solutions. They are the Lagrangian points L1 , L2 and L3 . For m1 ≫ m2 , Eq. (3.99) can be solved to obtain ( ( )1/3 ) b L1 : −a + (a + b) ,0 3a ( ( )1/3 ) b L2 : −a − (a + b) ,0 . (3.101) 3a ) ( 5 L3 : a + b + b, 0 12 L1 is in between the two masses, and it is in fact the Lagrangian point that we found in Eq. (3.33), but now we have the correction (the left hand side of Eq. (3.99)) coming from the rotation of the two masses. All these three Lagrangian points lie on the line joining the two masses, Fig. 3.12. If y ̸= 0, by Eq. (3.100), we have n2 = Gm1 Gm2 + . R13 R23 (3.102) CHAPTER 3. CELESTIAL MECHANICS Substitute this into Eq. (3.99), we have ( ) Gm1 Gm2 Gm1 Gm2 + x = (x − b) + (x + a) 3 3 3 R1 R2 R1 R23 Gm1 Gm2 a 0 = − 3 b+ R1 R23 R13 = R23 57 (3.103) because m1 b = m2 a. Hence, x = (b − a)/2. From the facts that n2 = G(m1 + m2 )/(a + b)3 , R13 = R23 and Eq. (3.102), we have R23 = (a + b)3 a+b 2 ) + y 2 = (a + b)2 2 √ 3 y = ± (a + b) . (3.104) 2 Thus, the coordinates of the fourth and fifth Lagrangian points L4 , L5 are ( ( L4 L5 ) √ 1 3 : (b − a), (a + b) 2 2 ( ) √ 3 1 : (b − a), − (a + b) . 2 2 (3.105) It can be shown that L1 , L2 and L3 are unstable1 , in the sense that material at these points would fly away if slightly perturbed. We saw that material will be transferred from one star to another through L1 . On the other hand, L4 and L5 are stable if m1 /m2 > 25. These are the cases for the Sun-Earth and Earth-Moon systems. The Lagrangian points have been used as parking lots for satellites. For the SunEarth system, the solar observatory SOHO is in L1 , such that it can observe the sun continuously. L2 is used by the microwave probes WMAP and Planck, which can observe the sky without any interference from the Sun and the Earth. It will also be used by the James Webb Space Telescope in future. L3 is too far to be useful, but it is a popular place for the hypothetical counter-Earth. The STEREO satellites were able to observe L3 and have ruled out the existence of any large objects there. L4 and L5 are the homes to many asteroids. This class of objects are called Trojans asteroids. The STEREO satellites have visited L4 and L5 to detect the Trojans asteroids. We have also discovered many asteroids at L4 and L5 of the Sun-Jupiter system. Question: In Fig. 3.12 right, why do the Lagrangian points seem to lie on top of the potential? And why does the potential fall off at large distance? 1 See http://wmap.gsfc.nasa.gov/media/ContentMedia/lagrange.pdf. Chapter 4 Introduction to Radiative Processes (Chapters 3.4–3.6, 9.1–9.4 in textbook.) In astrophysics, one studies distant objects through their emission. It is very important to understand how the radiation is generated and transmitted. We will briefly introduce the radiative transfer and blackbody radiation. 4.1 Solid Angle Recall that one way to define an angle is that the angle sustained by an arc is the ratio of the length of the arc to the radius of arc, the left diagram of Fig. 4.1, θ = l/r. The angle of a whole circle is, of course, 2π. θ Ω l r r 111 000 000 111 000 111 000 111 A 000 111 000 111 000 111 000 111 Figure 4.1: Angle is the ratio of arc length to radius. The solid angle is the ratio of the area to the square of the distance. Suppose there is a sphere from a distance from us. We could talk about the angular size of it, with units in degrees or radians. However, if we like to talk about how much of the sky is blocked by the sphere, we are talking about the solid angle, Fig. 4.1. Its definition is the ratio of the area to the square of the distance, Ω = A/r2 , and has a unit of sr. The solid angle of the whole sky is 4π sr. To find out the infinitesimal solid angle, consider the infinitesimal area in Fig. 4.2. The two sides are rdθ and r sin θdϕ, hence the area is dA = r2 sin θdθdϕ and 58 CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 59 z y dΩ = sinθdθdφ θ dθ (x,y,R) y φ φ dφ x Figure 4.3: The shaded area is one eighth of a face of a cube, from the point of view of +z-axis. x Figure 4.2: The differential solid angle is dΩ = sin θdθdϕ. dΩ = dA/r2 . This gives dΩ = sin θ dθ dϕ . (4.1) Example: If you are at the center of a cube, what will be the solid angle sustained by one face of the cube? By symmetry, there are six faces and the solid angle of whole sky is 4π, hence the solid angle of one face should be 2π/3. Let’s do it the hard way. Let the length of one side of the cube be 2a. We consider the face at z = a, −a ≤ x, y ≤ a. The required solid angle is eight times the solid angle of the shaded area in Fig. 4.3. We have to figure out the ranges of ϕ and θ. It is easy for ϕ, 0 ≤ ϕ ≤ π/4. The length of the dark line in Fig. 4.3 is a/ cos ϕ, and the θ-coordinate of the point (x, y, a) is a tan θ = (4.2) Hence, the range of θ is 0 ≤ θ ≤ α, where α ≡ tan−1 (1/ cos ϕ). The required solid angle is ∫ π/4 ∫ α sin θ dθ dϕ Ω = 8 ϕ=0 θ=0 = = We need 1 cos2 ϕ 1 = = . cos α = 1 + tan2 α 1 + 1/ cos2 ϕ 1 + cos2 ϕ 2 (4.3) (4.4) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 60 If we put u = sin ϕ, then ∫ π/4 Ω = 8 0 1− √ ∫ √ 2/2 cos ϕ 1 + cos2 ϕ dϕ 1 du 2 − u2 0 ( )√2/2 u = 2π − 8 sin−1 √ 2 = 2π − 8 √ 0 = 2π − 8(π/6) 2π = . 3 4.2 (4.5) Specific Intensity and Flux In this section, we will discuss various flux, intensity and energy density. These are scientific terms to describe a radiation field and light rays. dA dΩ n Figure 4.4: For the definition of specific intensity. Imagine inside a region filled with radiation, or photons, with all kinds of frequencies and going in all directions. Consider a photon (or a light ray) goes in a specific direction ⃗n, the specific intensity, Iν , is defined as the energy carried by photons pass through a small area dA perpendicular to ⃗n, pointing to similar directions within a small solid angle dΩ, between frequency ν and ν + dν, and time dt (see Fig. 4.4) dEν Iν ≡ . (4.6) dA dt dΩ dν Physically, the specific intensity could be understood as the brightness. It is generally a function of Ω, position, ν and time, and has a dimension of energy per unit area per unit time per unit solid angle per unit frequency. In c.g.s. the units are erg/s/cm2 /sr/Hz. Note that Iν is independent of distance! It is like surface brightness (magnitude per square arcsec). Stars do not fade with distance, but just get smaller, i.e. smaller solid angle. If we take two pictures of the Sun, one from Earth and one from Venus, using the same camera setting, the surface brightness of the sun will look identical in the two pictures. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 61 If we want to express the specific intensity in terms of per unit wavelength instead of frequency, we use the relation between wavelength and frequency c = νλ and define c Iλ = 2 Iν , (4.7) λ so that |Iν dν| = |Iλ dλ| and ∫ ∞ ∫ 0 Iν dν = Iλ dλ (4.8) ∞ 0 What is the unit of Iλ ? dA dΩ θ Figure 4.5: For the definition of energy flux. Next, we consider the case when the light ray direction is not perpendicular to dA. As shown in Figure 4.5, it is obvious that if θ = 90◦ the flux passing through the area dA will be zero, and the flux is maximum when θ = 0. For a general θ, the effective area is reduced by a factor of cos θ, therefore, the specific energy flux passing through a small area dA with direction dΩ is dFν = Iν cos θ dΩ . (4.9) To obtain the net flux (actually flux density, as it is per unit frequency), we have to integrate all directions ∫ Fν = Iν cos θ dΩ . (4.10) It is in units of erg s−1 cm−2 Hz−1 . Exercise: Show that if the radiation is isotropic, i.e., Iν does not depend on Ω, then Fν = 0. ∫ Hint: You will see this type of integral dΩ a lot in astrophysics. In the actual calculation, one needs to express dΩ in terms of θ and ϕ first. Also, it is sometimes useful to make a substitution µ = cos θ, which gives dµ = − sin θ dθ, such that ∫ π ∫ 1 f (θ) sin θ dθ = f (µ) dµ . (4.11) 0 −1 CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 62 Example: Constancy of specific intensity along rays in free space. Consider any ray L and any two points along the ray. As shown in Fig. 4.6, construct areas dA1 and dA2 normal to the ray at these points. By energy conservation, energy carried by the set of rays passing through both dA1 and dA2 can be expressed in two ways: dE1 = Iν1 dA1 dtdΩ1 dν1 = dE2 = Iν2 dA2 dtdΩ2 dν2 . Here dΩ1 is the solid angle subtended by dA2 at dA1 and so forth. Since dΩ1 =dA2 /R2 , dΩ2 =dA1 /R2 , and dν1 =dν2 , we have Iν1 = Iν2 , i.e. Iν =constant along a ray. dA2 dA1 R Figure 4.6: Constancy of intensity along rays. We now prove the inverse square law. For a distant star, one can take cos θ = 1. Since Iν is independent of distance, ∫ Fν = Iν cos θ dΩ = Iν Ω . (4.12) Ω = πR2 /d2 , where R is the stellar radius and d is the distance. Hence, Fν ∝ d−2 . Physically, the star does not get fainter with distance, but the solid angle gets smaller, therefore, the total flux is smaller. For a photon with energy E, it carries a momentum of is E/c. The momentum flux pν is the momentum per unit time per unit area perpendicular to dA, which also equals to the pressure. Imagine a photon strikes a wall, only the perpendicular component of the momentum will change and exert pressure. Therefore, only the perpendicular component matters and this introduces an extra factor of cos θ in the momentum flux ∫ 1 Iν cos2 θ dΩ . (4.13) pν = c We define the specific energy density, uν , as the energy per unit volume per unit frequency. It has units of erg/cm3 /Hz. Consider a cylinder in Figure 4.7 of cross CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 63 dA dΩ ds= cdt Figure 4.7: For the definition of energy flux. section dA and length ds=cdt, the total energy in the volume is uν dA cdt dν. On the other hand, after time dt, all photons inside will come out, passing through the area dA to all solid angle. From the definition of Iν , the total energy passing can ∫ also be expressed as ( Iν dΩ) dA dt dν. Equating these two, (∫ ) Iν dΩ dA dt dν = uν dA c dt dν ∫ 1 Iν dΩ . (4.14) uν = c Finally, we are going to prove that for isotropic radiation, the radiation pressure p is equal to 1/3 of the energy density. Consider a system consists of a container with isotropic radiation field inside. Since the photons have to turn around on the boundary, the radiation pressure on the boundary is twice the momentum flux; but we integrate only over 2π solid angle, ∫ p=2 2 pν dν = c ∫ π/2 Iν cos2 θ dΩ dν . (4.15) θ=0 By isotropy, Iν does not depend on Ω, therefore, p= . (4.16) On the other hand, the energy density is u = = 3p . (4.17) This relation is the equation of state for radiation. In cosmology, it can be used to describe the radiation-dominated era in the early Universe. Similarly, the energy flux flowing out of the boundary is ∫ F ≡ ∫ Fν dν = ∫ ∫ π/2 Iν dν cos θ dΩ = π θ=0 It is related to the energy density by F = cu/4. Iν dν . (4.18) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.3 64 Emission and Absorption The monochromatic emission coefficient jν is defined as the energy emitted per unit time per unit solid angle per unit volume: jν ≡ dE dV dΩdt dν (4.19) and it has units of erg/cm3 /s/sr/Hz. When a beam of cross section dA travels though a emission region for a distance ds, the volume it covers is dV =dAds. The energy energy it gains is dIν = jν ds . (4.20) For absorption, we first note that the amount of absorption depends on the intensity of the incident beam, e.g., no absorption could occur if the beam contains no energy. Therefore, the absorption coefficient αν is defined as the change in the beam intensity after traveling for a distance ds: dIν = −αν Iν ds . (4.21) αν is in units of cm−1 . Putting Eqs. (4.20) and (4.21) together, the general form of the radiative transfer equation is dIν = −αν Iν + jν . ds (4.22) For pure emission, αν = 0, the solution is ∫ s Iν (s) = Iν (s0 ) + jν (s′ ) ds′ . (4.23) s0 For pure absorption, jν = 0, the solution is − Iν (s) = Iν (s0 )e ∫s s0 αν (s′ ) ds′ . The integral in the exponent is called the optical depth ∫ s τν ≡ αν (s′ ) ds′ . (4.24) (4.25) s0 This is an important parameter to describe the emission properties of plasmas in astrophysics. A medium is optically thick or opaque when τν ≫ 1, meaning that photons cannot transmit for a long distance without being absorbed. A medium with τν ≪ 1 is optically thin or transparent, such that photons can propagate more or less freely. For example, the early Universe had a high temperature, most materials were ionized, absorbing all electromagnetic radiation. Therefore, it was opaque to light (i.e. optically thick). It was until the last scattering surface when matter decoupled from photons, resulting in the cosmic microwave background we CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 65 see today. Also, the surface i.e. photosphere of a star (e.g. the Sun) is defined as the point where τ = 2/3, such that photons can escape freely into space. We can rewrite equation 4.22 as jν 1 dIν = −Iν + . αν ds αν (4.26) From the definition of τν , we have dτν = αν ds. Define the source function Sν ≡ jν /αν , then dIν = −Iν + Sν . (4.27) dτν It is easy to see that if Iν > Sν , then dIν /dτν < 0. Physically, it means that Iν tends to decrease along the ray until it is the same as Sν . Conversely, Iν increases if Iν < Sν . In other words, Iν always tries to approach Sν . The general solution to Equation 4.27 is ∫ τν ′ −τν Iν = Iν (0)e + e−(τν −τν ) Sν (τν′ ) dτν′ . (4.28) 0 For the simple case of a constant source function Sν , this reduces to Iν = Iν (0)e−τν + Sν (1 − e−τν ) . (4.29) Again, τν → ∞ implies Iν → Sν , i.e., given a large optical depth (e.g., travel for sufficient distance in a optically thick medium), the observed intensity will approach to the source function. 4.4 Basics of Statistical Mechanics (Optional) (Chapter 8.1 in textbook. Chapters 2, 7, 9, 10, 11 in F. Mandl: Statistical Physics, John Wiley, 1988, 2nd ed.) 4.4.1 Thermodynamics We very briefly review the basics of thermodynamics. The zeroth law of thermodynamics states that if two systems A and B are in thermal equilibrium with each other and B and C are in thermal equilibrium, then A and C are in thermal equilibrium. We say that they all have same temperature. There are many kinds of temperature scales, for example, the length of a rod at different temperatures. We can verify by experiments that same amount of work, no matter which kind, produce same temperature rise. We call the form of energy transfer heat. The first law states that change in energy of a system is equal to net heat input plus net work done on the system. Note that energy of a system is a function of state, while heat and work done on it are not. They depend on its history. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 66 For an isolated system in equilibrium, usually energy E, volume V and number of molecules N are fixed. We call it the macrostate: (E, V, N ); or (E, V, N, α) where α are other macroscopic variables, for example, the dependence of density on location. On the other hand, a microstate specifies the positions, velocities, internal states of each particles. It is almost impossible to fully describe. We need to count the number of microstates corresponding to the same macrostates. The states are discrete for quantum system and continuous for classical system. We will adopt the quantum system viewpoint. We denote the number of states for energy between E and E + δE by Ω(E, V, N, α). We give an example. Consider spins in magnetic field. For single paramagnetic atom, energy E = −µ · B where µ is the magnetic moment, B is the magnetic field. Let assume that the spin can only take two values ±h̄/2. Thus, energy can only be ±µ · B. For N such atoms, if n of them align with the field and (N − n) anti-align with the field, total energy and the number of states are E = n(−µB) + (N − n)(µB) = (N − 2n)µB , N! Ω = . n!(N − n)! (4.30) (4.31) We assume that each microstate compatible with the constraints has equal a priori probabilities. Then, (one form of) the second law states that value of α will evolve in such a way that Ω(E, V, N, α) is always non-decreasing and equilibrium corresponds to value of α for which Ω(E, V, N, α) attains its maximum. We define entropy by S(E, V, N, α) = k ln Ω(E, V, N, α) (4.32) First law is about energy conservation. Second law is about direction. All experiments show that isolated systems tend to equilibrium, not the opposite. Real processes are non-reversible. Hence, entropy is always non-decreasing for isolated systems. We also see that the more disordered the system is, the larger its entropy. Equivalently, larger entropy, less information. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.4.2 67 Isolated Systems If an isolated system is partitioned into two subsystems and they are nearly independent, then E V N Ω(E, V, N, E1 , V1 , N1 ) = = = = E1 + E2 V1 + V2 N1 + N2 Ω1 (E1 , V1 , N1 )Ω2 (E2 , V2 , N2 ) . (4.33) (4.34) (4.35) (4.36) Hence, S(E, V, N, E1 , V1 , N1 ) = S1 (E1 , V1 , N1 ) + S2 (E2 , V2 , N2 ) (4.37) entropy is an extensive quantity (proportional to the size of the system). (Compared to intensive quantity, e.g. temperature.) To define absolute temperature scale, first consider diathermal wall (not permeable to everything, except heat). At equilibrium, entropy is maximum, ( ) ( ) ( ) ∂S ∂S1 ∂S2 dE2 0= = + (4.38) ∂E1 E,V,N,V1 ,N1 ∂E1 V1 ,N1 ∂E2 V2 ,N2 dE1 Since dE2 /dE1 = −1, we have ( ) ) ( ∂S1 ∂S2 = ∂E1 V1 ,N1 ∂E2 V2 ,N2 (4.39) We see that (∂Si /∂Ei )Vi ,Ni is a measure of temperature, and define the absolute temperature T by ( ) ∂Si 1 = . (4.40) ∂Ei Vi ,Ni T Defined as this, the perfect gas temperature scale is equal to the absolute temperature scale. By second law, 0< dS 1 1 dE1 =( − ) . dt T1 T2 dt (4.41) Heat flow from high temperature to low temperature. 4.4.3 Systems in a Heat Bath To consider systems in constant temperature, we study an isolated system such that our system of interest is subsystem 1 and the subsystem 2 is a heat bath, which means it can absorb or provide as much energy as subsystem 1 needs without CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 68 changing its temperature. The probability that subsystem 1 is in a definite state r is proportional to the number of states of the heat bath compatible with it pr = const. Ω2 (E0 − Er ) = const. exp(S2 (E0 − Er )/k) (4.42) where E0 is the total energy of our system and the heat bath. By the definition of heat bath, Er ≪ E0 , S2 (E0 − Er ) S2 (E0 ) Er ∂S2 (E0 ) 1 Er2 ∂ 2 S2 (E0 ) + = − + ··· (4.43) k k k ∂E0 2 k ∂E02 By Eq. (4.40), the second term is Er /kT . The third is the change of temperature of the heat bath, which, by definition, negligible. Eq. (4.42) is then 1 pr = e−βEr (4.44) Z ∑ where Z = r exp(−βEr ) and β = 1/kT . This is Boltzmann distribution, which gives the probability of a microstate of a system at some fixed temperature. Z is called the partition function. If there are degeneracies g(Er ), the probability of the system at particular energy is 1 p(Er ) = g(Er )e−βEr . (4.45) Z The mean energy of the system is ∑ ∂ ln Z Ē = pr Er = − . (4.46) ∂β r Scientists usually will employ a conceptual construction. The energy, for example, of a particular system will have some particular time dependent value. There will be fluctuations. We could consider a large number of identical systems at some fixed temperature, called a canonical ensemble. The average of the energy of each system will be given by Eq. (4.46), without any fluctuation. Example: From the Boltzmann equation above, the ratio of probability that a system will be in state a and in state b is given by gb P (Eb ) = e−(Eb −Ea )/kT . (4.47) P (Ea ) ga At what temperature a gas of neutral hydrogen will have equal number of atoms in the ground and first excited states? For hydrogen atoms, ground state is n = 1 and first excited state is n = 2. The degeneracy is gn = 2n2 . Therefore, 2(22 ) −[(−13.6 eV/22 )−(−13.6 eV/12 )]/kT e 1 = 2(12 ) 10.2 eV = ln 4 kT T = 8.54 × 104 K . CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 69 This is even higher than the surface of the Sun. However, we know from observations that some hydrogen atoms are ionized in the Sun. How can that be possible? (The answer lies in the Saha Equations, which will be discussed later in this chapter.) 4.4.4 The Perfect Classical Gas Gas consists of molecules moving about fairly freely in space. Perfect gas represents an idealization in which the potential energy of interaction between the molecules is negligible compared to their kinetic energy of motion. If the energy states of one single molecule are εr , the partition function of a single molecule is ∑ Z(T, V, 1) = exp(−βεr ) . (4.48) r ∑ The partition function of many identical molecules is not ( r exp(−βεr ))N (otherwise this leads to the Gibbs paradox). Since the molecules are identical, we cannot distinguish the cases: the first molecule is at state r and the second is at state s; or the first molecule is at state s and the second is at state r. We can only say that one molecule is at state r and one is at state s. The partition function for two molecules is then ∑ 1 ∑ Z(T, V, 2) = exp(−2βεr ) + exp(−β(εr + εs )) . (4.49) 2! r,s r r̸=s The partition function for N molecules is then Z(T, V, N ) ∑ = exp(−N βεr ) + . . . r + ∑ 1 N! exp(−β(εr1 + . . . + εrN )) . (4.50) r1 ,...,rN all ri different We define classical regime, in which the probability that any single-particle state is occupied by more than one molecule is very small. If we define the occupation number for a state as the number of particles in that state, then classical regime is in which the occupation number for any state is much less then one. For classical perfect gas, only the last term in Eq. (4.50) is important. Consider the function 1 ∑ exp(−β(εr1 + . . . + εrN )) (4.51) N ! r ,...,r 1 N The difference between this and the last term of Eq. (4.50) is not significant for CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 70 classical perfect gas. So, our final result for this section is Z(T, V, N ) 1 ∑ exp(−β(εr1 + . . . + εrN )) = N ! r ,...,r 1 N ( )N 1 ∑ = exp(−βεr ) N! r (4.52) for classical (occupation number small) perfect (interaction negligible) gas (particles moving freely). 4.4.5 The Partition Function To calculate the partition function of the gas, it is reduced to calculate the partition function for a single molecule. The energy of a molecule can be written as int εr = εtr s + εα (4.53) int where εtr s is the energy for the translational motion and εα is the energy of the internal excitations. Hence, ( )( ) ∑ ∑ Z(T, V, 1) = exp(−βεtr exp(−βεint ≡ Z1tr Zint . (4.54) s) α ) s α The internal energy εint α depends on the internal details of the molecules, for example, type, excited states, etc. It does not depend on the volume. We now evaluate Z1tr , which can be applied to any perfect gas. The translational energy is εtr = p2 . 2m (4.55) We would like to find the number of states f (p)dp with momentum of magnitude between p and p + dp. f (p) is called the density of states. Consider a cube of sides with length L. From quantum mechanics, the allowed wavefunctions are (n π ) (n π ) (n π ) x y z ψ = const. sin x sin y sin z (4.56) L L L with nx , ny , nz = 1, 2, . . .. The magnitude of the wave vector k is defined by k2 = π2 2 (n + n2y + n2z ) . L2 x (4.57) Hence, the volume per allowed point in k-space is (π/L)3 . Since the n’s can only be positive, we have to count only the positive octant, and the volume of the region CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 71 lying between the radii k and k + dk in the positive octant is 18 4πk 2 dk. The number of states in this region is 1 V k 2 dk 4πk 2 dk/(π/L)3 = . 8 2π 2 (4.58) The relation between the wave vector and momentum is k = 2πp/h, the final result of the density of states is V 4πp2 dp f (p) dp = . (4.59) h3 Note that we have only considered the translational motion. For example, if the particle has non-zero spin, we have to multiply the above formula by the number of internal degrees of freedom. We come back to calculate the partition function Z1tr . ∑ Z1tr = exp(−βεtr s) ∫s ∞ = = = = = exp(−βp2 /2m)f (p)dp 0 ∫ 4πV ∞ exp(−βp2 /2m)p2 dp h3 0 ( )3/2 ∫ ∞ 4πV 2m exp(−x2 )x2 dx h3 β 0 ( )3/2 √ π 4πV 2m 3 h β 4 ( )3/2 2πmkT V . h2 (4.60) The full partition function is 1 N Z(T, V, N ) = V N! ( 2πmkT h2 )3N/2 N Zint . (4.61) If a particle is in state r and the energy of this state Er depends on the volume, then dEr /dV ≡ −Pr is by definition the negative of pressure (contributed by this particle), because the derivative is the work done per unit change in volume. The CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 72 Pressure Gravity II I Figure 4.8: The outward pointing pressure balances the inward pointing gravitational force. average pressure is just P = ∑ pr Pr r = ∑ r pr (− dEr ) dV ∑1 dEr = e−βEr (− ) Z dV r ( ) 1 ∑ ∂e−βEr = Zβ r ∂V β ( ) ∂Z 1 = Zβ ∂V β ( ) 1 ∂ ln Z = . β ∂V β (4.62) Substitute Eq. (4.61) into the above equation and notice that Zint is independent of the volume, we have N kT 1 N = , (4.63) P = β V V or P V = N kT , which holds irrespective of the internal molecular structure. To illustrate one simple application of ideal gas law in astronomy, we could roughly estimate the temperature at the core of a star. We hypothetically divide the star into to halves, the regions I and II in Fig. 4.8. The mass of each half is M/2 if the total mass of the star is M . They are separated by a distance R/2 if the radius of 2 . The the star is R. Thus, the gravitational attraction of the two halves is G(M/2) (R/2)2 CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 73 area between them is 4π(R/2)2 . The average pressure is in the order of G(M/2)2 /4π(R/2)2 (R/2)2 4 GM ρ = 3 R ⟨P ⟩ = (4.64) where ρ is the average density of the star. For ideal gas with N particles, ideal gas law gives us the temperature PV Nk P mp = (N mp /V )k P mp = ρk 4GM mp = 3kR T = (4.65) where k is the Boltzmann’s constant, mp is the mass of proton. We have the second equality because in a main sequence star, most of it are protons. The fourth equality is given by Eq. (4.64). If we substitute the data of our Sun, the result is T = 3×107 K. A more detailed calculation of the core temperature gives 1.5×107 K. 4.4.6 The Perfect Quantal Gas and Quantum Statistics One basic assumption of classical gas of identical particles is that the mean occupation number for single-particle states is much less than one. For quantal gas, the mean occupation number could be near or even greater than one. The main quantum effect is the quantum statistics: how particles occupy single-particle states. Eq. (4.52) is no longer correct. Let nr be the occupation number of the single-particle state with energy εr . What are the possible values of nr ? There are two mutually exclusive classes. Bose-Einstein statistics (BE): there is no restriction on nr , i.e. nr = 0, 1, 2, . . .. They are called bosons, with integral spin 0, h̄, 2h̄, . . .. Fermi-Dirac statistics (FD): the nr can only be 0 or 1. They are called fermions, with half-integral spin 21 h̄, 32 h̄, . . .. Another way to say about fermions is that they satisfy the Pauli exclusion principle: no two fermions can be in the same singleparticle state. The particles in consideration could be fundamental or composite, as long as they are identical. A composite particle consisting of odd number of fermions is a fermion. A composite particle consisting of even number of fermions is a boson. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.4.7 74 The Partition Function We will give the general form of the partition function in this subsection. Suppose the energy of the single-particle states are ε1 , ε2 , ε3 , . . ., and the corresponding occupation numbers are n1 , n2 , n3 , . . .. For gas of N particles, ∑ nr = N . (4.66) r { Also nr = 0, 1 fermions 0, 1, 2, 3, . . . bosons . (4.67) Any set of {nr } that satisfies these two conditions defines a state of the gas. The partition function is then ∑ ∑ Z(T, V, N ) = exp(−β nr εr ) . (4.68) n1 ,n2 ,... r where the first sum is over all sets of {nr }. The mean occupation number is ∑ ∑ ni exp(−β nr εr ) n̄i = n1 ,n2 ,... ∑ r exp(−β n1 ,n2 ,... 1 = − β 4.4.8 ( ∂ ln Z ∂εi ) ∑ nr εr ) r . (4.69) T,εr (r̸=i) Derivation of Blackbody Radiation In this section, we discuss the thermal gas of photons, using the method developed in the last section. An ideal black body is an object that absorbs all radiations fall on it. A black body is also a perfect radiator and its radiation is a thermal gas of photons. Most hot objects, including stars and a piece of hot metal, behave roughly like black bodies. Photons are of spin 1, they are bosons and obey BE statistics. They also do not interact with each other (Maxwell’s equations are linear). Hence, photon gas is perfect gas. Since photons can be emitted or absorbed, photon number is not a constant. Eq. (4.66) does not apply. The occupation number for each state r can take any CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 75 values, nr = 0, 1, 2, . . . . Partition function is ∑ ∑ Z(T, V ) = exp(−β nr εr ) n1 ,n2 ,... = ( ∑ e−βn1 ε1 n1 r )( ∑ ) e−βn2 ε2 ··· n2 1 1 = ··· 1 − exp(−βε1 ) 1 − exp(−βε2 ) ∞ ∏ 1 = 1 − exp(−βεr ) r=1 (4.70) because ∑ e−βnε = 1 + e−βε + (e−βε )2 + · · · n 1 . 1 − exp(−βε) = Thus, we have ln Z(T, V ) = − of state r is ∑∞ r=1 (4.71) ln(1 − e−βεr ) and the mean occupation number 1 ∂ ln Z β ∂εr e−βεr = 1 − e−βεr 1 . = βεr e −1 n̄r = − (4.72) We now derive the Planck’s law of black-body radiation. Energy and momentum of a photon of frequency ν are ε = hν and p = hν/c. The density of states is given by Eq. (4.59). However, there is one more complication for photons. There are two polarizations for each translational degree of freedom, two perpendicular directions of the linear polarization, for example. The number of internal degrees of freedom is two. The label r specifies the translational motion, the frequency, and polarization. In terms of frequency ν, we have V 4π(hν/c)2 hdν/c h3 8πV ν 2 dν . = c3 f (ν)dν = 2 (4.73) Combining the mean occupation number (Eq. 4.72) and the density of state above, we have the number of photons in the frequency range ν and ν + dν as dNν = 8πV ν 2 dν . c3 exp(βhν) − 1 (4.74) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 76 The total energy of photons in volume V in this frequency range is 8πV h ν 3 dν dEν = hνdNν = . c3 exp(βhν) − 1 (4.75) The energy density per unit frequency is defined as uν = Eν /V = 8πhν 3 . c3 (ehν/kT − 1) (4.76) The energy density per unit wavelength uλ is defined as uλ |dλ| = uν |dν| where λ = c/ν is the wavelength. We have |dν/dλ| = c/λ2 , and c 8πh(c/λ)3 λ2 c3 (ehc/λkT − 1) 8πhc 1 = . hc 5 λ exp( λkT ) − 1 uλ = 4.5 (4.77) Physics of Blackbody Radiation Astrophysical emissions can be classified as thermal and non-thermal origins. Thermal radiation is emitted by the thermal motion of charged particles, such as blackbody radiation and thermal Bremsstrahlung. Non-thermal radiation is generated by other processes, and the particles do not follow a thermal distribution. Examples are synchrotron radiation and inverse Compton scattering. In this course, we will only discuss the blackbody radiation. It is one of the most common and important radiation mechanisms. You will find it in the Sun and other stars and even the cosmic microwave background of the Universe. When we say the Universe has a temperature of 3 K, how do we measure it? A blackbody is an idealized object that absorbs all radiation in any frequencies. It can be approximated by cavity with a small hole in it, such that any incident light can never come out (Figure 4.9). The photons inside are then in thermal equilibrium with the surrounding. Their distribution follows the Bose-Einstein statistics. This gives the blackbody radiation. Every object has a finite temperature emits blackbody radiation. As we will see below, the peak frequency only depends on temperature. This can be used to explain why objects glow at different color at different temperature, e.g. from red to blue as it heats up. Figure 4.9: A blackbody. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 77 The specific intensity of blackbody radiation is given by Planck’s law, which can be derived from Eqs. (4.14) and (4.76): IνBB ≡ Bν (T ) = 2hν 3 /c2 , exp(hν/kB T ) − 1 (4.78) where h is the Planck constant and kB is the Boltzmann constant. Examples of blackbody spectrum are shown in Figure 4.10. Note that thermal radiation is NOT Figure 4.10: Blackbody radiation spectrum. the same as blackbody radiation. For the former, the source function Sν is equal to the blackbody intensity Bν Sν = Bν (T ) , (4.79) so that the emission and absorption coefficients are related by jν = αν Bν (T ) . (4.80) This is the Kirchhoff’s law. For blackbody radiation, Iν = Bν . It is clear that for thermal radiation in optically thick media, it becomes blackbody. We describe below some properties of blackbody radiation. 4.5.1 Stefan-Boltzmann law The flux from blackbody radiation is F = σSB T 4 , (4.81) where σSB is the Stefan-Boltzmann constant σSB = 2π 5 kB4 = 5.67 × 10−5 erg cm−2 K−4 s−1 . 15c2 h3 (4.82) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 78 It can be proved by integrating the Planck spectrum directly ∫ ∞ ∫ ∞∫ ∫ ∞ ∫ π/2 ∫ 2π F = Fν dν = Bν cos θ dΩ dν = Bν cos θ sin θ dϕ dθ dν . 0 0 0 0 0 (4.83) We consider only radiation going out from a surface, therefore, θ goes from 0 to π/2 only. ∫ ∞ ∫ π/2 ∫ ∞ ∫ 1 ∫ ∞ F = 2π Bν cos θ sin θ dθ dν = 2π Bν dν µ dµ = π Bν dν , 0 0 0 0 0 (4.84) where we have substituted µ = cos θ. Finally, ( )4 ∫ ∞ ∫ ∞ 2hν 3 /c2 2πh kB T u3 F =π dν = 2 du , exp(hν/kB T ) − 1 c h eu − 1 0 0 (4.85) where u = hν/kB T . The integral is nontrivial but it can be shown that the answer is π 4 /15. This gives Stefan-Boltzmann law above. This law tells us the total energy emitted per unit area per unit time. As an example, we know that the Sun has a surface temperature T = 5800 K, to estimate 2 the solar luminosity L⊙ , we just need to integrate over its surface area 4πR⊙ , 2 L⊙ = 4πR⊙ σSB T 4 . 4.5.2 (4.86) Rayleigh-Jeans Law At the low energy limit, hν ≪ kB T , we can expand ( ) hν hν exp −1= + ... kB T kB T (4.87) This is the Rayleigh-Jeans limit of the Planck’s law IνRJ (T ) = (4.88) Physically, it is the classical limit and it leads to the ultraviolet catastrophe. 4.5.3 Wien Law At the high energy limit, hν ≫ kB T , we can expand ( ) ( ) hν hν exp − 1 ≈ exp kB T kB T (4.89) This is the Wien limit of the Planck’s law IνW (T ) = (4.90) Unlike Rayleigh-Jeans law, Wien law is a quantum effect, as you can guess from the fact that it contains the constant h. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.5.4 79 Wien’s Displacement Law At what frequency the blackbody radiation peaks at? We can find the answer by setting ∂Bν =0. (4.91) ∂ν ν=νmax This results in x = 3(1 − e−x ) , (4.92) where x ≡ hνmax /kB T . This can only be solved numerically hνmax = 2.82kB T , (4.93) νmax = 5.88 × 1010 T Hz , (4.94) or where T is in K. This is Wien’s displacement law. Similarly, we can also derive the law in wavelength: ∂Bλ =0. (4.95) ∂λ λ=λmax Solving y = 5(1 − e−y ), we obtain λmax = 0.290T −1 cm , (4.96) T is in K. However, note that λmax νmax ̸= c. (why?) For example, the sun has a temperature of 5800 K, which corresponds to λmax =500 nm. This is the wavelength of green light, coincident with the peak sensitivity of human vision. 4.5.5 Monotonicity One important property of blackbody radiation is that the curves of different temperatures in Figure 4.10 never cross, i.e. a curve of higher temperature is entirely above the lower temperature one. This can be proved by 2h2 ν 4 exp(hν/kB T ) ∂Bν = 2 2 ∂T c kB T [exp(hν/kB T ) − 1]2 (4.97) always > 0. This has two consequences. First, although the blackbody peak shifts toward shorter wavelength (blue color) as temperature increases, the intensity at long wavelength (red color) always increases with temperature. Second, given ν, there is one-to-one correspondence between Iν and T . As we will show below, this can be used to define temperature. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.5.6 80 Temperature Definitions Brightness temperature At a given frequency ν, one can equate the brightness of an object Iν to the blackbody intensity. The temperature obtained this way is called the brightness temperature Tb , i.e. Iν = Bν (Tb ) . (4.98) We can then measure the brightness in units of K. This is often used in radio astronomy, where we express the surface brightness of a nebula in terms of brightness temperature, and also the system noise temperature of antennas. In radio frequencies, the Rayleigh-Jeans law is usually applicable, so that Tb = Iν c2 . 2kB ν 2 (4.99) This is essentially how infrared thermometers (pyrometers) work, which provides a useful way to measure temperature when conventional methods are not practical, e.g., fast moving objects, objects far away, or having too high temperature to contact. The brightness temperature of giant pulses from some pulsars can go beyond 5 × 1039 K, the highest known brightness temperature in the Universe. Color temperature If the measured spectrum of an object have a shape more or less like a blackbody, we can perform a fit to obtain the temperature. Even simpler, one could just fit the peak of the emission then apply Wien’s displacement law. This gives the color temperature. This is the same as the color temperature you set in digital photography or TV screens. Why a lower color temperature gives a “warmer” tone, while a high temperature gives a “cool” tone? Some advanced infrared thermometers may measure two or more frequency bands to estimate the color temperature from the intensity ratio. This can give more accurate measurements. Effective temperature Recall that the Stefan-Boltzmann law relates the total flux to the temperature. This can also be used to define the temperature. The effective temperature Teff is the temperature of a blackbody that emits the same amount of flux as the observed object: 4 . (4.100) Fobs = σSB Teff The Sun is not a perfect blackbody, but we can derive the effective temperature of 5800 K from the total flux, same for other stars. CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 4.6 81 Scattering Cross Section How strong do two particles interact with each other? How can we describe it? Experimentally, we usually send a uniform beam of particles, with same mass and energy, to certain target particle. Some incident particles will be absorbed, some will be scattered, some will be even chemically changed to others. The number of incident particles affected is proportional to the number of incident particles. If this ratio is larger, the interaction between the incident particles and the target is considered to be stronger. (a) (b) Figure 4.11: The total cross section depends on the orientation. The intensity of the beam is defined to be the number of incident particles crossing unit area normal to the beam in unit time. The total cross section is defined to be number of particles affected per unit time σT = . (4.101) incident intensity Note that the dimension of the total cross section is same as an area, because the numerator has dimension of pure number per time, while the denominator has dimension of pure number per time per area. To illustrate the idea, let say there is a board with area A which absorbs every particles incident on it. If the board is normal to the incident particles, Fig. 4.11a, it is immediately that the total cross section is σT = A . (4.102) A larger total cross section does mean a stronger interaction. If the board is parallel to the direction of motion of the incident particles (and the board is very thin), Fig. 4.11b, the total cross section is zero. Hence, the cross section depends on the orientation in general. For simplicity, from now on, we assume that the incident particles will just be scattered. To have a finer description on scattering, we define the differential cross section, σ(Ω), as σ(Ω) dΩ = number of particles scattered into solid angle dΩ per unit time . incident intensity (4.103) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 82 dΩ Figure 4.12: The differential cross section. In this formula, we have assumed that the place where we actually do the measurement is far away from the scattering center. If a incident particle is scattered, it must be scattered to some direction. We have ∫ σT = σ(Ω) dΩ (4.104) where the integration is over the 4π solid angle. θ+dθ b θ Figure 4.13: The geometry for the calculation of differential cross section. We are going to calculate the differential cross section of the gravitational interaction. We know the relation between impact parameter and the scattering angle from Eq. (3.84). Notice that there is a cylindrical symmetry. We only have to calculate the θ dependency of the differential cross section, σ(Ω) = σ(θ). Referring to Fig. 4.13, all incident particles passing through the annulus at the left, with radii b and b + db, will be scattered to the annulus at the right, with angles between θ and θ + dθ. If the incident intensity is I particles per unit normal area per unit time, then the number of particles passing through between b and b + db is I 2πb db per unit time. This number should be equal to 2πIσ(θ) sin θ dθ by the definition of differential cross section, where the 2π comes from the integration of the ϕ dependency. We have b db . (4.105) σ(θ) = sin θ dθ CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 83 We take the absolute value because b and θ usually vary in opposite directions. Then, by Eq. (3.84), we have ( ) π−θ cos 2 1 2 sin θ/2 b2 v 4 G2 M 2 bv 2 GM v 2 db GM dθ db dθ ( )−1/2 b2 v 4 = 1+ 2 2 GM b2 v 4 = 1+ 2 2 GM θ = cot2 2 θ = cot 2 1 = − 2 sin2 θ/2 GM 1 = − 2 . 2 2v sin θ/2 (4.106) The differential cross section is GM b 2 sin θ v sin2 θ/2 cot θ/2 G2 M 2 = 2v 4 sin θ sin2 θ/2 G2 M 2 = 4v 4 sin4 θ/2 (GM m)2 = 16E 2 sin4 θ/2 σ(θ) = (4.107) where E = mv 2 /2 is the energy of the incident particles. M ≫ m is assumed such that µ ≈ m. In principle, we could get the total cross section by integration ∫ σT = σ(θ) sin θdϕ dθ . (4.108) If we actually do the integration, we find that σT diverges to infinity: ∫ (GM m)2 σT = sin θ dϕ dθ 16E 2 sin4 2θ ∫ 2π(GM m)2 π 2 sin 2θ cos 2θ dθ = − 16E 2 sin4 2θ 0 ∫ 2π(GM m)2 π d(sin 2θ ) = − 4E 2 sin3 θ 0 π 2 2π(GM m)2 1 = 8E 2 sin2 2θ (4.109) 0 = ∞. (4.110) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 84 Physically, this is because the total cross section describes how far away the target can affect the incident particles. But gravitational interaction is infinitely long range, all incident particles will be affected, although the effect could be tiny. 4.7 Chemical Potential and Saha Equation (Optional) We would like to study the relative abundance of the reactants and products of a reaction in equilibrium. We will only derive the simplest case in the first half of this section and state the main result, Saha equation, in the second. The total energy of a particle moving with momentum p is given by E 2 = m2 c4 + p2 c2 . If it isn’t moving near speed of light, we have E= √ m2 c4 + p2 c2 ≈ mc2 + p2 . 2m (4.111) Note that the rest mass energy mc2 includes the internal energy, for example, bounding energy. Consider a simple reaction A⇀ ↽B , (4.112) where A and B could be two excited states of a single particle. Let the total number of particles be N . We have the constraint N = NA + NB in obvious notations. The total energy of the microstate that there are NA particles of type A with B momenta pA i , i = 1, . . . , NA and NB particles of type B with momenta pj , j = 1, . . . , NB is ) ) ∑( 2 2 ∑( (pB (pA j ) i ) 2 2 mA c + + mB c + . (4.113) 2mA 2mB j i There are N !/(NA !NB !) microstates with the same energy. Hence, the probability that the system is in a state with NA particles of type A with momenta {pA i } and B NB particles of type B with momenta {pj } is, according to Eq. (4.44), proportional to { [ )]} ) ∑( B 2 A 2 ∑( ) (p ) (p N! j exp −β mA c 2 + i + mB c 2 + . (4.114) NA !NB ! 2m 2m A B i j We do not care about the momenta of the particles, and sum (integrate) up all microstates of different momenta. We have already done this in Eq. (4.60). The probability that the system is in a state with NA particles of type A and NB particles CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 85 of type B (with any momenta) is proportional to N! 2 e−βNA mA c V NA NA !NB ! ( 2πmA kT h2 ( )3NA /2 e −βNB mB c2 V NB 2πmB kT h2 )3NB /2 . (4.115) By Stirling formula for large s, ln s! = s ln s − s, we can rewrite the factor in the above formula as 1 −βNA mA c2 NA e V = NA ! = e−βNA mA c 2 +N −N A A e−βNA (mA c 2 +kT ln NA +NA ln V ln(NA /V )−kT ) . (4.116) For classical ideal gas of particles with mass m at temperature T , the chemical potential is defined by (g n ) s Q (4.117) µ = mc2 − kT ln n where n is the number density of particles in the gas, nQ is called the quantum concentration ( )3/2 2πmkT nQ = (4.118) h2 and gs is the internal degree of freedom of the particle. It depends on the spin of the particle and, for example, the excited states of the hydrogen atom. For electron, proton or neutron, gs = 2. We have implicitly assumed that for our particles A and B, gA = gB = 1. In terms of these, Eq. (4.115) is N ! e−βNA (mA c +kT ln(NA /V )−kT )+NA ln nQA e−βNB (mB c +kT ln(NB /V )−kT )+NB ln nQB 2 2 = N ! eN e−βNA (mA c −ln(nQA /nA )) e−βNB (mB c −ln(nQB /nB )) = N ! eN e−βNA µA e−βNB µB = N ! eN e−βN µB exp[−βNA (µA − µB )] . (4.119) 2 2 Since µB depends only very weakly on NA (through NB = N − NA ), the probability is greatest when µA = µB . This is the main result: in equilibrium, the chemical potentials of the reactant and product equal. Let ∆E = mA c2 − mB c2 . The equality of chemical potentials implies ( ) ( ) nQA nQB 2 2 mA c − kT ln = mB c − kT ln nA nB ( ) nA nQB ∆E = −kT ln . nB nQA (4.120) If ∆E is much smaller than mA c2 , then nQA ≈ nQB and nA = nB e−∆E/kT . (4.121) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 86 Chemical potential can be interpreted as how much energy is needed to put one more particle into the system. It depends on the physical parameters of the system. For example, to keep the temperature unchanged, we have to speed up the particle before injecting it to the system. Since it costs arbitrarily low energy to create a photon with very long wavelength, we claim that the chemical potential of photon is zero. For reactions with more reactants, A+B ⇀ ↽C +D , (4.122) at thermodynamical equilibrium, the energy needed to create particles A and B must equal the energy needed to create particles C and D. Hence, we claim that µ(A) + µ(B) = µ(C) + µ(D) . (4.123) Substitute the expressions of the chemical potential in Eq. (4.117) in this, the resulting equation is the Saha equation. Let us consider a very important example, γ + Hn ⇀ ↽ e− + p , (4.124) the ionization of hydrogen atom at the n-th excited state by absorbing a photon γ. To satisfy the assumption that they are ideal gases, their density must be much less than the quantum concentration, n ≪ nQ . As mentioned, µ(γ) = 0 and ( ) ge nQe 2 µ(e) = me c − kT ln , (4.125) ne ( ) gp nQp 2 µ(p) = mp c − kT ln , (4.126) np ( ) g(Hn )nQH 2 µ(Hn ) = m(Hn )c − kT ln (4.127) . n(Hn ) The energy of the n-th excited hydrogen atom is En , Eq. (2.7), hence we have m(Hn )c2 = me c2 + mp c2 + En . Substitute these into the chemical potential equation, we have ) ( n(Hn ) ge nQe gp nQp . − En = kT ln g(Hn )nQH ne np (4.128) (4.129) Since the mass of hydrogen is approximately equal to the mass of a proton, nQH = nQp . Also, let εn = |En |. The Saha equation becomes exp(−εn /kT ) g(Hn ) ne np . ge gp nQe n(Hn ) (4.130) CHAPTER 4. INTRODUCTION TO RADIATIVE PROCESSES 87 For free electrons, ge = 2 np 2gp = n(Hn ) ne g(Hn ) ( 2πme kT h2 )3/2 exp(−εn /kT ) . (4.131) In most cases, there is no net charge, ne = np and all the gs are of order of unity. We finally have )3/2 ( n2e 2πme kT ∼ exp(−εn /kT ) . (4.132) n(Hn ) h2 We see that there is significant change in the percentage of ionization when temperature is around ε1 /k, which is about 160,000K. Example: In Sun’s photosphere, ne = 1.88 × 1013 cm−3 , T = 5777 K, what is the number ratio between Ca II (singly-ionized calcium) and Ca I (neutral calcium)? Given that gII = 2.30 and gI = 1.32 and ionization energy χI = 6.11 eV for Ca I. Using the Saha equation, NII 2gII (2πme kT )3/2 −χI /kT = e NI gI ne h3 2 × 2.30 (2πme kT )3/2 −6.11 eV/kT = e 1.32 × 1.88 × 1013 h3 ≈ 927 .