Download PHY104 - Introduction to Astrophysics

Document related concepts

Drake equation wikipedia , lookup

Hipparcos wikipedia , lookup

Dyson sphere wikipedia , lookup

International Ultraviolet Explorer wikipedia , lookup

CoRoT wikipedia , lookup

Astronomical unit wikipedia , lookup

Ursa Minor wikipedia , lookup

Aquarius (constellation) wikipedia , lookup

Type II supernova wikipedia , lookup

H II region wikipedia , lookup

Corvus (constellation) wikipedia , lookup

Stellar evolution wikipedia , lookup

Cosmic distance ladder wikipedia , lookup

Timeline of astronomy wikipedia , lookup

Observational astronomy wikipedia , lookup

Star formation wikipedia , lookup

Transcript
PHY104 - Introduction to Astrophysics
S. P. Littlefair
June 4, 2013
Chapter 1
Properties of Light
1.1
Introduction
The only information we have about our Universe comes from the light
emitted by objects within it. A good understanding of light is essential in
all of astrophysics. We must learn what light is and how it behaves. We
must understand it’s properties, and learn how to use those properties to
discover the information we seek. Finally, we must understand how light
interacts with the matter around it. Much of the rest of the astrophysics
course at Sheffield deals with what we know. By covering the basic physics
of light, this course aims to explain how we acquire that knowledge. Our
starting point is a question which seems quite simple; “what is light”?
1.2
The wave nature of light
It is easy to demonstrate that light behaves like a wave. The famous Young’s
slit experiment is an elegant demonstration, and is shown in the left-hand
side of figure 1.1. In this experiment, a thin plate with two parallel slits
are illuminated by a single light source, and the light passing through the
slits strikes a screen behind them. When we look on the screen, we see
a diffraction pattern, made up of a series of bright and dark fringes. The
diffraction pattern is easily understood if we think of light as a wave propagating through some medium, like water waves on a lake. In our experiment,
each slit acts as a source of light waves; a wavefront of light spreads out from
each slit. The two wavefronts of light hit the screen, and the brightness of
light at that point depends on how the wavefronts interfere. If the light
waves are in phase (if the peaks of both waves line up), then we get a bright
1
Figure 1.1: Left: A schematic of the Young’s double slit experiment. A
light source behind S1 illuminates the two slits at S2. These slits act as
secondary sources of light, and light waves spread out from the slits like
water waves on a lake. Right: the principle of superposition. Light in
phase adds to give brighter light, but light which is out of phase cancels out
to produce dark regions.
region. If the two waves are out of phase (the peak of one wave corresponds
to a trough in the other), the light waves cancel out, and we see a dark
region (see RHS of figure 1.1).
We will return to the wave nature of light in a moment, but first let us
look at the startling property of light that emerges from quantum mechanics.
1.3
The particle nature of light
Whilst it is easy to demonstrate that light behaves like a wave, it is also
possible (though nowhere near as easy!) to demonstrate that light behaves
like a particle. When a metal plate is illuminated with blue or ultraviolet
light, electrons absorb the energy from the light, and can escape from the
metal. This phenomenon is known as the photoelectric effect. If light is
a wave, we might think that the energy of the escaping electrons would
increase as the intensity of light increased, but the frequency of that light
wouldn’t matter. In fact, the energy of the released electrons increases in
proportion to the frequency of the light, and below a certain frequency, no
electrons are emitted from the metal at all (see figure 1.2). It is extremely
difficult to explain this result by thinking about light as a wave.
The photoelectric effect’s dependence upon frequency was predicted by
Einstein, in 1905, based upon a model of light as particles of light, called
2
eEnergy
ν
E=hν - hν0
ν0
Frequency
Figure 1.2: The photoelectric effect. Blue light is shone onto a metal plate.
Electrons in the metal absorb the energy from the light and escape from
the metal (left hand side). When the energy of the emerging electrons is
measured, it turns out to be proportional to the frequency of the light (right
hand side). Furthermore, below a certain frequency, no electrons are released
from the metal plate. This effect led Einstein to propose the particle nature
of light in 1905.
photons. He suggested that each photon has an energy which is proportional
to its frequency, E = hν. An electron requires a certain amount of energy
to free it from the metal. If we call this amount of energy W (known as
the work function) then the energy of the freed electron should be given by
Enu = hν − W . Thus, the energy of the emerging electron is proportional to
the frequency of the incident light, just as we see in the photoelectric effect.
What happens if the frequency of the light is reduced, so that the energy
carried by the photons is less than W ? In this case, no single photon has
enough energy to liberate an electron, and so no electrons can escape the
metal. These were the predictions made by Einstein for the behaviour of
the photoelectric effect. Einstein’s predictions of the frequency dependence
of the photoelectric effect, based upon the photon model, were confirmed in
painstaking experiments by Millikan in 1913-1914. Although Millikan didn’t
believe in the particle nature of light at the time, his experiments earned
him the Nobel prize, and gave tremendous support to the picture of light as
discrete particles of light.
The photoelectric effect tells us that each photon has an energy given
by E = hν. Photons also have a momentum, given by p = E/c = hν/c,
although to understand why requires us to study Einstein’s theory of special
3
relativity, which is beyond this course.
1.4
The wave-particle nature of light
So we have some experiments which show that light behaves as a wave,
and other which show quite clearly that light behaves as a particle. In fact,
there are even experiments that show that light can behave as both a particle
and a wave at the same time! We are going to go back to the Young’s slit
experiment described earlier, but perform an experiment in which we reduce
the intensity of the light so much that only one photon illuminates the plate
with the slits at any one time. Now, let us put a special camera in place of
the screen, that can detect each photon as it arrives at the screen. What we
find is astonishing. The light is clearly behaving like photons, because we
see each one hit the screen individually. The location of each photon as it
hits the screen is seemingly random; one photon arrives at a given location
and a moment later another photon arrives somewhere else. Over time,
however, as we watch the photons arrive one-by-one, we find more photons
are hitting the screen where the bright regions of the original diffraction
pattern were! How can this be? The diffraction pattern was made by the
interference of waves of light. The photons are going through the slit oneby-one and so cannot be interfering with each other. What is going on is
that the photons are interfering with themselves; behaving as a particle and
a wave at the same time. Thus, whilst light sometimes behaves as a wave,
and sometimes as a particle, in reality it is neither. The true nature of light
is much stranger, and is given by quantum mechanics, which you will study
later in the Physics course. In the remainder of this module, we will choose
to describe light as either a particle or a wave, depending on which suits us
most!
1.5
The electro-magnetic spectrum
If light is a wave, what is it a wave in? The answer is that light is an electromagnetic wave. Modern theory describes light as electric and magnetic fields
which oscillate in phase, perpendicular to the velocity of propagation, but
in planes oriented at 90 degrees to each other. Confused? Have a look at
figure 1.3, or take a look at the JAVA applet at http://www.phys.hawaii.
edu/~teb/java/ntnujava/emWave/emWave.html.
Like any wave, it’s speed of propagation is given by the wave equation,
c = νλ, where c is the speed of light, ν is the frequency, and λ is the wave4
Figure 1.3: An electromagnetic wave.
length of the light. Light of different frequencies is referred to by different
names; visible light is only a tiny portion of the electro-magnetic spectrum,
which is shown in figure 1.4. In astronomy one of the most useful techniques
available to us is to exploit the full electro-magnetic spectrum. Because light
is described both by the wave equation (c = νλ) and by E = hν, we can describe a position on the electro-magnetic spectrum by wavelength, frequency,
or energy. Visible light (frequencies around 1015 Hz) tells us much about
the thermal emission from stars and galaxies, but infrared and microwave
radiation (wavelengths from 1 micron to 1 cm) can show us the location of
very cool stars and dust, whilst high-energy radiation (γ-rays, X-rays and
UV emission) tells us about the most energetic processes in the Universe.
Only a small portion of the electromagnetic spectrum is observable from
the surface of the Earth, however. The optical, infrared and radio portions
of the spectrum are visible through atmospheric windows, but the Earth’s
atmosphere is opaque to other wavelengths of light. The transparency of the
Earth’s atmosphere is shown in figure 1.5. Regions of the electromagnetic
spectrum not visible from the Earth must be observed from satellites in
space; there now exist satellites which cover most of the electromagnetic
spectrum.
5
Figure 1.4: The electromagnetic spectrum, showing the position of visible
light.
Figure 1.5: Transparency of the Earth’s atmosphere.
6
1.6
Measuring Light - Brightness
The most basic thing we can measure about the light from an astrophysical
object is it’s quantity. How much light is emitted from a star. To do this
usefully we need to define some quantities. The first is the monochromatic
flux. This is the energy falling on a unit area, per unit time, at a given
frequency. Actually, we need to be careful how we state that last part, “ at
a given frequency”. In the same way that no light falls on a single point,
because a point has no area, no light is emitted at a single frequency. Instead,
we should properly ask how much light is emitted in a infinitesimally small
range of frequency, dν. Suppose we have a perfect telescope, which detects
all the light that falls on it. The collecting area of this telescope is ∆A.
Suppose we tune that telescope so it is only sensitive to light over a frequency
range ∆ν, and collect light for a time interval of ∆t. We detect an amount
of energy given by ∆E. The amount of energy detected per unit time, area
∆E
and frequency is then given by ∆t ∆A
∆ν . To find the monochromatic flux,
we must let all these intervals become infinitesimally small. In other words,
the monochromatic flux is given by
∆E
.
∆A→0 ∆t ∆A ∆ν
Fν = lim
(1.1)
∆ν→0
∆t→0
As well as the monochromatic flux, which measures the amount of light as
a function of wavelength, we may want to know the amount of light across
all wavelengths. This quantity is known as the bolometric flux and is given
by
Z
∞
F bol =
Fν dν.
(1.2)
0
1.6.1
Monochromatic flux in more detail
Above we talk about the monochromatic flux, and we define it as the amount
of energy detected per unit time, area and per unit frequency. It is given
the symbol Fν . We can also define a monochromatic flux as the amount of
energy detected per unit time, area and per unit wavelength. We give this
a symbol Fλ . How are the two related? Over a small frequency range dν,
what is the total energy received per unit area and time? It is an amount
equal to Fν dν. Clearly, this amount of energy doesn’t change, whether we
measure the monochromatic flux in wavelength or frequency units. Using
this fact, we can write
Fλ dλ = Fν dν,
(1.3)
7
where dλ is the small wavelength range that corresponds to the frequency
range dν. We can find dλ from
dλ dλ = dν.
(1.4)
dν
We take the modulus of dλ/dν because we only care about the size of the
wavelength range dλ. Let’s substitute equation (1.4) into equation (1.3) to
get
dλ Fλ dν = Fν dν, or
dν
dλ Fλ = Fν
dν
.
This is our answer; anequation
linking Fλ and Fν . You might be wonder
, but in fact it is rather simple, because λ and ν
ing how we can calculate dλ
dν
are related via the wave equation, c = νλ. Re-arranging, and differentiating
this equation we find
c
ν
dλ
−c
= 2
ν
dν
2
dλ = c = λ
dν ν 2
c
,
λ=
which gives
Fν = Fλ
1.6.2
λ2
c
(1.5)
Luminosity
Whilst the flux is generally what we can measure from the Earth (or just
above it), what we would often like to be able to measure is the total energy
emitted by the source (in all directions). This is known as the luminosity and
is a fundamental property of the source. The monochromatic luminosity is
defined as the energy emitted by the source in unit time, per unit wavelength,
i.e.
∆E
Lν = lim
.
(1.6)
∆ν→0 ∆t ∆ν
∆t→0
8
Just as flux has a monochromatic and bolometric definition, we can also
define the bolometric luminosity, which is the total energy emitted by the
source per unit time, at all wavelengths,
Z ∞
bol
L =
Lν dν.
(1.7)
0
1.6.3
Inverse Square Law
d
Figure 1.6: Geometry for proving the inverse square law.
How are the luminosity and flux related? Consider a sphere at a distance
d from the source (shown in figure 1.6). Assuming the source is isotropic, the
energy from the source is evenly spread over the surface of sphere of radius d;
the energy is spread over an area 4πd2 . The total amount of energy emitted
per unit time is the bolometric luminosity, Lbol . Therefore, the total energy
arriving at the sphere, per unit time and per unit area (the bolometric flux)
is given by the inverse square law
Lbol
.
(1.8)
4πd2
A similar equation can be written relating the monochromatic flux and luminosity, Fν and Lν .
F bol =
9
Chapter 2
Magnitudes
2.1
An astrononomy quirk - magnitudes
Generally speaking the flux is a perfectly useful measure of how much light
we receive on Earth. There are some practical difficulties in measuring the
flux of course; our detectors are not 100% efficient, and we have to correct
for the amount of light absorbed by the atmosphere. Nevertheless, once
these corrections have been made, there should be no problem reporting the
amount of light received as fluxes, right? However, astronomy is a science
with a tendency towards unconventional units of measure, and no exception
is made here.
In astronomy, we often use the apparent magnitude instead of the flux.
The scale upon which magnitude is now measured has its origin in the
ancient Greek practice of dividing those stars visible to the naked eye into six
magnitudes. The brightest stars were said to be of first magnitude (m = 1),
while the faintest were of sixth magnitude (m = 6), the limit of human visual
perception (without the aid of a telescope). Each grade of magnitude was
considered to be twice the brightness of the following grade (a logarithmic
scale). Nowadays, magnitude is still a logarithmic scale, but has been put on
a formal footing. The apparent magnitude is related to the monochromatic
flux by
mν = −2.5 log10 Fν + c.
(2.1)
However, the ancient Greek convention has been retained; brighter stars
have smaller magnitudes than fainter ones - courtesy of that minus sign in
equation (2.1). So a star with magnitude -3 is much brighter than a star of
magnitude 14. What is the value of the constant, c in equation (2.1)? In
fact, we are free to choose any value we like, as long as we keep the same
10
value between measurements. Although magnitudes can be a bit confusing
on first acquaintance, they have one big advantage; the numbers involved
tend to be easy to deal with. A magnitude of 0 is a lot quicker to grasp and
remember than a flux of 3.44 × 10−8 W m−2 µm−1 !
2.2
1
0.9
Measuring magnitude: Photometry
λc
Transmission
0.8
0.7
0.6
0.5
Δλ
0.4
0.3
0.2
0.1
0
500 520 540 560 580 600 620 640 660 680 700
Wavelength (nm)
Figure 2.1: Left: an idealised astronomical filter and right: the JohnsonCousins filter set
The technique of measuring accurate fluxes and magnitudes of astronomical sources is called photometry. Photometric systems are defined by
the sets of filters which are used to isolate individual wavelength ranges in
order to measure a monochromatic flux. An example filter set is shown
in figure 2.1. This is the Johnson-Cousins filter set, which is widely used
throughout astronomy.
In principle, astronomical photometry is simple. Consider the idealised
filter shown in figure 2.1. This filter has a central (average) wavelength of
λc and a width of ∆λ. This filter is placed on a camera on a telescope
with collecting area ∆A. The camera is exposed to light for a time ∆t.
Not every photon which falls on the telescope will be recorded. Suppose our
camera/telescope combination detects a fraction, η of the photons which fall
on the telescope (we say it has an efficiency of η). Therefore, if we detect N
photons, the total number of photons which fell on our telescope was N/η.
What is the average energy of each of these photons? They have an average
energy of E = hν = hc/λ. So the total energy which arrived is ∆E = Nηλhc .
∆E
The flux in our filter follows from Fλ = ∆A∆t∆λ
.
11
Fluxes in astronomical filters are normally referred to using the name of
the filter as a subscript. As an example, the flux as measured in the Johnson
V-band would be called FV . The V-band magnitude can be calculated in
the usual way
mV = −2.5 log10 FV + cV .
Notice that I have called the constant cV , to indicate that this constant is
specific to the V-band. In the Johnson photometric system, the constant c
is chosen separately for each filter. The value of c is chosen so that the
bright star, Vega, has a magnitude of 0 in every band.
Magnitudes can be measured for a star in each photometric band and
colour indices determined by subtracting the magnitudes as measured in
different filters, e.g.
mB − mV = B − V = −2.5 log10 FB + cB + 2.5 log 10FV − cV
FV
B − V = +2.5 log10
+ const.
FB
(2.2)
Colour indices provide crude information about the spectra of astronomical
sources (they are often just called the “colour”), and can be used to estimate
the temperatures of stars. A couple of things are worth noting - because
magnitudes are a logarithmic scale, subtracting two magnitudes is equivalent
to dividing two fluxes. Also, since the magnitude scale is fixed so that Vega
has a magnitude of 0 in each band, Vega also has colour indices of zero,
which fixes the constant in the equation above.
2.3
Absolute Magnitudes
We noted in the last section that the luminosity is a fundamental property
of the source. The flux, by contrast, is a property both of the source, and
its distance from us. The inverse square law tells us that a source can be
faint (low in flux) because it is either intrinsically dim (low in luminosity),
or very far away. Since the apparent magnitude is related to the flux, it too
depends on the distance to the source. We would like a measure of intrinsic
brightness for the magnitude scale; the magnitude equivalent of luminosity.
This measure, which we will call the absolute magnitude, is the magnitude
an object would have, if it were 10 parsecs away1 . How are the absolute and
apparent magnitude related? Fν is the flux of our object, and we’ll call the
1
Parsecs are a standard unit of distance in astrophysics. We’ll see why in the next
section
12
flux it would have at 10 parsecs Fν10 . Let us measure the distance to our
object, d, in parsecs. The inverse square law tells us that
Fν
L/4πd2
=
=
Fν10
L/4π102
10
d
2
.
(2.3)
Using equation (2.1) we can then write
M = −2.5 log10 Fν10 + c, and
m − M = −2.5 log10 Fν + 2.5 log10 Fν10 , or
Fν
m − M = −2.5 log10
,
Fν10
where M is the absolute magnitude. Combining this last equation with
equation (2.3), we find
d
m − M = 5 log10
,
(2.4)
10
remembering all the while that d is measured in parsecs! Equation (2.4) tells
us that the apparent and absolute magnitudes are related by the distance
to the source; in fact, the quantity m − M is called the distance modulus.
We can derive a general form of equation 2.4 too, which tells us how the
apparent magnitude of an object changes with distance. We simply replace
the 10 parsec distance we chose earlier by a general distance d2 , and ask
what it is the magnitude of our object at that distance m2 ?
d
m − m2 = 5 log10
, or
d2
d1
m1 − m2 = 5 log10
,
d2
where I have replaced m and d with m1 and d1 .
If we can measure the apparent magnitude for an object, and somehow
know it’s absolute magnitude (perhaps all sources of a given type have the
same absolute magnitude), then we can calculate the distance modulus, and
from that, the distance. Generally speaking, however, we do not know the
absolute magnitude of an object, and can only measure apparent magnitudes. How then, do we calculate the distance to astronomical objects? We
will cover distance measurement in the next chapter.
13
Chapter 3
Distance Measurement
3.1
Parallax
p
p
p
d
Summer
1 AU
Winter
Figure 3.1: Astronomical Parallax
Parallax is the name given to the phenomenon where an object appears
to move, relative to the background, when viewed along two different lines
of sight. Parallax is easily experienced by holding your thumb up at arms
length and then closing one eye, and then the other. Your thumb appears to
move relative to the background. The effects of parallax mean that each eye
sees a subtly different scene; having two eyes with overlapping fields of view
14
is what allows your brain to measure the distances to everyday objects and
gives you depth perception. Therefore, you are already an expert in applying
parallax to measure distance, and what follows will surely be simple revision.
In astrophysics, parallax is the only method we have for obtaining direct measures of distance for objects outside our solar system. The way it
works is illustrated in figure 3.1. As the Earth moves round the sun, we observe different lines of sight towards a nearby star. Because nearby objects
experience larger parallax than further objects (it should be obvious why,
from figure 3.1), the nearby star appears to move, relative to the distant,
background stars. From the right hand side of figure 3.1, we can see that
tan p = (Earth-Sun distance)/d.
Since p is such a small angle, we can use the small angle approximation,
tan p ≈ p, where p is measured in radians, to get
p (radians) = (Earth-Sun distance)/d.
We can make life much easier for ourselves by some careful choice of units.
For a start, parallax angles are so small that a radian is an awkward unit.
We would be better off using arcseconds1 . Also, the Earth-Sun distance and
typical stellar distances are too large to be easily measured in metres. What
we will do is define a new unit, the parsec, so that an object at a distance
of 1 parsec (1 pc) will have a parallax of 1 arcsecond. If we do this, the
parallax equation becomes
p (arcseconds) =
1
.
d (pc)
(3.1)
How large is a parsec? Basic trigonometry reveals that a parsec is 206,265
AU, 3.085 × 1016 m, or 3.28 light years. It is a very convenient unit for
distances in astronomy.
3.2
Typical parallaxes
How large is the parallax for our nearest star? Proxima Centauri has a
parallax of p = 000 .75, which corresponds to a distance of d = 1.33 pc. This
is a very small amount of parallax; roughly the same angle as that subtended
1
Just like an hour is divided into minutes and seconds, so a degree is divided into
arcminutes and arcseconds. 1◦ = 60 arcminutes (600 ), 10 = 60 arcseconds (6000 ), so 1◦ =
60 × 60 = 360000
15
by a 2 pence piece at a distance of 5km! It’s comparable to the resolution
of ground-based telescopes (which you’ll remember are limited in resolution
by the seeing). In other words, even for the nearest stars, we’re looking for
a shift in position which is comparable to the size of the stellar image itself.
Parallax is a very small effect, which requires extremely careful data taking
and analysis to use.
As an historical aside, the difficulty of observing parallax was one of
the main reasons the Heliocentric model of the Solar System took so long
to be accepted. In general, people assumed that the stars were not much
further away than the planets of our own Solar System. Since they didn’t
show large parallaxes, the logical conclusion was that the Earth itself did
not move. The first succesful measurement of stellar parallax did not come
until 1838, when Friedrich Bessel measured the parallax of 61 Cygni to be
∼ 000 .3. Modern measurements of parallax have achieved an astonishing
level of accuracy. By moving to space, we can overcome the blurring effects
of the atmosphere and measure very small shifts in stellar positions. The
current state of the art is the Hipparchos satellite, which achieved accuracies
of ±0.001 arcseconds for the brightest stars.
It should be clear that parallax is only useful for measuring the distance
to nearby stars. Even at the 0.001 arcsecond accuracy provided by Hipparchos, we can only detect the parallax for stars closer than 1000 pc, or 1 kpc.
In order to measure the distance of the furthest stars, and to have any hope
of measuring distances to Galaxies, we need to apply new techniques.
3.3
“Standard Candles” and the Distance Ladder
A standard candle is the name given to an object of accurately known luminosity. Before going on to describe some types of standard candle, we
should ask why they are so useful? The reason is that we can easily measure
the flux of these objects from Earth and, if the luminosity is known, we can
apply the inverse square law to find the distance
d=
L
4πF
1
2
.
In order to establish that an object is a standard candle, we need some
way of knowing its luminosity. That means that, for at least some objects,
we already know the distance. Thus, ultimately, all distances from standard
candles are based upon distances obtained from parallax. Thus, we might
discover a standard candle in our own Galaxy and measure the distance
16
using parallax. Because the luminosity of our standard candle is known,
we could use these standard candles to measure the distances to nearby
galaxies. In these galaxies, we may find another type of standard candle.
Perhaps it is much brighter, allowing us to measure the distance to even
further galaxies, and so on. In this manner, we build up a “distance ladder”
which enables us to measure distances to all objects, right out to the edge of
the Universe. Ultimately though, they are all based on distances measured
in our own Galaxy, through parallax.
3.3.1
An standard candle example: Cepheid Variables
1994MNRAS.266..441L
Let’s look at standard candles in more detail by considering a specific, and
important, example. Cepheid variables are pulsating stars, whose brightness
varies periodically. They are very bright, and can be seen at great distances.
Figure 3.2: Calibration of the Cepheid period-luminosity relationship from
Lane & Stobie (1994). Here, the logarithm of the period is shown against the
absolute magnitude. The distance to any Cepheid can be found by measuring
its period and using this relationship to infer its absolute magnitude. This is
compared to the measured apparent magnitude to yield a distance modulus,
and hence a distance.
In 1893, Henrietta Leavitt began work measuring the brightness of stars
in the Magellanic Clouds2 . In 1908 she published her results, noting a large
2
At that time, women did much work in astrophysics, but they were usually carrying
17
number of periodic variable stars which showed a pattern; the brighter stars
seemed to have longer periods. In 1912 she published another study showing that these variables - Cepheid Variables - showed a close and predictable
relationship between their luminosity and the period. In 1913, Hertzsprung
used parallax to measure the distance to several Cepheids in our Galaxy.
In this way, the relationship between luminosity and period was calibrated.
As a result, if the period of a Cepheid Variable could be measured, then the
luminosity is known, and the distance to the Cepheid variable can be measured. Today, the relationship is very well established by studying Cepheid
variables with known parallaxes from Hipparchos (see figure 3.2).
At the time of Leavitt’s discovery, it still wasn’t clear that what we know
call Galaxies were actually outside of our own Milky Way. Soon, however,
Cepheids started to be discovered in other Galaxies. In 1923 Edwin Hubble
used Cepheids in the Andromeda Galaxy to show that it was located far
outside our Milky Way. In this way Cepheid variables revolutionalised our
understanding of the Universe.
3.3.2
Type Ia Supernovae
Cepheid Variables are bright; whilst the best parallaxes can yield distances
to 1 kpc, Cepheids can be observed up to 10-20 Mpc (1 Mpc = 1 million
pc). There are other types of variable stars which can be used as standard
candles in a similar way to Cepheids (notably the RR Lyrae stars and W
Virginis stars), but these are fainter than Cepheids. If we wish to measure
the distance to the furthest galaxies, we will need to find another rung on
the distance ladder. This rung comes from Type Ia supernovae.
Type Ia supernovae are the explosions on white dwarf stars in binary
systems. White dwarfs have a maximum mass, known as the Chandrasekhar
limit. This is somewhere around 1.4M . If a white dwarf in a binary system
accretes mass from it’s companion, it can be pushed over the Chandrasekhar
limit. The white dwarf must then collapse, but the collapse causes the
centre of the white dwarf to become extremely dense and hot, which triggers
explosive nuclear burning in the core. The resulting supernovae explosion
is extremely bright; it is about 14 magnitudes brighter than the brightest
Cepheid variable.
Using the general form of equation 2.4, we can see that type Ia supernovae can be seen to much larger distances than Cepheids. Suppose we have
an object on the very limit of visibility, which we then make 14 magnitudes
out the laborious lab work and number crunching for male professors. These “computers”,
as they were called, rarely received direct credit for their work
18
brighter. Now, we move it away until it is once again at the very limit of
visibility. How far has it moved?
d1
m1 − m2 = 5 log10
(3.2)
d2
d1
(3.3)
∆m = 5 log10
d2
d1
14 = 5 log10
(3.4)
d2
d1
10(14/5) =
(3.5)
d2
d1
≈ 630
(3.6)
d2
Type Ia supernovae can be seen 630 times further away than a Cepheid
variable! Since Cepheids can be seen to a distance of 20 Mpc, we can see
type Ia supernovae to over 10,000 Mpc; that is a significant fraction of the
entire visible Universe!
But are type Ia supernovae standard candles? In fact, there is a good reason to believe they are. Since no white dwarf can exceed the Chandrasekhar
mass, all type Ia supernovae result from the explosion of a white dwarf at
exactly the Chandrasekhar mass. Therefore, we can reasonably expect them
all to share the same luminosity! By combining parallax, Cepheid variables
and type Ia supernovae, we now have a distance ladder which covers almost
the entire Universe.
19
Chapter 4
Motion of celestial objects
4.1
Proper Motion
The stars are in constant motion. Much of it is apparent motion. Each night,
the stars appear to move because of the rotation of the Earth (usually called
diurnal motion). Over the course of a year the Earth moves round the Sun,
causing the stars which are visible from night to night to change. And
finally, as we have discussed, the Earth’s motion around the Sun causes the
position of nearby stars to change, due to the phenomenon of parallax.
Once all these apparent sources of motion have been put together however, the stars still move! A star’s motion through space is unsurprisingly
called its space velocity. Viewed from Earth, this velocity can be broken
down into two components. The component along our line of site is called
the radial velocity, which we will discuss in more detail shortly. The second
component lies in the plane of the night sky, and is called the proper motion
- see figure 4.1. The proper motion is visible as a gradual shift in the position of a star, relative to the stars around it. It is convenient to measure it
as the rate of angular displacement of the star, in arcseconds/yr.
Proper motions of stars are generally quite small. The star with the
largest proper motion, Barnard’s star, moves around 10.3 arcseconds every
year. Nevertheless over significant periods of time, the proper motions of
stars can build up to quite significant changes. Because a star’s proper
motion is measured in arcseconds/yr, a star which is very far away can show
a small proper motion, even if it is moving rather fast across our line of sight.
Therefore, proper motions are extremely useful for making a sample of stars
which are nearby, but become increasingly hard to measure for objects which
are far away.
20
radial
velocity
space velocity
tangential
velocity
µ
proper motion
Figure 4.1: Stellar motion
21
4.2
Radial velocity and Doppler shift
We return now to the component of the space velocity which is directed
along our line of sight; the radial velocity. How do we measure this? A
star which has no proper motion but high radial velocity does not appear
to move in the night sky... Fortunately, we can measure the radial velocity
of celestial objects using the Doppler shift.
rest wavelength
λ0
observed wavelength
λ'
vr
Figure 4.2: Doppler shift
The cause of the Doppler shift is shown in figure 4.2. Imagine a star
which is moving away from us as it emits light. We can think of the light
as being stretched out because of the movement of the star. We can work
out how much the light is stretched out using some simple maths. What is
the time taken by the star to emit one wavelength of light? Let us call the
wavelength emitted by an object at rest λ0 . Since the wave travels at the
speed of light, the time between peaks is given by
∆t =
λ0
.
c
Now consider the star as it moves away from us, with a radial velocity of vr .
It still takes the same time to emit one wave, ∆t. However, in that time,
the star itself has moved a distance
vr ∆t =
22
vr λ 0
.
c
The distance between successive peaks of the light is given by the old wavelength λ0 , plus the extra distance moved by the star. So the new wavelength,
λ0 , is given by
vr λ0
λ0 = λ0 +
,
c
or, rearranging,
∆λ
vr
λ0 − λ0
≡
= .
(4.1)
λ0
λ0
c
The quantity ∆λ is called the Doppler shift. An object moving away from
us has a positive value of vr . That means the Doppler shift is positive, the
wavelength increases, and so the emitted light becomes increasingly red. We
say the light has been redshifted. Conversely, if an object moves towards us
the emitted wavelength becomes shorter, and the light is blueshifted.
4.2.1
Measuring Doppler shifts
1
0.9
0.8
λ0
Fλ (normalised)
0.7
0.6
0.5
0.4
0.3
0.2
Δλ
0.1
0
300
350
400
450
500
550
600
650
700
Wavelength (nm)
Figure 4.3: Measuring Doppler shift
How do we measure Doppler shifts? We can do it by taking a spectrum
of the source, as illustrated in figure 4.3. Astronomical spectra often show
sharp features at an easily measured wavelength (e.g. absorption or emission lines). In many cases, the rest wavelengths of these lines are known
from laboratory experiments performed on Earth, so by comparing the rest
23
wavelength with the observed wavelength from the spectrum, we can easily
calculate the Doppler shift.
4.2.2
The expansion of the Universe
The combination of measuring Doppler shifts together with the work done
on the distance ladder lead directly to one of the most important discoveries made in Astrophysics. Following on from his work in which he used
Cepheid Variables to measure the distance of the Andromeda galaxy, Edwin
Hubble began work combining the distances to galaxies (as measured by
Cepheids), with the velocity at which the galaxies were moving away from
us (measured by Doppler shifts). Hubble used a measure called the redvr
shift, z = ∆λ
λ0 = c , which is proportional to the speed at which the galaxy
is moving away from us. Using the 2.5m telescope at the Mount Wilson
observatory, Hubble collected redshifts and distances for 46 galaxies. They
discovered a proportionality between the redshift and the distance. Since
the redshift is proportional to the velocity, this means that the radial velocity of galaxies are proportional to the distance from us, a result now known
as Hubble’s law
v = H0 D,
(4.2)
where H0 is a constant of proportionality known as Hubble’s constant. Hubble made several mistakes in his work, including getting all his distances
wrong, so that his estimate of Hubble’s constant was out by nearly a factor
of 10! Nevertheless, the observations changed our view of the Universe forever. At first glance, the result seems staggering. Every galaxy is moving
away from us, and at a speed proportional to its distance from us. It is
hard to explain without assuming that the Earth is located at the centre
of the Universe. Astrophysicists don’t like to do that, because it violates
an assumption we make that the Earth is not located anywhere particularly
special (the Copernican principle). The solution to our predicament is that
the entire Universe is expanding; that is, every point in space is moving
away from every other point! As bizarre as it sounds, this behaviour had
already been predicted, using Einstein’s General theory of Relativity, when
Hubble published his results. Hubble’s work provided convincing proof that
we live in an expanding Universe.
Hubble’s work also shows us that equation (4.1) is wrong! Or at least,
needs some correction. That’s because the redshifts he measured were very
large. For example, the most distant galaxy in Hubble’s sample had a redvr
shift of nearly 4. Since z = ∆λ
λ0 = c , according to equation (4.1), that would
suggest the galaxy was moving away from us at nearly 4 times the speed
24
of light! Since nothing can go faster than light, clearly something is wrong
with our formula for Doppler shift. In fact, the problem is that we need to
make corrections to take account of special relativity. Since that is beyond
the level of this course, I will just give the answer here, in the relativistic
Doppler equation
1
∆λ
c + vr 2
=
− 1.
(4.3)
λ0
c − vr
This equation should be used in place of equation (4.1) whenever the radial
velocity is a significant fraction of the speed of light.
Note that I’ve been spelling Doppler shift with a capital D. That’s because it is named after it’s discoverer, Christian Doppler. Doppler originally
proposed it (in 1842) as an explanation of why different stars had different
colours. With our modern understanding of Doppler shift, we know that
the colour of stars is not caused by Doppler shift. The maximum velocity
of stars in the Galaxy is around 300 km/s. At the centre of the optical
waveband (500nm) this velocity would cause a Doppler shift of only 0.5nm;
nowhere near enough to significantly change the colour of a star. We need
to find a different explanation for the colours of stars.
4.3
4.3.1
The colours of stars
Thermal continuum radiation
In everday life, most objects are visible because of the light which they
reflect. You and I are visible because of the sunlight we reflect, not because
of the light we emit ourselves. We do, however, emit light. All material
objects give of some radiation: the hotter they are, the more radiation they
emit. Can we be more specific about the properties of light emitted by warm
objects?
We can if we make one very important approximation; that the radiation
emitted by an object is in thermal equilibrium with it’s surroundings.
What do we mean by that? Suppose we heat an object up; it will radiate
light. This radiation carries away energy and so the object cools. Now,
we’ll put our object into a special box, which absorbs and re-emits all of
the radiation which falls upon it. Now, as our object cools, the box fills
with photons. Some of these will be absorbed by our object, and provide a
small heating effect. Eventually, a balance will be set up, where for every
photon absorbed another is emitted by the object. At this point the box,
the radiation and the object are all in thermal equilibrium. This is a stable
25
situation; the spectrum of the radiation does not change with time. The
radiation within the box is called thermal equilibrium radiation. It is often
called black-body radiation, because the best way to get the box walls to
absorb and emit all the radiation is to make them black.
Suppose we now make a tiny, tiny hole in the box. Just large enough
to allow some light to escape, but not so large as to break the thermal
equilibrium within the box. What does the spectrum of the light look like?
Black-body spectra for objects at three temperatures are shown in figure 4.4.
Figure 4.4: Black body light curves. Monochromatic intensity (monochromatic flux per unit solid angle) is plotted against wavelength for black bodies
at three temperatures. The wavelength range of visible light is shown.
The curves show a steep rise to a well-defined peak, and then a tail of
emission towards longer wavelengths. As the temperature increases, the
peak moves to shorter wavelengths, and the area under the curve increases.
A derivation of the formula which describes the black-body spectrum is not
easy, and produced the first quantum mechanical formula ever known. The
26
German physicist, Max Planck, first determined the formula empirically, by
fitting the observed curve with a function that gave an extremely good fit.
Classical physics was completely unable to explain why this formula worked
so well, but quantum mechanics provided the answer. The formula, now
known as the Planck function is
Fν (T ) =
2hπν 3
1
Wm2 Hz−1 .
c2 exp(hν/kT ) − 1
(4.4)
Fν (T ) is the monochromatic surface flux. It is still the amount of light per
unit time, per unit frequency and per unit area, but now that unit area refers
to a small unit of area on the surface of the emitting object. The Planck
function is extremely important in astrophysics, and we shall examine some
of the implications of it in the next lecture. But before we do; what objects
actually emit as black bodies?
4.3.2
Black bodies in Astrophysics
The pedantic answer to the question above is that nothing emits as a black
body. Perfect thermal equilibrium is impossible to achieve. However, there
are important classes of object that are very nearly in thermal equilibrium,
and so emit roughly as black bodies. What properties does an object need
to be close to thermal equilibrium? It must be a near perfect absorber of
light, or else it will reflect and the spectrum will differ from a black body.
Also, since it absorbs perfectly, it must also emit perfectly, or else it will
steadily absorb energy and heat up. Obviously, the object needs to have a
stable temperature (it needs to be in thermal equilibrium).
A good example is dust grains. Dust grains are bathed in radiation from
the surrounding stars. Moreover, they are good absorbers and emitters.
Therefore, they quickly reach a temperature where the radiation emitted
from the grains balances that absorbed from the background starlight. Another good example is the Cosmic Microwave Background. This background
radiation emitted at a time when the Universe was small, hot and opaque,
is almost a perfect black-body. It has cooled as the Universe has expanded,
and has a black-body spectrum corresponding to a temperature of 2.725K
(see figure 4.5).
However, the most important example of black bodies in astrophysics
are stars themselves. Since the energy lost through radiation is balanced
by heat from nuclear fusion in their cores, stars have a stable temperature.
And they are so dense in their interiors that nearly every photon is absorbed
(they are very good absorbers and re-emitters). However, stars are not
27
Figure 4.5: Cosmic Microwave Background spectrum as measured by COBE.
The theoretical curve is the Planck function for a 2.725K black body.
perfect black bodies. This is because the outer layers of the star is not
a perfect absorber/emitter of radiation. Instead, near the surface of the
star, the absorption of light is a strong function of wavelength. Starlight
is therefore only approximated by the Planck curve. Figure 4.6 shows the
Solar spectrum. After correction for the absorption by the atmosphere, the
Sun’s spectrum is pretty closely predicted by the Planck function, but the
agreement is not perfect. The fact that we can, to a rough approximation,
treat stars as black bodies allows us to make many important deductions
about their properties. This will be the topic of the next section.
28
Figure 4.6: Monochromatic flux from the Sun, as observed at ground level
(red) and after correction for the absorption by the atmosphere (yellow).
The spectrum is reasonably well fit by a black body curve for a temperature
of 5250 K.
29
Chapter 5
Thermal Continuum
Radiation
Last lecture we looked at the radiation emitted by hot objects. The most
important thing we learnt is that if radiative thermal equilibrium is approximately obeyed, the surface flux from the object obeys the Planck curve Equation (4.4). The emission from such objects is often called black-body
radiation (Equation (4.4)).
Fν (T ) =
2hπν 3
1
Wm−2 Hz−1 .
2
c
exp(hν/kT ) − 1
Some very important astrophysical objects, including stars, are close to
thermal equilibrium, and so their spectra are approximated by the Planck
curve. In this lecture, we are going to look at some important consequences
of that fact, by looking at the properties of the Planck curve in detail.
5.1
Blackbody radiation in wavelength units
Above is the formula for the spectrum of blackbody radiation in frequency
units. It is a monochromatic surface flux, so it gives the energy emitted
per unit surface area, per unit time and per unit frequency. What is the
corresponding curve in wavelength units? You might remember we solve this
question by realising that the amount of energy emitted in a small frequency
range must equal the amount of energy contained within the corresponding
wavelength range, Fν dν = Fλ dλ. That lead to equation (1.5),
Fν = Fλ
30
λ2
.
c
We can now use that equation to write the Planck curve in wavelength units
c
λ2
2hπν 3
1
=
.
2
cλ exp(hν/kT ) − 1
Fλ = Fν
We now substitute in ν =
Fλ (T ) =
5.2
c
λ
into the equation above to obtain
2hπc2
1
Wm2 nm−1 .
5
λ exp(hc/λkT ) − 1
(5.1)
Wien’s Law
At what wavelength does a hot body emit the most light? Assuming the
hot object is roughly a black body, this question boils down to finding out
at what wavelength the Planck curve peaks. The calculation is simple in
principle, and a bit complicated in practice. In principle, we find the maximum (or minimum) of a function by finding where the derivative is zero. In
other words, the peak wavelength, λpeak is the one which satisfies
dFλ (T )
= 0.
dλ
The solution to this equation is not so straightforward. I’m going to include (most of) it here, because it serves as a useful example of how to
solve a complex differentiation problem. The derivation is non-examinable,
however, and so the truly math-phobic might want to skip to the answer.
5.2.1
The derivation
We want to solve
dFλ (T )
d 2hπc2
1
=
=0
dλ
dλ
λ5 exp(hc/λkT ) − 1
We make a start by writing Fλ (T ) = uv, where
2hπc2
, and
λ5
1
v=
.
exp(hc/λkT ) − 1
u=
31
dF
dλ
dv
du
du
= u dλ
+ v dλ
. First, we calculate v dλ
easily
−5
1
du
.
v
= 2πhc2
dλ
λ6
exp(hc/λkT ) − 1
From the product rule,
dv
Next, we calculate u dλ
. This is a little harder.
dv
1
2hπc2 d
u
=
dλ
λ5 dλ exp(hc/λkT ) − 1
We can’t easily do the differential in the square brackets, so we try to apply
a few standard tricks to make it easier. In this case, we make a substitution
hc
x = λkT
.
dv
2hπc2 d
1
u
=
,
dλ
λ5 dλ exp(x) − 1
and use the product rule
1
dx d
1
d
=
dλ exp(x) − 1
dλ dx exp(x) − 1
. Since x =
hc
λkT ,
dx
−hc
= 2 .
dλ
λ kT
Substituting this in to our earlier equation, we find
dv
−2h2 πc3 d
1
u
=
.
dλ
λ7 kT dx exp(x) − 1
But we still can’t simply write the answer for the differential in the formula
above. However, we are now very close, because we can apply the quotient
rule,
g df (x) − f dg(x)
d f (x)
dx
= dx
,
dx g(x)
g(x)2
and set f (x) = 1, and g(x) = ex − 1. It follows that df /dx = 0 and
dg/dx = ex . If we use this, we find
1
−ex
d
=
.
dx exp(x) − 1
(ex − 1)2
So now we have,
dv
−2h2 πc3 d
1
2h2 πc3
ex
=
=
u
dλ
λ7 kT dx exp(x) − 1
λ7 kT (ex − 1)2
=
exp(hc/λkT )
2h2 πc3
.
7
λ kT (exp(hc/λkT ) − 1)2
32
Phew! So now going back to
dF
dλ
dv
du
= u dλ
+ v dλ
= 0, we find
1
2h2 πc3
exp(hc/λkT )
+ 7
exp(hc/λkT ) − 1
λ kT (exp(hc/λkT ) − 1)2
2πhc2
hc
exp(hc/λkT )
= 6
−5 +
λ (exp(hc/λkT ) − 1)
λkT exp(hc/λkT ) − 1
= 0.
dF
= 2πhc2
dλ
−5
λ6
The term outside the square bracket in that equation is always positive, and
never zero. That means the term inside the square brackets should be zero
and so we arrive at...
−5 +
exp(hc/λkT )
hc
=0
λkT exp(hc/λkT ) − 1
Unfortunately the answer we’ve reached is not very useful. We can’t use it
to find the value of the wavelength at the peak of the black body function
yet. Worse, it is fiendishly difficult (though possible) to solve this equation
analytically. Instead, we can use a computer to solve it numerically, in which
case we arrive at...
5.2.2
The answer
hc
= 4.9651
λkT
We can make this answer even more memorable by re-arranging, and substituting in S.I values for h and c, to get
λpeak T = 2.898 × 10−3 [mK],
(5.2)
where the wavelength is measured in metres, and the temperature in Kelvin.
This formula is known as Wien’s Law, after Wilhem Wien, who derived it
(following a different line of argument) in 1893. It shows that the peak
wavelength of the Planck curve is inversely proportional to temperature.
We have derived the property discussed last lecture that as the temperature
of a body increases, the peak wavelength gets shorter (see figure 4.4).
At a room temperature of ∼ 290 K, λpeak is about 10 microns. This is
well into the infrared spectrum of light, and our eyes are not sensitive to
these wavelengths. This explains why we see objects only from the sunlight
they reflect and scatter. If our eyes were sensitive to infrared light, we would
see the thermal radiation from everyday objects, like in figure 5.1. Our Sun,
33
Figure 5.1: An image taken at a wavelength of 12 microns of Prof Ned Wright
(Caltech). Prof Wright is the lead scientist on NASA’s WISE mission, which
aims to survey the night sky at infrared wavelengths between 3 and 12
microns. These wavelengths are not accessible from Earth, because the
water vapour in the atmosphere makes it opaque.
34
has a temperature around 5800 K. Using Wien’s law, the Sun’s spectrum
peaks at about 0.5 microns. It is by no means a coincidence that this is
close to the middle of the range of light to which our eyes are sensitive! The
middle curve in figure 4.4 is a decent approximation to the Sun’s spectrum.
You’ll notice that the sun emits light more or less evenly across the whole
visual range. Sunlight is thus a pretty even mix of colours, and we perceive it
as white. However, the sunlight which reaches the Earth has passed through
the atmosphere. Since the atmosphere preferentially scatters blue light most
of the blue part of the Sun’s spectrum is scattered. That is why the Sun
looks yellow, and the sky looks blue.
5.3
Colours and colour temperature
In principle, Wien’s Law gives us a way to obtain a measurement of stellar
temperature. In practice we often don’t have sufficient data to measure the
peak wavelength accurately. However, we can easily measure the colours of
stars, and since Wien’s law tells us that objects get bluer as they get hotter,
it shouldn’t be a surprise that the colour of a star can provide a measure of
the temperature.
The colour index, which we defined in equation (2.2), was the difference
in magnitude, as measured in two filters, e.g.
B − V = +2.5 log10
FV
+ const.
FB
Using the Planck curve, we can see that the colour index is a direct measure of temperature. Figure 5.2 shows two blackbodies, one at 4000 K and
another at 12,000 K. The 4000 K blackbody emits less light at B than at
V , and so it’s B − V colour index is positive. By contrast, the 12000 K
blackbody emits more light at B than at V , and so it’s B − V colour index
is negative. Thus, we see that B − V gets smaller as the star gets hotter
and bluer.
We can get a temperature estimate directly from a star’s B − V colour.
We simply ask what black body curve would show the same B − V colour
as we observe. Temperatures derived in this manner are called colour temperatures. Of course, since the star is not a perfect black body, the colour
temperature is only an approximation to the star’s true temperature.
35
Figure 5.2: Colour indices and temperature. The top panel shows a black
body at a temperature of 4000 K, and the bottom panel shows a black
body at a temperature of 10,000 K. Clearly, the ratio of B-band flux to Vband flux increases with increasing temperature. On our magnitude scale,
that corresponds to a B − V colour index which decreases with increasing
temperature.
36
5.4
The Stefan-Boltzmann Law
Let’s use the Planck curve to work out another property of black bodies.
What is the total flux emitted if we sum over all frequencies/wavelengths?
In other words, what is the bolometric surface flux from a black body? To
find this, we need to integrate the Planck curve over all frequencies (I’m
going to use frequency here, rather than wavelength, because it makes the
integration a bit easier). So, the bolometric surface flux is given by:
Z ∞
Z ∞
2hπν 3
1
Fν (T )dν =
F (T ) =
dν
2
c
exp(hν/kT
)−1
0
0
This is quite a hard integral, although it’s a lot easier than the derivation of
hν
Wien’s Law above. We’ll make it look a bit simpler by substituting x = kT
,
xkT
kT
so ν = h and dν = h dx:
Z
∞
F (T ) =
0
2πh
c2
xkT
h
3
1 kT
dx
ex − 1 h
If we tidy a few terms up, and take all the constant terms outside the
integral, we get
Z
2πk 4 T 4 ∞ x3
F (T ) =
dx.
h3 c2
ex − 1
0
R∞ 3
The integral 0 exx−1 dx is not trivial, but fortunately it can be looked up
in a table of standard integrals. It’s value is simply a constant;
Z ∞
x3
π4
dx
=
,
ex − 1
15
0
which we can substitute to get
F (T ) =
2k 4 π 5 4
T Wm−2 .
15h3 c2
This is the Stefan-Boltzmann equation. It shows that the bolometric surface flux from a black body is proportional to the temperature to the fourth
power. The constant of proportionality in the equation above is called Stefan’s constant, and given the symbol σ. In this form, the Stefan-Boltzmann
equation looks like
F (T ) = σT 4 Wm−2 ,
(5.3)
where σ = 5.67 × 10−8 Wm−2 K−4 .
37
5.4.1
Flux and luminosity from black bodies
The Stefan-Boltzmann law gives the bolometric surface flux from the black
body. In otherwords, it is the total energy emitted at all frequencies, per
second and per unit area of the black bodies surface. To find the total energy
emitted per second from the black body, we need to multiply by the surface
area of the black body, so
Lbol = 4πR∗2 σT 4 W
Finally, we can use the inverse square law F =
bolometric flux at a distance, d from the black body
F
5.4.2
bol
=
R∗
d
2
(5.4)
L
,
4πd2
to calculate the
σT 4 Wm−2 .
(5.5)
Effective Temperatures
Equation (5.5) can be used to estimate the surface temperature of a star.
We measure the bolometric flux from Earth. Usually this involves measuring
the monochromatic flux at as many wavelengths as possible, and integrating
over them to obtain the bolometric flux. Provided we know the distance to
the star, and its radius, we can estimate the temperature. Temperatures
obtained this way are known as effective temperatures. They are the surface
temperature the star would have if it radiated as a perfect black body. Again
- since stars are not perfect black bodies, the effective temperature is only
an approximation to the actual surface temperature of the star.
5.5
Other continuum emission mechanisms
The thermal radiation emitted by hot objects is a continuum emission mechanism. By that we mean that the light emitted is spread out over a broad
range of wavelengths, and has no sharp features in its spectrum. It is not
the only continuum emission mechanism encountered in astrophysics, and
for completeness I’ll briefly mention some of the other mechanisms here.
5.5.1
Synchrotron radiation
Synchrotron radiation is emitted by electrons which move at close to the
speed of light, in the presence of strong magnetic fields. The electrons feel
a force from the magnetic fields, which result in them spiralling around
38
Figure 5.3: Synchrotron radiation is produced by electrons spiralling around
the magnetic field lines
39
the field lines. Put another way, the electrons feel an acceleration from
the magnetic field, and since all accelerated charges emit radiation, light is
emitted from the electrons. Synchrotron radiation is emitted from regions
where there are both fast moving electrons, and strong magnetic fields.
Synchrotron radiation has a characteristic power-law spectrum, given by
Fν ∝ ν −n ,
where n ∼ 0.5–1.5. This means that most of the light emitted as synchrotron
radiation is emitted at low frequencies (long wavelengths). Synchroton radiation is a source of radio waves, with wavelengths longer than a cm or
so.
5.5.2
Bremsstrahlung
Bremsstrahlung
Fν = const
Synchrotron
Black-body
Fν∝ν2
log Fν
Fν∝ν-n
radio
optical
x-rays
log ν
Figure 5.4: A sketch of the contribution of various continuum sources at
different frequencies
Bremsstrahlung or braking radiation is emitted from ionised gasses. As
40
the electrons move past the protons, they feel an electric force, which decelerates them. Once, again, accelerated charged particles emit radiation,
so the electrons emit light as they are braked. For the gas to be ionised it
must be very hot, so that there is enough energy to free the electrons from
atoms. It is not surprising that bremsstrahlung is mostly emitted at high
frequencies. In fact, the spectrum from bremsstrahlung is flat; the amount
of flux emitted is independent of wavelength. Below some cutoff wavelength,
however, the flux is proportional to the square of the frequency, so very little
light is emitted at low frequencies (the spectrum is sketched in figure 5.4.
The cutoff frequency is high; bremsstrahlung is a good source of X-rays.
Figure 5.4 also illustrate the point that the different continuum emission
mechanisms produce light at very different wavelengths. As a result, the appearance of an astrophysical object can change drastically with wavelength.
In the optical, we see thermal radiation from hot objects, like stars. The
infrared is also dominated by thermal emission, but from cooler objects, like
dust. At radio wavelengths, if an object has a strong magnetic field and can
produce very fast moving electrons, we will see synchrotron radiation from
those electrons. In the X-rays, we might see bremsstrahlung from hot gas,
at least so long as there is sufficient ionised gas!
41
Chapter 6
Thermal Properties of
Matter
Up until now we’ve been focussing on the properties of radiation. We’ve
seen that for an object where the radiation is in thermal equilibrium with
the matter (a so-called black body), the spectrum of the radiation is given
by the Planck curve, and we’ve looked at a few important properties of the
Planck curve that allow us to obtain estimates of the temperature of stars.
Now I’d like to take a brief detour, and look at some of the properties of
matter. In particular, we’re going to continue to study our idealised object in
which thermal equilibrium between the matter and radiation is maintained.
This has two very important consequences:
1. the radiation field and the matter have a well defined temperature
and, since they are in thermal equilibrium with each other, the same
temperature describes the radiation field and the matter
2. since the matter is in thermal equilibrium, it’s properties are described
by the laws of statistical physics
Statistical physics is a way of looking at large systems. In a small-sized
rooms, there might be something like 1026 air molecules. These molecules
will be spread over a large range of speeds and energies. It is not possible
in practice to calculate the behaviour of a single molecule in this room.
Nevertheless, the gas in the room does have a few well defined properties,
such as the pressure, temperature and density. Calculating these properties
is the domain of statistical physics.
The idea behind statistical physics is although we cannot predict a single
particle’s energy, or speed, what we can do is calculate the probability that
42
it will have a given energy or speed. One of the most important results in
statistical physics is that the probability of a particle having an energy, E
depends upon the energy and temperature, like so
P (E) ∝ e−E/kT .
(6.1)
This is known as the Boltzmann distribution. The behaviour it predicts
makes intuitive sense. The probability that a particle has energy, E depends
on the energy and the temperature. At a given temperature, a particle is
not likely to have energies much higher than kT . Also, as the temperature
increases, it becomes more likely that particles will have higher energies.
Deriving this equation is beyond us at the moment, but everything we
will discuss today follows directly on from this result.
6.1
Maxwell-Boltzmann distribution of particle speeds
Let’s consider our object in which the radiation and matter are in thermal
equilibrium. Not every particle is going to have the same speed, so there will
be a distribution of speeds. What is it? Amazingly, we can derive it using
nothing more that the Boltzmann distribution. The details of the derivation
involve some difficult maths, so I’m just going to cover the important steps
below.
Since the gas in in thermal equilibrium, the Boltzmann distribution
states that the probability that a gas particle has energy E is proportional
to e−E/kT . But for a gas particle with speed v and mass m, the energy is
2
just the kinetic energy - E = mv
2 . So the probability that a particle as a
speed v, is given by
2
P (v) ∝ e−mv /2kT .
What we want to know is how many gas particles there are with speeds between v and v + dv. Since dv is a very small change in speed, the probability
that a particle has a speed in this range will simply be P (v), to a good approximation. The fraction of particles between v and v + dv is proportional
to the probability a particle has this speed, multiplied by the number of
possible speeds between v and v + dv. The number of possible speeds turns
out to be proportional to v 2 dv. Finally, the total number of particles with
speeds in the range of interest is the fraction of particles with these speeds
multiplied by the total number of particles, n, and so we can write that the
number of gas particles with speeds between v and v + dv is
n(v)dv ∝ nv 2 e−mv
43
2 /2kT
dv.
(6.2)
We can get rid of the proportionality by realising that if we integrate equation (6.2) over all speeds, the answer must equal the total number of particles, n. Doing this gives the Maxwell-Boltzmann distribution of speeds in a
gas:
m 3
2
2
n(v)dv = 4πn
v 2 e−mv /2kT dv.
(6.3)
2πkT
This result gives the distribution of speeds in a gas which is in thermal
equilibrium. The speeds of individual gas particles will change constantly as
particles collide with each other, but the distribution of speeds within the
gas as a whole does not change.
6.2
Properties of the Maxwell-Boltzmann distribution
Figure 6.1: Maxwell-Boltzmann distribution, n(v)/n for hydrogen atoms at
a temperature of 6000 K, The most probable speed is labelled.
What does the Maxwell-Boltzmann distribution look like? Figure 6.1
shows the distribution of speeds of hydrogen atoms at 6000K. At low speeds
44
the v 2 term in equation (6.3) dominates. Note that because of this there are
no particles with zero speed, regardless of temperature! At high speeds, the
exponential term dominates, which makes very high speeds (where mv 2 2kT ) unlikely. In between there is a maximum of the Maxwell-Boltzmann
distribution, which defines the most probable speed.
The most probable speed can be found by differentiating the MaxwellBoltzmann distribution, and finding the point at which the slope is zero.
Once again, the maths is awkward and adds nothing to our physical understanding, so I will just give the result, that the most probable speed
is
1
2kT 2
vp =
.
(6.4)
m
Using E = 12 mv 2 , we can convert this to the most probable energy, and find
Ep = kT
6.2.1
(6.5)
Mean energy
If we look again at figure 6.1, we see that it is not symmetric around the most
probable speed. Instead, there is a tail extending towards higher velocities.
What this means is that the mean of the distribution is not the same as the
most probable speed (known as the mode of the distribution).
How do we calculate the mean speed or, more interestingly, the mean
energy of the Maxwell-Boltzman distribution? The mean energy is defined
by
R∞
n(E)EdE
Ē = 0
.
(6.6)
n
This equation should Rmake sense to you; to calculate a mean you add up all
the particle energies ( n(E) E dE) and divide by the total number of particles, n. We solve the integral in equation (6.6) by using E = 12 mv 2 to convert
the Maxwell-Boltzmann distribution into the distribution of energies, n(E).
In doing so, we find the mean particle energy is
3
Ē = kT.
2
6.2.2
(6.7)
Equipartition
Before we move on to discuss the pressure of our gas, let’s take a quick look
at our result for the mean energy. Our particles have a mean (kinetic) energy
of 23 kT . Of course, our particles are free to move in three dimensions, so
45
their speed v has components directed along each axis (vx , vy , vz ). Each of
these components has a corresponding kinetic energy, ( 21 mvx2 , 12 mvy2 , 12 mvz2 ),
and since there is no reason to think that any one component will be larger
than any other, we must conclude that all of these components share, on
average, an equal part of the total energy!
This means that each direction of motion (we call them degrees of freedom) has, on average, an energy 12 kT associated with it. This is known as
the equipartition theorem. Although we have derived it in quite a specific
setting it actually applies to all physical systems of many particles, and is
tremendously useful throughout astrophysics.
6.3
6.3.1
Pressure
Gas Pressure
z
y
x
Figure 6.2: A cubic volume of gas in thermal equlibrium
Having looked at the Maxwell-Boltzmann distribution, and the theory
of equipartition, we are now in a position to derive the pressure caused
by a gas in thermal equilibrium. We are going to make one simplifying
46
assumption; we are going to assume our gas consists of randomly moving,
non-interacting particles. Such a gas is called an ideal gas. Because our ideal
gas is in thermal equilibrium, the speeds of the particles follow the MaxwellBoltzmann distribution and all the results we calculated for average energy
etc. apply.
We’re going to think about a cubic box of gas, shown in figure 6.2. The
force on the walls of our box is caused by the collisions of gas particles with
the walls. During a collision the particles momentum is changed. The force
felt by the wall is equal in size to the rate of change of momentum of the
particles. Let’s look at just one wall of the box. Symmetry tells us the
pressure must be the same on all walls of the box, so we can pick any wall
we choose.
Let’s look at the right-hand wall. All of our gas particles are moving in
three dimensions, but it is only the x-component of their velocity, vx , which
causes them to hit this wall. When the particle hits the wall we assume
the collision is elastic. This means that no kinetic energy is lost. Therefore,
before the collision, the x-component of the particle’s velocity was vx ; after
the collision it is −vx . The change in momentum of the particle is
∆p = 2mvx .
On average, the time taken between collisions with the right-hand wall will
be the time it takes a particle with a x-velocity vx to travel to the opposite
wall, rebound and collide with the right-hand wall again. If our box has sides
of length L, this time is ∆t = 2L/vx . The rate of change of momentum is
thus
∆p
vx
mvx2
= 2mvx .
=
.
∆t
2L
L
Since force is equal to the rate of change of momentum the force on the wall
of the box is mvx2 /L. The pressure on the box wall is the force per unit area.
The area of the box wall is L2 and so the pressure due to a single particle
(labelled i), is
2
2
mvx,i
mvx,i
Pi =
=
,
L3
V
where V is the volume of the box. If we have N particles in the box, then
we find the total pressure by adding up the contribution from all particles
P =
N
X
0
N
Pi =
mX 2
m 2
2
2
vx,i = (vx,1
+ vx,2
+ . . . + vx,N
).
V
V
0
47
2 + v 2 + . . . + v 2 ) = N v¯2 , we can write
Since (vx,1
x
x,2
x,N
P =
N mv¯x2
.
V
(6.8)
Our equation gives the pressure in terms of the x-component of the particle
velocities. It would be more interesting to re-write it terms of the particle
speed. We know that the speed, v obeys v 2 = vx2 + vy2 + vz2 . Also, since there
is no preferred direction of motion for our particles the average velocities in
the x, y, and z directions should all be equal, so v¯2 = v¯2 = v¯2 . Therefore we
x
can write
y
z
1
v¯x2 = v¯2 .
3
Substituting this result into equation (6.8) we can replace vx with v and
obtain
1 N m ¯2
P =
v .
(6.9)
3 V
The reason we have gone to all this trouble to write equation (6.9) in terms
of the particle speed is that mv¯2 is just twice the mean energy of the gas
particles. But since the gas in in thermal equilibrium, the speed distribution
is the Maxwell-Boltzmann distribution, and we worked out in section 6.2.1
that the mean energy of the gas particles was 23 kT . So we substitute in
mv¯2 = 3kT to obtain
N
P = kT = nkT,
(6.10)
V
where n is the number of particles, per unit volume. This is the ideal gas
law, which is hopefully familiar to you!
The ideal gas law is an equation of state for an ideal gas. An equation
of state relates pressure, density and temperature. We made a starting
assumption in our derivation; that the gas particles do not interact with
each other. This is not true of real atoms and molecules, so we can expect
the ideal gas law to be an approximation to the behaviour of real gasses. In
fact, it is a good approximation up to very high densities and the ideal gas
law is used widely in astrophysics. It is used, for example to describe the
state of matter in the interiors of stars. Only when the gas density becomes
extremely high can we no longer safely ignore the interactions between gas
particles. For very compact objects (e.g white dwarfs and neutron stars) the
ideal gas law breaks down and we need to derive new equations of state.
48
Figure 6.3: NanoSail-D - a solar sail spacecraft designed by NASA to test
methods of deploying solar sails for space travel. Unfortunately the spacecraft was lost in a launch failure of the rocket it was onboard.
6.3.2
Radiation Pressure
We saw in lecture 2 that photons also carry momentum. Remember, the
momentum of a photon is p = E/c, where E is the photon energy. Since
photons carry momentum they too can exert a pressure. This pressure is,
unsurprisingly, known as radiation pressure.
We could go through the derivation of the ideal gas law again, and replace
our gas particles (with momentum mv) with photons (with momentum E/c).
Rather than go through all the steps again, I shall just state the result
1
Prad = aT 4 ,
3
(6.11)
where a = 4σ/c (σ is Stefan-Boltzmann’s constant - 5.67×10−8 Wm−2 K−1 ).
Radiation pressure is generally a very small effect. Note however that it is
strongly temperature dependent; radiation pressure is a significant effect in
the highest mass stars, which are also the hottest stars. Radiation pressure
is also the reason why solar sails (figure 6.3) could be used for long distance
space travel. The radiation pressure on a solar sail is tiny, but over time
it could accelerate a spacecraft to a reasonable speed. The great advantage
is that no fuel need be carried, so the range of a solar sail spacecraft is
49
unlimited!
50
Chapter 7
A brief history of
astronomical spectroscopy
Spectroscopy is the task of measuring the amount of light as a function of
wavelength. When taking spectra of astronomical objects, we are measuring
the monochromatic flux as a function of wavelength. The history of spectroscopy in general, and spectroscopy in astronomy and astrophysics are
closely linked. A continuing theme throughout the history of spectroscopy
has been that new technologies have allowed scientific breakthroughs. So the
best place to start is with Isaac Newton, and the technology that allowed
spectroscopy in the first place; the prism.
7.1
Isaac Newton and the nature of light
When Isaac Newton started studying light, it was already well known that
a beam of light, shone through a prism, would split into many different
colours. At the time, it was believed that white light had no colour, and
the prism itself caused the light to have colour. Newton showed that this
wasn’t true with a very clever experimental setup (see figure 7.1).
In 1672, Newton shone window light through a prism and showed that
the white light split into many colours. Then, he isolated the red light from
this beam and passed this through a second prism. The red light remained
unchanged. This proved that white light was made up of coloured light, and
the prism merely split the white light up into it’s constituent colours. The
prism was to remain the spectroscopic tool of choice for nearly 200 years.
51
Figure 7.1: Newton’s sketch of his experiment in the nature of sunlight,
which he called his crucial experiment.
7.2
Wollaston and the dark bands in sunlight (1802)
Newton’s experiments were some of the first spectroscopic observations of
a star (the Sun). It wasn’t until 1802, however, that William Wollaston
showed that sunlight was not simply a continuous spectrum. Wollaston was
a chemist principally, but he was also very interested in optics. He invented
the first camera lens, and also the Wollaston prism, used for measuring the
polarisation of light. His prisms were of much higher quality than Newton’s,
and they allowed him to see for the first time that sunlight was not an
unbroken, continuous spectrum. Instead, there were several prominent dark
bands which he observed. Wollaston believed that these dark bands were
”gaps” in the colours of the Sun.
7.3
Joseph von Fraunhofer
Fraunhofer was a Bavarian, orphaned at the age of 11, when he went to work
as an apprentice to a glassmaker. At the age of 14, Fraunhofer was involved
in an accident when the workshop he was in collapsed. In a bizarre twist
of fate, he was rescued by the Bavarian prince, Maximilian IV Joseph, who
took an interest in Fraunhofer’s life, providing him with access to books,
time to do research and glassmaking materials.
52
Fraunhofer became probably the best maker of optics in the world. He
discovered the dark bands in the Sun’s spectrum, independently of Wollaston, in 1814. In total, he found 574 dark lines, which are named after him;
the Fraunhofer lines (see figure 7.2).
Figure 7.2: A graphical representation of the Fraunhofer lines in the Sun’s
spectrum. Fraunhofer labelled the strongest lines A through K, whilst the
weaker lines were also labelled with lower case letters. Astronomers sometimes still use these names today.
Fraunhofer’s optics allowed him to see that the Fraunhofer lines were
not “gaps” in the solar spectrum, as Wollaston thought, but were instead
absorption lines; discrete wavelengths at which the Sun was fainter, but at
which it still emitted light. Fraunhofer was not really interested in these
lines from a scientific point of view. Instead, he was looking for a way to
calibrate his spectroscope, the precursor to the modern spectrograph. An
example of a spectroscope is shown in figure 7.3; light from the object of
study comes down a scope and is focussed on a prism, which disperses the
light. Another scope then views the light which emerges from the prism;
by moving this scope’s position you detect light of different wavelengths.
Fraunhofer was using the Sun’s dark lines as fixed wavelength references,
so he could calibrate the relationship between the spectroscope’s position
and wavelength. He, nor anyone else at the time, had any understanding of
where the dark lines came from.
7.4
Kirchhof & Bunsen
A major breakthrough in the understanding of astronomical spectra came
with the work of Kirchhof and Bunsen, between 1859 and 1861. Using a
spectroscope of a type designed by Fraunhofer (figure 7.3), they examined
53
Figure 7.3: Kirchhoff & Bunsen’s spectroscope, which they used to look a
the spectra produced by burning various elements. Light from the flame
is focussed on prism F by scope B. The prism disperses this light and, by
moving scope C, the amount of light at a given wavelength can be measured.
the spectra of the flames produced by burning various elements. What they
observed was that each element produced numerous bright emission lines,
and that the wavelengths of these lines were characteristic of each individual
element, and could be used as a “fingerprint” for that element. Kirchhof
and Bunsen also produced, by experiment, a series of rules for the type of
spectrum observed from various types of light sources (figure 7.4). They
found that hot, dense gasses (or solids) produced continuous a spectrum
(the Planck function, we discussed earlier). Hot gasses which were of low
density produced a series of emission lines, characteristic to the composition of the gas. Also, they found that when a hot solid or dense gas was
observed through a cooler, less dense gas the spectrum observed was a continuous spectrum, but with a series of absorption lines superimposed. The
absorption lines were characteristic of the cool gas, and thus presumably
produced by it.
Kirchhoff and Bunsen’s work allowed an understanding of the origin of
the Fraunhofer lines in the Sun’s spectrum. They should the Fraunhofer
lines could be identified with lines emitted by known elements (Hydrogen,
Calcium, Sodium, etc). The absorption line nature of the solar spectrum
implied that we were observing the light from the hot, inner regions of the
Sun, after it passed through the cooler surface layers.
54
hot, high
density gas
hot solid
continuous
spectrum
or
emission
lines
hot, low
density gas
thru
hot, dense gas
or
hot solid
cool, low
density gas
continuous spectrum
with absorption lines
Figure 7.4: Kirchhoff & Bunsen’s empirical rules for the type of spectrum
observed from different types of light sources.
7.5
Huggins (1863-1864)
With the work of Kirchhoff and Bunsen, finally there was a framework for
understanding astronomical spectroscopy. The effect on astronomers was
astounding, the notable British astronomer William Huggins summed up
the mood by saying
“Astronomy, the oldest of the sciences, has more than renewed
her youth. At no time in the past has she been so bright with
unbounded aspirations and hopes” - Huggins (1891).
Huggins in particular took Kirchhoff and Bunsen’s results and put them
to great use. Combining large telescopes with early precursors of the modern
astronomical spectrograph he obtained many spectra of nearby stars and
used the absorption lines in their spectra to work out their composition. In
a result that startled many at the time, it turned out that stars were made
out of the same material as the Sun, and by extension, from the common
elements found on the Earth. In his paper with Miller in 1864, Huggins
wrote:
“It is remarkable that the elements most widely diffused through
the host of stars are some of those most closely connected with
55
the living organisms of our globe, including hydrogen, sodium,
magnesium and iron...” - Huggins & Miller (1864)
Working with his wife, Margaret, Huggins also took some steps towards
the understanding of Nebulae. These faint, diffuse, smudges of light were
hotly debated at the time. Some claimed they were clouds of brightly glowing gas whilst others argued that they were distant collections of many stars.
By taking spectra of the brightest nebulae, Huggins showed that they possessed emission line spectra. Following the work of Kirchhoff and Bunsen,
this proved that the brightest nebulae were clouds of hot, sparse gas. Later
on, many of the fainter nebulae were shown to have absorption line spectra.
It was the work of Hubble and Leavitt in the 1920’s which showed that these
nebulae, now called galaxies were very distant collections of stars, like our
own Milky Way.
7.6
Spectral classification of stars: Secchi (1866)
Figure 7.5: Secchi’s spectral classification scheme (1866)
Many people were by now acquiring large collections of stellar spectra.
It wasn’t long before they noticed patterns emerge, and started to place
stars into groups defined by their spectral appearance. This act of spectral
classification was not motivated by an understanding of why stars shared
56
similar properties, but instead was a purely empirical exercise, somewhat
akin to the classifying of animals into species within Biology. Nevertheless,
the eventual spectral sequence of stars would prove to be hugely important
within astrophysics, and understanding the spectral sequence will form the
next part of our course.
The first spectral classification scheme was devised by a Jesuit priest,
Angelo Secchi, in 1866. He had acquired of the order of a thousand stellar
spectra, and divided them into 4 main classes, based upon the types of
absorption lines which appeared in their spectra. Secchi’s sequence is shown
in figure 7.5. The basic idea of Secchi’s sequence; classifying stars by their
absorption lines, remains with us today. However, the development of the
modern stellar spectral classification scheme required the analysis of a very
large number of stellar spectra. In turn, this relied on two developments in
astrophysics; one technological, and one social.
7.7
The Harvard spectral classification sequence
(1886-1992)
Figure 7.6: The Harvard group
The Harvard astronomer Edward Pickering was amongst the first to
57
use objective prism spectroscopy to collect large numbers of stellar spectra.
Objective prism spectroscopy uses a large prism to disperse the light from
all the stars in the field of view of a telescope. Using the large photographic
plates which were newly available, all of these spectra could be recorded in
a single image. However, such a large amount of data was being produced,
that Pickering and his colleagues could not keep up with it’s analysis. A
perhaps apocryphal tale suggests that Pickering, exasperated with the rate
of progress provided by his postdoctoral assistants, claimed that even his
housemaid could be more productive. Whether or not this is true, Pickering
certainly hired his housemaid, Williamina Fleming, and a sizeable staff of
other women astronomers. Pickering’s staff was hardworking, dedicated,
talented and (most importantly) cheap to employ. Because they were female,
they were employed at very low wages, allowing Pickering to process a large
amount of data, for very little money.
The large number of women on Pickering’s staff created a stir at the
time, and the Harvard group acquired several nicknames, from the slightly
insulting “Harvard Computers”, to the downright patronising “Pickering’s
Harem”. Despite this, the contribution of the Harvard group to astronomy was considerable. As well as the modern spectral classification scheme,
we have already discussed the contribution of Henrietta Leavitt, one of the
Harvard group, to the measuring of distance in astrophysics, and our understanding of the scale of the Universe. The Harvard group was also the
beginning of large-scale contribution to astronomy by women.
By processing many thousands of stellar spectra between 1886 and 1922,
the Harvard group assembled a classification scheme which persists to this
day. The scheme is based upon the strengths of absorption lines in the stellar
spectrum, and the requirement is that the absorption line strengths must
vary smoothly and continuously along the spectral sequence. The Harvard
scheme divides stars into seven spectral classes, each denoted by an upper
case letter. The letters are OBAFGKM, and they can be remembered, in
order, using the (terrible) mnemonic “Oh, Be A Fine Girl, Kiss Me”.
The details of how line strengths vary across the Harvard spectral classification sequence is shown in figure 7.7. By measuring the relative line
strengths in the spectrum of any star it is a simple matter to assign it to a
spectral class in the Harvard sequence.
It is worth remembering that, at the time, this work was taxonomical.
Although stars could be placed on a sequence on the basis of their line
strengths, the physical meaning of this, and what properties of the star
changed along the Harvard sequence, were unknown. This is because the
classical physics of the time was completely unable to explain the reason why
58
Line Strength
H
He II
Molecules
HeI
Neutral
Metals
Ionised
Metals
O5
B0
A0
G0
F0
K0
M0
Spectral type
Figure 7.7: The variation of line strength along the Harvard spectral classification sequence. The top graph shows the detailed behaviour of individual
element species along the sequence. The species labelling denotes element
and ionisation state. For example HI represents neutral, atomic hydrogen,
and HeII represents singly ionised Helium atoms. TiO represents molecules
of titanium oxide. The bottom graph shows a schematic representation of
the Harvard sequence, at the level you are expected to learn it for this course.
59
elements produced characteristic absorption and emission lines. If we wish
to understand this, and by doing so understand what the Harvard spectral
sequence actually represents, we must turn to the beginnings of quantum
theory, which was being developed at the turn of the 20th century.
60
Chapter 8
The Bohr model of the atom
The spectroscopic work carried out up till the beginning of the 20th century
has left us with three facts which need explaining:
1. the presence of discrete lines in the emission spectra of elements;
2. the Kirchhoff-Bunsen rules, which dictate what kind of spectrum will
be observed from different sources
3. the nature of the Harvard Spectral Sequence
In this section, I will try and tackle the first two questions. The question we
are really seeking to answer is how does an atom of, say, Hydrogen interact
with light. Why does it emit and absorb at discrete wavelengths? To answer
this question, we need to examine the structure of an atom.
8.1
The atom
In the final years of the 19th century, J.J Thompon discovered the electron.
Since matter (and hence atoms) is neutral, this discovery meant that the
atom consists of negatively charged electrons, and some distribution of positive charge. In 1911, Ernest Rutherford showed that the positive charge was
confined to a tiny, massive nucleus. He did so by firing energetic alpha particles at thin metal foils. Astoundingly, some of the alpha particles bounced
off the foil. Rutherford wrote:
“It was quite the most incredible event that has ever happened
to me in my life. It was almost as incredible as if you fired a
15-inch shell at a piece of tissue paper and it came back and hit
you.”
61
Rutherford’s work led to a picture of the atom in which negatively
charged electrons orbited around a tiny, positively charged and massive nucleus. This atomic picture had two major flaws. Firstly, it was known
from Maxwell’s theories of electromagnetism, that acclerating charges emit
radiation. An electron in a circular orbit is constantly accelerating1 and
so it should emit radiation, lose energy and spiral into the nucleus. This
should all happen in less that 10−8 s. Obviously matter is stable on very
long timescales, so this is a major flaw in the model! Secondly, the atom as
described couldn’t explain the work of Kirchhoff and Bunsen, who showed
that atoms absorb and emit light at discrete wavelengths.
8.2
Niels Bohr and the ’semi-classical’ atom
At the same time as Rutherford was beginning to understand the nature
of the atom, theoretical physicists were beginning to grasp the quantised
nature of light. Einstein’s work on the photoelectric effect showed that light
existed as photons with quantised energy, and this idea was exploited by Max
Planck, who used it to derive the Planck curve for Black Body emission.
The Danish physicist, Neils Bohr made a great step towards solving the
structure of the atom by making a massive leap of intuition. Bohr noted
that the dimensions of Planck’s constant [Js] are equivalent to [kg m2 s−1 ],
the dimensions of angular momentum. What if the angular momentum of
the electron was quantised? Just as an electron magnetic wave, made out
of n photons fo frequency ν, could only have an energy of E = nhν, Bohr
wondered about the consequences of an atom in which the electrons could
only have quantised angular momenta given by L = nh/2π.
As we will see, Bohr’s model atom allows us to understand why atoms
only absorb and emit light at certain wavelengths. He was also able to explain why the atom was stable; an electron in an orbit with an “allowed”
angular momentum could not spiral into the nucleus, since that would involve passing through “forbidden” values of angular momentum. Bohr was
not able to explain why the electron’s angular momentum was quantised.
This would require full quantum mechanical theory, which described electrons in the atom not as particles orbiting a nucleus, but as probability
waves, which describe the likely position, energy and momentum of an electron. Nevertheless, Bohr’s success in overcoming the problems faced by the
classical model of the atom was a sign he was on the right track. To show
1
Its speed may be constant, but its direction is constantly changing. This is because it
feels the attractive force of the nucleus; it is this force that provides the acceleration.
62
this, let’s look at why Bohr’s model explains the line emission from atoms.
8.3
Energy levels of electrons in the Bohr atom
Bohr’s model meant that electrons could only occupy certain orbits; those
with allowed values of angular momentum L = nh/2π. What happens when
an electron moves from one orbit to another? The electron’s energy must
change, and that change results in the absorption or emission of a photon
of the same energy. To calculate the energy involved, we need to work out
the energy of electron orbits in the Bohr model. We’ll look at a hydrogen
atom, as that is the simplest case we can consider.
Figure 8.1: The Bohr model of the hydrogen atom.
In a hydrogen atom we have an electron of mass m and charge e− in
orbit around a proton of charge e+ . The electron is in an allowed orbit,
a distance r from the proton, and orbits with a speed v (see figure 8.1).
We need to work out the energy of the orbit. We start by noting that the
centripital force must balance against the electrostatic attraction between
the electron and proton, which gives
Ze2
mv 2
=
,
4π0 r2
r
(8.1)
where Z is the atomic number (the number of protons in the nucleus, Z = 1
for hydrogen). Re-arranging equation (8.1) gives
1
Ze2
mv 2 =
,
2
8π0 r
63
(8.2)
but 1/2mv 2 is the kinetic energy of the electron, so
K.E =
Ze2
.
8π0 r
(8.3)
We’re trying to work out the total energy of the orbit, so we also need
to know the potential energy due to the electrostatic attraction. We can
calculate the potential energy by looking at the work done assembling the
atom. Start with the electron at an infinite distance from the proton, and
move it a small distance dr towards the proton. The work done in moving
that small distance dr is F dr, where F is the electric force on the electron.
To find the potential energy of the atom, we have to add up all the work
done in moving the electron from infinity to r. The potential energy is then
given by
Z ∞
Z r
F.dr
F dr = −
P.E =
r
∞
Z ∞
Ze2
=−
dr
4π0 r2
r
Ze2
=−
.
(8.4)
4π0 r
The total energy is the sum of the potential and kinetic energy:
E = P.E + K.E
Ze2
Ze2
−
8π0 r 4π0 r
Ze2
=−
.
8π0 r
=
(8.5)
The total energy is negative because the electron is bound to proton; if we
wish to free the electron from the proton, we must add energy. So far, our
derivation has been entirely classical. We make a semi-classical model by
adding Bohr’s hypothesis that the angular momentum is quantised, so
L = mvr = nh/2π,
(8.6)
where n = 1, 2, 3 . . . ∞. We can use this to solve the radius of the electron’s
orbit, r. We take equation (8.1), which we obtained from balancing the
coulomb force and centripetal acceleration, and we re-write it like so
4π0
(mvr)2 = Ze2 .
mr
64
(8.7)
We then substitute equation (8.6) for mvr in equation (8.7) to find
4π0 nh 2
= Ze2 ,
mr 2π
which can be re-arranged to give the radius of the nth orbit, rn , as
4π0 nh 2
rn =
.
mZe2 2π
(8.8)
(8.9)
Now we know the radius of the orbit, we can substitute this back into the
equation for the total energy of the nth orbit, equation (8.5), to find
Ze2
Ze2 mZe2 2π 2
En = −
=−
8π0 rn
8π0 4π0 nh
2
4
Z e m 1
=− 2 2 2
80 h n
W
= − 2,
(8.10)
n
where W is a constant for any given atom. Therefore, the integer n, known as
the principal quantum number completely determines the radius and energy
of each orbit of the Bohr atom. When the electron is in the lowest energy
level (n = 1 - the ground state), its energy is simply E = −W . Since it
would take an amount of energy equal to W to remove the electron from
the atom; W is the ionisation energy of the atom.
8.4
Atomic lines of hydrogen
We are now in a position to understand the lines emitted in the spectrum of
hydrogen. When an electron moves between one orbit and another, it emits
or absorbs a photon. The energy of the photon is given by the difference in
energy between the two orbits, ∆E = En1 − En2 . Equation (8.10) leads to
an equation for the energy of the emitted or absorbed photon
1
1
Ephoton = En1 − En2 = W
−
.
(8.11)
n22 n21
Since, for a photon E = hν, the corresponding frequency of the photon is
W
1
1
ν=
−
.
(8.12)
h n22 n21
65
And, finally, we can use the wave equation νλ = c to obtain the wavelength
of the photon as
ch 1
1 −1
λ=
−
W n22 n21
1
1
1 −1
=
−
,
R n22 n21
(8.13)
where R = 1.097 × 107 m−1 is the Rydberg constant. These equations give
the wavelengths/frequencies of the radiation that would be emitted/absorbed
when electron jumps from one level to another.
When an electron jumps
from the n = 3 orbit to n = 2 level, a photon of
1 1
1 −1
wavelength λ = R 4 − 9
= 656 nm is emitted. The reverse process can
also occur; an electron in the n = 2 orbit can absorb a photon of wavelength
656 nm and jump to the n = 3 orbit.
8.4.1
Hydrogen line series
Considering the energy levels in more detail, it is clear that the line spectrum
of hydrogen will exhibit a number of ”series”, associated with transitions to
and from a given orbit. For example, transitions from the n = 3, 4, . . . , ∞
energy level to the n = 2 level cause a series of emission lines known as the
Balmer series, often denoted with the letter H (see figure 8.2). The line we
considered above, when the electron jumps from n = 3 to n = 2 is part of
the Balmer series. In fact, it is the first line of the Balmer series, Hα. It’s
wavelength of 656 nm is in the middle of the optical part of the spectrum.
The Balmer series is exactly the same series of lines observed by Kirchhoff
and Bunsen. There are also series of lines in the ultraviolet, corresponding
to transitions to and from the n = 1 ground state (the Lyman series) and an
infrared series of lines, corresponding to transitions to and from the n = 3
orbit (the Paschen series).
Look at the spacing of the energy levels in figure 8.2, and compare it
to equation (8.10). As n increases, the energies of the orbits become more
closely spaced. Therefore the difference in energy, or frequency, or wavelength between successive lines in a series gets smaller as n increases. Eventually, the spacing of the energy levels approaches zero, and the emission
or absorption lines get very closely spaced indeed. We say the series has
reached it’s limit.
66
Figure 8.2: Energy level diagram for a hydrogen atom showing the Lyman,
Balmer and Paschen lines (downward arrows indicate emission lines; upward
arrows indicate absorption lines).
67
8.4.2
Hydrogen-like atoms
When we derived the energy levels for hydrogen, we set the atomic number,
Z = 1. However, you would get a similar series of energy levels for any atom
consisting of a single electron orbiting a nucleus containing Z > 1 protons.
For example, singly ionised helium, is such an atom, with Z = 2. Although
the energy level diagram for any hydrogen-like atom has the same form as for
hydrogen, the exact spacing of the levels depends upon the atomic number,
Z. In the case of singly ionised helium, for example, equation 8.10 tells us
that the spacing between the energy levels is four times that between the
energy levels in a hydrogen atom.
8.4.3
More complex atoms
Figure 8.3: Some of the energy levels of a helium atom (2 protons, 2 electrons). A small number of possible transitions is also indicated.
Bohr’s model is very successful in describing the line spectrum of hydrogenlike atoms. However, it is important to note that Bohr’s model atom is not
correct. Although the angular momentum is quantised, it is not quantised
in the way suggested by Neils Bohr. To some extent, it is a matter of good
luck that we obtained the correct energy level diagram for hydrogen! To
68
calculate the energy level of more complex atoms, it is necessary to use
the complete quantum theory, and to account for the interactions between
electrons, as well as between the electron and the nucleus. This calculation
rapidly becomes extremely complex, and the number of possible energy levels grows rapidly as the number of electrons in the atom rises. Figure 8.3
shows a simplified energy level diagram for atomic helium. Even with a single extra electron, the energy level diagram is already much more complex,
and there are many more possible transitions with corresponding emission
and absorption lines.
8.5
The Kirchhoff-Bunsen laws
We are now in a position to understand the Kirchhoff-Bunsen laws, which
describe what kind of spectrum will be seen from a given source.
• A hot dense gas or hot solid produces a continuum spectrum with no
spectral lines2 . If the body is in thermal equilibrium, this spectrum is
described by the Planck curve.
• A hot, diffuse gas produces emission lines. Because the gas is hot,
electrons exist in excited states. When the electrons decay to lower
energy orbits the energy lost is carried away by a single photon. This
photon can only have certain, discrete energies, corresponding to the
differences in energy between allowed orbits.
• A cool, diffuse gas in front of a continuous spectrum source (a hot solid
or dense gas) produces absorption lines in the continuous spectrum.
Absorption lines are produced when an electron makes a transition
from a lower energy orbit to a higher energy orbit. If a photon in the
continuous spectrum has exactly the right amount of energy, equal to
the energy difference between two orbits, that photon can be absorbed
and the electron makes the transition to a higher orbit. The cool
diffuse gas thus absorbs light from the continuous spectrum, but only
at discrete wavelengths. This produces an absorption line spectrum.
2
Since a solid is made of atoms, why don’t solids also emit a line spectrum? The answer
is that interactions between the closely spaced atoms change the energy levels available to
the electrons, so that electrons can have a large range of energies. As a result, the object
can emit or absorb light across a large range of wavelengths
69
8.6
The use of spectral lines
It turns out that the presence of absorption and emission lines in the spectra of astrophysical objects is one of the most powerful tools available to
astronomers. We can use the lines to measure velocities, using the Doppler
effect. The wavelengths of lines act as fingerprints for the material in an object. What we will look at in the next section is that the relative strengths
of these absorption lines depends upon the physical properties of the emitting gas. These properties (temperature, density and pressure) can be determined by a careful examination of spectral lines. We will see that the
strengths of the spectral lines (and thus the explanation for the Harvard
spectral sequence) is strongly dependent on the temperature; giving us yet
more ways of measuring the temperature of astrophysical objects!
70
Chapter 9
Line strength
In the last section we dealt with two of the puzzles arising from early astronomical spectroscopy. Now, we turn to the last puzzle - the Harvard spectral
classification sequence.
9.1
The Harvard sequence in more detail
Ionised
Calcium
Line Strength
H
He II
Molecules
HeI
Neutral
Metals
Ionised
Metals
O5
B0
A0
G0
F0
K0
M0
Spectral type
Figure 9.1: A crude sketch of the variation of line strength along the Harvard
spectral classification sequence, at the level you are expected to remember
it for this course.
The Harvard sequence classifies stars according to the strength of their
absorption lines. There are various ways of presenting this information. In
71
Table 9.1: Harvard spectral classification
Spectral Type
Characteristics
O
Blue-white stars with few lines.
Strong He II lines
He I absorption lines becoming stronger
B
Blue-white stars
He I lines peak at B2
H I (Balmer) lines increasing in strength
A
White stars
Balmer lines strongest at A0, becoming weaker
Ca II lines becoming stronger
F
Yellow-white
Ca II lines strengthen as Balmer lines weaken
Neutral metal lines (Fe I, Cr I) appear
G
Yellow (solar type).
Ca II lines continue to strengthen
Neutral metal lines getting stronger
K
Cool orange
Ca II (Fraunhofer H & K) lines peak at K0
Spectra dominate by neutral metal lines
M
Cool red
Spectra dominated by molecular absorption bands (especially TiO and VO)
Strong neutral metal lines
72
table 9.1 I present a summary of the trends in text form1 . Figure 9.1 shows
the important trends in line strength in a graphical sketch.
What is the physical basis for the Harvard sequence? Since it is based
on absorption line strengths, we must try and understand what controls the
absorption line strength in stars. Why does one star have strong hydrogen
lines, and another have weak hydrogen lines? Our first guess might be that
it is related to the abundance of hydrogen in the star’s photosphere. Some
easy observations show that this is not the case; figure 9.2 shows the Orion
nebula. Located halfway down Orion’s sword, this cloud of dust and gas is a
stellar nursery. The stars we see here are only a million years old. Crucially,
they have all formed from the same cloud of gas, and so we expect them
to have the same composition. Nevertheless, the familiar Harvard sequence
can still be seen in the young stars of Orion. So the abundance of elements
in stars is not the primary cause of their line strength variations.
9.2
Line strengths
To see what factors control line strengths consider the first line of the Balmer
series, Hα. An Hα absorption line is caused by an electron absorbing a
photon and moving from the n = 2 energy level to the n = 3 level. Of
course, for this to occur means that some electrons had to be in the n = 2
energy level in the first place. Since an excited electron will tend to decay
into the ground state (n = 1), how do the electrons get into the n = 2 level
in the first place?
Electrons can be excited into higher energy levels by two mechanisms. As
we have seen, they can absorb photons of the correct energy. This process is
particularly important in stellar atmospheres. Collisions between atoms can
also excite electrons by passing energy from one atom to another. Both of
these processes depend upon the temperature; the higher the temperature,
the higher the mean energy of atoms and photons, and more photons or
atoms are capable of exciting electrons. Therefore, we might expect the
number of electrons in the n = 2 energy level of hydrogen to increase with
increasing temperature.
1
Note that in the table the term metal is used to denote any element heavier than
helium. This is a standard convention in astronomy. It arises because hydrogen and
helium are by far the most abundant elements in the Universe.
73
Figure 9.2: The Orion Nebula, as seen from the Hubble Space Telescope.
More than three thousand stars appear in this image, with spectral types
ranging from mid-O to early-M. This is despite all these stars being formed
from the same cloud of gas and dust.
74
9.2.1
The Boltzmann equation
To understand this process in a quantative way, we need to return to statistical physics, which we discussed earlier. Remember that, in thermal
equlibrium, the probability of a particle having energy E was given by
P (E) ∝ e−E/kT .
(9.1)
If we are comparing two energy levels (labelled 1 and 2), then the ratio of
the probability P2 that an electron is in level 2 to the probability P1 that
an electron is in level 1 is given by
P2
e−E2 /kT
= −E /kT = e−(E2 −E1 )/kT .
P1
e 1
(9.2)
Suppose E2 is greater than E1 . Therefore, as the temperature tends towards
zero, the quantity −(E2 − E1 )/kT tends towards −∞, and P2 /P1 tends
towards zero. In this case, all the electrons would be in level 1. However, as
the temperature increases, the proportion of electrons in energy level 2also
increases.
In many atoms, there may exist many quantum states available to an
electron which have the same energy. These quantum states are said to be
degenerate. We define gn to be the number of states with energy En . Then,
the ratio of the probability that an electron will be found in any of the g2
states with energy E2 , to the probability that it will be found in any of the
g1 states with energy E1 is given by
P (E2 )
g2
= e−(E2 −E1 )/kT .
P (E1 )
g1
(9.3)
Since astronomical objects contain very large numbers of atoms, the number
of atoms N2 with energy E2 is indistinguishable from the probability that
an atom has energy E2 . Thus, the ratio of the numbers of atoms in one
energy level to another is given by the Boltzmann equation
N (E2 )
g2
= e−(E2 −E1 )/kT .
N (E1 )
g1
(9.4)
Let’s look at a concrete example of the Boltzmann equation, and work
out the relative populations of the n = 2 and n = 1 energy levels in hydrogen.
Recall from last week that the energy of a electron orbit with quantum
number n was given by
W
En = − 2 ,
n
75
where W is the ionisation energy of the atom (13.6 eV for hydrogen). We
also need to know the degeneracy of the n = 2 and n = 1 levels. For this
we need the full quantum mechanical theory, but I’ll simply state there are
2 quantum states with energy E1 and 8 quantum states with energy E2 .
Therefore the number of electrons in state n = 2, divided by the number of
electrons in state n = 1 is given by
N (E2 )
8
2
2
= e−[(−13.6 eV /2 )−(−13.6 eV /1 )]/kT ,
N (E1 )
2
or
N (E2 )
= 4e−10.2 eV /kT .
N (E1 )
Figure 9.3: The number of electrons in n = 2 (N2 ) divided by the total number of electrons N1 + N2 for hydrogen gas, as determined by the Boltzmann
equation.
Figure 9.3 shows the number of electrons in energy level n = 2, divided
by the total number of electrons, derived using the formula above. We can
76
see that the number of electrons in n = 2 is a rapidly rising function of
temperature.
This provides us with something of a puzzle. Recall that the Balmer
lines are produced by electrons in the n = 2 level absorbing photons. The
Balmer lines reach their peak strength at spectral types A0, corresponding
to temperatures of ∼ 9500 K. Clearly, according to the Boltzmann equation,
at temperatures higher than 9500 K an even greater number of electrons will
be excited to the n = 2 level. If this is the case, why do the Balmer lines
decrease in strength towards the hotter O and B stars?
9.2.2
The Saha equation
The answer lies in the considering the number of atoms in different states
of ionisation. Consider the ionisation of a species I
AI + hν *
) AI+1 + e− ,
where hν > EI , the ionisation energy of the species I. Clearly, as the temperature increases, the number of photons with hν > EI will increase and
the degree of ionisation of an element will increase correspondingly. This is
why the Balmer lines decrease in strength above T ∼ 9500 K; it is due to
the rapid ionisation of hydrogen above 1000 K. This process is illustrated in
figure 9.4.
Figure 9.4: The electron’s position in the hydrogen atom at different temperatures. In (a), the electron is in the ground state. Balmer absorption
lines can only be produced when the electron is excited to the n = 2 level,
as shown in (b). In (c) the atom has been ionised, and no longer produces
absorption lines.
77
Just as we did for electron excitation above, we can apply statistical
physics to the process of ionisation to derive the Saha equation
N I+1
2 Z I+1
=
NI
Ne Z I
2πme kT
h2
3/2
e−EI /kT .
(9.5)
Since the derivation of this equation is beyond us, at least we can examine
it to see if it makes intuitive sense. The Saha equation is proportional to
e−EI /kT ; we should now expect this from our familiarity with statistical
physics. The electron density Ne also enters the equation. This is not too
surprising. Ionisation involves the creation of a free electron. The more free
electrons that are present, the more likely an ionised atom is to capture an
electron. Therefore the amount of ionisation should decrease as the electron
density increases. This is what we see in the Saha equation. The term Z I
also appears in the Saha equation. This is a quantity known as the partition
function. It represents a weighted sum of the number of ways a species can
arrange its electrons with the same energy, with more energetic (and hence
less likely) configurations receiving less weight. One very important point
to keep in mind; all of these results in statistical physics assume thermal
equilibrium. The Saha equation, like the Boltzmann equation, is only strictly
valid for systems in thermal equilibrium.
If we combine the Saha and Boltzmann equations, we can calculate the
number of electrons in the n = 2 level of hydrogen as a function of temperature. The results are shown in figure 9.5. The number of electrons in the
n = 2 level peaks around 9900 K. This is in reasonable agreement with the
temperature of A0 stars (around 9500 K), where the Balmer line strength
peaks.
9.3
A physical interpretation of the Harvard sequence
Finally, we are in a position to understand the Harvard spectral sequence
as a sequence in temperature. The line strengths of species vary along the
sequence due to the interplay of electron excitation and ionisation.
• Balmer lines of hydrogen - at low temperatures (spectral types K
to A) the excitation effect dominates and line strength rises as the
population of the n = 2 energy level rises. However, at high temperatures (spectral types A to O) the ionisation of hydrogen increases and
the Balmer line strength drops.
78
Figure 9.5: The number of electrons in the n = 2 level of hydrogen, divided
by the total number of hydrogen atoms. This calculation takes account
of electron excitation (the Boltzmann equation) and ionisation (the Saha
equation). The peak occurs at approximately 9900 K, in good agreement
with the temperature of early-A stars, where the Balmer line strength peaks.
79
• Metal lines - at low temperatures (spectral types M to G), lines
from neutral metals dominate, but neutral metal line strengths (e.g.
Ca I, Fe I) decrease from K to G as the gas becomes ionised. From
spectral types G to A the lines from singly ionised metals (e.g Mg II,
Si II) become more prominent as the temperature rises and ionisation
increases but eventually the gas becomes even more highly ionised and
these lines also decrease in strength between spectral types A and B.
Some metals (i.e Fe and Ca) have quite low ionisation energies and
CaII lines are strong between G and M-type stars.
• Molecular bands - the M stars are dominated by molecular bands,
especially those from titanium oxide (TiO) and vanadium oxide (VO).
The generally become weaker as the temperature increases because
those molecules are dissociated to form atoms of Ti, V and O. Like
excitation and ionisation, dissociation can also occur due to collisions
or the absorption of photons.
9.3.1
Stellar temperatures re-visited
The Saha and Boltzmann equations give us two more ways of measuring
the temperature of the stellar photosphere. Remember, the photospheric
temperature can be derived from the peak of the continuum spectra (the
Wien temperature), or from measurements of the flux at two wavelengths
(the colour temperature), or from the bolometric luminosity and distance
(the effective temperature).
Measurements of line strengths give us two more ways of measuring the
photospheric temperature. The excitation temperature is measured from
the Boltzmann equation, after using the relative line strengths to measure
the population of electrons in different excited strengths. The ionisation
temperature is measured from the relative populations in different ionisation
stages (for example He I and He II), using the Saha equation.
Table 9.2 shows the temperature of the Sun’s photosphere, measured
using some of these techniques. The temperature measurements do not
agree with each other, which by now should come as no surprise to you.
All of these temperature estimates are only approximate, because they all
assume the Sun’s photosphere is in perfect thermal equilibrium, which is not
true!
80
Table 9.2: Different temperature esimates of the Sun’s photosphere
Method
Colour temperature
Wien temperature
Effective temperature
Excitation temperature
Ionisation temperature
Temperature
5640 K
6200 K
5778 K
5600 K
6200 K
Table 9.3: Solar abundances by mass
Element
H
He
C
N
O
Fe
9.4
Abundance
73.4%
24.9%
0.29%
0.10%
0.77%
0.16%
Abundances from line strengths
Once the line strength variations due to temperature have been accounted
for, it turns out we can see line strength variations caused by differences in
the abundances of elements in the stellar photosphere. It was found that
differences in the abundances of main sequence stars of the same population
were very small. By the far the most abundant element is hydrogen; the
abundances of elements in the Sun are shown in table 9.3.
However, there are abundance variations between stars. Stars in the
haloes of galaxies (Population II stars) have lower metal abundances than
stars in the disk (Population I stars). This observation allowed astronomers
to realise that the Population II stars are an older generation than the
Population I stars.
81
Chapter 10
Gravitational Astrophysics
Throughout this course, I hope you have been struck by how astronomy
is a science of remote sensing; using our understanding of physics we can
interpret the observations we make of the night sky, and deduce from them
facts about the objects of our study. In no area of astrophysics is this
more apparent than when we use Newton’s law of gravity to understand the
motions of objects in gravitationally bound systems (for example, binary
stars, galaxies, planetary systems). The application of gravity leads to some
of the most subtle and elegant measurement techniques in astrophysics. We
will spend the remainder of the course studying these techniques, so we had
better have a firm grasp of gravity itself.
10.1
History
At the end of the 16th Century, the Danish nobleman Tycho Brahe was busy
developing a huge collection of observations of the Solar system. As the official astronomer to the Holy Roman empire, Brahe had the best observatory
in the world, and was credited with taking the most accurate astronomical
observations of the time. As well as being famous for the painstaking accuracy of his work, Brahe is also famous for his nose. Having lost part of his
nose in a duel, Brahe replaced it with a prosthetic nose made of copper.
Brahe’s observations were put to good use by Johannes Kepler; a German mathematician and astronomer who was, for a time, Brahe’s assistant.
Kepler wanted to deduce the rules which governed the Solar system; rules
which he believed were created by God. He used Brahe’s observations to
tease out 3 laws which all bodies in the Solar system obey. Kepler’s laws
were not a physical theory; there was no framework in place to understand
82
them. Instead they were a tour-de-force of empirical deduction. Kepler’s
laws are still used by astronomers today, and played a crucial role in the
development of a theory of gravity.
10.2
Kepler’s Laws
10.2.1
The 1st Law
“The orbit of every planet is an ellipse with the Sun at a focus”
r
b
θ
ea
a
Figure 10.1: Planetary orbits: an ellipse with the Sun at a focus.
Derived from Brahe’s observations of the orbit of Mars, this observation
is a very useful result in orbital theory. Although we won’t use them much in
this course, where we will mostly consider circular orbits, a few properties of
ellipses are summarised here. An ellipse has semi-major axis a, semi-minor
axis b and eccentricity e. An ellipse has two focii - each focus of an ellipse
is a distance ae from the centre. A circle is a special case of an ellipse with
e = 0; the focii of a circle are in the centre of the circle. The size of the
semi-minor axis b, the semi-major axis a and the eccentricity e are related
by
2
b
2
e =1−
.
a
Although the equation of an ellipse can be written in Cartesian co-ordinates
(x,y) it is more useful to use polar co-ordinates with a focus at the origin
83
(as in figure 10.1). In this case, the equation of an ellipse is given by
r=
10.2.2
a(1 − e2 )
.
1 ± e cos θ
The 2nd Law
“For any planet the radius vector sweeps out equal areas in equal
times”
P2
A1
P1
A2
P3
P4
Figure 10.2: Kepler’s 2nd Law.
Kepler’s 2nd law is illustrated in figure 10.2. A1 is the area swept out
as the planet moves from P1 to P2. A2 is the area swept out as the planet
moves from P3 to P4. If the time taken to go from P1 to P2 equals the
time taken to go from P3 to P4, then A1 equals A2. Just from looking
at figure 10.2, you should be able to see that this means a planet will move
faster when it is closer to the Sun.
10.2.3
The 3rd Law
“The cubes of the semi-major axes of the planetary orbits are
proportional to the squares of the planetary periods”
Kepler’s 3rd law is a bit of a mouthful, but is more succinctly expressed
in equation form,
a3 ∝ P 2 .
(10.1)
84
It turns out Kepler’s third law is incredibly useful, and we will use it again
and again in this section of the course. Because of that, we really need to
work out the constant of proportionality in equation (10.1) above. To do
so, we need a full theory of gravity.
10.3
Newton’s law of gravity
On the 5th July 1687, Isaac Newton published his “Philosophiæ Naturalis
Principia Mathematica”. In it he set himself the incredible task of writing
down the laws which governed the behaviour of everything in the Universe,
from the smallest mote of dust to the planets themselves. It took him just
three sentences:
• A body continues in a state of rest or uniform motion in a straight line
unless compelled by some external force to act otherwise;
• The net force on an object is equal to the mass of the object multiplied
by its acceleration;
• When a first body exerts a force on a second body, the second body exerts a force on the first body which is equal in magnitude and opposite
in direction.
In the Principia, Newton also produced a derivation of Kepler’s laws
from first principles. Since the planets are not moving in a straight line,
some force must act upon them. Newton was able to show that his force of
gravity reproduced Kepler’s laws in full. Newton’s gravitation force was, of
course
Gm1 m2
F =
,
(10.2)
r2
where m1 and m2 are the masses of the two attracting bodies, and r is the
distance between them. G = 6.673 × 10−11 m3 kg−1 s−2 is the gravitational
constant.
It’s worth mentioning here that gravity is a tremendously weak force.
The electrostatic repulsion between two protons is e2 /4π0 r2 , whilst the
gravitational attraction between them is Gm2p /r2 . The ratio of these two
quantities is e2 /4π0 Gm2p . This expression is independent of radius, so the
relative strengths of the forces is the same throughout all space. The value of
this expression is 1036 ! The electrostatic force is 1036 times stronger everywhere than the force of gravity, and yet it is gravity, not electromagnetism,
that controls the motions of the stars and planets. This is because matter
85
is mostly neutral. Large amounts of matter have negligible net charge, but
very large masses; allowing gravity to become the dominant force.
An understanding of gravity is an incredibly versatile tool for an astrophysicist because it can be used to measure a basic property that, so far, we
have no way of measuring; mass. By manipulating the laws of gravity we
can make observations that allow us to measure the mass of astronomical
objects as small as tiny satellites of Jupiter and as large as giant clusters of
galaxies. To do so however, we need to spend a little bit of time developing
some tools to use later in the course.
10.4
Some results on gravity
We’re going to derive some very useful results which follow from Newton’s
law of gravity. We will need these results later on in the course. But before
we do, let’s return to Kepler’s 3rd law and show that it can be derived
from Newton’s law of gravity (finding the constant of proportionality in the
process).
10.4.1
Kepler’s 3rd law revisited
As in the rest of this course, we will consider circular orbits only. A full
treatment of elliptical orbits is possible, but only serves to complicate the
mathematics; circular orbits capture all of the essential physics. In the
Solar system, Kepler stated that the planets orbit around the Sun. This is
because the Sun is much heavier than the planets. In the general case of
two bodies orbiting under their mutual gravitational attraction, both bodies
perform circular orbits around the center of mass. This situation is shown
in figure 10.3.
We can derive Kepler’s third law by equating force and mass × acceleration. The acceleration of an object in a circular orbit is v 2 /r, so for star b
(using the notation from figure 10.3),
Mb vb2
GMa Mb
=
.
a2
rb
Similarly, for star a,
GMa Mb
Ma va2
=
.
a2
ra
Adding these two, we find
G(Ma + Mb )
=
a2
86
va2 vb2
+
ra
rb
.
centre of mass
vb
Ma
va
Mb
X
ra
rb
a = ra + rb
Figure 10.3: Circular orbits around the centre of mass (COM). The centre
of mass is marked with a cross, whilst the circular orbit of star b around the
COM is shown as a dotted line.
87
Now, we use one of the neat mathematical tricks that crop up throughout
gravitational astrophysics. The distance round a circular orbit of radius r is
2πr, and the time taken to go round the orbit is P . Therefore we can write
v=
2πr
,
P
and substitute this into our equation above to get
2
G(Ma + Mb )
4π 2
4π ra 4π 2 rb
4π 2
=
=
+
(r
+
r
)
=
a.
a
b
a2
P2
P2
P2
P2
Re-arranging, we find
GP 2 (Ma + Mb )
= a3 ,
4π 2
(10.3)
which you can see is Kepler’s third law P 2 ∝ a3 . The form of Kepler’s third
law given in equation (10.3) crops up again and again. It is worth spending
some time memorising it.
10.4.2
Gravitational potential energy
It is useful to define the gravitational potential energy; the work required to
separate two bodies to an infinite distance from an initial separation r. We
start by asking how much work is required to move them a small distance
dr. Since work = force × distance
Work = −
Gm1 m2
dr
r2
Why the minus sign? Because the direction of the force and dr are in
opposite directions. If we choose our radius axis so that dr is positive, then
the gravitational force is −Gm1 m2 /r2 . The total work moving the bodies
to infinity is the sum of all the little steps along the way
Z ∞
Gm1 m2
Grav. P.E = −
dr
r2
r
∞
Gm1 m2
=
r
r
Gm1 m2
=−
.
(10.4)
r
This quantity is negative; as expected because we have to put energy in to
separate the two bodies. The gravitational potential energy leads to the
88
concept of escape velocity. One object will escape the gravitational field
of another if its kinetic energy is larger than the size of the gravitational
potential well,
1
Gm1 m2
m2 v 2 >
2
r
1
2Gm1 2
vesc >
r
10.4.3
(10.5)
Gravitational theorem #1
“A body inside a spherical shell of matter experiences no net
gravitational force from that shell”
A1
R1
P
Ω
R2
A2
Ω
Figure 10.4: A point P inside a hollow, thin shell experiences no net gravitational force.
This turns out to be OK to derive. We imagine the shell to be thin,
with a density of ρ kg per unit surface area. We pick a point, P inside
89
the shell and draw two cones of the same solid angle radiating out from the
point P , so that they includes two small areas of the shell on opposite sides:
these two areas will exert gravitational attraction on a mass at P in opposite
directions. We will show that these forces exactly cancel out. The situation
is shown in figure 10.4.
Since the cones have the same solid angle Ω, and the area of the base of
a cone of solid angle Ω is A = Ωr2 , we see that the ratio of the areas A1 and
A2 at distances r1 and r2 are given by A1 /A2 = r12 /r22 . Since the masses of
the bits of the shell are proportional to the areas, the ratio of the masses of
the shell sections is also r12 /r22 . It follows that the ratio of the gravitational
forces from the two bits of shell is
F1 /F2 =
M1 r22
r12 r22
=
= 1.
M2 r12
r22 r12
(10.6)
So the forces on a particle at P due to these sections of shell are the same
size and in opposite directions; they cancel exactly. In fact, the gravitational
pull from every small part of the shell is balanced by a part on the opposite
sideyou just have to construct a lot of cones going through P to see this. So
the net force on a particle inside the shell is zero.
What if the shell is not thin? A particle inside a spherical cavity in a
dust cloud is such a situation. We can consider it as an infinite number
of thin shells nested inside each other. The force from each shell is zero,
so the net force inside a cavity like this is also zero. This result will be
tremendously important when we look at measuring mass in galaxies.
10.4.4
Gravitational theorem #2
“The gravitational force on a body that lies outside a closed
spherical shell of matter is the same as it would be if all the
shell’s matter were concentrated into a point at it’s centre.”
We’ve already assumed that this theorem is true, when we derived Kepler’s 3rd law; we implicitly assumed we could treat the stars as point masses,
even though they are spheres of finite size. There is a beautiful and elegant
proof of this theorem, which can be written in about three lines, using a
different way of writing the law of gravity known as Gauss’s theorem. Unfortunately for you, the maths used is (I believe) more advanced than you
have learned to date. By contrast, Newton’s derivation took him several
pages to write and years to figure out! Therefore we will take this theorem
to be true without proof; the curious will find complete derivations using Newton’s method at http://galileo.phys.virginia.edu/classes/
90
152.mf1i.spring02/GravField.htm or http://en.wikipedia.org/wiki/
Shell_theorem. Gauss’s theorem is really quite beautiful mathematics, and
allows you to easily work out the gravity from complex objects - you can find
a good basic introduction at http://www.pgccphy.net/1030/gravity.pdf
91
Chapter 11
Measuring mass
Having dealt with the theory of gravity, let’s start to use it to measure mass.
The mass of an object is probably the most fundamental and important
measurement we can make for an object (for example, a star’s brightness,
lifetime and evolution are mainly determined by its mass). In almost all
cases, measuring the mass of an object involves measuring the effect of its
gravity on nearby objects. We will start with a example which is close to
home; the masses of planets in our solar system.
11.1
Planets in the solar system
If a planet has a moon orbiting it, we can use Kepler’s 3rd law to calculate
the mass of the planet, relative to the mass of the Sun. Figure 11.1 shows
the geometry. A planet of mass m, orbits the Sun (mass M ) with a semimajor axis of a. The planet also has a moon, with a mass m1 , which orbits
the planet with a semi-major axis a1 . Using Kepler’s 3rd law as written in
equation (10.3), we find the following equation for the period of the planet’s
orbit around the Sun
12
2π
a3
P =√
,
G m+M
and a corresponding equation for the period of the moon’s orbit around the
planet
12
2π
a31
P1 = √
.
G m + m1
92
m1
a1
m
a
M
Figure 11.1: Geometry of a planet with a moon, orbiting the Sun (not to
scale!)
We divide the two equations to get
P
P1
2
3
m + m1
m+M
3
m 1 + m1 /m
a
.
=
a1
M 1 + m/M
=
a
a1
But here we can use a (very good) approximation. Since the mass of the
moon is much less than the mass of the planet (the Moon is around 1% the
mass of the Earth), we have 1 + m1 /m ≈ 1, and using the same argument
for the planet and Sun, 1 + m/M ≈ 1. Re-arranging, we find
m
=
M
P
P1
2 a1 3
.
a
Thus the relative mass of any planet with a moon can be found once the
periods and semi-major axes of the moon and planet are known. The periods
are easy to measure by tracking the motion of planets in the night sky, and
the same data can yield distances using the parallax method.
93
11.1.1
Absolute planetary masses
The equation above yields the masses of the planets in the Solar system,
relative to the mass of the Sun, M . If we could measure the absolute mass
m
of a single planet, we can find the mass of the Sun from M = mp / Mp . In
a rare case of astronomy progressing by experiment, the mass of the Earth
was measured in a beautiful experiment by Henry Cavendish in 1798 (more
than 100 years after the Principia was published).
Figure 11.2: A sketch of Cavendish’s experiment to measure the mass of the
Earth
Cavendish’s experiment is shown (in sketch form) in figure 11.2. He
attached a bar holding small masses to a torsion wire and placed two much
larger masses close by. The large masses were equidistant from the smaller,
test masses - a distance r. By considering the forces on the small masses
due to the large masses, and the Earth, Cavendish could calculate the mass
of the Earth.
The force on the test masses due to the large masses is
Fp =
GmMp
,
r2
94
(11.1)
and the force on the test masses due to the Earth is
FE =
GmME
.
2
rE
(11.2)
Dividing equation (11.2) by equation (11.1) gives
FE
r 2 ME
= 2
Fp
rE Mp
or
ME =
FE rE 2
Mp .
Fp r
(11.3)
All of the quantities on the right hand side of equation (11.3) can be measured. Fp is measured from the twist of the torsion wire. FE = ma can be
measured by measuring the gravitational acceleration of the test particles
when dropped1 . The radius of the Earth is easily measured using geometrical techniques.
In this way, Cavendish determined the mass of the Earth, and hence
the absolute mass of all planets. Cavendish’s equipment was remarkably
sensitive. The force involved in twisting the torsion balance was very small,
roughly equivalent to the weight of a large grain of sand. To prevent air
currents and temperature changes from interfering with the measurements,
Cavendish placed the entire apparatus in a wooden box about 0.6 m thick,
3 m tall, and 3 m wide, all in a closed shed on his estate. Through two holes
in the walls of the shed, Cavendish used telescopes to observe the movement
of the torsion balance’s horizontal rod. The motion of the rod was only about
4mm, and Cavendish had to account for the constant swaying of the rod,
which was never still. Cavendish’s experiment was repeated many times,
but his accuracy wasn’t bettered for nearly 100 years.
11.2
Stellar masses
It is phenomenal to consider that we can measure the mass of stars so impossibly distant that the light we see from them is hundreds of years old.
All of the direct measurements of stellar masses we have come from stars in
multiple systems; a collection of stars bound together by their own gravity.
The most important of these are the binary stars - two stars which orbit
each other around a common centre of mass. Binary stars are surprisingly
1
The distance s travelled in a time t, under constant acceleration a is s = at2 /2
95
common: over half of the stars visible to the naked eye are actually in binary systems. Binary systems are classified according to their observational
characteristics. Different types of binary systems provide different ways of
measuring mass, and some types of binary system allow a rich set of data
to be collected. In the sections that follow, we will consider some types of
binary star in turn.
11.3
Visual Binaries
centre of mass
vb
Ma
va
Mb
X
ra
rb
a = ra + rb
Figure 11.3: Circular orbits around the centre of mass (COM). The centre
of mass is marked with a cross, whilst the circular orbit of star b around the
COM is shown as a dotted line.
Remember that the resolving power of a telescope is not infinite; in
practise an image taken from a ground based telescope has an image quality
dictated by the atmosphere. This is called seeing, and means that the typical
size of a stellar disc in a ground-based image is around one arcsecond2 .
Therefore, if the two stars in a binary are very close together, so that their
separation on the sky is less than an arcsecond, the light from the stars will
be blurred together. We will not see the stars as a binary system at all; such
a binary system is unresolved.
2
Space-based telescopes can do rather better, being above the atmosphere and hence
limited by their optics.
96
Binaries in which we can clearly see both components are called visual
binaries. For a star to be a visual binary the components must be widely
separated, and both components must be bright enough to detect. Visual
binaries are very useful for measuring masses simply, with a minimum of
observations.
In some visual binaries, we are fortunate that the orbital period is short
enough that we can actually watch the stars move around their orbits. By
patiently watching visual binaries, we can measure the size of the orbits.
Figure 11.3 shows the geometry of a binary star system with circular orbits.
From the definition of the centre of mass, we know that the sizes of the star’s
orbits ra and rb , are related through
ma ra = mb rb .
(11.4)
Therefore the mass ratio (ma /mb ) is given by
ma
rb
= .
mb
ra
From our images, we can measure the angular sizes of the orbits αa , αb .
Since these angles are small, they are related to the sizes of the orbits by
ra = dαa and rb = dαb , where d is the distance to the binary. As a result,
the mass ratio is given by
ma
αb
=
,
mb
αa
and can be found by a simple measurement of the angular sizes of the orbits,
without knowing the distance to the stars.
If the distance is known (e.g. from a measured parallax), we can calculate
the physical sizes of the orbits, ra and rb . It follows that we can also measure
the binary separation, a = ra + rb and use this in Kepler’s third law to find
the total mass of the binary, since
4π 2 a3
,
GP 2
and P is also directly measured from the star’s orbit. Once we know the
mass ratio, and the total mass, we can find the individual masses through
some simple algebra. You should be able to convince yourselves that
ma + mb
.
mb = a
1+ m
mb
ma + mb =
Since all the terms on the right hand side of this equation can be measured,
it follows we can measure individual masses of stars within a visual binary
system, armed with nothing more than images of the stars as they orbit the
centre of mass, and a distance to the binary!
97
11.3.1
Orbital Inclination
Our discussion of using visual binaries to measure mass above presents quite
a simplified picture. In reality, the analysis of the data is more complex
because we do not know the inclination of the binary orbit to our line of
site. The true situation is shown in figure 11.4. When we track the motion
plane of the sky
i
r
r cos i
to
earth
Figure 11.4: An orbit inclined with respect to our line of sight. The angle i
between the orbit and the plane of the sky is called the orbital inclination.
We do not see the true orbit, instead we see the orbit projected on the plane
of the sky.
of stars in a visual binary, we do not see the true orbit, but the projection
of the orbit onto the plane of the sky. Instead of measuring the true sizes
of the orbits ra and rb we instead measure the projected sizes, for example
ra0 = ra cos i. We can still measure the mass ratio without knowing the
inclination because
α0
αb cos i
ma
αb
=
=
= 0b .
mb
αa
αa cos i
αa
98
We do, however, need to know the orbital inclination to measure the total
mass, as
0 3
0 3
4π 2 a3
4π 2
a
4π 2
αd
ma + mb =
=
=
,
2
2
2
GP
GP
cos i
GP
cos i
where α0 = αa0 + αb0 is the projected angular separation of the binary.
Therefore, whilst a visual binary can yield the mass ratio without knowing the orbital inclination or distance, a full solution for the individual
masses needs knowledge of both the orbital inclination and distance to the
binary. We might measure the distance using a parallax measurement, but
how to measure the orbital inclination? It turns out that very detailed observations of the binary star orbits can tell us the orbital inclination as well.
Figure 11.5 shows the basic idea. A circular orbit (shown in red) is inclined
Figure 11.5: The projection of an inclined, circular orbit onto the plane of
the sky. A circular orbit (red) is inclined to our line of sight by 60 degrees.
It’s projection on the sky (blue) is an ellipse. The centre of the circular orbit
projects to the centre of the ellipse (both marked by dots).
to our line of sight at 60 degrees. It’s projection onto the plane of the sky
99
is an ellipse. Therefore, a star on this orbit appears to follow an elliptical
orbit. However, if the orbit is measured accurately, it is clear that something is not right. The star orbits on an elliptical orbit, but the centre of the
orbit is not at one of the focii of the ellipse. Instead, the centre of the orbit
is in the centre of the ellipse. Thus, the star appears to violate Kepler’s
first law! Therefore the inclination of the true orbit can be determined by
comparing the observed stellar positions with mathematical projections of
various orbits onto the plane of the sky.
Therefore, detailed measurements of the orbits of visual binaries can be
used to measure the mass ratio, and the inclination of the binary orbit.
Combined with a distance measurement, the total mass of the binary (and
hence the individual stellar masses) can also be determined.
11.3.2
A visual binary case study: Sirius
Figure 11.6: A bright Geminid meteor, and Sirius (the bright star in the
bottom left).
We need not look far for an example of a visual binary. Sirius is the
brightest star in the night sky; and in many ways is the archetypical visual
binary. Sirius consists of two stars, separated by around 7.5 arcseconds.
The brightest star, Sirius A has a luminosity of 25.4 L and an effective
temperature of 9,940 K. In many ways it is a typical star of spectral type
100
A2. Sirius B is much hotter, at 25,200 K but much fainter, with a luminosity
4 , we can immediately
of 0.026 L . Since, for a black body, L = 4πR2 σTef
f
tell that Sirius B is much smaller than Sirius A; in fact it is over 200 times
smaller.
Sirius is extremely bright because in part because it is close to us. It’s
therefore not suprising that it shows a large proper motion. The motion of
Sirius A and B in the sky is shown in figure 11.7. As well as a large proper
motion relative to the background stars, the paths of the stars also reveal
the motions of the two stars around the centre of gravity.
Figure 11.7: The paths of Sirius A (solid line) and Sirius B (dashed line) in
the sky. Background stars are marked with dots and numbers. On top of
the binary motion, there is a very large proper motion.
Figure 11.8 shows the motions of Sirius A and Sirius B again, but this
time the proper motion, and the motion of Sirius A have been subtracted, so
Sirius A appears stationary. This makes the orbit of Sirius B more obvious.
The orbit is elliptical, but is not centred on one of the focii of the ellipse.
This tells us the orbit is inclined at an angle to our line of sight. Detailed
analysis of the orbital shape reveals that the orbital inclination of Sirius AB
is roughly 44 degrees. The size of the orbit is α = 7.56 arcseconds. Parallax
101
Figure 11.8: The ’apparent’ orbit of Sirius B. This is the orbital motion of
Sirius B after subtracting the proper motion of the binary, and the motion
of Sirius A.
measurements reveal the distance to the binary is d = 2.6 parsecs. The size
of the orbit is thus given by a = αd = 19.7 AU. We can apply Kepler’s third
law to find the total mass of the binary - about 3 Solar masses.
Look again at the orbits of Sirius A and B in figure 11.7. It is clear that
the orbit of Sirius A is about half the size of Sirius B’s orbit. Since
ma
rb
αb
=
=
mb
ra
αa
we can immediately see that Sirius A is roughly twice as massive as Sirius
B. Since the total mass of Sirius AB is 3 Solar masses it follows that Sirius
A is roughly 2 Solar masses and Sirius B is roughly the same mass as the
Sun.
Whilst Sirius A is essentially a normal star of spectral type A2, with a
typical mass and luminosity, Sirius B is very odd indeed. It has roughly the
same mass as the Sun, is nearly four times hotter than the Sun and yet its
radius is 200 times smaller than a typical A2 star. That means the radius of
Sirius B is much around the same as that of the Earth! Sirius B is one of the
earliest known White Dwarfs, extremely dense stars which are the ultimate
fate of stars like our Sun. They are formed from the hot dense core of the
star as it reaches the end of its life.
102
The extremely high density of white dwarfs can only be supported against
gravitational collapse due to electron degeneracy pressure; a curious quirk of
quantum mechanics. The Heisenberg uncertainty principle states that you
cannot simultaneously define the position and momentum of an electron to
arbitrary precision. If the position is known more accurately, the momentum
becomes more uncertain, and vice versa. In a white dwarf, the electrons are
confined within a very small radius. Their positions are thus well known,
so their momenta must be very uncertain! This means that some electrons
will have high momenta, and be moving at high speeds. Just like a gas in a
box, high electron speeds cause a pressure, which supports the white dwarf
against gravity.
103
11.4
Spectroscopic Binaries
What if we cannot resolve each of the stars individually? In that case, we
cannot measure the orbit of the binary directly, but there is still a wealth
of information that can be extracted from the spectra of binary stars. If the
orbital motion has a component along the line of sight, a periodic radial
velocity shift will be observable3 - as shown in figure 11.9.
Figure 11.9: The orbital paths and radial velocities of two stars in circular
orbits. In this example, M1 = 1 M , M2 = 2 M and the orbital period is
P = 30 d. The whole binary is moving away from us with a radial velocity of
vcm = 42 km s−1 . v1 and v2 are the velocities of star 1 and star 2 respectively.
(a) The plane of the circular orbits lies along the line of sight of the observer.
(b) The observed radial velocity curves.
As with the visual binaries before, the angle of inclination between the
line of sight and the orbit effects the observed radial velocities. Figure 11.10
shows that if the star has a velocity v around its orbit, what we actually
observe is v 0 = v sin i, where i is the inclination angle of the orbit. To obtain
the actual velocities of the stars it is thus necessary to determine the orbital
inclination somehow.
11.4.1
Double-lined binaries
If both stars are comparably bright, we will be able to see absorption lines
from both stars in the binary. Such a binary is called a double-lined spectro3
remember, we can determine a stars radial velocity because the Doppler shift will
change the wavelength of absorption or emission lines from the star
104
plane of the sky
v
i
i
v sin i
to
earth
Figure 11.10: A star follows a circular orbit in a binary with velocity v.
The orbit is inclined at an angle i. This figure shows that the component of
velocity along our line of sight - the radial velocity - is given by v sin i.
scopic binary. Double-lined binaries are very useful for mass determinations,
as we will see below. We will assume the orbits are circular, in which case the
speeds of the stars around the orbits are constant and given by v1 = 2πa1 /P
and v2 = 2πa2 /P . Since
m1
a2
= ,
m2
a1
we can use the formulae for the speed above to replace a1 and a2 with v1
and v2 to get
v0
v2
v2 sin i
m1
= 20 .
=
=
m2
v1
v1 sin i
v1
Hence, as for visual binaries we can determine the mass ratio without knowing the orbital inclination, in this case using the observed radial velocities,
v10 and v20 .
However, as is also the case with visual binaries finding the total mass
of the binary does require knowledge of the orbital inclination. The total
size of the orbit, a, can be written as
a = a1 + a2 =
P
(v1 + v2 )
2π
105
We can use this to replace a in Kepler’s third law
4π 2
a3
2
P =
,
G m1 + m2
and solve for the total mass,
P
(v1 + v2 )3 .
2πG
Re-writing this in terms of the observed radial velocities, we find
m1 + m2 =
P (v10 + v20 )3
.
2πG sin3 i
Hence, provided we know some way of measuring the orbital inclination,
double-lined spectroscopic binaries can yield individual stellar masses via
radial velocity measurements.
m1 + m2 =
11.4.2
Double-lined, eclipsing binaries
Eclipsing binaries are the heavyweight champions of precise stellar measurements. In large part, this is because an eclipsing system tells us the orbit
is very close to edge on. Put another way, if we see eclipses, we know the
orbital inclination is close to 90◦ . Even if it were assumed that i = 90◦ , while
the actual value was close to i = 75◦ , the resulting error in sin3 i would only
be around 10%, with a corresponding error in the total mass. Thus, the observations of eclipses in a binary star’s lightcurve (figure 11.11) immediately
allows us to roughly guess the orbital inclination, and get a decent estimate
of the stellar masses.
We can get a better estimate of the inclination from the eclipse shape.
Figure 11.11 shows the lightcurve of a binary with i = 90◦ . When the smaller
star is eclipsed by the larger one a nearly constant minimum occurs in the
brightness of the binary as a whole. Similarly, even though the larger star
is not completely eclipsed by the smaller one, a constant amount of area is
obscured, and so again a nearly constant drop in brightness is observed. The
eclipse is described as ’flat-bottomed’ and is a clear indicator of very high
inclinations. When the inclination is a little lower, one star is not completely
eclipsed by its companion. In this case, the minima of the lightcurve are no
longer constant implying that i < 90◦ .
However, double-lined eclipsing binaries allow much more than just the
stellar masses to be measured. Detailed analysis of the eclipse also allows
direct measurements of the stellar radii, and the ratio of the effective temperatures to be measured. It is for these reasons that eclipsing binaries are
so useful in stellar physics.
106
Figure 11.11: The light curves of two eclipsing binaries with different inclinations. In the top panel is a binary for which i = 90◦ . The bottom
curve shows a partially eclipsing binary, of lower inclination. The times indicated on the light curves correspond to the positions of the smaller star
relative to its larger companion. It is assumed that the smaller star is hotter than it’s companion, so that the luminosities of the two stars are similar
(L = 4πσR2 T 4 ).
107
Stellar radii
We refer again to figure 11.11 and looking at the binary with i = 90◦ . Let
us label the large star as star 1, and the smaller star as star 2. The relative
velocities of the two stars is v = v1 + v2 , so the time taken for the small star
to move from a to b is tb − ta = 2r2 /v. Since tb − ta can be measured from
the lightcurve, we can immediately find the radius of the small star
r2 =
v
(tb − ta ).
2
Similarly, by considering the time take for the small star to move between
b and c, the size of the larger star can be determined
r1 =
v
v
(tc − ta ) = r2 + (tc − tb ).
2
2
Effective temperatures
By assuming the stars emit as black bodies we can find the ratio of the star’s
effective temperatures. Recall that, for a black body the surface flux (energy
4 . The
emitted/second per unit surface area of the star) is given by F = σTef
f
total light from the binary when both stars are visible is
LT = πr12 F1 + πr22 F2 .
When the small star (star 2) is full eclipsed the light from the binary is
L2 = πr12 F1 .
When the larger star (star 1) is eclipsed most of the stellar disc is still visible,
but an area equal to πr22 is obscured by the smaller star. The light from the
binary is therefore
L1 = πr22 F2 . + π(r12 − r22 )F1 .
The depth of the eclipse when star 1 is eclipsed is LT − L1 . Likewise for star
2. A remarkable thing happens when we look at the relative depths of the
two eclipses
LT − L2
πr12 F1 + πr22 F2 − πr12 F1
= 2
,
LT − L1
πr1 F1 + πr22 F2 − πr22 F2 . + π(r12 − r22 )F1
which simplifies to
F2
LT − L2
=
=
LT − L1
F1
108
T2
T1
4
.
This is why double-lined eclipsing binaries are such a precious object for
stellar astronomers. They are the only way to directly measure both the
mass and radius of a star, and they also allow effective temperatures to be
measured. Furthermore; notice that we did not need to know the distance
to the star. All that is needed is observations of the radial velocity curves
and the light curve. Unlike visual binaries, double-lined eclipsing binaries
can yield measurements of the properties of stars which are too distant for
a parallax measurement.
109
11.4.3
Single-lined binaries
Obviously, the mass ratio and total mass can only be measured if the radial
velocities of both stars are measurable. This requires that absorption or
emission lines from both stars are visible in the spectrum of the binary. If
one star is much brighter than the other the spectrum of the fainter star will
be overwhelmed. Such a binary is called a single-lined binary. Recall that,
for a double-lined binary,
m1 + m2 =
and
P (v10 + v20 )3
,
2πG sin3 i
m1
v0
= 20 .
m2
v1
Suppose that only star 1 is visible, so we can only measure v10 . We can use
the latter equation to replace v20 in the first equation with v20 = v10 m1 /m2 to
give
P v103
m1 3
m1 + m2 =
1
+
.
2πG sin3 i
m2
Re-arranging terms gives
P 03
m32
sin3 i =
v .
(m1 + m2 )2
2πG 1
(11.5)
The right hand side of this equation is known as the mass function. It only
depends on observable quantities of a single-lined binary, the period and radial velocity of the visible component. The left-hand side of equation (11.5)
is always less than m2 , since m1 + m2 > m2 and sin i ≤ 1. Therefore, the
mass function provides a lower limit for the mass of the unseen component,
m2 . As we will see in the following case study - this can still be very useful.
11.5
Exoplanets
The planets of our solar systems have been recognised since the Babylonians,
since nearly 2000 years BC. In the four millenia that followed they were the
only planets known to exist. Then, on October 6th 1995, Michel Mayor and
Didier Queloz announced the discovery of a exoplanet orbiting the mainsequence star 51 Peg. This discovery started a new era of planet discovery;
as of April 2010 there are 452 known planets outside our solar system.
110
Exoplanets were first discovered and characterised by measuring the radial velocity of the star as it orbits the centre of mass of the star-planet
system. This radial velocity of the host star is often called a Doppler wobble. To date, the vast majority of exoplanets have been discovered by looking
for stars which show a detectable Doppler wobble. Another way of searching
for exoplanets is to look for the tiny dip in light caused by the planet passing in front of the host star - an exoplanetary transit. Transit searches have
become a popular alternative to the Doppler wobble technique for exoplanet
hunting. Transit searches have a number of advantages. Because many stars
can fit on a CCD image, many more stars can be studied in a given time.
Also, because the starlight is not being divided into many wavelengths for
study, quite small telescopes can be used - in contrast to the Doppler wobble
technique which uses the biggest telescopes available.
11.5.1
Exoplanets as single-lined binaries
The planets themselves are extremely difficult to see directly; exoplanetary
systems are therefore a type of single-lined spectroscopic binary, and can be
analysed and treated as such. We measure the Doppler wobble of the planetary host, and can construct the mass function, as given by equation (11.5),
m3p
P 03
sin3 i =
v .
2
(mp + ms )
2πG s
where the subscript p denotes the exoplanet and the subscript s denotes
the host star. However, the mass of the star is much greater than that of
the planet (Jupiter, for example, is around 1000 times less massive than the
Sun); we can use this to re-write the mass function in a much more useful
form. The term mp + ms can be written as
mp + ms = ms (1 +
mp
) ≈ ms ,
ms
because mp /ms 1. Substituting mp + ms ≈ ms into the mass function,
we get
m3p
(mp sin i)3
P 03
3
sin
i
=
≈
v .
(11.6)
2
2
ms
ms
2πG s
It is normally possible to get a good estimate of the stellar mass, ms . This is
because, on the main-sequence, there are well known relationships between
a star’s mass and a number of observable quantities, such as its luminosity,
effective temperature, or spectral type. So by measuring (for example), the
111
spectral type of the host star, we can estimate the stellar mass to an accuracy
of a few percent. Armed with an estimate of the stellar mass and Doppler
wobble measurements of the host star (which reveal both the observed radial
velocity vs0 and the period P ), we can calculate the quantity mp sin i, which
is a lower limit to the mass of the planet.
11.5.2
Doppler wobble measurement
The majority of the known exoplanets have been found by searching for
Doppler wobbles in nearby stars, and as outlined above, the Doppler wobble
gives a lower limit to the planetary mass. But how large is it? Consider the
Doppler wobble of the Sun, as caused by Jupiter. The period of Jupiter is
11.86 years, and it’s mass is 1.9×102 7 kg, compared to the Sun’s 2 × 103 0 kg.
Putting these quantities into equation (11.6), and assuming that i = 90◦ , so
all of the stellar motion is along our line of sight, we find a Doppler wobble
of around 12 ms−1 . To put this in some sort of context, that’s just less than
30 mph. To detect exoplanets thus needs us to be able to measure radial
velocities of objects which are many parsecs away, and moving away from
us at a similar speed to city centre traffic!
Such observations are extremely challenging, and this explains why the
discovery of exoplanets was so recent. The problem lies in calibrating a spectrograph. If you measure the spectrum of a planet-hosting star you need
to measure the wavelength of the absorption lines to measure the Doppler
shift. What you actually measure is the position of an absorption line on
your detector, and there are lots of flaws in the instrument that can cause
this to change, even if the star itself shows no motion. For example, the spectrograph can flex as the telescope moves, or as the instrument cools during
the night. To get round this, astronomers calibrate their spectrographs, by
observing lamps which emit lines of known wavelength. In conventional
spectrographs this is done every hour or so. The calibration usually limits
accuracies to a few km/s: much worse than we need to detect exoplanets!
To get round this, spectrographs have been built where the starlight shines
through an Iodine cell before the spectrum is measured. The Iodine superimposes absorption lines on the star’s spectrum. The position of the star’s
absorption lines can then be measured relative to the Iodine lines. This
greatly improves precision, and the best spectrographs can reach radial velocity accuracies of 1m/s; a slow walking pace!
112
11.5.3
Planets found from Doppler Wobble: observational
bias
Figure 11.12 shows the properties of the known exoplanets as of April 2010.
The known exoplanets are totally unlike the planets in our own solar system.
Most of the known exoplanets are of a similar mass to Jupiter, and yet have
orbits similar to that of Earth. Some exoplanets (so-called hot Jupiters) are
similar in mass to Jupiter, but have orbits smaller than Mercury’s! This
raises the immediate question of whether these planets are typical or not.
There is good reason to suspect they are not, because planet-hunting using Doppler wobble is very strongly biased. Look in detail at equation (11.6).
In order for the Doppler wobble vs0 to be large we need the mass of the planet
mp to be large, and the period P to be small! A small period implies a small
orbit4 , so it is hardly surprising that the Doppler wobble technique is finding lots of heavy planets orbiting close to their host stars. Note also that
there are almost no planets found using the Doppler wobble technique with
orbital periods longer than 10 years. This is because we need to see the
radial velocity curve repeat itself in order to measure the period and be sure
we are seeing an exoplanet orbiting the star. Since we have been monitoring
stars for only 15 years or so, it is natural that exoplanets with long periods
are rare! Exoplanet searches are steadily becoming more accurate and as
time goes on it is hoped we will start to find exoplanetary systems more like
our own Solar system.
11.5.4
Transiting exoplanets
Recall the mass function for planetary systems,
(mp sin i)3
P 03
≈
v .
m2s
2πG s
Obviously, to get more than a minimum mass for the exoplanet we need
to know the orbital inclination. By comparison to spectroscopic binaries,
the obvious way to do this is to look for systems in which the exoplanet
passes in front of the host star. In stellar binaries these are called eclipses;
in exoplanet systems, they are known as transits.
As with spectroscopic binaries, the presence of transits allows us to measure the inclination, and more besides. Again, in a similar way to eclipsing
binaries, the presence of transits tells us that the inclination is close to
90◦ , and this might be enough for our needs. If not, the detailed transit
4
Kepler’s third law says that P 2 ∝ a3
113
Figure 11.12: Properties of known exoplanets as of April 2010. The y-axis
shows planetary mass (we assume mass is equal to the minimum mass given
by the mass function). On the x-axis we show either the size of the orbit
(bottom axis) or the period of the orbit (top axis). The dots are colour
coded according to their discovery method. Blue dots are planets detected
by Doppler wobble. Green dots represent planets discovered from their
transits.
114
Figure 11.13: Transit of exoplanet Wasp-4b in front of its host star.
shape tells us the precise inclination, as shown in figure 11.14. The transit
depth can also tell us the radius of the exoplanet. The luminosity of the
star/exoplanet system outside of transit is just given by the luminosity of
the star. Assuming the star radiates as a black body this is given by
Lout = πrs2 σT 4 .
During transit, the exoplanet blocks some of the surface of the star. The
visible surface area of the star is now π(rs2 − rp2 ), so the luminosity during
transit is
Lin = π(rs2 − rp2 )σT 4 .
Now, we calculate the transit depth, as a fraction of the total out-of-transit
light
2
rp
Lout − Lin
=
.
(11.7)
Lout
rs
Using equation (11.7) we can measure the planetary radius using the
transit lightcurve. We need to know the radius of the host star but this, like
the mass of the host star earlier, can be estimated from the spectral type
and main-sequence mass-radius relationships.
Note that equation (11.7) predicts that transit depths are very small normally only 1 or 2% of the total light from the star. Measuring planetary
115
2
FIG. 1: Definition of transit light-curve observables. Two schematic light curves are shown on the bottom (solid and dotted
lines), and the corresponding geometry of the star and planet is shown on the top. Indicated on the solid light curve are the
transit depth ∆F , the total transit duration tT , and the transit duration between ingress and egress tF (i.e., the ”flat part” of
◦
the transit light curve when the planet is fully superimposed
on the parent star). First, second, third, and fourth contacts are
noted for a planet moving from left to right (not needed for this problem set). Also defined are R∗ , Rp , and impact parameter
b corresponding to orbital inclination i. Different impact parameters b (or different i) will result in different transit shapes, as
shown by the transits corresponding to the solid and dotted lines.
Figure 11.14: How the inclination affects the transit shape. The solid line
shows the transit shape for i=90 , whilst the top shaded circles show the
position of the planet at beginning and end of the ingress and egress from
transit. The dashed line shows the transit shape for a lower inclination,
and the lower row of circles shows the planetary positions at ingress and
egress. Transits at lower inclinations last for shorter times, and have slower
transitions into and out of transit.
FIG. 2: Transit planet schematic for deriving non-central transit parameters. Note the definition of i for orbital inclination
(i = 90◦ corresponds to ”edge-on”). Figure from R. Santana.
116
transits thus requires very accurate photometry, and transit searches are
inevitably biassed towards larger planets, which cause the biggest transits.
The transit depth does not depend on the distance of the planet from the
star; unlike the Doppler wobble searches, transit searches are sensitive to
planets in large orbits around their host stars. They are still slightly biassed
against these planets though; planets in large orbits are less likely to show
transits in the first place. The dependence of transit depth on the planetary
radius means that detecting Earth-like planets needs better accuracy than
can be obtained from the ground. To get round this, several satellites aimed
at transit searches have been launched. Examples include NASA’s Kepler
satellite and ESOs COROT satellite. These satellites offer a good chance of
detecting a truly Earth-like planet within the next few years.
117
11.6
Weighing Galaxies
So far we have measured the mass of planets in our own Solar system, planets
around other stars and stars in various types of binary systems. Now we
take a step up in scale, and ask how we can measure the mass of galaxies.
Galaxies have been known since the 10th century, but their nature was
only recently understood. The idea that galaxies were disks of stars, like
our Milky Way has been around since the mid-18th century. However, it
was also thought possible that nebulae, as galaxies were then called were
bright clouds of gas and dust within the Milky Way itself. As we discussed
earlier, the matter was finally resolved in the 1920’s, when astronomers used
Cepheid variables to measure the distances to nearby galaxies.
Of course, galaxy masses are measured using the effects of gravity, but
unlike stars and planets, we can measure the mass of a single galaxy, even if
it is isolated in space. This is because we can use the rotation of the galaxy,
to measure the mass.
11.6.1
Galaxy rotation
Figure 11.15: Galaxy rotation
Spiral galaxies, like the Milky Way, rotate. Due to the Doppler shift,
light from one side of the galaxy will appear blueshifted, whilst light from the
other side of the galaxy appears redshifted (see figure 11.15). By measuring
how the rotation speed changes with distance from the galaxy centre, we
118
can measure the mass of the galaxy. Before we show how that works, we
should quickly discuss how the rotational velocity is measured.
The visible light from galaxies is dominated by stars. Therefore, the
optical spectrum of a galaxy looks like the spectra of lots of stars added
together, and contains many absorption lines that can be used to measure
the Doppler shift. However, there is more to the galaxy than just starlight;
galaxies also contain significant amounts of gas and dust; which can extend
well beyond the regions of the galaxy that contains stars. Since hydrogen is
the most abundant constituent of this gas, we can track the motion of this
gas using hydrogen lines, but not the hydrogen lines which appear in the
optical and ultraviolet (the Balmer, Lyman and Paschen series). Remember
that absorption and emission lines are created when electrons hop between
allowed energy levels in an atom. These energy levels are labelled with a
principal quantum number, n. In earlier lectures we stated that all energy
levels with the same value as n had the same energy. This is not quite true.
It turns out there is a tiny, tiny difference in energy between electron orbits
in which the electron and proton spin in the same direction, and orbits in
which the electron and proton spin in opposite directions. This phenomenon
is known as hyperfine splitting. The difference in energy is 9.5 × 10−25 J,
and electrons changing from one level to another give rise to light with a
wavelength of 21 cm. This is in the radio region of the electromagnetic
spectrum. Thus, optical spectroscopy tells us about the rotation speeds of
those parts of the galaxy which contain stars, and radio spectroscopy tells
us about the rotation of the parts of the galaxy where stars are absent.
11.6.2
Galaxy rotation curves
Before we look at the observed rotation of galaxies, let us work out what we
expect to see. We look at the rotation under gravity of material a distance r
from the centre of the gravity. We assume that the mass inside r and outside
r is distributed spherically. Our simple model is shown in figure 11.16. To
calculate the expected rotation, we need to remember two theorems we
proved in section 10.4.3. One was that the gravity due to the mass inside r
is the same as if all that mass was concentrated in a point at the centre of the
galaxy. The other was that the mass outside r exerts no net gravitational
force. We can find the rotation of material at r by balancing the force from
gravity and the acceleration of the material (assuming it is on a circular
orbit). This gives
GM (r)m
mv 2 (r)
=
,
r2
r
119
v(r)
r
M(r)
Figure 11.16: A simple model of galaxy rotation
. where M (r) is the mass inside r, m is the mass of a small test mass at r,
and v(r) is the rotational speed at r. Re-arranging gives
r
M (r)G
v(r) =
.
(11.8)
r
Let’s assume the galaxy has a constant density. Inside the galaxy
4
M (r) = πr3 ρ.
3
Substituting this back into equation (11.8) tells us that the rotational velocity should increase linearly with radius, v(r) ∝ r. Outside the galaxy, the
M (r) is constant (no more mass is enclosed by spheres of larger radii). This
suggests that outside the galaxy, the rotation speed should slowly drop off
as v(r) ∝ r−1/2 .
A graph of rotation speed v(r) versus radius r is known as a galaxy
rotation curve, and our calculations predict it should look like the red line
in figure 11.17.
120
Figure 11.17: A sketch of actual versus predicted rotation curves for a spiral
galaxy
Actual galaxy rotation curves
The first indication that real life doesn’t follow our simple calculation came
from the Swiss astronomer, Fritz Zwicky, in the 1930’s. Zwicky is an amazing character; an outspoken man who considered ”humbleness a lie”, he was
not liked by his colleagues. Indeed, he referred to his colleagues as ”spherical bastards”, because they were ”bastards, whichever way you looked at
them”. Perhaps because of his unpopularity, Zwicky is often not credited
for discoveries to which he can rightly claim priority. A prime example is
that, in the 1930’s, Zwicky showed that the motion of galaxies seemed to be
at odds with what one would expect from theory.
Figure 11.17 shows the predicted, and actual, rotation curve of a spiral
galaxy. As seen above, theory predicts that the rotation speed should drop
with increasing radius. Instead, the rotation speed is roughly constant,
even out to distances well beyond the radius which contains all the visible
starlight. From equation (11.8) it is obvious that a flat rotation curve implies
that M (r) ∝ r. Since M (r) increases with r, even at these large radii, there
must still be material there. In other words, even at radii well outside the
visible extent of the galaxy, there is still large amounts of matter.
Unsurprisingly, Zwicky labelled this material “Dark Matter”, since it
was not visible, except through its gravitational influence. This was a major
discovery which might have revolutionised astronomy at the time. Naturally,
Zwicky’s colleagues ignored it, until the effect was once again discovered
121
forty years later.
How much of galaxies is dark matter? When we measure the rotation of
galaxies, we actually observe v 0 (r) = v(r) sin i. If we know the inclination of
the galaxy, we can work out the true mass of the galaxy. We can estimate the
total amount of mass contained within the visible components of the galaxy
(e.g stars) by measuring the luminosity, and working out what mass of stars
is necessary to create that luminosity. This only accounts for 10-20% of the
total mass of the galaxy, as measured from the rotation curves. Therefore,
something like 80-90% of the mass of galaxies is contained in dark matter.
11.6.3
Dark Matter
What is dark matter? We know that is has mass, and that it does not emit
light. This does not rule out the possibility that dark matter is made out of
the same stuff as ordinary matter. Black holes, for example, have mass and
emit no light by definition. Very low mass stars and brown dwarfs do emit
some light, but they are so faint as to be invisible at the distances of nearby
galaxies. Perhaps dark matter is made of large numbers of brown dwarfs,
in the outer regions of galaxies? We can test this prediction by carrying
out extremely deep searches for brown dwarfs in the outer regions of our
own galaxy. Such studies reveal that brown dwarfs only constitute around
6% of the dark matter in our own galaxy. Black holes can be searched for
by gravitational lensing. Einstein predicted that gravity bends light. If a
black hole passes between us and a background star, its gravity will bend the
light from the background star in a distinctive way. Studies using gravitation
lensing have shown that black holes are not a big component of dark matter.
We are left with the possibility that dark matter is made of material
entirely unlike normal matter. Dark matter particles must be made of particles which create and feel gravity, but interact only very weakly with light
and normal matter. These particles are called WIMPs - “weakly interacting
massive particles”. As yet, no-one has actually managed to detect a WIMP
directly, so we know very little about their properties. Nevertheless, the
search is currently ongoing, so next year the story may be different.
11.6.4
Galaxy Clusters
Galaxies are often not alone in space. Instead they tend to form large
clusters of galaxies. How much do these clusters weigh? A while back, we
discussed the thermal properties of matter, and derived the equipartition
theorem, which states that matter in thermal equilibrium has 1/2kT of
122
thermal energy for every degree of freedom. I also said that this could be
used to solve difficult problems in astrophysics. The equipartition theorem
is a closely related to another theorem for systems in equilibrium, known as
the virial theorem. The virial theorem states that the kinetic energy of a
large, self-gravitating system is minus 1/2 its gravitational potential energy,
or
−2K.E = P.E.
(11.9)
A cluster of galaxies is a large, self-gravitating system, and we can apply
the virial theorem to measure the mass of the cluster.
We assume for simplicity that the cluster is a sphere of radius Rc , containing N galaxies, all with the same mass m. The kinetic energy of galaxy
i is mvi2 /2, so the total kinetic energy of the cluster is
1
1
1
2
2
2
K.E = ΣN
i=1 mvi = N mhv i = Mc hv i,
2
2
2
where Mc = N m is the total mass of the cluster. Previously, we derived
equation (10.4) for the gravitational potential energy of two masses,
Grav. P.E = −
Gm1 m2
.
r
For a spherical collection of galaxies, the gravitational potential energy is
hard to work out, but is approximately given by
P.E ≈ −
3 GMc2
.
5 R
Applying the virial theorem, we find that
Mc hv 2 i =
3 GMc2
.
5 R
Of course, we cannot measure the true speeds of the galaxies. Instead,
we use the Doppler shift to measure the radial velocity of each galaxy vr .
On average, a galaxy is as likely to be moving along our line of sight, as in
the other two directions (θ, φ), and so hvr2 i = hvθ2 i = hvφ2 i. Hence we can
write 3hvr2 i = hv2 i, which gives
3Mc hvr2 i =
3 GMc2
.
5 R
Re-arranging for the mass of the galaxy cluster gives
Mc =
5Rhvr2 i
.
G
123
(11.10)
Masses measured in this way are known as virial masses. An appropriate
value for the size of the cluster must be adopted. This comes from combining
the angular size of the cluster with a distance estimate (from Hubble’s law
and the redshift of the cluster). When we calculate the virial masses of
clusters, we find once again that most of the mass of the galaxy cluster is
not explained by the amount of visible material. As well as being a major
component of the galaxies themselves, it seems the space between galaxies
is also full of dark matter!
124