Download Cosmology Notes - University of Florida Astronomy

Document related concepts

Gibbs free energy wikipedia , lookup

Work (physics) wikipedia , lookup

Fundamental interaction wikipedia , lookup

Dark matter wikipedia , lookup

Standard Model wikipedia , lookup

Redshift wikipedia , lookup

History of physics wikipedia , lookup

State of matter wikipedia , lookup

Gravity wikipedia , lookup

Condensed matter physics wikipedia , lookup

Modified Newtonian dynamics wikipedia , lookup

Equation of state wikipedia , lookup

Negative mass wikipedia , lookup

Introduction to general relativity wikipedia , lookup

Density of states wikipedia , lookup

History of subatomic physics wikipedia , lookup

Stoic physics wikipedia , lookup

T-symmetry wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Anti-gravity wikipedia , lookup

Elementary particle wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Weakly-interacting massive particles wikipedia , lookup

A Brief History of Time wikipedia , lookup

Dark energy wikipedia , lookup

Time in physics wikipedia , lookup

Chronology of the universe wikipedia , lookup

Non-standard cosmology wikipedia , lookup

Physical cosmology wikipedia , lookup

Flatness problem wikipedia , lookup

Transcript
AST 6416: Physical Cosmology
Instructor: Gonzalez
Fall 2009
This document contains the lecture notes for the graduate level cosmology course at the
University of Florida, AST 6416. The course is 15 weeks long, with three class periods
per week (one on Tuesday and two on Friday). These notes are based upon a collection of
sources. The most notable of these are lecture notes from George Blumenthal and Henry
Kandrup, and the textbooks by Coles & Lucchin (2002), Peacock (1999), and Peebles (1993).
1
Introduction, Early Cosmology
Week 1 Reading Assignment: Chapter 1
1.1
Course Overview
Cosmology, defined as man’s attempt to understand the origin of the universe, is as old as
mankind. Cosmology, as a field of scientific inquiry, is one of the newest of topics. The first
theoretical underpinnings of the field date to the dawn of the 20th century; a significant
fraction of the landmark cosmological observations have occurred in the past two decades
— and the field certainly holds a plethora of fundamental unanswered questions. It is only
during this past century that we have gained the ability to start answering questions about
the origin of the universe, and I hope to share with you some of the excitement of this field.
The title of this course is Physical Cosmology, and the central aim of this semester will be
for you to understand the underlying physics that defines the formation and evolution of the
universe. We will together explore the development of Big Bang cosmology, investigate the
successes (and failures) of the current paradigm, and discuss topics of current relevance. By
the end of the course, you will have a basic understanding of the foundations upon which our
current picture of the universe is based and (hopefully) a sense of the direction in which this
field is headed. What you will not have is comprehensive knowledge of the entire discipline
of cosmology. The field has grown dramatically in recent years, and a semester is sufficient
time to cover only a fraction of the material. This semester will be primarily taught from a
theoretical perspective, with limited discussion of the details of the observations that have
helped define our current picture of the universe.
General Relativity is the foundation that underpins all of modern cosmology, as it defines
the structure of spacetime and thereby provides the physical framework for describing the
Universe. I realize that not all of you have taken a course in GR; and a detailed discussion
of GR is beyond the scope of this class. Consequently, I will tread lightly in this area, and
GR will not be considered a prerequisite for this class. For many of the key results, we
will use pseudo-Newtonian derivations to facilitate intuition (which is useful even if you do
know GR). In practice, this detracts very little from the scope of the course. Once we have
1
established a few fundamental equations, the bulk of the semester will be quite independent
of one’s knowledge of General Relativity. For those of you who wish to learn more about
General Relativity, I refer you to PHZ 6607.
Finally, please review your copy of the syllabus for the semester. You will see that the
textbook is Coles & Lucchin. The advantages of this text are that it is generally readable and
should serve as a good reference source for you both for this class and in the future. Moreover,
this text is used both for my course and for Observational Cosmology. Be aware however
that the organization of this course does not directly parallel the organization of the book
– we will be jumping around, and sometimes covering material in a different fashion than
the text. The first half of the semester will be dedicated to what I would call “classical”
cosmology, which broadly refers to the fundamental description of the universe that was
developed from 1916-1970 – the global structure of the universe, expansion of the universe,
and development of the Big Bang model, Big Bang nucleosynthesis, etc. The second half of
the semester will focus upon more recent topics in the field – things such as dark matter,
dark energy, inflation, modern cosmological tests, and gravitational lensing. I emphasize
that the division between the two halves of the semesters is only a preliminary plan, and
schedule may shift depending on the pace of the course. Homework will be assigned every
two weeks, starting on Friday, and will comprise 50% of your grade. I strongly encourage you
to work together on these assignments. Astronomy and cosmology are collaborative fields
and you are best served by helping each other to learn the material. Make sure though that
you clearly understand everything you write down – otherwise you will be poorly served for
the exam and the future. The final will be comprehensive for the semester and account for
the other 50%.
1.2
The Big Questions
Before we begin, it is worth taking a few moments to consider the scope of the field of
cosmology by considering, in broad terms, the aim of the subject. More so than most other
fields, cosmology is all encompassing and aims for a detailed understanding of the universe
and our place therein. Fundamental questions that the field aims include:
• What is the history of the Universe? How did it begin? How did the structures
that we see today – matter, galaxies, and everything else – come to be?
• What is the future of the Universe? What happens next? How does the Universe
end, or does it end?a
• How does the Universe, and the matter/energy it contains, change with
time?
• What are the matter/energy constituents of the Universe and how were
they made?
• What is the geometry of the Universe?
2
• Why are the physical laws in the Universe as they are?
• What, if anything, exists outside our own Universe?
Clearly an ambitious set of questions. We by no means have complete answers to all of
the above, but it is remarkable the progress – and rate of progress – towards answers that has
transpired in recent times. In this course we will touch upon all these topics, but primarily
focus upon the first five.
1.3
Olbers’ Paradox
And so with that introduction, let us begin. Let us for a moment step back 200 years to
1807. Newtonian physics and calculus were well-established, but electromagnetism was still
over 50 years in the future, and it would be a similar year before Monsieur Messier would
begin to map his nebulae (and hence the concepts of Galaxy and Universe were essentially
equivalent). Copernicus had successfully displaced us from the center of the solar system,
but our position in the larger Universe was essentially unknown. At the time, as would
remain the case for another 100 years, cosmology was the realm of the philosopher – but
even in this realm one can ask physically meaningful questions to attempt to understand
the Universe. Consider Olbers’ Paradox, which was actually first posited in ancient Greece
before being rediscovered by several people in the 18th and 19th century. When Olbers
posed the paradox in 1826, the general belief was the Universe was infinite, uniform, and
unchanging (“as fixed as the stars in the firmament”). The question that Olbers asked was:
Why is the night sky dark?
Let us make the following assumptions:
1. Stars (or in a more modern version galaxies) are uniformly distributed throughout the
universe with mean density n and luminosity L. This a corollary of the Cosmological
Principle, which we will discuss in a moment.
2. The universe is infinitely old and static, so ṅ = L̇ = 0.
3. The geometry of space is Euclidean. And in 1800, what else would one even consider?
4. There is no large scale systematic motion of stars (galaxies) in the Universe. Specifically, the Universe is not expanding or contracting.
5. The known laws of physics, derived locally, are valid throughout the Universe.
For a Euclidean geometry, the flux from an object is defined simply as
f=
L
,
4πr 2
(1)
where L is the luminosity and r is the distance to the object. In this case, the total incident
flux arriving at the Earth is
3
L
4πr dr n
ftot =
=∞
(2)
4πr 2
0
The incident flux that we observe should therefore be infinite, as should the energy
density, < u >≡ f /c. Clearly the night sky is not that bright!
Can we get around this by including some sort of absorption of the radiation? Adding
an absorbing dust between us doesn’t help much. For a static, infinitely old universe (assumption 2), the dust must eventually come into thermodynamic equilibrium with the stars
and itself radiate. This would predict a night sky as bright as the surface of a typical star.
We get the same result if we include absorption by the stars themselves (through their
geometric cross section). Specifically, consider the paradox in terms of surface brightness.
For a Euclidean geometry, surface brightness (flux per unit solid angle) is independent of
distance since
!
L
πd2
L
/
=
,
(3)
I ≡ f /dΩ =
4πr 2
r2
4π 2 d2
Z
∞
2
where d is the physical size of the object. If the surface brightness is constant, and there is a
star in every direction that we look (which is a the logical result of the above assumptions),
then every point in space should have the same surface brightness as the surface of a star –
and hence Tsky ≈ 5000 K. That the sky looks dark to us tells us that Tsky < 1000 K, and
from modern observations of the background radiation we know that Tsky = 2.726 K.
Which assumption is wrong?! Assumption 1 is required by a Copernican view of the Universe. We now know that the stars themselves are not uniformly distributed, but the galaxy
density is essentially constant on large scales. We are also loathe to abandon assumption 5,
without which we cannot hope to proceed. Assumption 3, Euclidean geometry, turns out to
be unnecessary. For a non-Euclidean space, the surface area and volume elements within a
solid angle dΩ are defined as:
dA = r 2 f (r, Ω)dΩ
(4)
and
dV = d3 r = r 2 f (r, Ω)drdΩ.
(5)
Therefore, from a given solid angle dΩ,
< u >Ω =
Z
n
r drf (r, Ω)
c
2
L
r 2 f (r, Ω)
!
=
Z
dr
nL
,
c
(6)
independent of f (r, Ω).
Relaxing assumption 2 (infinite and static) does avoid the paradox. If the Universe is
young, then:
• Absorption can work because the dust may not be hot yet.
• Stars may not have shined long enough for the light to reach us from all directions.
4
If we define the present time at t0 , then we can only see sources out to R = ct0 so
nL
nLR
=
= nLt0 ,
(7)
c
c
0
which is finite and can yield a dark sky for sufficiently small t0 .
Relaxing assumption 4 can also avoid the paradox. Radial motion gives a Doppler shift
< u >=
Z
R
dr
νobserved = νemitted · γ · (1 − vr /c).
(8)
Since luminosity is energy per unit time, it behaves like frequency squared, i.e.
Lobserved = Lemitted · γ 2 · (1 − vr /c)2 ≤ Lemitted .
(9)
One avoids the paradox if vr ∼ c at large distances. This can be achieved if the Universe is
expanding.
Olbers’ paradox therefore tells us that the universe must be either young or
expanding – or both. In practice, it would be another century before such conclusions
would be drawn, and before there would be additional observational evidence.
2
Definitions and Guiding Principles (Assumptions)
Olber’s paradox has begun to introduce us to some of the fundamental concepts underlying
modern cosmology. It is now time step forward 100 years to the start of the 20th century,
explicitly lay out these concepts, and establish working definitions for terms that we will use
throughout the course.
2.1
Definitions
Let us begin by introducing the concepts of a co-moving observer, homogeneity, and
isotropy.
• Co-moving Observer: Imagine a hypothetical set of observers at every point in the
universe (the cosmological equivalent of test particles). A co-moving observer is defined
as an observer who is at rest and unaccelerated with respect to nearby material. More
specifically, any observer can measure the flow velocity, v(r), of nearby material at
any time. If the observer finds v(0) = 0 and v̇(0) = 0, then the observer is comoving.
Co-moving observers are expected to be inertial observers (who feel no force) in a
homogeneous universe. Note, however that all inertial observers are not necessarily
comoving – an inertial observer must have v̇(0) = 0, but can have v(0) 6= 0.
• Homogeneity: A universe is homogeneous if all co-moving observers would observe
identical properties for the universe. In other words, all spatial positions are equivalent
(translational invariance). A simple example of a homogeneous geometry would be the
2-D surface of a sphere. Equivalently, an example of an inhomogeneous universe would
be the interior of a 3-D sphere, since some points are closer to the surface than others.
5
• Isotropy: A universe is isotropic if, for every co-moving observer, there is no preferred
direction. In other words, the properties of the universe must look the same in all directions. This is equivalent to saying that an isotropic Universe is rotationally invariant
at all points. Going back to the same examples from before, the two-dimensional surface of a sphere is isotropic – any direction along the surface of the sphere looks the
same. On the other hand, the interior of a 3-D sphere is not isotropic. It is rotationally
invariant at the center, but for any other point the distance to the surface is shorter
for some directions than others.
So are the conditions of homogeneity and isotropy equivalent? Not quite. One can prove
that an isotropic universe is always homogeneous, but the converse is not true. Here
are the proofs.
Assume that the first statement is false, such that there exists a universe that is isotropic
everywhere, but not homogeneous. For an inhomogeneous universe, there must exist some
observable quantity φ(r) that is position dependent. The quantity φ must be a scalar,
because if it were a vector it would have a direction and thus violate the assumption of
isotropy. Consider the vector D, defined by
D = ∇φ(r).
(10)
Since φ is not a constant, D must be non-zero somewhere. Since D is a vector, it picks out
a direction at some point, and therefore the universe cannot appear isotropic to an observer
at that point. This contradicts our assumption of an isotropic but inhomogeneous universe
and therefore proves that an isotropic universe is always homogeneous.
Now, what about the converse statement? How can we have a universe that is homogeneous but not isotropic. One example would be the 2-D surface of an infinite cylinder (Figure
1). The surface is clearly homogeneous (translationally invariant). However, at any point
on the surface the direction parallel to the axis of the cylinder is clearly different from the
direction perpendicular to the axis since a path perpendicular to the axis will return to the
starting point. A few examples of homogeneous, inhomogeneous, isotropic, and anisotropic
universes are show in Figure 2.
The fact that a geometry is dynamic need not affect its isotropy or homogeneity. A
dynamic universe can be both homogeneous and isotropic. Consider the surface of a sphere
whose radius is increasing as some function of time. The surface of a static sphere is isotropic
and homogeneous. The mere fact that the size of the sphere is increasing in no way picks
out a special position or direction along the surface. The same considerations also apply to
a uniform, infinite sheet that is being uniformly stretched in all directions.
2.2
The Cosmological Principle
In the early days of cosmology at the start of the 20th century, theoretical development
was very much unconstrained by empirical data (aside from the night sky being dark).
Consequently, initial progress relied making some fundamental assumptions about the nature
6
Figure 1 An example of a homogeneous, but anisotropic universe. On the 2-D surface of
an infinite cylinder there is no preferred location; however, not all directions are equivalent.
The surface is translationally, but not rotationally invariant.
Figure 2 Slices through four possible universes. The upper left panel shows a homogeneous
and isotropic example. The upper right shows a non-homogeneous and non-isotropic universe. The lower panels illustrate universes that are homogeneous (on large scales), but not
isotropic. In one case the galaxies are clustered in a preferred direction; in the other the
expansion of the universe occurs in only one direction.
7
of the Universe. As we have seen above, the geometry, dynamics, and matter distribution of
a universe can be arbitrarily complex. In the absence of any knowledge of these quantities,
where should we begin?
The most logical approach is the spherical cow approach – start with the simplest physical
system, adding complexity only when required. Towards this end, Albert Einstein introduced
what is known as the Cosmological Principle. The Cosmological Principle states that
the Universe is homogeneous and isotropic.
It is immediately obvious that this principle is incorrect on small scales – this classroom
for instance is clearly not homogeneous and isotropic. Similarly, there are obvious inhomogeneities on galaxy, galaxy cluster, and even supercluster scales. However, if you average
over larger scales, then the distribution of matter is indeed approximately uniform. The
Cosmological Principle should therefore be thought of as a reasonable approximation of the
Universe on large scales – specifically scales much greater than the size of gravitationally
collapsed structures. Both the global homogeneity and isotropy (at least from our perspective) have been remarkable confirmed by observations such as cosmic microwave background
experiments (COBE, WMAP) and large galaxy redshift surveys. The success of the Cosmological Principle is remarkable given that it was proposed at time with the existence of
external galaxies was still a subject of debate.
2.2.1
Spatial Invariance of Physical Laws
If we ponder the implications of the Cosmological Principle, we see that it has important
physical consequences. Perhaps the most fundamental implication of accepting the Cosmological Principle is that the known laws of physics, derived locally, must remain valid
everywhere else in the Universe. Otherwise the assumption of homogeneity would be violated. Reassuringly, modern observations appear to validate this assumption, at least within
the observable universe, with the properties of distant astrophysical objects being consistent
with those observed locally. Within our own Galaxy, period changes for binary pulsars are
consistent with the slowdown predicted by General Relativity as a result of gravitational
radiation. On a much more distant scale, the light curves of type Ia supernovae are similar
in all directions out to z ≈ 1 (d ∼ 8 billion light years), and have the same functional form
as those at z = 0. Indeed, terrestrial physics has been remarkably successful in explaining
astrophysical phenomena, and the absence of failures is a powerful argument for spatial invariance. As an aside, it is worth nothing that dark matter and dark energy, which we will
discuss later, are two instances in which standard physics cannot yet adequately describe the
universe. Neither of these phenomena violate spatial invariance though – they’re a problem
everywhere.
2.2.2
The Copernican Principle
Additionally, the Cosmological Principle has a philosophical implication for the place of
mankind in the Universe. The assumption of isotropy explicitly requires that we are not in
a preferred location in the Universe, unlike the center of the 3-D sphere discussed above.
8
The Cosmological Principle therefore extends Copernicus’ displacement of the Earth from
the center of the solar system. The statement that we are not in a preferred location is
sometimes called the Copernican Principle.
2.2.3
The Perfect Cosmological Principle
It is worth noting that there exists a stronger version of the Cosmological Principle called
the “Perfect Cosmological Principle”. The Perfect Cosmological Principle requires that
the Universe also be the same at all times, and led rise to the “steady-state” cosmology
(Hoyle 1948), in which continuous creation of matter and stars maintained the density and
luminosity of the expanding Universe. We now know that the Universe is not infinitely old
(and could have from Olbers’ paradox!), yet this can still be considered relevant in larger
contexts such as eternal inflation, where our Universe is one of an infinite number. In this
case we may have a preferred time in our own Universe, but the Universe itself is not at a
preferred “time”.
2.2.4
Olbers’ Paradox Revisited
Finally, it is worth taking one last look at Olbers’ Paradox in light of the Cosmological
Principle. Of the five assumptions listed before, the first and fifth are simply implications of
the cosmological principle. Since we showed that the fourth was unnecessary, we return to
the conclusion that either 2. or 3. must be false.
2.3
Expansion and the Cosmological Principle
One of the most influential observations of the 20th century was the discovery by Edwin
Hubble of the expansion of the Universe (Hubble 1929). Hubble’s law states that the recessional velocity of external galaxies is linearly related to their distance. Specifically, v = H0 d,
where v is velocity, d is the distance of a galaxy from us, and H0 is the “Hubble constant”. [It
turns out that this “constant” actually isn’t, and the relation is only linear on small scales,
but we’ll get to this later.].
It is straightforward to derive Hubble’s law as a natural consequence of the Cosmological Principle. Consider a triangle, sufficiently small that both Euclidean geometry is a valid
approximation (even in a universe with curved geometry) and v << c so that Galilean transformations are valid. As the universe expands or contracts, the conditions of homogeneity
and isotropy require that the expansion is identical in all locations. Consequently, the triangle must grow self-similarly. If we define the present time at t0 and the scale factor of the
expansion as a(t), with a0 = a(t0 ) being the scale factor at t0 , then this self-similarity requires
that any distance x increase by the same scale factor. Mathematically, this is equivalent to
saying that
a
x=
x0 .
(11)
a0
9
Taking the derivative,
ȧ
ȧ
x0 =
x,
ẋ =
a0
a
(12)
v = Hx,
(13)
or
where the Hubble parameter is defined as H ≡ ȧ/a. The Hubble constant, H0 is defined
at the value of the Hubble parameter at t0 , i.e. H0 = ȧ0 /a0 . Note that the Cosmological
Principle does not require H > 0 – it is perfectly acceptable to have a static or contracting
universe.
3
Dynamics of the Universe - Conservation Laws, Friedmann Equations
To solve for the dynamics of the universe, it is necessary to use the Cosmological Principle
(or another symmetry principle) along with General Relativity (or another theory of gravity). In this lecture we shall use a Newtonian approximation to derive the evolution of the
universe. The meaning of these solutions within the framework of GR will then be discussed
to illustrate the effect of spatial curvature and the behavior of light as it propagates. It turns
out that the trajectory of light cannot be treated self-consistently within the framework of
Newtonian gravity – essentially because of the need for Lorentzian rather than Galilean
invariance for relativistic velocities.
As a reminder, for Galilean transformation,
x′ = x − vt
t′ = t,
(14)
(15)
while for a Lorentz transformation
x − vt
x′ = q
1 − (v/c)2
t − vx
′
c2
.
t =q
1 − (v/c)2
3.1
(16)
(17)
Conservation Laws in the Universe
Let us approximate a region of the universe as a uniform density sphere of non-relativistic
matter. We will now use the Eulerian equations for conservation of mass and momentum to
derive the dynamical evolution of the universe.
10
3.1.1
Conservation of Mass
If we assume that mass is conserved, then the mass density ρ satisfies the continuity equation
∂ρ
= ∇ · (vρ) = 0.
∂t
(18)
The Cosmological Principle demands that the density ρ be independent of position. Using
the fact that ∇ · v = 3H(t), the continuity equation becomes
or
dρ
+ 3H(t)ρ = 0,
dt
(19)
dρ
= −3H(t)dt,
ρ
(20)
which integrates to
ρ
ln
ρ0
!
= −3
Z
t
t0
dtH(t) = −3
This can be rewritten as
Z
a
a0
da
a
.
= −3 ln
a
a0
(21)
a0 3
,
(22)
ρ(t) = ρ0
a
so the time dependence is determined solely by the evolution of the scale factor and for a
matter dominated universe ρ ∝ a−3 . This intuitively makes sense, as it’s equivalent to saying
that the matter density is inversely proportional to the volume.
3.1.2
Conservation of Momentum
We would like to apply conservation of momentum to the Universe using Newton’s theory of
gravity. This approach would seem, at first glance, to be inconsistent with the Cosmological
Principle. Euler’s equation for momentum conservation is
∂ρv
+ ∇ · (ρv) v + ∇p = Fρ,
∂t
(23)
where v is the local fluid velocity with respect to a co-comoving observer, p is the pressure,
and F is the force (in this case gravitational) per unit mass. An immediate problem is that
it is difficult to define the gravitational potential in a uniform unbounded medium. We could
apply Newton’s laws to a universe which is the interior of a large sphere. This violates the
Cosmological Principle since we sacrifice isotropy; however it doesn’t violate it too badly if we
consider only regions with size x << Rsphere. In fact, Milne & McCrea (1934) demonstrated
that Newtonian cosmology is a reasonable approximation to GR. In that spirit, we shall use
Euler’s equations in an unbounded medium to represent conservation of momentum.
11
The above version of Euler’s equation makes the physical meaning of each term apparent,
but let us now switch to the more commonly used form,
∂v
∇p
+ (v · ∇) v = F −
.
∂t
ρ
(24)
The Cosmological Principle requires that the pressure gradient must be zero, and using
the fact that x · ∇x = x, the equation becomes
h
i
x Ḣ + H 2 = F.
(25)
Poisson’s equation for the gravitational force is
∇ · F = −4πGρ.
(26)
Taking the divergence of both sides above, and using ∇ · x = 3, we get
dH
+ H 2 = −4πGρ/3.
dt
(27)
Using
ȧ
H(t) = ,
(28)
a
along with mass conservation, this can be converted into an equation for the scale factor.
2
ȧ
ä
−
a
a
+
2
ȧ
a
=−
4πGρ
,
3
(29)
which simplifies to
4πGρ
a,
3
or, using our result for the evolution of the matter density,
ä = −
a2 ä = −
4πGρ0 a30
.
3
(30)
(31)
This is the basic differential equation for the time evolution of the scale factor. It is also
the equation for the radius of a spherical self-gravitating ball.
Looking at the equation, it is clear that the only case in which ä = 0 is when ρ0 = 0 –
an empty universe. [We will revisit this with the more general form from GR, but this basic
result is OK.] To obtain a static universe, Einstein modified GR, to give it the most general
form possible. His modification was to add a constant (for which there is no justification in
Newtonian gravity), corresponding to a modification of Poisson’s Law,
∇ · F = −4πGρ + Λ,
12
(32)
where Λ is referred to as the cosmological constant. The cosmological constant Λ must have
dimensions of t−2 to match the units of ∇ · F If |Λ| ∼ H02 , it would have virtually no effect
on gravity in the solar system, but would affect the large-scale universe.
If we include Λ, our previous derivation is modified such that
3(Ḣ + H 2 ) = −4πGρ + Λ
(33)
ä
4πGρ Λ
=−
+ ,
a
3
3
(34)
or equivalently
4πGρ0 a30 −2 Λ
4πGρ Λ
+ )a = −
a + a,
(35)
3
3
3
3
Note that a positive Λ corresponds to a repulsive force than can counteract gravity. We
now multiply both sides by ȧ and integrate with respect to dt:
ä = (−
1 2 4πGρ0 a30 1 Λ a2
ȧ =
+
+ K,
2
3
a 3 2
(36)
or
ȧ2
8πGρ0 a0 3 Λ
=
(37)
+ + Ka−2 ,
2
a
3
a
3
where K is an arbitrary constant of integration. For the case of a self-gravitation sphere with
Λ = 0, K/2 is just the total energy per unity mass (kinetic plus potential) at the surface of
the sphere. In GR, we shall see that K is associated with the spatial curvature. The above
equation describes what are called the Friedmann solutions for the scale factor of the
universe. It implicitly assumes that the universe is filled with zero pressure, non-relativistic
material (also known as the dust-filled model).
The above equations give some intuition for the evolution of the scale factor of the universe. The equation shows that for an expanding universe, where a(0) = 0, the gravitational
term should dominate for early times when a is small. As the universe expands though, first
the curvature term and later the cosmological constant term are expected to dominate the
right hand side of the equation.
Let us now introduce one additional non-Newtonian tweak to the equations. The above
equations correspond to a limiting case of the fully correct equations from GR in which the
pressure is zero and the energy density is dominated by the rest mass of the particles. To
be fully general, the matter density term should be replaced by an “effective density”
ρef f = ρ +
3p
,
c2
(38)
where ρ should now be understood to be the total energy density (kinetic + rest mass).
With this modification, Equation 34 becomes
ä
3p
4πG
Λ
ρ+ 2 + .
=−
a
3
c
3
13
(39)
It is worth emphasizing at this point that the energy density ρ will include contributions
from both matter and radiation, which as we shall see have different dependences upon the
scale factor.
Finally, we can now re-obtain the equation above for the first derivative if we take into
account that the expansion of the universe with be adiabatic, i.e.
dE = −pdV → d(ρc2 a3 ) = −pda3 .
(40)
This equation can be rewritten
a3 d(ρc2 ) + (ρc2 )da3 = −pda3 ,
(ρc2 + p)da3 + a3 d(ρc2 + p) = a3 dp
h
i
(41)
(42)
d a3 (ρc2 + p) = a3 dp
(43)
i
d h 3 2
a (ρc + p)
dt
(44)
ṗa3 =
which yields
p ȧ
= 0.
c2 a
If we now return to deriving the equation for the first derivative,
ρ̇ + 3 ρ +
3p
Λ
4πG
ä
ρ+ 2 + ,
=−
a
3
c
3
2
3p
4πG
Λ
1 dȧ
ρ + 2 aȧ + aȧ.
=−
2 dt
3
c
3
The expression for adiabatic expansion can be rewritten,
ȧ
3p ȧ
= −ρ̇ − 3ρ ,
2
c a
a
which can be inserted to yield
1 dȧ2
Λ
4πG ρaȧ − ρ̇a2 − 3ρaȧ + aȧ,
=−
2 dt
3
3
2
Λ
4πG
1 dȧ
2ρaȧ + ρ̇a2 + aȧ,
=
2 dt
3
3
1 dȧ2
4πG dρa2 Λ d 2
=
+
a,
2 dt
3 dt
6 dt
and hence
(45)
(46)
(47)
(48)
(49)
(50)
(51)
8πGρa2 Λa2
+
− k.
(52)
3
3
In the context of GR, we will come to associate the constant k with the spatial curvature
of the universe. GR is fundamentally a geometric theory in which gravity is described as a
curved spacetime rather than a force. In this Newtonian analogy the quantity -k/2 would
be interpreted as the energy per unit mass for a particle at the point a(t) in the expanding
system.
ȧ2 =
14
3.2
Conclusions
The two Friedmann equations,
ä
3p
4πG
Λ
ρ+ 2 + ,
=−
a
3
c
3
2
2
Λa
8πGρa
+
− k,
ȧ2 =
3
3
(53)
(54)
together fully describe the time evolution of the scale factor of the universe and will be used
extensively during the next few weeks.
3.3
An Example Solution and Definitions of Observable Quantities
Let us now work through one possible solution to the Friedmann equations. For a simple
case, we will start with Λ = 0. At the present
da
dt
!
= ȧ0 = a0 H0 .
(55)
t=to
We can now evaluate the constant k in terms of observable present day quantities.
8πGρa2
− k,
3
8πGρ0 a20
ȧ20 ≡ H02 a20 =
− k,
3
!
8πG 2
3H02
2
2 8πGρ0
.
− H0 =
a ρ0 −
k = −a0
3
3 0
8πG
ȧ2 =
(56)
(57)
(58)
Clearly, k = 0 only if ρ0 is equal to what we will define as the critical density,
ρcrit =
3H02
.
8πG
(59)
With this definition,
!
8πG 2
8πG 2
ρ0
k=
a0 ρ0
−1 =−
a ρ0 (Ω0 − 1) ,
3
ρcrit
3 0
(60)
where we have further defined,
Ω0 ≡
ρ0
8πGρ0
=
.
ρcrit
3H02
15
(61)
Note that this has the corollary definition
H02 =
8πGρ0
.
3Ω0
(62)
Inserting the definition for the curvature back into the Friedmann equation, we see that
8πGρ0 a30 8πGa20 ρcrit
+
(1 − Ω0 ),
ȧ =
3
a
3
2
(63)
or
ȧ
a0
2
Ω0 H02 a0 8πGρcrit
+
(1 − Ω0 )
a
3
2
Ω0 H02 a0
ȧ
=
+ H02 (1 − Ω0 )
a0
a
=
(64)
(65)
We now consider big bang solutions, i.e. a(0) = 0. At very early times (a ∼ 0), the
first term on the rhs – the gravitational term, will dominate the second term. Thus, at
early times the form of the solution should be independent of the density. However, at later
times the nature of the solution depends critically upon whether the second (energy) term
is positive, negative, or zero. Equivalently, it depends whether Ω0 is less than, equal to, or
greater than 1. If Ω0 < 1 and the energy term is positive, the solution for a(t) is analogous
to the trajectory of a rocket launched with a velocity greater than the escape velocity.
Consider now the case Ω0 = 1, which is called the Einstein-deSitter universe. This case
must always be a good approximation at early time. Then
3/2
da
H0 a
= 1/20 ,
dt
a
3/2
1/2
a da = H0 a0 dt
(66)
(67)
or, assuming a(0) = 0,
a
3H0 t
=
a0
2
2/3
.
(68)
Thus, a(t) is a very simple function for the Einstein-deSitter case. We can also very
easily solve for the age of the universe,
2
t0 = H0−1 .
3
(69)
Indeed, H0−1 overestimates the age of the universe for all Friedmann models with Λ = 0.
Now consider the case of Ω0 > 1. The maximum scale factor amax occurs when ȧ = 0 in
equation 65,
Ω0
amax
=
.
(70)
a0
Ω0 − 1
16
We can obtain a parametric solution by letting
a(t) = amax sin2 θ =
Ω0 a0
sin2 θ.
Ω0 − 1
(71)
Substituting this into equation 65 gives
2
Ω0
cos2 θ
4 sin2 θ cos2 θ θ̇2 = H02 (Ω0 − 1)
,
Ω0 − 1
sin2 θ
Z θ
1
Ω0
2Ω0
2
sin
2θ
.
θ
−
dx
sin
x
=
H0 t =
2
(Ω0 − 1)3/2 0
(Ω0 − 1)3/2
(72)
(73)
The above equation represents a parametric solution for the scale factor when Ω0 > 1.
Since the lifetime of the universe extends from θ = 0 to θ = π, the total lifetime of the
universe is
πΩ0
tlif etime =
(74)
H0 (Ω0 − 1)3/2
A similar parametric solution for H0 t can be derived for Ω0 < 1 by replacing sin θ with sinh θ
in the expression for a(t). In this case, a(t) ∝ t for large t.
3.4
The Friedmann Equations from General Relativity
Before moving on to a discussion of spacetime metrics, it is worth at least briefly mentioning
the origin of the Friedmann equations in the context of General Relativity. They are derived
directly from Einstein’s field equations,
1
8πG
Gij ≡ Rij − Rgij = 4 Tij ,
2
c
(75)
or, including the cosmological constant,
1
8πG
Rij − Rgij − Λgij = 4 Tij
2
c
(76)
The gik comprise the metric tensor, describing the metric of spacetime. T is the energymomentum tensor, and encapsulates all the information about the energy and momentum
conservation laws that we discussed in the Newtonian context. The conservation law in this
j
context is simply Ti;j
= 0, which means that the covariant derivative is zero. The Ricci
tensor (Rij ) and Ricci scalar (R) together make up the Einstein tensor.
In cosmology, the energy-momentum tensor of greatest relevance is a perfect fluid,
Tij = (ρc2 + p)Ui Uj − pgij
(77)
where Uk is the fluid four-velocity. Remember that we assumed a perfect fluid in the Newtonian analog. This covariant derivative of the tensor provides the analog to the Euler
equations. Substituting this expression for the stress tensor yields, after some math, the
Friedmann equations.
17
4
Spacetime Metrics
It is important to interpret the solutions for the scale factor obtained from Newtonian theory
in the last section within the framework of GR. While Newtonian theory treats gravity as a
force, in GR the presence of a mass is treated as curving or warping spacetime so that it is
no longer Euclidean. Particles moving under the influence of gravity travel along geodesics,
the shortest distance between two points in curved spacetime. It is therefore necessary to
be able to describe spatial curvature in a well-defined way.
4.1
Example Metrics
Curvature is most easily visualized by considering the analogy with 2D creatures living on
the surface of a sphere (balloon). Such creatures, who live in a closed universe, could easily
detect curvature by noticing that the sum of the angles of any triangle is greater than 180◦ .
However, this space is locally flat (Euclidean) in the sense that in a small enough region
of space the geometry is well-approximated by a Euclidean geometry. This space has the
interesting property that the space expands if the sphere (balloon) is inflated, and such an
expansion in no way changes the nature of the geometry.
It is also possible to define a metric along the surface. A metric, or distance measure,
describes the distance, ds, between two points in space or spacetime. The general form for
a metric is
ds2 = gij dxi dxj ,
(78)
where the gij are the metric coefficients that we saw in the Einstein field equations.
The distance ds along the surface of a unit sphere is given by

dφ
ds2 = dθ2 + sin2 θdφ2 = dθ2 1 + sin2 θ
dθ
!2 
.
(79)
The metric given by the above equation relates the difference between the coordinates θ and
φ of two points to the physically measurable distance between those points. Since the metric
provides the physical distance between two nearby points, its value should not change if
different coordinates are used. A change of coordinates from (θ, φ) to two other coordinates
must leave the value of the metric unchanged even though its functional form may be very
different.
The minimum distance between two points on the surface of the sphere is obtained by
minimizing the distance given by equation 79.
4.2
Geodesics
In general, for any metric the shortest distance between two points comes from minimizing
the quantity
Z P2
Z P2
Z P2
ds
I=
ds =
dt =
Ldt,
(80)
P1
P1 dt
P1
18
where the two points P1 and P2 are held fixed, t is a dummy variable that varies continuously
along a trajectory, and the Lagrangian L = ds/dt. Minimization of the Lagrangian yields
the equation of motion in special relativity.
If P1 and P2 are held fixed then the integral is minimized when Lagrange’s equations
are satisfied (same as in classical mechanics),
d ∂L
∂L
=
, i = 1..N.
∂xi
dt ∂ ẋi
(81)
Consider the example of the shortest distance (geodesic) between two points on the
surface of a unit sphere. Let the independent variable be θ instead of t. Then the Lagrangian
is
v
!2
u
ds u
dφ
t
2
= 1 + sin θ
L≡
,
(82)
dθ
dθ
and Lagrange’s equation is
∂L
d ∂L
=
,
∂φ
dθ ∂ φ̇

(83)

d  sin2 θ φ̇ 
q
= 0,
dθ
1 + sin2 θφ̇2
(84)
where φ̇ = dφ/dθ.
Integrating and squaring this equation gives
"
d
(φ − C2 )
sin θ
dθ
4
#2
!2

dx
dθ
−1
sin2 θ
=
C1
y=
1 − C1
1/2
(85)
and the differential equation becomes
2
with the solution
or,
2
dy
= C1 (1 − y ) + (1 + x )
dx
#2 
"
d
(φ − C2 )  .
= C1 1 + sin θ
dθ
Let y = φ − C2 and x = cot θ. Then
dy
dx

2
x = C1′ x,
cos(φ − C2 ) = C1′ cot θ.
!2 
,
(86)
(87)
(88)
The above equation gives the geodesics along the surface of a sphere. But this is just the
expression for a great circle! To see this, consider that a plane through the origin,
x + Ay + Bz = 0
19
(89)
produces the following locus of intersection with a unit sphere:
sin θ cos φ + A sin θ sin φ + B cos θ = 0,
B + tan θ(A sin φ + cos φ) = 0,
−B cot θ = C cos(φ − D).
(90)
(91)
(92)
Therefore we have demonstrated that geodesics on the surface of a sphere are great circles.
Of course, this can be proven much more easily, but the above derivation illustrates the
general method for determining geodesics for an arbitrary metric.
4.3
Special Relativity and Curvature
Week 3 Reading Assignment: Chapter 2
For special relativity, in a Lorentz frame we can define a distance in spacetime as
2
2
2
2
2
2
ds = c dt − dx = c dt
v2
1− 2 .
c
!
(93)
This metric also relates physically measurable distances to differences in coordinates. For
example, the time measured by a moving clock (the proper time) is given by ds/c. Thus,
proper time intervals are proportional to, but not equal to, dt.
Let’s look at the above metric for a moment. For light, the metric clearly yields ds2 = 0.
Light is therefore said to follow a null geodesic, which simply means that the physical distance
travelled is equal to ct. Everything that we see in the universe by definition lies along null
geodesics, as the light has just had enough time to reach us. Consider Figure 3. The null
geodesics divide the spacetime plane into two types of world lines. World lines with ds2 > 0
are said to be timelike because the time component is larger. Physically, this means that
we observed (received the light from) events with timelike world lines some time in the past.
World lines with ds2 < 0 are said to be spacelike. Spacetime points that lie along spacelike
world lines are sufficiently far that light has not yet had time to reach us.
Now, consider the equation of motion for a particle in special relativity. For a free
particle, the equation of motion follows from minimizing the distance between two fixed
points in space time, analogous to the case with the surface of the sphere,
δ
Z
2
1
ds = δ
Z
t2
t1
Ldt = 0.
(94)
Since the Lagrangian is
"
2 #1/2
v
L=c 1−
c
v2
= c 1 − 2 + ... ,
2c
!
(95)
and since the first term is constant, for nonrelativistic free particles (v << c) the special
relativistic Lagrangian reduces to the usual nonrelativistic Lagrangian without interactions.
20
Figure 3 Light cones for a flat geometry. Light travels along the null geodesics, while particles
travel along timelike geodesics. Points with ds2 < 0 are not observable at the present time.
.
Note that in the case of external forces, the situation is not quite so simple. Recall that
the classical Lagrangian is given by
1
L = mv2 − U.
2
(96)
The analog in special relativity is
q
L = −mc2 1 − v2 /c2 − U,
(97)
If one wishes to calculate the motion of a relativistic particle undergoing electromagnetic
interactions, then one must include the electrostatic potential Φ and the vector potential A
in the Lagrangian as
e
U = eΦ − v · A.
(98)
c
In general relativity, gravity is treated as an entity that modifies the geometry of spacetime. Particles travel along geodesics in that geometry with the equation of motion
δ
Z
1
2
ds = 0.
(99)
Thus, gravitational forces, as such, do not exist. The presence of massive bodies simply
affects the geometry of spacetime. When spacetime is curved due to the presence of gravitational mass, particles no longer travel on straight lines in that geometry. If one wishes to
21
Figure 4 Geometries with the three different curvatures.
include, say, electromagnetic forces in addition to gravity, then the Lagrangian would have
to be modified as in special relativity.
What distinguishes a curved from flat geometry? At any point in a metric, one can
define an invariant quantity called the curvature, which characterizes the local deviation of
the geometry from flatness. Since it is an invariant quantity, the curvature does not depend
on the choice of coordinate system. For the surface of a unit sphere, the value of the curvature
is +1. The curvature of flat space is zero, and the curvature of an open hyperboloid is -1. It
is useful to picture the three types of curvature geometrically (Figure ??). The properties
of the three cases are:
• k=0: Flat, Euclidean geometry. The sum of angles in a triangle is 180◦ .
• k=1: Closed, spherical geometry. The sum of angles in a triangle is greater than 180◦ .
• k=-1: Open, hyperbolic geometry. The sum of angles in a triangle is less than 180◦ . The
standard analogy for visualization is a saddle, where all directions extend to infinity.
Since the value of the curvature is invariant, there can be no global coordinate transformation that converts a curved metric, such as the surface of a sphere, into the metric of flat
spacetime. In other words, there is no mapping x = x(θ, φ), y = y(θ, φ), z = z(θ, φ) that
converts the metric for a unit sphere to
ds2 = dx2 + dy 2 + dz 2 .
(100)
This is why, for example, flat maps of the world always have some intrinsic distortion in
them.
22
Similarly, there is no coordinate transformation that converts the metric of special relativity (called the Minkowski metric)
ds2 = c2 dt2 − dx2
(101)
into a curved geometry.
4.4
The Robertson-Walker Metric
We have looked at examples of metrics for a unit sphere and for special relativity. Let
us now turn our attention to the question of whether we can construct a metric that is
valid in a cosmological context. Assume that (1) the cosmological principle is true, and (2)
each point in spacetime has one and only one co-moving, timelike geodesic passing through
it. Assumption (2) is equivalent to assuming the existence of worldwide simultaneity or
universal time. Then for a co-moving observer, there is a metric for the universe called the
Robertson-Walker metric, or sometimes the Friedmann-LeMaitre-Robertson-Walker metric
(named after the people who originally derived it). The Robertson-Walker metric is
2
2
ds = (c dt) − a(t)
2
"
dr̃ 2
+ r̃ 2 dη 2 ,
1 − kr̃ 2
#
(102)
where k is the sign of the curvature (k = −1, 0, 1), a(t) is the scale factor, and r is the
co-moving distance. The dη term is short-hand for the solid angle,
dη 2 = sin2 θdφ2 + dθ2 .
(103)
For a given curvature, this metric completely specifies the geometry of the universe to
within one undetermined factor, a(t), which is determined from the Friedmann equations.
Together, the Friedmann equations and Robertson-Walker metric completely describe the
geometry.
The above form of the metric is the one given in the text; however, there are in fact three
commonly used forms for the metric,

ds2 = (c dt)2 − a(t)2 dr̄ 2 +
ds2 = (c dt)2 −
2
2
sin kr̄
k
!2

dη 2  ,
i
h
a(t)2
2
2
2
,
dr
+
r
dη
(1 + 14 kr 2 )2
ds = (c dt) − a(t)
2
"
dr̃ 2
+ r̃ 2 dη 2 .
1 − kr̃ 2
#
(104)
(105)
(106)
All three forms are equivalent, yielding the same value for the distance between two points.
Transformation between the forms is possible given the appropriate variable substitutions.
These transformations are left as a homework exercise.
23
In the above equations, k is the same curvature that we discussed in the context of
special relativity. The phrases “open” and “closed” now take on added significance in the
sense that, for Λ = 0, a “closed” will recollapse while an “open” universe will expand forever.
In contrast, the recent discovery that Λ 6= 0 has given rise to the phrase: “Geometry is not
destiny”. In the presence of a cosmological constant, the strict relation above does not hold.
4.4.1
Proper and Co-moving Distance
Given the above metric, we will be able to measure distances. Looking at the equation, let
us start with two distance definitions
• Proper Distance: Proper distance is defined as the actual spatial distance between
two co-moving observers. This distance is what you would actually measure, and is a
function of time as the universe expands.
• Co-moving (or coordinate) distance: The co-moving distance is defined such that
the distance between two co-moving observers is independent of time. The standard
practice is to define the co-moving distance at the present time t0 .
As an illustration, consider two co-moving observers currently separated by a proper
distance r0 . At any lookback time t, the proper separation will be
DP = (a/a0 )r0 ,
(107)
DC = r 0 .
(108)
while the co-moving distance will be
Note that it is a common practice to set a0 = 1.
4.4.2
Derivation of the Robertson-Walker Metric
We shall now derive the Robertson-Walker metric. While the metric can be derived by
several methods, we will go with a geometric approach for clarity. Consider an arbitrary
event (t, r) in spacetime. This event must lie within a spacelike 3D hypersurface within
which the universe everywhere appears identical to its appearance at the point in question
(homogeneity). The set of co-moving timelike geodesics (world lines of co-moving observers)
through each point on this hypersurface defines the universal time axis. The metric can then
be expressed in the form
ds2 = c2 dt2 − dχ2 ,
(109)
where dχ is the distance measured within the spacelike hypersurface. There are no cross
terms dχdt because the time axis must be perpendicular to the hypersurface. Otherwise there
is a largest cross-term that yields a preferred spacelike direction, thus violating isotropy. If
we choose a polar coordinate system, then dχ2 can be written in the form
h
i
dχ2 = Q(r, t) dr̃ 2 + r̃ 2 dη 2 ,
24
(110)
where Q(r, t) includes both the time and spatial dependence. Again by isotropy, all cross
terms like drdη must vanish. The second term inside the brackets can have a different
coefficient than the first term, but we have the freedom to define r so that the coefficients
are the same.
The proper distance δx between two radial points r and r + δr is
δx = Q1/2 δr.
(111)
Locally, geometry is Euclidean, and local Galilean invariance implies that Hubble’s law is
valid:
1 ∂
1 ∂Q
H(t) =
δx =
(112)
δx ∂t
2Q ∂t
Hubble’s law must be independent of position, r, because of the Cosmological Principle.
Therefore Q(r, t) must be separable,
Q(r, t) = a2 (t)G(r),
so the metric is
h
(113)
i
ds2 = c2 dt2 − a2 (t)G(r) dr 2 + r 2 dη 2 .
(114)
dχ2 = dr̃ 2 + F 2 (r̃)dη 2
(115)
Let us now transform the radial coordinates to
using the change of variables
F (r̃) = G(r)r,
dr̃ = G(r)dr.
(116)
(117)
For a Euclidean geometry,
dχ2 = dr̃ 2 + dη 2 ,
(118)
so F (r) = r in the Euclidean case. Since spacetime locally appears Euclidean, we therefore
require in the limit (r̃ → 0) that F (0) = 0 and F ′ (0) =1.
Now consider the triangles below.
If the angles α, β, γ are small, and if x, y, z are proper distances, we get 3 identities:
F (r̃)α = F (ǫ + τ )γ,
F (r̃ + ǫ + τ )α = F (ǫ + τ )β,
F (r̃ + ǫ)α = F (ǫ)β + F (τ )γ.
(119)
(120)
(121)
Eliminating β and γ from the three equations, we get
F (r̃)
F (r̃ + ǫ + τ )
+ F (τ )
,
F (ǫ + τ )
F (ǫ + τ )
F (ǫ + τ )F (r̃ + ǫ) = F (ǫ)F (r̃ + ǫ + τ ) + F (τ )F (r̃).
F (r̃ + ǫ) = F (ǫ)
25
(122)
(123)
Figure 5 Geometric Derivation of Robertson-Walker Metric
Take the limit ǫ → 0 and expand to first order in ǫ.
[F (τ ) + ǫF ′ (τ )] [F (r̃) + ǫF ′ (r̃)] = ǫF (r̃ + τ ) + F (τ )F (r̃)),
F (τ )F (r̃) + ǫF (τ )F ′ (r̃) + ǫF ′ (τ )F (r̃) + ǫ2 F ′ (τ )F ′ (r̃) = ǫF (r̃ + τ ) + F (τ )F (r̃)),
F (r̃)F ′ (τ ) + F (τ )F ′ (r̃) = F (r̃ + τ ).
(124)
(125)
(126)
Expand to second order in τ :
1
1
1
F (r̃) 1 + τ F ′′ (0) + τ 2 F ′′′ (0) +F ′ (r̃) F (0) + τ F ′ (0) + + τ 2 F ′′ (0) = F (r̃)+τ F ′ (r̃)+ τ 2 F ′′ (r̃),
2
2
2
(127)
or, using the limits for F (0) and F ′ (0), the first order terms give
F ′′ (0) = 0,
(128)
F ′′ (r̃) = F ′′′ (0)F (r̃).
(129)
and the second order terms give
Define k ≡ (−F ′′′ (0))1/2 . Then
F ′′ (r̃) = −k 2 F (r̃),
(130)
F (r) = A sin(kr̃ + B).
(131)
and this has the general solution
26
From the boundary conditions, F (0) = 0 implies B = 0, and F ′ (0) = 1 implies kA = 1.
Therefore, the solution is
sin kr
F (r̃) =
.
(132)
k
Verify the third derivative:
F ′′′ (0) = −k 2 cos 0 = −k 2 .
(133)
Correct. The sign of k determines the nature of the solution:
• k = 1 → F (r̃) = sin r̃
• k = 0 → F (r̃) = r̃
• k = −1 → F (r̃) = sinh r̃.
Thus, we have the Robertson-Walker metric,

sin kr̄
ds2 = (c dt)2 − a(t)2 dr̄ 2 +
k
which can be converted to the other standard forms.
5
!2

dη 2  ,
(134)
Redshift
OK. Stepping back for a second, we now have a means of describing the evolution of the
size of the universe (Friedmann equation) and of measuring distances within the universe
(Robertson-Walker metric). It’s time to recast these items in terms of observable quantities
and use this machinery to develop a more concise description our Universe. We don’t directly
observe the scale factor, a(t), but we can observe the cosmological redshift of objects due
to the expansion of the universe. As you may recall, the Doppler shift of light (redshift or
blueshift) is defined as
νe − νo
λo − λe
=
,
(135)
z=
λe
νo
where λo and λe are the observed and emitted wavelengths, and νo and νe are the corresponding frequencies. This can be recast in terms of frequency as
1+z =
νe
.
νo
(136)
We know that light travels along null geodesics (ds = 0). Therefore, for light travelling
to us (i.e. along the radial direction) the RW metric implies
dr 2
c dt = a
1 − kr 2
cdt
dr
=
= f (r).
a
1 − kr 2
2
2
2
27
(137)
(138)
Consider two photons at distance R, emitted at times te and te + δte , that are observed at
times to and to + δto . Since both are emitted at distance R, f (r) is the same and
Z
to
t1
cdt
=
a
Z
to +δto
te +δte
cdt
.
a
(139)
If δte is small, then the above equation becomes
δto
δt1
=
ao
ae
νo ao = νe ae
νe
ao
=
= 1 + z,
νo
ae
(140)
(141)
(142)
where the last relation comes from the definition of redshift. Taking ao to be now (t0 ), and
defining a0 ≡ 1 we therefore have the final relation
a=
1
.
1+z
(143)
Note that there is a one-to-one correspondence between redshift and scale factor – and
hence also time. The variables z,a, and t are therefore interchangeable. From this point on,
we will work in terms of redshift since this is an observable quantity. We do, however, need
to be aware that the cosmological expansion is not the only source of redshift. The other
sources are
• Gravitational redshift: Light emitted from deep within a gravitational potential well
will be redshifted as it escapes. This effect can be the dominant source of redshift in
some cases, such as light emitted from near the event horizon of a black hole.
• Peculiar velocities: Any motion relative to the uniform expansion will also yield a
Doppler shift. Galaxies (and stars for that matter) do not move uniformly with the
expansion, but rather have peculiar velocities relative to the Hubble flow of several
hundred km s−1 – or even > 1000 km s−1 for galaxies in clusters. In fact, some of the
nearest galaxies to us are blueshifted rather than redshifted. This motion, which is
a natural consequence of gravitational attraction, dominates the observed redshift for
nearby galaxies.
The total observed redshift for all three sources is
(1 + z) = (1 + zcosmological )(1 + zgrav )(1 + zpec ).
(144)
Also, between two points at redshifts z1 and z2 (z1 being larger), the relative redshift is
1 + z12 =
1 + z1
a1
= .
1 + z2
a2
28
(145)
6
The Friedmann Equations 1: Observable Quantities
Recall again the Friedmann equation,
ȧ2 + kc2 =
8πG 2 Λc2 2
ρa +
a.
3
3
(146)
We will now recast this in a simpler form corresponding to observable quantities. First,
let us list and define these quantities.
6.1
The Hubble parameter (H)
We have previously defined the Hubble parameter as
H=
and the Hubble constant as
H0 =
7
ȧ
a
(147)
ȧ0
.
a0
(148)
The density parameter (Ω0)
We have previously defined the density parameter as the ratio of the actual density to the
critical density at the current time (t0 ). The critical density ρc is the density required to
just halt the expansion of the universe for models with Λ = 0, and is given by
3H02
.
ρc =
8πG
(149)
The matter density parameter at the current time is thus,
Ω0 =
8
8πGρ0
ρ0
=
.
ρc
3H02
(150)
The cosmological constant density parameter (Ω0Λ)
Consider a empty universe (Ω0 = 0). The “critical” value of the cosmological constant is
defined as the value required for a flat universe in this model (k = 0). Specifically, for time
t0 the Friedmann equation above becomes
ȧ20 Λc c2
−
=0
a20
3
3H 2
Λc = 2 0 .
c
29
(151)
(152)
The parameter ΩΛ is defined as
Λ
Λc2
ΩΛ =
=
.
Λc
3H02
(153)
This is basically a statement describing the contribution of the energy density in the cosmological constant as a fraction of the total required to close the universe.
9
The Observable Friedmann Equation
Using the above equations, let’s now proceed to recast the Friedmann equation.
8πG 2 Λc2 2
ρa +
a
3
3 #
"
8πGρ0 ρ 2 Λc2 2
= a2
H +
H
3H02 ρ0 0 3H02 0
#
"
kc2
ρ
2
2 2
ȧ = a H0 Ω0 + ΩΛ − 2 2
ρ0
H0 a
"
#
ρ
kc2
2
2
H = H0 Ω0 + ΩΛ − 2 2
ρ0
H0 a
ȧ2 + kc2 =
(154)
(155)
(156)
(157)
Now, at time t0 ,
H02
=
H02
ρ0
kc2
Ω0 + ΩΛ − 2
ρ0
H0
2
kc
Ω0 + ΩΛ − 2 = 1
H0
Ω0 + ΩΛ + Ωk = 1,
"
#
(158)
(159)
(160)
(161)
where we have now defined the curvature term in terms of the other quantities,
Ωk = 1 − Ω0 − ΩΛ ,
(162)
This tells us that the general description of the evolution of the scale factor, in terms of
redshift, is
#
"
ρ
2
2
2
(163)
H = H0 Ω0 + ΩΛ + Ωk (1 + z)
ρ0
or
#
"
ρ
2
2
2
(164)
H = H0 Ω0 + ΩΛ + (1 − Ω0 − ΩΛ )(1 + z) .
ρ0
This definition is commonly written as H = H0 E(z), where
"
ρ
E(z) = Ω0 + ΩΛ + (1 − Ω0 − ΩΛ )(1 + z)2
ρ0
30
#1/2
.
(165)
10
The Equation of State
OK – looks like we’re making progress. Now, what is ρ/ρ0 ?? Well, we worked this out
earlier for pressureless, non-relativistic matter, assuming adiabatic expansion of the universe
– ρ ∝ (1 + z)3 . However, ρ is an expression for the total energy density. We need to
correctly model the evolution of the density for each component, which requires us to use
the appropriate equation of state for each component.
Recall that for the matter case, we started with the adiabatic assumption
pdV = −dE
pda3 = −d(ρa3 )
(166)
(167)
and set p = 0. Let us now assume a more general equation of state,
p = (1 − γ)ρc2 = wρc2 .
(168)
In general w is defined as the ratio of the pressure to the density. One can (and people do)
invent more complicated equations of state, such as p = (1−γ)ρc2 +p0 , where w is not longer
defined by the simple relation above, but the above equation is the standard generalization
that encompasses most models. For this generalization,
ρwda3 = −da3 ρ − a3 dρ
ρda3 (1 + w) = −a3 dρ
dρ
da3
= −(1 + w) 3
ρ
a
ρ = ρ0
a 3(1+w)
= ρ0 (1 + z)3(1+w)
a0
(169)
(170)
(171)
(172)
For the “dust-filled” universe case that we discussed before, which corresponds to nonrelativistic, pressureless material, we had w = 0. In this case, the above equation reduces to
ρ = ρ0 (1+z)3 . More generally, a non-relativistic fluid or gas can be described by a somewhat
more complicated equation of state that includes the pressure. For an ideal gas with thermal
energy much smaller than the rest mass (kB T << mp c2 ), matter density ρm , and adabatic
index γ,
p = nkB T =
ρm
kB T
ρc2
k
T
=
= w(T )ρc2
B
mp c2
mp c2 1 + (kB T /((γ − 1)mp c2 ))
(173)
In most instances, w(T ) << 1 and is well-approximated by the dust case.
———————————Aside on adiabatic processes
As a reminder, an adiabatic process is defined by
P V γ = constant,
31
(174)
where γ is called the adiabatic index. For an ideal gas, we know from basic thermodynamics
that
pV = nkB T ;
3
E = nkB T.
2
(175)
(176)
The equation of state for an ideal gas can be obtained in the following fashion. Integrating
dE = −pdV = V −γ dV,
(177)
one gets
PV
kB T
C
V 1−γ =
=
.
1−γ
γ−1
γ−1
It is now simple to see that the total energy density is
E=−
ρ = ρm + ρkB T = ρm (1 +
kB T /(γ − 1)
.
mp c2
(178)
(179)
———————————At the other extreme, for photons and ultra-relativistic particles where the rest-mass
makes a negligible contribution to the energy density, w = 1/3. In this case, ρ ∝ (1 + z)4 .
Thus, the radiation and matter densities have different dependences on redshift.
For radiation, the added 1 + z term can be understood physically as corresponding to the
redshifting of the light. Since E ∝ ν ∝ 1/(1 + z), the energy of the received photons is a
factor of 1 + z less than the emitted photons.
What about other equations of state described by other values of w? As we noted earlier,
the special case of w = −1 is indistinguishable from a cosmological constant. More generally,
let us consider constraints on arbitrary values of w. If we consider that the adiabatic sound
speed for a fluid is
!1/2
∂p
vs =
= wc2 ,
(180)
∂ρ
[where the equation is for the condition of constant entropy] then we see that w > 1 the sound
speed is greater than the speed of light, which is unphysical. Thus, we require that w < 1.
All values less that one are physically possible. The range 0 ≤ w ≤ 1 is called the Zel’dovich
interval. This interval contains the full range of matter- to radiation-dominated equations
of state (0 ≤ w ≤ 1/3) as well as any other equations of state where the pressure increases
with the energy density. Exploring equations of state with w < 1 is currently a hot topic
in cosmology as a means of distinguishing exotic dark energy models from a cosmological
constant. Additionally, in the above discussion we have generally made the approximation
that w is independent of time. For the ideal gas case, which depends upon temperature, this
is not the case since the temperature will change with the expansion. More generally, for
the negative w cases there is also a great deal of effort being put into models where w varies
with time. We will likely talk more about these topics later in the semester.
32
11
Back to the Friedmann Equation
For now, let us return to the present topic, which is the Friedmann equation in terms of
observable quantities. What is the appropriate expression for ρ/ρ0 that we should insert into
the equation? Well, we know that in general the universe can include multiple constituents
with different densities and equations of state, so the E(z) expression in the Friedmann
equation should really be expressed as a summation of all these components,
E(z) =
"
X
i
Ω0i (1 + z)3(1+wi ) + (1 −
X
Ω0i )(1 + z)2
i
#1/2
.
(181)
To be more concrete, if we consider the main components to be matter (Ω0 M), radiation
(Ω0r ), neutrinos (Ω0ν ), a cosmological constant (Ω0Λ ), and any unknown exotic component
(Ω0X ), then the equation becomes
i1/2
h
,
E(z) = Ω0M (1 + z)3 + Ω0r (1 + z)4 + Ω0ν (1 + z)4 + Ω0Λ + Ω0X (1 + z)3(1+wX ) + Ωk (1 + z)2
(182)
Ωk = 1 − Ω0M − Ω0r − Ω0ν − Ω0Λ − Ω0X .
(183)
where
When people talk about dark energy, they’re basically suggesting replacing the Ω0Λ term
with the Ω0X term with −1 < wx < 0. The radiation and neutrino densities are currently
orders of magnitude lower than the matter density, so in most textbooks you will see the
simpler expression
h
E(z) = Ω0M (1 + z)3 + Ω0Λ + (1 − Ω0M − Ω0Λ )(1 + z)2
i1/2
.
(184)
The expression for E(z) can be considered the fundamental component of the Friedmann
equation upon which our measures for the distance and evolution of other quantities will be
based.
So, given the above expression for E(z) (whichever you prefer), what does this tell us
about all the other possible observable quantities? We have already seen that
H = H0 E(z);
ρ = ρ0 (1 + z)3(1+w) .
12
(185)
(186)
(187)
Distances, Volumes, and Times
Cosmography is the measurement of the Universe. We’re now ready to take a look at how
we can measure various distances and times.
33
12.1
Hubble Time and Hubble Distance
The simplest time that we can define is the Hubble time,
tH =
1
,
H0
(188)
which is roughly (actually slightly greater than) the age of the universe. The simplest
distance that we can define is the Hubble distance, the distance that light travels in a
Hubble time,
c
DH = ctH =
.
(189)
H0
12.2
Radial Comoving Distance
Now, if we want to know the radial (line-of-sight) comoving distance between ourselves and
an object at redshift z,
DC = c
Z
a
0
t cdt
dr̃
√
,
DC ≡
=
0
0 a
1 − kr̃ 2
Z a
Z a
da
da
da
=c
=c
,
2
ada/dt
0 aȧ
0 a H(z)
r̃
Z
Z
(190)
(191)
dz
and using a = (1 + z)−1 and da = − (1+z)
2,
DC =
Z
z
0
c Z z dz
cdz
=
.
H(z)
H0 0 E(z)
Z z
dz
DC = D H
.
0 E(z)
(192)
(193)
This can also be derived directly from Hubble’s law, v = Hd. Recalling that v = cz, for
a small distance change ∆d,
DC =
Z
dd =
Z
z
0
∆v = c∆z = H∆d
Z z
dz
cdz
= DH
.
H
0 E(z)
(194)
(195)
We shall see below that all other distances can be expressed in terms of the radial comoving
distance.
Finally, note that at the start of this section we used,
DC =
r̃
Z
0
dr̃
,
1 − kr̃ 2
(196)
dr
,
1 + 41 kr 2
(197)
√
which relates DC to r̃. We could just as easily have used
DC =
Z
0
r
34
or
DC =
Z
r̄
0
dr̄ = r̄.
(198)
The important thing is to be consistent in your definition of r when relating
to other quantities!
12.3
Transverse Comoving Distance
Now consider two events at the same redshift that are separated by some angle δθ. The
comoving distance between these two objects, known as the transverse comoving distance
or the proper motion distance, is defined by the coefficient of the angular term in the
RW metric. Using the r̄ version of the metric,
DM =
sin kr̄
sin kDC
=
,
k
k
(199)
which for the three cases of curvature corresponds to
DM = sinh DC ;
DM = D C ;
DM = sin DC .
k = −1, Ωk > 0
k = 0, Ωk = 0
k = 1, Ωk < 0
(200)
(201)
(202)
Note that David Hogg posted a nice set of notes about Cosmography, which are widely
used, on astro-ph (astro-ph/9905116). In these notes, he instead recasts the equations in
terms of Ωk and DH , giving the following form for gives the transverse comoving distance
as:
q
DH
DM = √ sinh Ωk DC /DH
Ωk
DM = D C
q
DH
√
DM =
sin |Ωk |DC /DH
Ωk
k = −1, Ωk > 0;
(203)
k = 0, Ωk = 0;
(204)
k = 1, Ωk < 0;
(205)
(206)
which is equivalent to our formulation above.
12.4
Angular Diameter Distance
The angular diameter distance relates an objects physical transverse distance to its angular
size. It is defined such that for a rod with proper length l,
l=a
sin kr̄
dθ ≡ DA dθ,
k
35
(207)
or
DA = a
sin kr̄
DM
=
.
k
1+z
(208)
Note that we are using proper distance, because physically we typically care about the actual
size of the observed source (say the size of star forming region or galaxy) rather than some
comoving scale.
It is of interest to note that the angular diameter distance does not increase indefinitely.
At large redshift the (1 + z)−1 term dominates and the angular size decreases. In practice,
the maximum size for objects is at z ∼ 1 for the observed cosmological parameters.
12.5
Comoving Area and Volume
It is also ofter of interest to measure volumes so that one can determine the density of the
objects being observed (e.g. galaxies or quasars). In this instance, what one typically cares
about is the comoving volume, since you want to know how the population is evolving (and
hence the comoving density is changing) rather than how the scale factor is changing the
proper density. The differential comoving volume is simply the produce of the differential
comoving area and the comoving radial extent of the volume element,
dVC = dAC dDC .
(209)
The comoving area is simply defined from the solid angle term of the RW metric,
dAC =
sin kr̄
k
!2
sin θdθdφ,
(210)
!2
(211)
sin kr̄
dΩ.
dAC =
k
2
dAC = DM
dΩ.
(212)
(213)
Using the above information in the volume relation, we get
dVc =
or
dzDH
2
DM
dΩ
Ez
2
DA
(1 + z)2 DH
=
dΩdz,
E(z)
2
2
DH (1 + z)2 DA
DH DM
dVC
=
=
.
dΩdz
E(z)
E(z)
(214)
(215)
The integral over the full sky, out to redshift z, gives the total comoving volume within
that redshift. It will likely be a homework assignment for you to derive an analytic solution
for this volume and plot it for different values of Ω0 and ΩΛ .
36
12.6
Luminosity Distance
OK – so at this point we have a means of measuring comoving distances and volumes and
figuring out how large something is. What about figuring out the luminosity of a source?
The luminosity distance to an object is defined such that the observed flux, f , is
L
,
(216)
f=
4πDL2
just as in the Euclidean case, where L is the bolometric luminosity of the source. Now,
looking at this from a physical perspective, the flux is going to be the observed luminosity
divided by the area of a spherical surface passing through the observer. This sphere should
have area 4π(ao r̃)2 = 4πr̃ 2. Additionally, the observed luminosity differs from the intrinsic
luminosity of the source. During their flight the photons are redshifted by a factor of (1+z)
— so the energy is decreased by this factor, and time dilation also dilutes the incident flux
by a factor of (1+z) – δt0 = (1 + z)δt.
The net effect then is that
Lobs
L(1 + z)−2
f=
=
,
(217)
4πr̄ 2
4πr̄ 2
or
DL = r̄(1 + z) = DM (1 + z) = DA (1 + z)2
(218)
Note the very different redshift dependences of the angular diameter and luminosity
distances. While the angular diameter distance eventually decreases, the luminosity distance
is monotonic. This is good, as otherwise the flux could diverge at large redshift!
12.7
Flux from a Fixed Passband: k-corrections
On a related practical note, the luminosity distance above is defined for a bolometric luminosity. In astronomy, one always observes the flux within some fixed passband. For any
spectrum the differential flux fν , which is the flux at frequency ν within a passband of width
δν, is related to the differential luminosity Lν by
∆ν L′ν Lν
,
(219)
fν =
∆ν ′ Lν 4πDL2
where ν ′ is the emitted frequency and is related to ν by ν ′ = (1 + z)ν. Similarly, Lν ′ is the
emitted luminosity at frequency ν ′ .
The first time in the expression accounts for the change in the width of the passband due
to the redshift. Consider two emitted frequency ν1′ and ν2′ . These are related to the observed
wavelengths by
ν1′ = (1 + z)ν1 ;
ν2′ = (1 + z)ν2 ;
ν1′ − ν2′ = (1 + z)(ν1 − ν2 );
∆ν ′
= (1 + z).
∆ν
37
(220)
(221)
(222)
(223)
The second term accounts for the fact that you are looking at a different part of the
spectrum than you would be in the rest frame. This quantity will be one for a source with
a flat spectrum. Thus, the expression for the observed flux is
fν = (1 + z)
Lν(1+z) Lν
.
Lν 4πDL2
(224)
It is worth noting that it is a common practice in astronomy to look at the quantity νfν
because this eliminates the (1 + z) redshifting of the passband since ν = νe /(1 + z) ,
νfν =
νe Lνe
,
4πDL2
(225)
where νe = ν(1 + z) is the emitted frequency.
12.8
Lookback time and the age of the Universe
Equivalent to asking how far away an object lies, one can also ask how long ago the observed
photons left that object. This quantity is called the lookback time. The definition of the
lookback time is straightforward,
tL =
Z
0
z
da Z a da
=
,
tL =
dt =
a0 aH(z)
0
a0 ȧ
Z z
dz
dz
1
= tH
.
(1 + z) H0 E(z)
0 (1 + z)E(z)
Z
t
Z
a
(226)
(227)
The complement of the lookback time is the age of the universe at redshift z, which is
simply the integral from z to infinity of the same quantity,
tU = tH
12.9
Z
z
∞
dz
(1 + z)E(z)
(228)
Surface Brightness Dimming
While we are venturing into the realm of observable quantity, another that is of particular
relevance to observers is the surface brightness of an object, which is the flux per unit solid
angle. In the previous sections, we just seen that for a source of a given luminosity f ∝ DL−2 ,
and for a source of a given size,
2
dΩ = dθdφ ∝ DA
.
(229)
We also know that DL = DA (1 + z)2 .
From this information one can quickly show that
2
f
DL2
∝ (1 + z)4 .
Σ=
∝
dΩ
DA
38
(230)
The above equation, which quantifies the effect of cosmological dimming is an important result. It says that the observed surface brightness of objects must decrease very
rapidly as one moves to high redshift purely due to cosmology, and that this effect is completely independent of the cosmological parameters.
12.10
Deceleration Parameter
There is one additional quantity that should be mentioned in this section, which is primarily
of historical significance, but also somewhat useful for physical intuition. There was period
during the mid-20th century when observational cosmology was considered essentially a quest
for two parameters, the Hubble constant (H0 ), and the deceleration parameter (q0 ). The
idea was that measurement of the instantaneous velocity and deceleration at the present
time would completely specify the time evolution. The deceleration parameter is defined by
q0 = −
ä0
ä0 a0
= 2 ,
ȧ0
Ho a0
(231)
which originates from a Taylor expansion for the scale factor at low redshift,
a = a0
1
1 + H0 (t − t0 ) − q0 H02 (t − t0 )2 + ... .
2
(232)
For a general Friedmann model, the deceleration parameter is given by
1
q0 = Ω0 − ΩΛ
2
13
(233)
The Steady-State Universe
Although we won’t go into this topic during the current semester, it is worth pointing out
that there have been proposed alternatives to the standard cosmological model that we have
presented thus far. One that is of particular historical interest is the “steady-state” universe.
The steady-state universe follows from the perfect cosmological principle, which states that
the universe is isotropic and homogeneous in time as well as space. This means that all
observable quantities must be constant in time, and that all observers must observe the
same properties for the universe no matter when or where they live. It does not mean that
the universe is motionless – a flowing river or glacier has motion but does not change with
time (global warming aside). The expansion of the universe implies that the scale factor
(which is not itself a directly observable quantity) must increase with time.
The metric must again be the RW metric because the cosmological principle is contained
within the perfect cosmological principle. For the steady-state universe, the curvature must
be k = 0. Otherwise, the three dimensional spatial curvature (ka−2 ), which is an observable
quantity, varies with time as a changes. Similarly, the Hubble parameter must be a true
constant, which implies that
a
= exp [H(t − t0 )] ,
(234)
a0
39
and the metric must be
h
ds2 = c2 dt2 − e2Ht dr 2 + r 2 dθ2 + sin2 θdφ2
Note the for the steady-state universe,
i
(235)
ä0 a0
= −1.
(236)
ȧ2
The mean density of the universe is also observable, which requires that ρ is constant,
even though the universe is expanding. This require the continuous creation of matter at a
uniform rate per unit volume that just counterbalances the effect of the expansion,
q = q0 = −
dρa3
= 3ρH ∼ 3 × 10−47 gmcm−3 s−1 .
(237)
dt
In this model galaxies are constantly forming and evolving in such a way that the mean
observed properties do not change. Usually creation of hydrogen and helium were assumed,
but in principle the created matter could have been anything. Continuous creation of neutrons, the so-called hot steady-state model, was ruled out because it predicted too large an
X-ray background via n → p + e− + ν̄e + γ. Hoyle considered a modified version of GR
that no longer conserves mass, and he found a way to obtain the steady-state universe with
ρ = ρcrit .
This model is of course only of historical interest – it was originally proposed by Bondi
and Gold in 1948 when H0 was thought to be an order of magnitude larger than the currently
accepted value. The larger value, and hence younger age of the Universe, resulting in the
classic age problem in which the Universe was younger that some of the stars it contains.
The discovery of the black-body microwave background proved to be the fatal blow for this
model.
a−3
14
Horizons
The discussion of lookback time naturally leads to the issue of horizons – how far we can
see. There are two kinds of horizons of interest in cosmology. One represents a horizon of
events, while the other represents a horizon of world lines.
The event horizon is the boundary of the set of events from which light can never reach
us. You are all probably familiar with the term event horizon in the context of black holes.
In the cosmological context, event horizons arise because of the expansion of the universe.
In a sense, the universe is expanding so fast that light will never get here. The other type
of horizon, the particle horizon, is the boundary of the set of events from which light has
not yet had time to reach us.
Consider first event horizons. Imagine a photon emitted towards us at (t1,r1). This
photon travels on a null geodesic,
Z
t
t1
cdt Z r1
dr
=
.
a
1 + kr 2 /4
r
40
(238)
As t increases, the distance r will decrease as the photon gets closer. The photon lies outside
the event horizon if r > 0 at t = ∞ – i.e. if the light never reaches us. Put differently, if
Z
cdt
= ∞,
a
∞
t1
(239)
then light can reach everywhere, so there is no event horizon, and so an event horizon exists
if and only if
Z ∞
cdt
< ∞,
(240)
a
t1
Note that for a closed universe, which recollapses, the upper limit is usually set to tcrunch ,
the time when the universe has recollapsed. The hypersurface corresponding to the event
horizon is
Z ∞
cdt Z r1
dr
=
.
(241)
a
t1
0 1 + kr 2 /4
For the Einstein-de Sitter universe, where k = 0 and Λ = 0, there is no event horizon.
Intuitively, this should make since because the Einstein-de Sitter universe expands forever,
but with an expansion rate asymptotically approaching zero. On the other hand, the steadystate universe does have one with
r1 = (c/H)e−Ht1 .
(242)
This can be seen in that
Z
∞
t1
dt
=
a
∞
Z
t1
dt
e−Ht
=
c −Ht1
e
.
H
(243)
Now, the particle horizon exists only if
Z
dt
< ∞.
a
t
0
(244)
For the steady-state universe, the lower limit of the time integral should be −∞, and it is
clear that universe does not, in fact, have a particle horizon. The Einstein-deSitter universe,
for which a ∝ t2/3 (which can be derived going back to the section on the lookback time),
does have a particle horizon at
rph ∝
Z
t
0
cdt
∝ 3ct1/3 .
t2/3
(245)
Hence one measure of the physical size of the particle horizon at any time t is the proper
distance arph = 3ct. All non-empty isotropic general relativistic cosmologies have
a particle horizon.
Horizons have a host of interesting properties, some of which are listed below:
1. If there is no event horizon, any event can be observed at any other event.
41
2. Every galaxy within the event horizon must eventually pass out of the event horizon.
This must be true because equation 241 is a monotonically decreasing function.
3. In big bang models, particles crossing the particle horizon are seen initially with infinite
redshift since the emission occurred at a(temission ) = 0.
4. If both an event horizon and a particle horizon exist, they must eventually cross each
other. Specifically, at some time t, the size of the event horizon corresponding to those
events occurring at time t will equal the size of the particle horizon. This can be seen
as a natural consequence of the previous statements, as the event horizon shrinks with
time while the particle horizon grows.
15
Exploring the Friedmann Models
Having derived a general description for the evolution of the universe, let us now explore how
that time evolution depends upon the properties of the universe. Specifically, let us exploring
the dependence upon the density of the different components and the presence (or absence)
of a cosmological constant. In all cases below we will consider only single-component models.
Before we begin though, let us return for a moment to a brief discussion from an earlier
lecture. We discussed that for Λ = 0 the curvature alone determines the fate of the universe.
For a universe with positive curvature, gravity eventually reverses the expansion and the
universe recollapses. For a universe with zero curvature, gravity is sufficient to asymptotically
halt the expansion, but the universe never recollapses. Meanwhile, for a universe with
negative curvature the expansion slows but never stops (analogous to a rocket with velocity
greater than escape velocity).
In the case of a cosmological constant, the above is no longer true. Geometry alone does
not determine the destiny of the universe. Instead, since the cosmological constant dominates
at late times, the sign of the cosmological constant determines the late-time evolution (with
the exception of cases where the matter density is >> the critical density and the universe
recollapses before the cosmological constant has any effect). A positive cosmological constant
ensures eternal expansion; a negative cosmological constant leads to eventual recollapse. This
can be seen in a figure that I will show (showed) in class.
15.1
Empty Universe
A completely empty universe with Λ = 0 has the following properties:
H = H0 (1 + z)
q = q0 = 0
t0 = H0 −1
42
(246)
(247)
(248)
(249)
Such a universe is said to be “coasting” because there is no gravitational attraction to
decelerate the expansion.
In contrast, an empty universe with ΩΛ = 1 has:
H = H0
q0 = −1
(250)
(251)
(252)
This universe, which has an accelerating expansion, can be considered the limiting case at
late times for a universe dominated by a cosmological constant.
15.2
Einstein - de Sitter (EdS) Universe
The Einstein - de Sitter universe, which we have discussed previously, is a flat model in which
Ω0 = 1. By definition, this universe has a Euclidean geometry and the following properties:
H = H0 (1 + z 3(1+w)/2 )
q = q0
2
t0 =
3(1 + w)H0
−2
t
1
ρ = ρ0c
=
.
t0
6(1 + w)2πGt2
(253)
(254)
(255)
(256)
(257)
The importance of the EdS model is that at early times all Friedmann models with
w > −1/3 are well-approximated as and EdS model. This can be seen in a straightforward
fashion by looking at E(z),
h
E(z) ≡ Ω0 (1 + z)3(1+w) + ΩΛ + (1 − Ω0 − ΩΛ )(1 + z)2
i1/2
.
(258)
As z → ∞, the cosmological constant and curvature terms become unimportant as long as
w > −1/3.
15.3
Concordance Model
Current observations indicate that the actual Universe is well-described as by a spatially
flat, dust-filled model with a non-zero cosmological constant. Specifically, the data indicate
that Ω0 ≈ 0.27 and Λ = 0.73. This particular model has
q0 = −0.6
t0 ≈ tH .
(259)
(260)
It is interesting that this model yields an age very close to the Hubble time (consistent to
within the observational uncertainties), as this is not a generic property of spatially flat
43
Table 1. Comparison of Different Cosmological Models
Name
Ω0
ΩΛ
t0
q0
Einstein-de Sitter
Empty, no Λ
Example Open
Example Closed
Example Flat, Lambda
Concordance
Steady State
1
0
0.3
2
0.5
.27
1
0
0
0
0
0.5
.73
0
2
t
3 H
1
2
tH
0.82tH
0.58tH
0.84tH
≈ 1.001tH
∞
0
0.15
1
-0.25
-0.6
-1
Note. — The ages presume a matter-dominated (dust)
model.
models with a cosmological constant (see Table 1). I have not seen any discussion of this
“coincidence” in the literature. It is also worth pointing out the values cited assume the
presence of a cosmological constant (i.e. w = −1) rather than some other form of dark
energy.
15.4
General Behavior of Different Classes of Models
We have talked about the time evolution of the Λ = 0 models, and have also talked about
the accelerated late time expansion in Λ > 0 models. Figure ?? shows the range of expansion histories that can occur once one includes a cosmological constant. Of particular
note are the so-called “loitering” models. In these models, the energy density is sufficient
to nearly halt the expansion, but right at the point where the cosmological constant becomes dominant. Essentially, the expansion rate temporarily drops to near zero, followed
by a period of accelerated expansion that at late times looks like the standard Λ-dominated
universe. What this means is that a large amount of time corresponds to a narrow redshift interval, so observationally there exists a preferred redshift range during which a great
deal of evolution (stellar/galaxy) occurs. Having a loitering period in the past requires
that Ω0Λ > 1, and therefore is not consistent with the current data. Finally, to gain a
physical intuition for the different types of models, there is a nice javascript application at
http://www.jb.man.ac.uk/∼jpl/cosmo/friedman.html.
44
16
Classical Cosmological Tests
Week 4 Reading Assignment: §4.7
All right. To wrap up this section of the class it’s time to talk slightly longer about
something fun – classical cosmological tests, which are basically the application of the above
theory to the real Universe. There are three fundamental classical tests that have been used
with varying degrees of success: number counts, the angular size - redshift relation, and the
magnitude - redshift relation.
16.1
Number Counts
The basic idea here is that the volume is a function of the cosmological parameters, and
therefore for a given class of objects the redshift distribution, N(z), will depend upon Ω0
and ΩΛ .
To see this, let us try a simple example. Assume that we have a uniformly distributed
population of objects with mean density n0 . Within a given redshift interval dz, the differential number of these objects (dN) is given by
dN
= n × dVP ,
dz
(261)
where the proper density n and the proper volume element dVP are given by
n = n0 (1 + z)3
DP dΩ
DP
dz
dVP = AdDP =
=
.
(1 + z)2 H(1 + z)
(1 + z)3 H0 E(z)
(262)
(263)
Inserting this into the above definition,
or
dV
n0 DP2
dN
=n
= dΩ
,
dz
dz
H0 E(z)
(264)
(1 + z)2 DC2
dN
= n0
dz
H0 E(z)
(265)
The challenge with this test, as with all the others, is finding a suitable set of sources
that either do not evolve with redshift, or evolve in a way that is physically well understood.
16.2
Angular Size - Redshift Relation
If one has a measuring stick with fixed proper length, then measuring the angular diameter
versus redshift is an obvious test of geometry and expansion. We have already seen (and I
will show again in class), that the angular diameter distance has an interesting redshift dependence, and is a function of the combination of Ω0 and ΩΛ . A comparison with observation
should in principle directly constrain these two parameters.
45
In practice, there are several issues that crop up which make this test difficult. First, there
is the issue of defining your standard ruler, as most objects (like galaxies) evolve significantly
over cosmologically interesting distance scales. I will leave the discussion of possible sources
and systematic issues to the observational cosmology class, but will note that there is indeed
one additional fundamental concern. Specifically, the relation that we derived is valid for a
homogeneous universe. In practice, we know that the matter distribution is clumpy. We will
not go into detail on this issue, but it is worth pointing out that gravitational focusing can
flatten out the angular size - redshift relation.
16.3
Alcock-Paczynski Test
The Alcock-Paczynski (Alcock & Paczynski 1979) test perhaps should not be included in the
“classical” section since it is relatively modern, but I include it here because it is another
geometric test in the same spirit as the others. The basic idea here is as follows. Assume
that at some redshift you have a spherical source (the original proposal was a galaxy cluster).
Then in this case, the proper distance measured along the line of sight should be equal to the
proper distance measured in the plane of the sky, or more specifically. If one inputs incorrect
values for Ω0 and ΩΛ , then the sphere will appear distorted in one of the two directions.
Mathematically, the sizes are
dDC
dz
=
1+z
(1 + z)H0 E(z)
DM dθ
Angular size = DA dθ =
,
1+z
LOS size =
so
dz
= H0 E(z)DM ,
dθ
(266)
(267)
(268)
or in the more standard form
H0 E(z)
1 dz
=
DM .
(269)
z dθ
z
In practice, the idea is to average over a number of sources that you expect to be spherical
such that the relation holds in a statistical sense. This test remains of interest in a modern
context, primarily in the application of measuring the mean separations between an ensemble
of uniformly distributed objects (say galaxies). In this case, the mean separation in redshift
and mean angular separation should again satisfy the above relation.
16.4
Magnitude - Redshift Relation
The magnitude - redshift relation utilizes the luminosity distance to constrain the combination of Ω0 − ΩΛ . We have seen that the incident bolometric flux from a source is described
by
L
,
(270)
f=
4πDL2
46
Figure 6 Pretend that there is a figure here showing the SN mag-z relation.
which expressed in astronomical magnitudes (m ∝ −2.5 log f ) becomes,
m = M + 2.5 log(4π) − log DL (z, Ω0 , ΩΛ ),
(271)
where the redshift dependence and its sensitivity to the density parameters is fully encapsulated in DL . This test requires that you have a class of sources with the same intrinsic
luminosity at all redshifts – so-called “standard candles”. Furthermore, real observations are
not bolometric, which means that you must include passband effects and k-corrections. The
basic principle is however the same.
The greatest challenge associated with this test lies in the identification of well-understood
standard candles, and the history of attempted application of this method is both long and
interesting. Early attempts included a number of different sources, with perhaps the most
famous being brightest cluster galaxies.
Application of the magnitude-redshift relation to type Ia supernovae provided the first
evidence for an accelerated expansion, and remains a cosmological test of key relevance.
It is hoped that refinement of the supernovae measurements, coupled with other modern
cosmological tests, will also provide a precision constraint upon w. Achieving this goal will
require addressing a number of systematics, including some fundamental issues like bias
induced by gravitational focusing of supernovae. These issues are left for the observational
cosmology course. It is interesting to note though that this test took the better part of a
century to yield meaningful observational constraints!
17
The Hot Big Bang Universe: An overview
Up to this point we have been concerned with the geometry of the universe and measuring
distances within the universe. For the next large section of the course we are going to turn
our attention to evolution of matter in the universe. Before delving into details though, let’s
begin with a brief overview of the time evolution of the constituent particles and fundamental
forces in the Universe.
If we look around at the present time, the radiation from the baryons in our local part
of the universe – stars, galaxies, galaxy clusters – reveals a complex network of structure.
Observations of the microwave background also show that the radiation energy density, and
temperature, are low. Three fundamental questions in cosmology are:
• Can we explain the observed structures in the universe in a self-consistent cosmological
model?
• Can we explain the observed cosmic background radiation?
• Can we explain the abundances of light elements within the same model?
47
Table 2. Timeline of the Evolution of the Universe
Event
Planck Time
Strong Force
Inflation Era
Weak Force
Quark-Hadron Transition
Lepton Era
Nucleosynthesis
Radiation-Matter Equality
Recombination
Reionization
Galaxy Formation
Present Day
tU
TU
10−43 s
1019 GeV
10−36 s
1014 GeV
−36
−32
10
− 10
s
−12
10
s
−5
10 s
300 MeV
10−5 − 10−2 s
130 Mev − 500 keV
10−2 − 102 s
∼ 1 MeV
50,000 yrs (z = 3454)
9400 K
372,000 yrs (z = 1088)
2970 K (0.3 eV)
8
∼ 10 yrs (z = 6 − 20)
50 K
Reionization til now
50-2.7 K
13.7 Gyrs
2.7 K (∼ 10−4 eV)
Notes
GR breaks down
GUT
Hadrons form
e+ − e− annihilation
Light elements form
CMB
The answer to these three question is largely yes for a hot big bang cosmological model
– coupled with inflation to address a few residual details for the second of these questions.
For now, we will begin with a broad overview and then explore different critical epochs in
greater detail.
The basic picture for the time evolution of the universe is that of an adiabatically expanding, monotonically cooling fluid undergoing a series of phase transitions, with the global
structure defined by the RW metric and Friedmann equations. A standard timeline denoting
major events in the history of the universe typically looks something like Table 2.
Note that our direct observations are limited to t > 372 kyrs, while nucleosynthesis
constraints probe to t ∼ 1 s. The table, however, indicates that much of the action in
establishing what we see in the present-day universe occurs at even earlier times t < 1s.
From terrestrial experiments and the standard model of particle physics, we believe that we
have a reasonable description up to ∼ 10−32 s, although the details get sketchier as we move
to progressively earlier times. At t ∼ 10−32 there is a postulated period of superluminal
expansion (the motivations for which we will discuss later), potentially driven by a change
in the equation of state. At earlier times, we expect that sufficiently high temperatures
are reached that the strong force is unified with the weak and electromagnetic forces, and
eventually a sufficiently high temperature (density) is reached that a theory of quantum
gravity (which currently does not exist in a coherent form) is required. The story at early
times though remains very much a speculative tale; as we shall see there are ways to avoid
ever reaching the Planck density.
Now, looking back at the above table, in some sense it contains several categories of
48
events. One category corresponds to the unification scales of the four fundamental forces.
A second corresponds to the evolution of particle species with changing temperatures. This
topic is commonly described as the thermal history of the universe. A third category describes
key events related to the Friedmann equations (radiation-matter equality, inflation). Finally,
The last few events in this table correspond to the formation and evolution of the large scale
structures that we see in the universe today. Clearly each of these subjects can fill a semester
(or more) by itself, and in truth all of these “categories” are quite interdependent. We will
aim to focus on specific parts of the overall picture that illuminate the overall evolutionary
history. For now, let us begin at the “beginning”.
18
The Planck Time
Definition
The Planck time (∼ 10−43 s) corresponds to the limit in which Einstein’s equations are no
longer valid and must be replaced with a more complete theory of quantum gravity if we
wish to probe to earlier times. An often used summary of GR is that “space tells matter
how to move; matter tells space how to curve”. The Planck time and length essentially
correspond to the point at which the two cannot be considered as independent entities.
There are several ways to define the Planck time and Planck length. We will go through
two. The first method starts with the Heisenberg uncertainty principle, and defines the
Planck time as the point at which the uncertainty of the wavefunction is equal to the particle
horizon of the universe,
∆x∆p = lP mp c = h̄,
(272)
where lP = ctP .
Now, mP is the mass within the particle horizon, and
mP = ρP lP3 .
(273)
At early times we know that ρ ∼ ρc and tU ∼ 21 H −1 , so the density can be approximated as
ρP ∼ ρc =
3H 2
c2
1
∼
,
∼
8πG
Gt2P
GlP2
(274)
so
lP ≃
Gh̄
c3
!1/2
Gh̄
c5
!1/2
(275)
≃ 10−43 s,
(276)
1
≃ 4 × 1093 gcm−3 ,
Gt2P
(277)
tP = lP /c ≃
ρP ≃
≃ 2 × 10−33 cm,
49
mP ≃
ρlP3
h̄c
G
!1/2
≃ 3 × 10−5 g,
(278)
h̄c5
G
!1/2
≃ 10−19 GeV.
(279)
≃
2
EP = mP c ≃
Additionally, we can also define a Planck temperature,
TP ≃
EP
≃ 1032 K.
k
(280)
Just for perspective, it is interesting to make a couple of comparisons. The Large Hadron
Collider (LHC), which will be the most advanced terrestrial accelerator when it becomes
fully operational, is capable of reaching energies E ∼ 7 × 103 GeV, or roughly 10−15 EP .
Meanwhile, the density of a neutron star is ρN ∼ 1014 g cm−3 , or roughly 10−79 ρP .
The second way to think about the Planck length is in terms of the Compton wavelength
and the Schwarzschild radius. To see this, consider a particle of mass m. The Compton
length of the particle’s wavefunction is
λC ≡
h̄
h̄
=
.
∆p
mc
(281)
The Compton wavelength in essence defines the scale over which the wavefunction is localized.
Now consider the Schwarzschild radius of a body of mass m,
rs =
2Gm
.
c2
(282)
By definition, any particle within the Schwarzschild radius lies beyond the event horizon and
can never escape.
The Planck length can be defined as the scale at which the above two equations are equal.
Equating the two relations, we find that the Planck mass is
mP =
h̄c
2G
!1/2
≃
h̄c
G
!1/2
,
(283)
and that Planck length is
2G
lP = 2 mp ≃
c
Gh̄
c3
!1/2
,
(284)
from which the other definitions follow. Note that if the Schwarzschild radius is less than the
wavefunction, then this would indicate that information (and mass) can escape from within.
This would equivalent to having a naked singularity.
Physical Interpretation
50
The notion of a Big Bang singularity at t = 0 is an idea that is somewhat engrained
in the common picture of the Big Bang model. In truth, we cannot presently say anything
about the early universe at smaller times than the Planck time, and it is not at all clear
that a complete theory of quantum gravity would lead to an initial singularity. In this light,
the notion of t = 0 is indeed more a matter of convention than of physics. The Planck time
should therefore be thought of as the age of the universe when it has the Planck density IF
one uniformly extrapolates the expansion to earlier times.
Moreover, it is also physically plausible that the real Universe never reaches the Planck
density. Consider again the equation of state of the universe. As discussed in Chapter 2
of your book, there is no initial singularity if w < −1/3. To see this, we return to the
Friedmann equations. Recall that
!
4
p
ä = − πGρ 1 + 3 2 a.
3
ρc
(285)
It is clear that
ä < 0
ä > 0
p
1
>− ,
2
ρc
3
p
1
if 2 < − .
ρc
3
if
(286)
(287)
(288)
In the latter case, the expansion is accelerating with time, so conversely as you look back to
earlier times you fail to approach an initial singularity.
Fluids with w < −1/3 are considered to violate the strong energy condition. Physically, how might one violate this condition? The simplest option is to relax our implicit
assumption that matter can be described as an ideal fluid. Instead, consider a generalized
imperfect fluid – one which can have thermal conductivity (χ), shear viscosity (η), and
bulk viscosity (ζ).
We cannot introduce thermal conductivity or shear viscosity without violating the CP,
however it is possible for the fluid to have a bulk viscosity. In the Euler equation this would
look like
#
"
dv
+ (v · ∇)v = −∇p + ζ∇(∇ · v).
(289)
ρ
dt
The net effect upon the Friedmann equations (which we will not derive right now) is to
replace p with a effective pressure p∗ ,
p → p∗ = p − 3ζH.
(290)
With this redefinition it is possible to get homogeneous and isotropic solutions that never
reach the Planck density if ζ > 0.
In fact, there are actually physical motivations for having an early period of exponential
growth in the scale factor, which we will discuss in the context of inflation later in the term.
51
Given our lack of knowledge of the equation of state close to the Planck time, the above
scenario remains plausible. Having briefly looked at the earliest time, we now shift focus and
will spend a while talking about the evolution from the lepton era through recombination.
19
Temperature Evolution, Recombination and Decoupling
Before exploring the thermal history of the universe in the big bang model, we first need to
know how the temperature scales with redshift. This will give us our first glimpse of the cosmic microwave background. Below we are concerned with matter and radiation temperatures
when the two are thermally decoupled and evolving independently.
19.1
The Adiabatic and LTE Assumptions
Throughout this course we have been making the assumption that the matter and radiation
distributions are well-approximated as an adiabatically expanding ideal fluid. For much of
the early history of the universe, we will also be assuming that this fluid is approximately in
local thermodynamic equilibrium (LTE). It is worth digressing for a few moments to discuss
why these are both reasonable assumptions.
Adiabatic Expansion
In classical thermodynamics, an expansion is considered to be adiabatic if it is “fast” in
the sense that the gas is unable to transfer heat to/from an external reservoir on a timescale
less than the expansion. The converse would be an isothermal expansion, in which the
pressure/volume are changed sufficiently slowly that the gas can transfer heat to/from an
external reservoir and maintain a constant temperature. Mathematically, the above condition
for adiabatic expansion corresponds to P V =constant and dE = −P dV .
In the case of the universe, the assumption of adiabatic expansion is a basic consequence of
heat. Having a non-adiabatic expansion would require a means of transferring heat between
our universe and some external system (an adjacent brane?). Moreover, this heat transfer
would need to occur on a timescale t << tH for the adiabatic assumption to fail. One can
always postulate scenarios in which there is such a transfer (e.g. the steady-state model);
however, there is no physical motivation for doing so at present. Indeed, the success of Big
Bang nucleosynthesis can be considered a good argument back to t ∼ 1s for the sufficiency
of the adiabatic assumption.
Local Thermodynamic Equilibrium (LTE)
The condition of LTE implies that the processes acting to thermalize the fluid must occur
rapidly enough to maintain equilibrium. In an expanding universe, this is roughly equivalent
to saying that the collision timescale is less than a Hubble time, τ ≤ tH . Equivalently, one
can also say that the interaction rate, Γ ≡ nσ|v| ≥ H. Here n is the number density of
particles, σ is the interaction cross-section, and |v| is the amplitude of the velocity.
52
Physically, the above comes from noting that T ∝ a−1 (which we will derive shortly)
and hence Ṫ /T = −H, which says that the rate of change in the temperature is just set
by the expansion rate. Once the interaction rate drops below H, the average particle has a
mean free path larger that the Hubble distance, and hence that species of particle evolves
independently from the radiation field henceforth. It is worth noting that one cannot assume
a departure from thermal equilibrium just because a species is no longer interacting – it is
possible for the temperature evolution to be the same as that of the radiation field if no
additional processes are acting on either the species or the radiation field.
19.2
Non-relativistic matter
In this section we will look at the temperature evolution of non-relativistic matter in the
case where the matter is decoupled from the radiation field. If we assume that the matter
can be described as an adiabatically expanding ideal gas, then we know
3
E = U + KE = ρm c2 + nkB Tm
2
dE = −P dV ;
!
3 ρm kB Tm 3
2
a;
V = ρm c +
2 mp
ρm kB Tm
,
P = nkB Tm =
mp
(291)
(292)
(293)
and putting these equations together can quickly see
d
"
!
#
kB Tm 3
3 kB Tm 3
a = −ρm
ρm c2 + ρm
da .
2
mp
mp
(294)
Mass conservation requires also that ρm a3 is constant, so
3 ρm kB a3 dTm
ρm kB TM 3
=−
da ,
2
mp
mp
dTm
2 da3
=
,
Tm
3 a3
2
a0
= T0m (1 + z)2 .
Tm = T0m
a
(295)
(296)
(297)
So we see that the temperature of the matter distribution goes at (1 + z)2 .
19.3
Radiation and Relativistic Matter
What about for radiation. For a gas of photons, it is straightforward to derive the redshift
dependence. The relation between the energy density and temperature for a black body is
simply
< u >≡ ρr c2 = σr Tr4 ,
(298)
53
and the pressure is
1
σr T 4
p = ρc2 =
.
(299)
3
3
Note that you may have seen this before in the context of the luminosity of a star with
temperature T . The quantity σr is the radiation density constant, which in most places
you’ll see written at a instead. The value of the radiation constant is
4
π 2 kB
−15
ergscm−3 K4 .
3 3 = 7.6 × 10
15h̄ c
We know from a previous class that
(300)
σr =
ρr ∝ (1 + z)4 ,
(301)
Tr = T0r (1 + z).
(302)
which tells us that
This is true for any relativistic species, and more generally the temperature of any particle
species that is coupled to the radiation field will have this dependence.
One could also derive the same expression using the adiabatic expression
σr T 4 3
da .
3
T4
4T 3 dT a3 + T 4 da3 = − da3
3
dT
1 da3
=−
T
3 a3
−1
T ∝ a ∝ (1 + z).
d(σr T 4 a3 ) = −
19.4
(303)
(304)
(305)
(306)
Temperature Evolution Prior to Decoupling
In the above two sections we have looked at the temperature evolution of non-relativistic
matter and radiation when they are evolving as independent, decoupled systems:
Tm = T0m (1 + z)2
Tr = T0r (1 + z)1
(307)
(308)
(309)
What about before decoupling? In this case, the adiabatic assumption becomes
d
"
kB Tm σr T 4
3 kB Tm
+ σr T 4 a3 = − ρm
+
da3 .
ρm c + ρm
2
mp
mp
3
2
!
#
!
(310)
Mass conservation (valid after freeze-out) requires that ρm a3 =constant, as before. We are
now going to introduce a dimensionless quantity σrad , which will be important in subsequent
discussions. We define
4mp T 3
.
(311)
σrad = σr
3kB ρm
54
Using this expression and mass conservation, the previous equation can be rewritten as
kB T
σr T 4
3 kB T
d
da3 .
ρm
+ σr T 4 a3 = − ρm
+
2
mp
mp
3
!
!
3ρm kB
4σr T 4
3
3 ρm kB T
3
dT
+ 4σr T a = −da
+
2mp
mp
3
"
!
#
!
3ρm kB
3
da
da3
dT 2mp + 4σr T
=
−3
=
−
3
3
ρ
k
m B
T
a
a
+ 4σr3T
mp
dT
1 + σrad da
= 1
+ σrad a
T
2
(312)
(313)
(314)
(315)
3
Now, the above is non-trivial to integrate because σrad is in general a function of ρTm .
Recall though, that after decoupling the radiation temperature T ∝ (1 + z), while the
matter density ρm ∝ (1 + z)3 . In this case, we have that σrad is constant after decoupling.
Note that I have not justified here why it is OK to use the radiation temperature, but bear
with me. To zeroth order you can consider σrad as roughly the ratio of the matter to radiation
energy densities (to within a constant of order unity), which would have in it Tr4 /Tm , so the
temperature dependence is mostly from the radiation.
Anyway, if σrad is constant after decoupling, then we can compute the present value and
take this as also valid at decoupling. Taking T0r = 2.73 K,
σrad (tdecoupling ) ≃ σrad (t = t0 ) ≃ 1.35 × 108 (Ωb h2 )−1 ,
(316)
This value is >> 1, which implies that to first order,
dT
da
≃ − T = T0r (1 + z),
T
a
(317)
This shows that even at decoupling, where we have non-negligible contributions from both the
matter and the radiation, the temperature evolution is very well approximated as T ∝ (1+z).
At higher temperatures the matter becomes relativistic, and the temperature should therefore
evolve in with the same redshift dependence.
20
A Thermodynamic Digression
Before proceeding further, it is worth stepping back for a moment and reviewing some basic
thermodynamics and statistical mechanics. You should be all too familiar at this point
with adiabatic expansion, but we haven’t discussed either entropy, chemical potentials, and
equilibrium energy/momentum distributions of particles.
55
20.1
Entropy
Entropy is a fundamental quantity in thermodynamics that essentially describes the disorder
of a system. The classical thermodynamic definition of entropy is given by
dQ
,
(318)
T
where S is the entropy and Q is the heat of the system. In a more relevant astrophysical
context, the radiation energy density for instance would be
dS =
ρc2 + p
<u>
=
.
(319)
T
T
The statistical mechanics definition is based instead upon the number of internal “microstates” in a system (essentially internal degrees of freedom) for a given macro-state. As
an example, consider the case of 10 coins. There is one macro-state corresponding to all
heads, and also only 1 micro-state (configuration of the individual coins) that yields this
macro-state. On the other hand, for the macro-state with 5 heads and 5 tails, there are
C(10,5) combinations of individual coins - micro-states - that can yield a single macro-state.
By this Boltzmann definition, the entropy is S = kB ln ω, where ω is the number of internal
micro-states.
sr =
21
Chemical Potential
I’ve never liked the name chemical potential, as the ‘chemical’ part is mainly a historical
artifact. What we’re really referring here to is the potential for electromagnetic and weak
(and strong at high T) reactions between particles. In this particular instance, I’ll quote the
definition of chemical potential from Kittel & Kroemer (page 118). Consider two systems
that can exchange particles and energy. The two systems are in equilibrium with respect to
particle exchange when the net particle flow is zero. In this case:
“The chemical potential governs the flow of particles between the systems, just as the
temperature govers the flow of energy. If two systems with a single chemical species are at
the same temperature and have the same value of the chemical potential, there will be no
net particle flow and no net energy flow between them. If the chemical potentials of the two
systems are different, particles will flow from the system at higher chemical potential to the
system at lower chemical potential.”
In the cosmological context that we are considering, instead of looking at physically
moving particles between two systems, what we are instead talking about is converting
one species of particle to another. In this interpretation, what the above statement says
is that species of particles are in equilibrium (chemical equilibrium, but again I loathe the
terminology), then the chemical potential of a given species is related to the potentials of
the other species with which it interacts. For instance, consider four species a, b, c, d that
interact as
a + b ←→ c + d.
(320)
56
For this reaction, µa + µb = µc + µd whenever chemical equilibrium holds.
For photons, µγ = 0, and indeed in for all species in the early universe it is reasonable
to approximate µi = 0 (i.e. µi << kT ). This can be seen for example by considering the
reaction
(321)
γ+γ ⇀
↽ e+ + e− .
If this reaction is in thermal equilibrium (i.e. prior to pair annihilation), then the chemical
potential must be
µ+ + µ− = 0.
(322)
In addition, since
ne− = ne+ + ∼ 10−9 ,
(323)
µ+ = µ− + ∼ 10−9 .
(324)
µ+ ≃ µ− ≃ 0.
(325)
we expect
Hence,
21.1
Distribution Functions
Long ago, in a statistical mechanics class far, far away, I imagine that most of you discussed
distribution functions for particles. For species of indistinguishable particles in kinetic
equilibrium the distribution of filled occupation states is given either by the Fermi-Dirac
distribution (for fermions) or the Bose-Einstein distribution (for bosons), which are given by
f (p) =
1
e(E−µ)/kT
±1
,
(326)
where p in this equation is the particle momentum and E 2 = |p|2 c2 + (mc2 )2 ). The “+”
corresponds to Fermions and the “-” corresponds to bosons.
Physically the reason that the equation is different for the two types of particles is due to
their intrinsic properties. Fermions, which have half integer spin, obey the Pauli exclusion
principle, which means that no two identical fermions can occupy the same quantum state.
Bosons, on the other hand, do not obey the Pauli exclusion principle and hence multiple
bosons can occupy the same quantum state. This is a bit of a digression at this point, so I
refer the reader to Kittel & Kroemer for a more detailed explanation.
Note that the above distribution functions hold for indistinguishable particles. By definition, particles are indistinguishable if their wavefunctions overlap. Conversely, are considered
distinguishable if their physical separation is large compared to their De Broglie wavelength.
In the classical limit of distinguishable particles, the appropriate distribution function is the
Boltzmann distribution function,
f (p) =
1
e(E−µ)/kT
57
,
(327)
which can be seen to be the limiting case of the other distributions when kT << E (i.e. the
non-relativistic limit).
Getting back to the current discussion, for any given species of particles that we will be
discussing in the context of the early universe, the total number density of particles is found
by integrating over the distribution function,
g Z
n = 3 f (p)d3 p.
(328)
h
In the above equation, the quantity g/h is the density of states available for occupation
(can be derived from a particle-in-a-box quantum mechanical argument). The quantity g
specifically refers to the number of internal degrees of freedom. We will return to this in a
moment.
Similar to the number density, the energy density can be written as written as
g Z
2
(329)
< u >= ρc = 3 E(p)f (p)d3 p.
h
From the distribution functions and definition of energy, these equations can be rewritten
as
Z ∞
g
(E 2 − (mc2 )2 )1/2 EdE
n= 2 3 3
2π h̄ c m exp[(E − µ)/kT ] ± 1
Z ∞
(E 2 − (mc2 )2 )1/2 E 2 dE
g
ρ= 2 3 3
.
2π h̄ c m exp[(E − µ)/kT ] ± 1
(330)
(331)
In the relativistic limit, the above equations become
∞
E 2 dE
g
n= 2 3 3
2π h̄ c 0 exp[(E − µ)/kT ] ± 1
Z ∞
E 3 dE
g
.
ρ= 2 3 3
2π h̄ c 0 exp[(E − µ)/kT ] ± 1
Z
(332)
(333)
Note that Kolb & Turner §3.3-3.4 is a good reference for this material.
Now, let us consider a specific example. Photons obey Bose-Einstein statistics, so the
number density is
Z ∞
g
E 2 dE
.
(334)
nγ = 2 3 3
2π h̄ c 0 exp[(E − µ)/kT ] ± 1
As we will discuss later, for photons the chemical potential is µγ = 0, so making the substitution x = E/kT the above equation becomes
g
2π 2
kT
h̄c
!3 Z
∞
0
x2 dx
.
ex − 1
(335)
It turns out that this integral corresponds to the Riemann-Zeta function, which is defined
such that
Z ∞ n−1
x dx
,
(336)
ζ(n)Γ(n) =
ex − 1
0
58
for integer values of n. Also, for photons g = 2 (again, we’ll discuss this in a moment). The
equation for the number density of photons thus becomes,
1
nγ = 2
π
kT
h̄c
!3
2ζ(3)
ζ(3)Γ(3) =
π2
kT
h̄c
!3
.
(337)
For the current Tr = 2.73 K, and given that ζ(3) ≃ 1.202, we have that n0γ = 420 cm−3 .
Note that since nγ scales with T 3 , nγ ∝ (1 + z)3 for redshift intervals where no species are
freezing out.
More generally, we have noted above that the chemical potential for all particle species
in the early universe is zero. Consequently, in a more general derivation it can be shown
that for any relativistic particle species
ni = gi
kT
h̄c
!3 Z
0
∞
x2 dx
gi ζ(3)
=α 2
x
e ±1
π
kB T
h̄c
!3
,
(338)
where α = 3/4 for fermions and α = 1 for bosons. Similarly, the energy density of a given
species is given by
Z
gi
(kT )4 ∞ x3 dx
= β σr T 4 .
(339)
ρi = gi
3
x
(h̄c) 0 e ± 1
2
where β = 7/8 for fermions and β = 1 for bosons.
For a multi-species fluid, the total energy density will therefore be


7 X
σr T 4
σr T 4
ρc2 = 
gi +
= g∗
.
gi 
8 f ermions
2
2
bosons
21.2
X
(340)
What is g?
A missing link at this point is this mysterious g, which I said is the number of internal
degrees of freedom. In practice, what this means for both bosons and fermions is that
g = 2 × spin + 1.
(341)
For example, for a spin 1/2 electron or muon, g = 2. Two exceptions to this rule are photons
(g = 2) and neutrinos (g = 1), which each have one less degree of freedom than you might
expect from the above relation. The underlying reasons are unimportant for the current
discussion, but basically the photon is down by one because longitudinal E&M waves don’t
propagate, and neutrinos are down one because one helicity state does not exist.
In the above section we showed how to combine the g of the different particle species to
obtain an effective factor g ∗ for computing the energy density for a multispecies fluid. Just
to give one concrete (but not physical) example, consider having a fluid comprised of only
νe and µ+ . In this case,
7
21
g ∗ = 0 + (1 + 2) =
(342)
8
8
59
22
Photon-Baryon Ratio
OK – time to return to cosmology from thermodynamics, although at this point we’re still
laying a bit of groundwork. One important quantity is the ratio between the present mean
number density of baryons (n0b ) and photons (n0γ ). It’s actually defined in the book conversely as η0 = n0b /n0γ . The present density of baryons is
n0b =
ρ0b
≃ 1.12 × 10−5 Ω0b h2 cm−3 .
mp
(343)
Meanwhile, we have now calculated the photon density in a previous section,
n0γ
kB T0r
h̄c
2ζ(3)
=
π2
!3
≃ 420cm−3
(344)
n0b
≃ 3.75 × 107 (Ω0b h2 )−1 .
n0γ
(345)
The photon-baryon ratio therefore is
η0−1 =
The importance of this quantity should become clearer, but for now the key thing to note is
that there are far more photons than baryons.
23
Radiation Entropy per Baryon
Related to the above ratio, we can also ask what is ratio between the radiation entropy
density and the baryon density? From our definition of entropy earlier, we have
sr =
<u>
ρr c2 + p + r
4 ρr c2
4
=
=
= σr T 3 .
T
T
3 T
3
(346)
We also know that the number density of baryons is
nb = ρb mp
(347)
3
Recalling that σrad = 4mp σr T0r
/(3kb ρ0b ), we can rewrite the equation for the entropy as
sr = σrad kb nb ,
or
σrad =
sr nγ
sr
=
,
kB nb
kB η
(348)
(349)
which tells us that σrad , sr , and η −1 are all proportional. Your book actually takes the above
equation and fills in numbers to get σrad = 3.6η −1 to show that the constant is of order unity.
Finally, and perhaps more interestingly, σrad is also related to the primordial baryonantibaryon asymmetry. Subsequent to the initial establishment of the asymmmetry, (nb −
60
n̄b )a3 must be a conserved quantity (conservation of baryon number). Moreover, in the
observed universe n̄b → 0, so n0b a3 is conserved. At early times when the baryon species are
in equilibrium, we have
(350)
nb ≃ nb̄ ≃ nγ ∝ T 3 ∝ (1 + z)3 .
At this stage the baryon asymmetry is expected to be
nb − nb̄
n0b
nb − nb̄
−1
≃
≃ 1.8σrad
.
≃
nb + nb̄
2nγ
2n0γ
(351)
From a physical perspective, what this says is that the reason that σrad is so large, and
that there are so many more photons than baryons, is that the baryon-antibaryon asymmetry
is small.
24
Lepton Era
The Lepton era corresponds the time period when the universe is dominated by leptons,
which as you will recall are particles that do not interact via the strong force. The three
families of leptons are the electron (e± , νe , ν̄e ), muon (µ± , νµ , ν̄µ), and tau (τ ± , ντ , ν̄τ ) families.
The lepton era begins when pions (a type of hadron with a short lifetime) freeze-out at
T ∼ 130 MeV, annihilating and/or decaying into photons. At the start of the lepton era,
the only species that are in equilibrium are the γ,e± ,µ± , a small number of baryons, and
neutrinos (all 3 types).
It is of interest to calculate g ∗ at both the beginning and end of the lepton era (both for
practice and physical insight). At the start of the lepton era (right after pion annihilation),
∗
gstart
=2+
7
× (2 × 2 + 2 × 2 + 3 × 2) = 14.25
8
(352)
where the terms correspond to the photons, electrons, muons, and neutrinos, respectively.[We’ll
ignore the baryons.] At the end of the lepton era, right after the electrons annihilate, we are
left with only the photons, so
∗
gend
= 2.
(353)
We’ll work out an example with neutrinos in a moment to show why this matters. The quick
physical answer though is that when species annihilate g decreases and the radiation energy
density increases (since particles are being converted into radiation). The value of g is used
to quantify this jump in energy density and temperature.
To see this, consider that in this entire analysis we are treating the universe as an adiabatically expanding fluid. We discussed previously that this is equivalent to requiring that
there is no heat transfer to an external system, or
dS ≡
dQ
= 0.
dT
61
(354)
In other words, entropy is conserved as the universe expands. We also discussed previously
that the entropy density is given by
sr =
ρc2 + p
,
T
(355)
which in well-approximated by the radiative components, giving
sr =
4 ρc2
2
= g ∗σr T 3 .
3 T
3
(356)
Now consider pair annihilation of a particle species at temperature T . From conservation
of entropy, we require
sbef ore = saf ter ,
2 ∗
2 ∗
3
3
gbef ore σr Tbef
ore = gaf ter σr Taf ter ,
3
3
∗
3
∗
3
gbef
ore Tbef ore = gaf ter Taf ter ,
or
Taf ter = Tbef ore
∗
gbef
ore
∗
gaf ter
!1/3
(357)
(358)
(359)
(360)
(361)
Now, g ∗ is a decreasing function as particles leave equilibrium, so the above equation states
that the radiation temperature is always higher after annihilation of a species.
There are two relevant types of interactions during the lepton era that act to keep particles
in equilibrium – electromagnetic and weak interactions. Examples of the electromagnetic
interactions are
p + p̄ ⇀
↽ 2γ,
↽ π0 ⇀
↽ e+ + e− ⇀
↽ µ+ + µ− ⇀
↽ π+ + π− ⇀
↽ n + n̄ ⇀
(362)
and examples of weak interactions are
e− + µ+ ⇀
↽ νe + νµ ,
e− + e+ ⇀
↽ νe + ν̄e ,
−
e +p⇀
↽ νe + n,
−
e + νe ⇀
↽ e+ + νe .
(363)
(364)
(365)
(366)
(367)
The relevant cross-sections for electromagnetic interactions is the Thomson cross-section
(σT ), while the weak interaction cross-section σwk ∝ T 2 is given in your book.
Note that neutrinos feel the weak force, but not the electromagnetic force. This property,
coupled with the temperature dependence of the weak interaction cross-section, is the reason
that neutrinos are so difficult to detect.
62
24.1
Electrons
The electron-positron pairs remain in equilibrium during the entire lepton era since the
creation timescale for pairs is much less than the expansion timescale. Indeed the electrons
remain in equilibrium until recombination, which we will discuss shortly. In practice, the
end of the lepton era is defined by the annihilation of the electrons and positrons at T ∼ 0.5
MeV.
What is the density of electron-positron pairs? For T > 1010 K, electromagnetic‘ interactions such as γ + γ ⇀
↽ e+ + e− are in thermal equilibrium. Using the fermion phase space
distributions, we have
ne± = ne−
!3
!3
3 ζ(3)
kB Te
3ζ(3) kB Te
+ ne+ =
(2 × ge )
=
,
2
4 π
h̄c
π2
h̄c
σr Te4
7
7
= σr Te4 .
ρe± c2 = ρe+ c2 + ρe− c2 = (2 + 2)
8
2
4
(368)
(369)
Since the electrons are in equilibrium, Te = Tr .
24.2
Muons
The muon pairs also remain in equilibrium until T ∼ 1012 K, at which point they annihilate.
It is straightforward to work out that before annihilation the muons should have the same
number and energy densities as the electrons.
24.3
Neutrinos
Electron neutrinos decouple from the rest of the universe when the timescale for weak interaction processes such as νe + ν̄e ⇀
↽ e+ + e− equals the expansion timescale. Other neutrino
species are coupled to the νe via neutral current interactions, so they decouple no later than
this. To be specific, the condition for neutrino decoupling is
3
tH ≃ 2
32πGρ
!1/2
< tcollision ≃ (nl σwk c)−1 ,
(370)
where nl is the number density of a generic lepton. At this time period τ particles are no
longer expected to be in equilibrium, and we have noted above that ne = nµ , so the relevant
density is given by the above equation for ne± as nl = (1/2)ne± . Similarly,
7
ρl = ρe± c2 /2 = σr T 4 .
8
(371)
The condition for equilibrium, with insertion of constants, becomes
tH
T
≃
tcoll
3 × 1010 K
63
3
<1
(372)
24.3.1
Temperature of Relic Species: Neutrinos as an Example
It is interesting to consider the neutrino temperature in order to illustrate the general characteristics of temperature evolution. Given that we know the current temperature of the
radiation field, we can derive the neutrino temperature by expressing it in terms of the
radiation temperature.
Neutrinos decouple from the radiation field at a time when they remain a relativistic
species. Consequently, we expect their subsequent time evolution to follow the relation
T0ν = Tν,decoupling (1 + z)−1
(373)
We know that at decoupling Tν = Tr . If no particle species annihilate after the neutrinos
decouple, the Tr = T0r (1 + z) and we have T0ν = T0r . However, we know that neutrinos
decouple before electron-positron annihilation. We must therefore calculate how much the
temperature increased due to this annihilation We saw before that when a particle species
annihilates
!1/3
∗
gbef
ore
Taf ter = Tbef ore
,
(374)
∗
gaf
ter
so what we need to do is calculate the g ∗ before and after annihilation. Before annihilation,
the equilibrium particle species are e± and photons, so
7
g ∗ = 2 + (2 × 2) = 2 + 7/2 = 11/2.
8
(375)
After annihilation, only the photons remain in equilibrium, so g ∗ = 2. Consequently,
Taf ter = Tbef ore
11
4
1/3
.
(376)
which says that at the neutrino decoupling
Tr,decoupling = T0r
4
11
1/3
and hence
T0ν = Tν,decoupling (1 + z)−1 =
24.3.2
4
11
(1 + z)
1/3
T0r ≃ 1.9K.
(377)
(378)
Densities of Relic Species: Neutrinos as an Example
Once the neutrino temperature is known, the number density can be calculated in the standard fashion:
3 ζ(3)
kB Tν
nν =
(3
×
2
×
1)
4 π2
h̄c
64
!3
≃ 324cm−3 .
(379)
where we have assumed 3 neutrino species. If neutrinos are massless, we can also compute
the energy density as
7
σr Tν4
7
ρν c2 = (3 × 2 × 1)
= σr Te4 ≃ 3 × 10−34 g cm−3 .
8
2
4
(380)
In the last few years, terrestrial experiments have however demonstrated that there must
be at least one massive neutrino species (a consequence of neutrinos changing flavor, which
cannot happen if they are all massless). The number density calculation above is unaffected
if neutrinos are massive, but looking at the number one sees that this is roughly the same as
the photon number density (or of order 109 times the baryon number density). Consequently,
even if neutrinos were to have a very small rest mass, it is possible for them to contribute
a non-negligible amount to the total energy density. To be specific, the mass density of
neutrinos is
< mν >
ρ0ν =< mν > n0ν ≃ Nν × 1.92 ×
× 10−30 g cm−3 ,
(381)
10eV
or in terms of critical density,
Ω0ν ≃ 0.1 × Nν
< mν >
× h−2 .
10eV
(382)
Astrophysical constraints based upon the CMB and large scale structure indicate that the
combined mass of all neutrino species is
X
mν ≤ 1eV
(383)
(assuming General Relativity is correct), or < mν >≤ 1/3 eV for Nν = 3, which implies that
Ω0ν ≤ 0.005.
(384)
One can ask whether the relic neutrinos remain relativistic at the current time. Roughly
speaking, the neutrinos will cease to be relativistic when
ρkinetic
≤ 1.
(385)
ρrestmass
We calculated that at the current time, for a given type of neutrino, the kinetic energy
density is
ρkinetic = 10−34 g cm−3 .
(386)
and the rest energy is
ρ0ν =≃ 1.92 ×
< mν >
× 10−30 g cm−3 ,
10eV
(387)
so we see that the neutrinos are no longer relativistic if
µν ≥
10−34
× 10 eV,
1.92 × 10−30
µν ≥ 5 × 10−4 eV.
65
(388)
(389)
We know that at least one of the species of neutrinos is massive (see section 8.5 of your
book for more details), but given this limit cannot say whether some of the neutrino species
are non-relativistic at this time.
Finally, it is worth reiterating that we found that the neutrinos, which were relativistic
at decoupling, have a temperature dependence after decoupling of Tν ∝ (1 + z). This will
remain true even after the neutrinos become non-relativistic. Recall that the temperature
is defined in terms of the distribution function – this distribution remains valid when the
particles become non-relativistic. On the other hand, a species that is non-relativistic when
it decouples has T ∝ (1 + z)2 . Thus, the redshift dependence of the temperature for
a given particle subsequent to decoupling is determined simply by whether the
particle is relativistic at decoupling.
24.4
Neutrino Oscillations
This is an aside for interested readers.
Why do we believe that at least one species of neutrinos has a non-zero mass? The
basic evidence comes from observations of solar neutrinos and the story of the search for
neutrino mass starts with the “solar neutrino problem”. In the standard model of stellar
nucleosynthesis, the p − p chain produces neutrinos via reactions such as
p + p → D + e+ + νe ; Eν = 0.26MeV
Be7 + e− → Li7 + νe ; Eν = 0.80MeV
B 8 → Be7 + e+ + νe ; Eν = 7.2MeV.
(390)
(391)
(392)
The physics is well-understood, so if we understand stellar structure then we can make a
precise prediction for the solar neutrino flux at the earth. In practice, terrestrial experiments
to detect solar neutrinos find a factor of a few less νe than are expected based upon the
standard solar model.
Initially, there were two proposed solutions to the solar neutrino problem. One proposed
resolution was that perhaps the central temperature of the sun was slightly lower, which
would decrease the expected neutrino flux. Helioseismology has now yielded sound speeds
over the entire volume of the sun to 0.1% and the resulting constraints on the central temperature eliminate this possibility. The second possible solution is “neutrino oscillations”.
The idea with neutrino oscillations is that neutrinos can self-interact and transform to different flavors. Since terrestrial experiments are only capable of detecting νe , we should only
observe 1/3 to 1/2 of the expected flux if the electron neutrinos produces in the sun convert
to other types.
Now, what does this all have to do with neutrino mass? The answer comes from the
physics behind how one can get neutrinos to change flavors. The basic idea is to postulate
that perhaps the observed neutrino types are not fundamental in themselves, but instead
correspond to linear combinations of neutrino eigenstates. As an example, consider oscillations between only electron and muon neutrinos and two eigenstates ν1 and ν2 . Given a
66
mixing angle θ, one could construct νe and νµ states as
νe = cos θν1 + sinθν2 ,
νµ = sin θν1 − sin θν2 .
(393)
(394)
(imaging the above in bra-ket notation...not sure how to do this properly in latex)
Essentially, the particle precesses between the νe and νµ states. If the energies corresponding to the eigenstates are E1 and E2 , then the state will evolve as
νe = cos θ exp(−iE1 t/h̄)ν1 + sin θ exp(−iE2 t/h̄)ν2 ,
(395)
and the probability of finding a pure electron state will be
Pνe (t) = |νe (t)|2 = 1 − sin2 (2θ) sin2
1
(E1 − E2 )t/h̄ .
2
(396)
If both states have the same momenta (and how could they have different momenta?),
then the energy difference is simple
∆E =
∆m2 c4
.
E1 + E2
(397)
The important thing to notice in the above equation is that the oscillations do not occur
if the two neutrinos have equal mass. There must therefore be at least one flavor of neutrino
that has a non-zero mass.
25
Matter versus Radiation Dominated Eras
One key benchmark during this period is the transition from a radiation to matter-dominated
universe. We have seen in earlier sections that the redshift evolution of the energy density
for non-relativistic matter is different than for relativistic particles and photons. Specifically,
ρ = ρ0 (1 + z)3(w+1) , where w ≈ 0 for non-relativistic matter and w = 1/3 for relativistic
particles and photons. While non-relativistic matter is the dominant component at z = 0,
the practical consequence of this density definition is that
ρ0r
ρr
=
(1 + z),
ρm
ρ0m
(398)
which means that above some redshift the energy density of relativistic particles dominates
and the radiative term in the Friedmann equations becomes critical. The current radiation
energy density is
g σr T 4
σr T 4
ρr =
=
= 4.67 × 10−34 g cm−3 ,
(399)
2 c2
c2
while the current matter density is
ρ0m = ρoc Ω0m h2 = (1.9 × 10−29 )0.27h2 g cm−3 ≃ 2.5 × 10−30 h270 g cm−3 .
67
(400)
Considering just these two components, we would expect the matter and radiation densities to be equal at
ρ0m
1 + zeq =
∼ 5353.
(401)
ρ0r
In practice, the more relevant timescale is the point when the energy density of nonrelativistic matter (w = 0) is equal to the energy density in all relativistic species (w = 1/3).
To correctly calculate this timescale, we also need to fold in the contribution of neutrinos.
Using the relations that we saw earlier, the energy density of the photons + neutrinos should
be
7 gν
4
4
ρrel = ρr + ρν = σr Tr + Nν σr Tν c−2 .
(402)
8
2
In our discussion of the lepton era and evolution of the neutrinos we worked out (or will
work out) that Tν = (4/11)1/3 Tr . We also know that gν = 1, and the best current evidence
is that there are 3 neutrino species (Nν = 3) plus their antiparticles (so 3 × 2), which means
that the above equation can be written as
ρrel
σr T 4
7
4
= 2 r 1 + (2 × Nν ) 12
c
8
11
"
4/3 #
= 1.68σr T 4 = 1.681ρr .
(403)
In your book, the coefficient (1.68) used to include the total relativistic contribution from
photons and neutrinos is denoted at K0 , and at higher densities the factor is called Kc to
account for contributions from other relativistic species. In practice, Kc ∼ K0 , so we’ll stick
with K0 .
If we now use this energy density in calculating the epoch of matter-radiation equality,
we find that
ρ0m
1 + zeq =
∼ 5353/1.68 ≃ 3190.
(404)
ρ0rel
The most precise current determination from WMAP gives zeq = 3454 (tu ∼ 50 kyrs).
This is the redshift at which the universe transitions from being dominated by radiation
and relativistic particles to being dominated by non-relativistic matter. At earlier times the
universe is well-approximated by a simple radiation-dominated EdS model. During this era,
q
E(z) ≃ (1 + z) K0 Ω0r
(405)
It is worth noting that the most distant direct observations that we currently have come
from the cosmic microwave background at z = 1088, so all existing observations are in the
matter dominated era.
26
Big Bang Nucleosynthesis
Cosmological nucleosynthesis occurs just after electron-positron pairs have annihilated at
the end of the lepton era. From an anthropic perspective this is perhaps one of the most
68
important events in the history of the early universe – by the end of BBN at t ≃ 3 minutes
the primordial elemental abundances are fixed.
Let us begin with the basic conceptual framework and definitions. There are basically
two ways to synthesize elements heavier that hydrogen. The first method is the familiar
process of stellar nucleosynthesis, as worked out by Burbidge, Burbidge, Fowler, and Hoyle.
This method is good for producing heavy elements (C,N,O, etc), but cannot explain the
observed high fraction of helium in the universe,
Y ≡
mHe
≃ 0.25.
mtot
(406)
The second method is cosmological nucleosynthesis. The idea of elemental synthesis in the
early universe was put forward in the 1940’s by Gamov, Alpher, and Hermann, with the
basic idea being that at early times the temperature should be high enough to drive nuclear
fusion. These authors found that, unlike stellar nucleosynthesis, cosmological (or Big Bang)
nucleosynthesis could produce a high helium fraction. As we shall see though, BBN does
not produce significant quantities of heavier elements. The current standard picture is that
BBN establishes the primordial abundances of the light elements, while the enrichment of
heavier elements is subsequently driven by stellar nucleosynthesis.
With that introduction, let’s dive in. Your book listed the basic underlying assumptions
that are implicit to this discussion. Some of these are aspects of the Cosmological Principle
(and apply to our discussion of other, earlier epochs as well); some are subtle issues that are
somewhat beyond the scope of our discussion. I reproduce the full list here for completeness:
1. The Universe has passed through a hot phase with T ≥ 1012 K, during which its
components were in thermal equilibrium.
2. The known laws of physics apply at this time.
3. The Universe is homogeneous and isotropic at this time.
4. The number of neutrino times is not high (Nν ≃ 3).
5. The neutrinos have a negligible degeneracy parameter.
6. The Universe does not contain some regions of matter and others of antimatter (subpoint of 2., and part of the CP).
7. There is no appreciable magnetic field at this epoch.
8. The photon density is greater than that of any exotic particles at this time.
69
26.1
Neutron-Proton Ratio
As a starting point, we need to know the relative abundances of neutrons and protons at
the start of nucleosynthesis. In kinetic equilibrium, the number density of a non-relativistic
particle species obeys the Boltzmann distribution, so
ni = gi
mi kB T
2πh̄2
!3/2
µi − mi c2
.
exp
kB T
!
(407)
Neutrons and protons both have g = 2, and the µi can be ignored, which means that the
ratio of the two number densities is
nn
≃
np
mn
mp
!3/2
(mn − mp )c2
exp −
kB T
!
≃ e−Q/kB T ,
(408)
where Q = (mn − mp )c2 ≃ 1.3 MeV. Equivalently, this expression says that while the two
species are in thermal equilibrium,
1.5 × 1010 K
nn
≃ exp −
.
np
T
!
(409)
Equilibrium between protons and neutrons is maintained by weak interactions such as
⇀ p + e− , which remain efficient until the neutrinos decouple. The ratio is therefore
n + νe ↽
set by the temperature at which the neutrinos decouple. Neutrinos decouple at T ≃ 1010 K,
in which case
n
1
Xn ≡
≃
= 0.18.
(410)
n+p
1 + exp(1.5)
After this point, free neutrons can still transform to protons via β−decay, which has a
half-life of τn = 900s. This, the subsequent relative abundances of free protons and neutrons
is given by
Xn (t) ≡ Xn (teq )e−(t−teq )/τn .
(411)
In practice, as we shall see nucleosynthesis lasts for << 900s, so Xn ≃ Xn (teq ) for the entire
time period of interest.
26.2
Nuclear Reactions
Before proceeding, let us define the nuclear reactions that we may expect to occur. These
include:
p+n⇀
↽ d(i.e.H 2 ) + γ
d+n⇀
↽ H3 + γ
d+d⇀
↽ H3 + p
70
(412)
(413)
(414)
(415)
d+p⇀
↽ He3 + γ
d+d⇀
↽ He3 + n
(416)
(417)
(418)
(419)
(420)
(421)
(422)
H3 + d ⇀
↽ He4 + n
He3 + n ⇀
↽ He4 + γ
He3 + d ⇀
↽ He4 + p.
The net effect of all after the first of these equations is essentially
d+d ⇀
↽ He4 + γ.
(423)
What about nuclei with higher atomic weights? The problem is that there are no stable
nuclei with atomic weights of either 5 or 8, which means that you can only produce heavier
elements by “triple-reactions”, such as 3He4 → C 12 + γ, in which a third nuclei hits the
unstable nuclei before it has time to decay. The density during cosmological nucleosynthesis
is far lower than the density in stellar interiors, and the total time for reactions is only ∼3
minutes rather than billions of years. Consequently this process is far less efficient during
cosmological nucleosynthesis that stellar nucleosynthesis.
26.3
Deuterium
Let us now consider the expected abundance of helium produced. From the above equations,
the key first step is production of deuterium via
p+n⇀
↽ d + γ.
(424)
. The fact that nucleosynthesis cannot proceed further until there is sufficient deuterium is
known as the deuterium bottleneck. From the Boltzmann equation, we saw that for all
species
!3/2
!
mi kB T
µi − mi c2
ni = gi
exp
,
(425)
kB T
2πh̄2
where for protons and neutrons gn = gp = 2 (i.e. two spin states), and for deuterium gd = 3
(i.e. the spins can be up-up, up-down, or down-down). For the chemical potentials, we take
the relation
µn + µp = µd
(426)
for equilibrium. Taking the total number density at ntot ≃ nn + np , we thus find that
nd
3
Xd ≡
≃
ntot
ntot
md kB T
2πh̄2
71
!3/2
µd − md c2
.
exp
kB T
"
#
(427)
The exponential term is equivalent to
µn + µp − (mn + mp )c2 + Bd
exp
kB T
"
#
(428)
where Bd = (mn + mp − md )c2 ≃ 2.225 MeV. Using the expressions for the number density
of neutrons and protons, and defining the new quantity Xp = 1 − Xn , the above expression
can be written in the form:
Xd ≃ ntot
md
mn mp
!3/2
3
4
kB T
2πh̄2
!−3/2
Xn Xp eBd /kB T .
(429)
After some algebraic manipulation and insertion of relevant quantities, the above equation
has the observable form
Xd ≃ Xn Xp exp −29.33 +
25.82
− 1.5 ln T9 + ln Ω0b h2 ,
T9
(430)
where the dependence upon Ω0b h comes from ntot , and T9 = T /109 K. Looking at the last
equation, the amount of deuterium starts becoming significant for T9 < 1. At this point the
deuterium bottleneck is alleviated and additional reactions can proceed. To be more specific,
for a value of Ω0b h2 consistent with WMAP, we find that Xd ≃ Xn Xp at T ≃ 8 × 108 K, or
t ≃ 200 s. We will call this time t∗ for consistency with the book.
26.4
Helium and Trace Metals
Now, what about Helium? Once the temperature is low enough that the deuterium bottleneck is alleviated, essentially all neutrons are rapidly captured and incorporated into He4 by
reactions such as d + d → He3 + n d + He3 → He4 + p because of the large cross-sections for
these reactions. We can consequently assume that almost all the neutrons end up in He4 , in
which case the helium number density fraction is
XHe ∼
1 nn
1
= Xn ,
2 ntot
2
(431)
and the fraction of helium by mass, Y , is
Y ≡
mHe
nHe
=4
≃ 2Xn .
mtot
ntot
(432)
If we account for the neutron β-decay, so that at the end of the deuterium bottleneck
t∗ − teq
,
Xn ≃ Xn (teq ) exp −
τn
(433)
then we get that Y ≃ 0.26. This value is in good agreement with the observed helium
abundance in the universe. Note that the helium abundance is fairly insensitive to Ω0b . This
72
is primarily because the driving factor in determining the abundance is the temperature at
which the n/p ratio is established rather than the density of nuclei.
In contrast, the abundances of other nuclei are much more strongly dependent upon
Ω0b h2 . For nuclei with lower atomic weight than Helium (i.e. deuterium and He3 ) the
abundance decreases with increasing density. The reason is that for higher density these
particles are more efficiently converted into He4 and hence have lower relic abundances. For
nuclei with higher atomic weight the situation is somewhat more complicated. On the one
hand, higher density means that there is a greater likelihood of the collisions required to
make these species. Thus, one would expect for example that the C 12 abundance (if carbon
had time to form) should monotonically increase with density. On the other hand, for nuclei
that require reactions involving deuterium, H3 , or He3 , (such as Li7 ) lower density has the
advantage of increasing the abundances of these intermediate stage nuclei. These competing
effects are the reason that you see the characteristic dip in the lithium abundance in plots
of the relative elemental abundances as a function of baryon density.
27
The Plasma Era
The “plasma era” is essentially defined as the period during which the baryons, electrons,
and photons can be considered a thermal plasma. This period follows the end of the lepton
era, starting when the electrons and positrons annihilate at T ≃ 5 × 109 K and ending at
recombination when the universe becomes optically thin. Furthermore, this era is sometimes
subdivided into the radiative and matter era, which are the intervals within the plasma era
during which the universe is radiation and matter dominated, respectively. In this section
we will discuss the properties of a plasma, and trace the evolution of the universe up to
recombination. We will also briefly discuss the concept of reionization at lower redshift.
27.1
Plasma Properties
What exactly do we mean when we say that the universe was comprised of a plasma of
protons, helium, electrons, and photons? Physically, we mean that the thermal energy of the
particles is much greater than the energy of Coulomb interactions between the particles. If we
define λ as the mean particle separation, then this criteria can be expressed mathematically
as
λD >> λ,
(434)
where λD is the Debye length. The Debye length (λD ) is a fundamental length scale in
plasma physics, and is defined as
λD =
kB T
4πne e2
!1/2
,
(435)
where ne is the number density of charged particles and e is the charge in electrostatic units.
It is essentially the separation at which the thermal and Coulomb terms balance.
73
Another way of looking at this is that for a charged particle in a plasma the effective
electrostatic potential is
e −r/λD
Φ=
e
,
(436)
4πǫ0 r
where ǫ0 is the permittivity of free space. From this equation Debye length can be though
of as the scale beyond which the charge is effective shielded by the surrounding sea of other
charged particles. In this sense, the charge has a sphere of influence with a radius of roughly
the Debye length.
Now, given the above definitions, we can look at the Debye radius in a cosmological
context. If we define the particle density to be
ne ≃
ρ0c Ω0b
mp
T
T0r
then we see that
λD ≃
3
kB T0r
mp
2
2
4πe T ρ0c Ω0b
3
!1/2
,
(437)
∝ T −1 .
(438)
Similarly, if we define
λ≃
ne−1/3
≃
ρ0c Ω0b
mp
!1/3
T
,
T0r
(439)
then the temperature cancels and the ratio of the Debye length to the mean particle separation is
λD
≃ 102 (Ω0b h2 )−1/6 .
(440)
λ
We therefore find that the ratio of these two quantities is independent of redshift, which
means that ionized material in the universe can be appropriately treated as a plasma fluid
at all redshifts up to recombination.
27.2
Timescales
Now, what are the practical effects of having a plasma? There are several relevant characteristic timescales:
1. τe - This is the time that it takes for an electron to move a Debye length.
2. τeγ - This is the characteristic time for an electron to lose its momentum by electronphoton scattering.
3. τγe – This is the characteristic timescale for a photon to scatter off an electron.
4. τep – This is the relaxation time to reach thermal equilibrium between photons and
electrons.
74
Before diving in, why do we care about these timescales? We want to use the timescales
assess the relative significance of different physical mechanisms. For instance, we must verify
that τep is shorter than a Hubble time or else the assumption of thermal equilibrium is invalid.
Let’s start with τep . We won’t go through the calculations for this one, but
τep ≃ 106 (Ω0b h2 )−1 T −3/2 s.
(441)
Assuming Ω0b h2 ≃ 0.02, then this implies that at the start of the radiative era(T ≃ 109 K,
tU ≃ 10s) τep ≈ 10−6 s. Similarly, at the end of the radiative era (T ≃ 4000K, tU ≃ 300, 000
yrs) τep ≈ 200s. Clearly, the approximation of thermal equilibrium remains valid in this era.
Next, consider τe . The Coulomb interaction of an electron with a proton or helium nuclei
is only felt when the two are within a Debye length of one another. On average, the time
for an electron to cross a Debye sphere is
τe =
we−1
me
=
4πne e2
2
≃ 2 × 108 T −3/2 s.
(442)
so any net change to the electron momentum or energy must occur in this timescale.
Let’s now compare this with the time that it takes for electron-photon scattering to
actually occur. This timescale is,
′
τeγ
=
3me
1
=
= 4.4 × 1021 T −4 s.
nγ σT c
4σT ρr c
(443)
Note that this equation contains a factor of 4/3 in 4/3ρr c2 because of the contribution of
the pressure for a radiative fluid (p = 1/3ρc2 ).
Combining the two, we see that
τe
≃ 5 × 10−14 T 5/2 ,
′
τeγ
(444)
so
′
τe << τeγ
when z << 2 × 107 (Ω0b h2 )1/5 ≃ 107 ,
(445)
which is true for much of the period the plasma era (after the universe is about 1 hour old).
What this means is that for z << 2 × 107 there is only a very small probability for an e− to
scatter off a γ during the timescale of an e− − p+ interaction during most of the plasma era.
Consequently, the electrons and protons are strongly coupled – essentially stuck together.
On the other hand, at z >> 2 × 107 the electrons and γs have a high probability of
scattering – they are basically stuck together. In this case, the effective mass of the electron
is
4 ρr
m∗e = me + (ρr + pr /c2 )/ne ≃
>> me ,
(446)
3 ne
when calculating the timescale for an e− + p+ collision. We simply note this for now, but
may utilize this last bit of information later.
75
For now, the main point is that z = 1 × 107 is essentially a transition point before which
the electrons and photons are stuck together, and after which the electrons and protons are
stuck together.
We also note that the effective timescale for electron-γ scattering is:
τeγ =
3 me + mp
3 mp
≃
≃ 1025 T −4 s.
4 σT ρr c
4 σT ρr c
(447)
The final timescale that we will calculate here is that for the typical photon to scatter
off an electron (not the same as the converse since there are more photons than electrons).
This is roughly,
τγe =
mp 1
4 ρr
1
τeγ ≃ 1020 (Ω0b h2 )−1 T −3 s.
=
=
ne σT c
ρb σT c
3 ρb
(448)
To put this all together... First, we showed that we are in equilibrium. This means
that the protons and electrons have the same temperature, and also means that the photons
should obey a Planck distribution (Bose-Einstein statistics). If everything is at the same
temperature, Compton interactions should dominate. From the calculation of the Debye
length we found that up until recombination the universe is well-approximated as a thermal
plasma. During this period, we have an initial era where the electron-photon interactions
dominate and these two particle species are strongly stuck together. This is true up to
z ≃ 107 , after which the proton-electron interactions dominate and these two species are
stuck to each other. As we will discuss, the relevance of this change is that any energy
injection prior to z = 107 (say due to evaporation of primordial black holes, among other
things) will be rapidly thermalized and leave no signature in the radiation field. In contrast,
energy injection at lower temperatures can leave some signature. We’ll get back to this in a
minute.
As we discussed previously, the transition from radiation to matter-dominated eras occurs
at
ρ0c Ω0
≃ 3450,
(449)
1 + zeq =
K0 ρ0r
or when the universe is roughly 50 kyrs old. This time is squarely in the middle of the
plasma era. At the end of the radiative era, the temperature is T ≃ 104 K, at which point
everything remains ionized, although some of the helium (perhaps half) may be in the form
of He+ rather than He+ + at this point. In general though, recombination occurs during the
matter-dominated era.
27.3
Recombination
Up through the end of the plasma era, all the particles (p, e− , γ, H and helium) remain
coupled. Assuming that thermodynamic equilibrium holds, then we can compute the ionization fraction for hydrogen and helium in the same way that we have computed relative
76
abundances of different particle species (we’re getting a lot of mileage out of a few basic formulas based upon Fermi-Dirac, Bose-Einstein statistics, and Boltzmann statistics). In the
present case, we are considering particles at T ≃ 104 K, at which point p, e− , and H are all
non-relativistic. We therefore can use Boltzmann rather than Bose-Einstein or Fermi-Dirac
statistics, so
!3/2
!
µi − mi c2
mi kB T
exp
ni ≃ gi
.
(450)
kB T
2πh̄2
Considering now only hydrogen (i.e. ignoring helium for simplicity), the ionization fraction
should be
ne
ne
x=
≃
,
(451)
np + nH
ntot
and the chemical potentials for e− + p → H + γ must obey the relation
µe− + µp = µH .
(452)
Also, the statistical weights of the particles are gp = ge− = 2, as always, and gH = gp +ge− = 4.
Finally, the binding energy,
BH = (mp + me− − mH )c2 = 13.6eV.
(453)
Using this information and assuming ne = np (charge equality), we will derive what is
called the Saha equation for the ionization fraction. Let us start by computing the ration
of charge to neutral particles. We know from the above equations that
me kB T
ne = 2
2πh̄2
!3/2
µe − me c2
exp
kB T
!
(454)
mp kB T
np = 2
2πh̄2
!3/2
µp − mp c2
exp
kB T
!
(455)
µH − mH c2
exp
kB T
!
(456)
mH kB T
nH = 4
2πh̄2
!3/2
(457)
Therefore,
ne np
=
nH
me kB T
2πh̄2
!3/2 mp
mH
3/2
µe + µp − µH − (me + mp − mH )c2
exp
kB T
≃
me kB T
2πh̄2
!3/2
BH
BT
−k
e
!
(458)
(459)
Now, we can also see that
ne np
n2e
ne
=
= ntot
nH
ntot − ne
ntot
2
77
1
x2
= ntot
,
1 − ne /ntot
1−x
(460)
which means that
1
x2
=
1−x
ntot
me kB T
2πh̄2
!3/2
BH
BT
−k
e
.
(461)
This last expression is the Saha equation, giving the ionization fraction as a function of
temperature and density. Your book gives a table of values corresponding to ionization
fraction as a function of redshift and baryon density. The basic result is that the ionization
fraction falls to about 50% by z ≃ 1400.
Now, it is worth pointing out that we have assumed the ions are in thermal equilibrium.
This was in fact a bit of a fudge. Formally, this is only true when the recombination
timescale is shorter than the Hubble time, which is valid for z > 2000. During the actual
interesting period – recombination itself – non-equilibrium processes can alter the ionization
history. Nevertheless, the above scenario conveys the basic physics and is a reasonable
approximation. More detailed calculations with careful treatment of physical processes get
closer to the WMAP value of z = 1088. In general, the residual ionization fraction well after
recombination ends up being x ≃ 10−4.
27.4
Cosmic Microwave Background: A First Look
At recombination, the mean free path of a photon rapidly goes from being very short to essentially infinite as the probability for scattering off an electron suddenly becomes negligible.
This is why you will often hear the cosmic microwave background called the “surface of last
scattering”. We are essentially seeing a snapshot of the universe at z = 1088 when the CMB
photons last interacted with the matter field.
The CMB provides a wealth of cosmological information, and we will return to it in much
greater detail in a few weeks. Right now though, there are a few key things to point out. First,
note that Compton scattering between thermal electrons and photons maintains the photons
in a Bose-Einstein distribution since the photons are conserved in Compton scattering (and
consistent with our previous discussion). To get an (observed) Planck distribution of photons
requires photon production via free-free emission, double Compton scattering, or some other
physical process. These processes do occur rapidly enough in the early universe to yield a
Planck distribution. Specifically,
"
hν
Nν = exp
kB T
!
−1
#−1
,
(462)
and the corresponding energy density per unit frequency should be
uν = ρr,ν c2 = Nν
which corresponds to an intensity
Iν = Nν
8πhν 3
,
c
4πh̄ν 3
c
78
(463)
(464)
Integrating over the energy density gives the standard
ρr c2 = σr T 4
(465)
(see section 21.1 for this derivation). One can see that as the universe expands,
uν dν = (1 + z)4 uν0 dν0 ,
(466)
This is the standard (1 + z)4 for radiation that we have previously derived. Since dν =
(1 + z)dν0 , we have,
uν (z) = (1 + z)3 uν0 = (1 + z)3 uν/(1+z) .
(467)
Plugging this into the above equation for the energy density, one finds that
8πh
ν
uν (z) = (1 + z)3
c
1+z
!#−1
3 "
hν/(1 + z)
exp
−1
kB T0
ν3
8πh
.
=
c exp hν − 1
(468)
(469)
kB T
Thus, we see that an initial black body spectrum retains it’s shape as the temperature cools. This may seem like a trivial statement, but it tells us that when we look at
the CMB we are seeing the same spectrum as was emitted at the surface of last scatter, just
redshifted to lower temperature. Moreover, any distortions in the shape (or temperature) of
the spectrum must therefore be telling us something physical. One does potentially expect
distortions in the far tails of the Planck distribution due to the details of the recombination
process (particularly two-photon decay), but these are largely unobservable due to galactic
dust.
Another means of distorting the black body spectrum is to inject energy during the
plasma era. The book discusses this in some detail, but for the current discussion we will
simply state that there are a range of possible energy sources (black hole evaporation, decay
of unstable particles, damping of density fluctuations by photon diffusion), but the upper
limits indicate that the level of this injection is fairly low. Nonetheless, if there is any
injection, we saw before that it must occur in the redshift range 4 < log z < 7.
Finally, intervening material between us and the CMB can also distort the spectrum.
These “foregrounds” are the bane of the CMB field, but also quite useful in their own right
as we shall discuss later. One particular example is inverse Compton scattering by ionized
gas at lower redshift (think intracluster medium). If the ionized gas is hot, then the CMB
photons can gain energy. This process, called the Sunyaev-Zeldovich effect, for low column
densities essentially increases the effective temperature of the CMB in the Rayleigh-Jeans
part of the spectrum, while distorting the tail of the distribution.
27.5
Matter-Radiation Coupling and the Growth of Structure
One thing that we have perhaps not emphasized up to this point is that matter is tied in
space, as well as temperature, to the radiation in the universe. In other words, the radiation
79
exerts a very strong drag force on the matter. We have already talked about electrons and
photons being effectively glued together early in the plasma era. More generally for any
charged particles interacting with a Planck distribution of photons, to first order the force
on the particles is
∆v
4
me v
v
F ≃ me
= − ′ = − σT σr T 4 .
(470)
∆t
τeγ
3
c
or the same equation scaled by the ratio mp /me for a proton. The important thing to note
here is that the force is linear in the velocity, which means that ionized matter experiences
a very strong drag force if it tries to move with respect to the background radiation. The
net result is that any density variations in the matter distribution remain locked at their
original values until the matter and radiation fields decouple. Put more simply, gravity can’t
start forming stars,galaxies,etc until the matter and radiation fields decouple. Conversely,
the structure that we see at the surface of last scatter was formed at a far earlier epoch.
27.6
Decoupling
As we have discussed before, the matter temperature falls at T ∝ (1 + z)2 once the matter
decouples from the radiation field. To zeroth order this happens at recombination. In
practice though, the matter temperature remains coupled to the radiation temperature until
slightly later because of residual ionization. As before, one can calculate the timescale for
collisions of free electrons with photons,
τeγ =
1
× 1025 T −4 s.
x
(471)
You can then go the normal approach of comparing this with the Hubble time to estimate
when decoupling actually occurs (z ≃ 300).
27.7
Reionization
A general question for the class. I just told you before that after recombination the ionized
fraction drops to 10−4 – essentially all matter in the universe is neutral. Why then at z = 0
do we not see neutral hydrogen tracing out the structure of the universe? This is a bit beyond
where we’ll get in the class, but the idea is that the universe is reionized, perhaps in several
stages, at z = 6 − 20 by the onset of radiation from AGN (with possibly some contribution
from star formation). This reionization is partial – much of the gas at this point has fallen
into galaxies and is either in stars or self-shielding to this radiation.
Related to this last comment, it is good to be familiar with the concept of optical depth,
which is commonly denoted at τ (note: beware to avoid confusion with the timescales above).
The optical depth is related to the probability that a photon has a scattering interaction
with an electron while travelling over a given distance. Specifically, combining
dP =
dt
xρb
dt
= ne σT cdt =
σT c dz
τγe
mp
dz
80
(472)
dP = −
dI
dNγ
=− ,
Nγ
I
(473)
where Nγ is the photon flux and I is the intensity of the background radiation. The first
equation is directly from the definition of P. The second line states that the fraction of
photons (energy) that reaches in observer is defined by the fraction of photons that have not
suffered a scattering event. If we now define τ as,
dP = −dτ,
(474)
we see that
Nγ,obs = Nγ exp(−τ )
Iγ,obs = I exp(−τ ),
(475)
(476)
where obs stands for the observed number and intensity. The book has a slightly different
definition in that it expresses the relation in terms of redshift such that I(t0 , z) and Nγ (t0 , z)
are defined and the intensity and number of photons observed for a initial intensity and
number I(t), Nγ (t).
One can see from the earlier equations that in terms of redshift,
ρ0c Ω0b σT c
τ (z) =
mp
Z
0
z
dt
dz,
dz
(477)
which for w = 0 becomes
τ (z) =
For Ω0 z >> 1, this yields,
ρ0c Ω0b σT c Z z
1+z
dz.
mp H0
0 (1 + Ω0 z)1/2
τ (z) ≃ 10−2 (Ω0b h2 )1/2 z 3/2 .
(478)
(479)
Finally, the probability that a photon arriving at z = 0 suffered it’s last scattering
between z and z − dz is
1 dI
d
= − [(1 − exp(−τ )] dz = exp(−τ (z))dτ = g(z)dz.
I dz
dz
(480)
The quantity g(z) is called the differential visibility and defines the effective width of the
surface of last scattering. In other words, recombination does not occur instantaneously, but
rather occurs over a range of redshifts. The quantity g(z) measures the effective width of
this transition. To compute it, one would plug in a prescription for the ionization fraction
and integrate to get τ (z). Doing so for the approximations in the book, one finds that g(z)
can be approximated at a Gaussian centered at the surface of last scattering with a width
∆z ≃ 400. WMAP observations indicate that the actual width is about 200.
81
28
Successes and Failures of Basic Big Bang Picture
[Reading: 7.3-7.13]
In recent weeks we have been working our way forward in time, until at last we have
reached the surface of last scattering at t = 300, 000 years. We will soon explore this epoch
in greater detail, but first we will spend a bit of time exploring the successes and failures of
the basic Big Bang model that we have presented thus far and look in detail at inflation as
a means of alleviating some (but not all) of these failures.
Everything that we have discussed in class up until this point is predicated upon a few
fundamental assumptions:
1. The Cosmological Principle is valid, and therefore on large scales the universe is homogeneous and isotropic.
2. General relativity is a valid description of gravity everywhere (at least outside event
horizons) and at all times back to the Planck time. More generally, the known laws of
physics, derived locally, are valid everywhere. This latter part is a consequence of the
Cosmological Principle.
3. At some early time the contents of the Universe were in thermal equilibrium with
T > 1012 K.
Why do we believe the Big Bang model? There are four basic compelling reasons. Given
the above assumptions, the Big Bang model:
1. provides a natural explanation for the observed expansion of the universe (Hubble
1929). Indeed, it requires that the universe be either expanding or contracting.
2. explains the observed abundance of helium via cosmological production of light elements (Alpher, Hermann, Gamow; late 1940s). Indeed, the high helium abundance
cannot be explained via stellar nucleosynthesis, but is explained remarkably well if one
assumes that it was produced at early times when the universe was hot enough for
fusion.
3. explains the cosmic microwave background. The CMB is a natural consequence of the
cooling expansion.
4. provides a framework for understanding structure formation. Initial fluctuations (from
whatever origin) remain small until recombination, after which they grow via gravity
to produce stars, galaxies, and other observed structure. Numerical simulations show
that this works remarkably well given (a) a prescription for the power spectrum of the
initial fluctuations, and (b) inclusion of non-baryonic dark matter.
Clearly an impressive list of accomplishment, particularly given that all the observational
and theoretical work progress described above was made in a mere 80 years or so. Not bad
82
progress for a field that started from scratch at the beginning of the 20th century. Still,
the basic model has some gaping holes. These can be divided into two categories. The
first category consists of holes arising from our limited current understanding of gravity and
particle physics. These aren’t so much “problems” with the Big Bang as much as gaps that
need to be filled in. These gaps include:
1. A description for the Universe prior to the Planck time. Also, the current physics is
somewhat sketchy as one gets near the Planck time.
2. The matter-antimatter asymmetry. Why is there an excess of matter?
3. The nature of dark matter. What is it?
4. The cosmological constant (dark energy) problem. There is no good explanation for
the size of the cosmological constant.
All four of the above questions are rather fundamental. The solution to the first item on
the list will require a theory of quantum gravity, or alternatively an explanation of how one
avoids reaching the Planck density. Meanwhile, the last two give us the sobering reminder
that at present we haven’t identified the two components that contain 99% of the total
energy density in the universe. Clearly a bit of work to be done!
The second category of holes are more in the thread of actual problems for the basic
model. These include:
1. The horizon problem [Why is everything so uniform?]
2. The flatness problem [Why is the universe (nearly) flat?]
3. The magnetic monopole problem [Why don’t we see any?]
4. The origin of the initial fluctuations. [Where’d they come from?]
We will discuss each of these, as well as the cosmological constant problem in greater
detail below, and see how inflation can (or cannot) help.
28.1
The Horizon Problem
The most striking feature of the cosmic microwave background is it’s uniformity. Across the
entire sky (after removing the dipole term due to our own motion) the temperature of the
CMB is constant to one part in 104 . The problem is that in the standard Big Bang model
points in opposite directions in the sky have never been in causal contact, so how can they
possibly “know” to be at the same temperature?
Let us pose this question in a more concrete form. Recall from earlier in the term (and
chapter 2 in your book) that we discussed the definitions of a “particle horizon” and the
“cosmological horizon”. The “particle horizon” is defined as including all points with which
we have ever been in causal contact. It is an actual horizon – we have no way of knowing
83
anything about what is currently beyond the particle horizon. As we discussed previously,
the particle horizon is defined as
Z t
cdt
RH = a
.
(481)
0 a
If the expansion of the universe at early times goes at a ∝ tβ , with β > 0, then
RH = tβ
Z
0
t
ct−β dt = (1 − β)ct.
(482)
and a particle horizon exists if β < 1.
Using the same expansion form in the Friedmann equation,
4
p
ä = − πG ρ + 3 2 a,
3
c
(483)
ä = β(β − 1)tβ−2 ,
(484)
yields
and
p
4
(485)
β(β − 1) = − πG ρ + 3 2 t2 .
3
c
The existence of an initial singularity requires ä < 0 and hence 0 < β < 1. Combining this
with the result above, we see that there must be a particle horizon.
How does the size of the particle horizon compare to the size of the surface of last
scattering? At zCM B = 1100, the CMB surface that we observe had a radius
rCM B =
ct0
ctlookback
≃
,
1 + zCM B
1 + zCM B
(486)
so opposite directions in the sky show the CMB at sites that were 2rCM B apart at recombination. If the above is a bit unclear, the way to think about it is as follows. Consider us
at the center of a sphere with the observed CMB emitted at a comoving distance from us of
ctlookback . This is the radius above, with the (1 + z) included to convert to proper distance.
The size of the particle horizon at a given redshift for w = 0 is given by
RH ≃ 3ct ≃ 3ct0 (1 + zCM B )−3/2 ≃ 3rCM B (1 + zCM B )−1/2 ≃
rCM B
.
10
(487)
The implication is that the CMB is homogeneous and isotropic on scales a factor of ten
larger than the particle horizon.
28.2
Inflation: Basic Idea
The most popular way of getting around the above problem is called inflation. The basic idea
is to simply postulate that the universe underwent a period of accelerated expansion (ä > 0)
at early times. As we will see later, there are many variants of inflation, but they all boil
down to the same result – a finite period of accelerated expansion very early. So how does
84
inflation help with the horizon problem? If there was a period of accelerated expansion, then
one can envision a scenario in which the entire observable universe was actually in causal
contact at some point prior to this accelerated expansion. In this case the uniformity of the
CMB is no longer so mysterious. Let’s see how this would work.
28.2.1
Cosmological Horizon
When we discussed particle horizons early in the term, we also discussed the “cosmological
horizon”. This term is actually somewhat of a misnomer, but is commonly used. It’s not a
true horizon, but rather is simply defined as being equivalent to the Hubble proper distance
at a given redshift,
a
c
Rc = c =
,
(488)
ȧ
H(z)
or a comoving distance
rc = c
c(1 + z)
a0
=
,
ȧ
H(z)
(489)
which reduces to the familiar
c
(490)
H0
at z = 0. The relevance of the cosmological horizon, or Hubble distance, is that physical
processes can keep things fairly homogeneous within a scale of roughly the cosmological
horizon. Recall during the thermodynamic discussion that reactions were in thermodynamic
equilibrium if the collision time was less than the Hubble time; similarly, physically processes
can act within regions less than the Hubble distance (cosmological horizon). There are a
couple important things to remember about the cosmological horizon. First, objects can be
outside the cosmological horizon, but inside the particle horizon. Second, objects can move
in and out of the cosmological horizon (unlike the particle horizon, where an object within
the particle horizon will forever remain within the particle horizon).
[Those reading notes should now refer to figure 7.4 in book, as I will be drawing something
similar on board.]
Rc = rc ≡ DH =
28.2.2
Inflationary solution
The horizon problem, as discussed above, is that points separated by proper distance l
(comoving distance l0 = l(1 + z)) are only causally connected when l < RH , where RH is
the size of the particle horizon. If we consider a ∝ tβ at early times (with β < 1 as above),
then the size of the horizon grows with time. Put simply, as the universe gets older light can
reach farther so the particle horizon is larger.
Now, imagine that a region l0 that is originally within the cosmological horizon (l0 <
rc (ti )) is larger than the horizon at some later time (l0 > rc (tf )). This can only happen if
the comoving cosmological horizon decreases with time. In other words,
d ca0
−ca0 ä
=
< 0,
dt ȧ
ȧ2
85
(491)
or ä > 0.
The inflationary solution thus posits that the universe passes through a period of accelerated expansion, which after some time turns off and returns to a decelerated expansion.
If observers (like us) are unaware of this period of accelerated expansion, then we perceive
the paradox of the horizon problem. The problem is non-existent though in the inflationary
model because everything that we see was at some point in causal contact.
OK. That’s the basic picture. Now let’s work through the details. During the inflationary
era, we require that the Friedmann equation be dominated by a component with w < −1/3
in order to have an accelerated expansion.
If we define the start and finish of the inflationary period as ti and tf , respectively, then
from the Friedmann equation we find,
ȧ
ai
2
"
= Hi2 Ωi
ai
a
1+3w
≃
ȧ a
a ai
#
(492)
1+3w
(493)
= Hi
(494)
+ (1 − Ωi )
Hi2
ai
a
−3(1+w)/2
where we have assumed Ωi ≃ 1 since we are considering early times. Now we have several
cases to consider. The first is w = −1. In this case, we have
da
= Hi dt,
a
(495)
a = ai eHi (t−ti ) .
(496)
and integrating from ti to t we have,
This case is called “exponential inflation” for obvious reasons. Now, consider the cases where
w 6= 1. Starting from above and integrating again we now have,
"
#
d
3(1 + w) a −3(1+w)/2
d
−
= (Hi t)
da
2
ai
dt
"
3(1+w)/2 #
a
3(1 + w)
1−
= Hi (t − ti
−
2
ai
3(1+w)/2
a
2
Hi (t − ti ) + 1
=
ai
3(1 + w)
#q
"
2
Hi (t − ti )
.
; where q =
a ≃ ai 1 +
q
3(1 + w)
(497)
(498)
(499)
(500)
For −1 < w < −1/3 and t/ti > 1, this equation reduces to simply
a ∝ tq ; where q > 1.
86
(501)
This case is called “standard inflation” or “power-law inflation”. Finally, for w < −1, we
have
"
#−|q|
Hi
a∝ 1−
(t − ti )
∝ (C − t)−|q| for t < C and q < 0.
(502)
|q|
This latter case is called “super-inflation” because the expansion is super-exponential.
An alternate, and perhaps more concise way to understand this terminology is to look at
the acceleration in terms of H. Recall that H = ȧ/a, so
ä = Ḣa + ȧH = a(H 2 + Ḣ)
(503)
“Standard inflation” corresponds to Ḣ < 0, “exponential inflation” corresponds to Ḣ = 0,
and “super-inflation” corresponds to Ḣ > 0. It is straightforward to show that Ḣ = 0 for
“exponential inflation” yields
a ∝ eHi t ,
(504)
and the previous solutions for the other cases can be recovered as well.
28.2.3
Solving the Horizon Problem
Now, there are several requirements for inflation to solve the horizon problem. Let us divide
the evolution of the universe into three epochs:
• Epoch 1: Inflationary era from ti to tf , where w < −1/3
• Epoch 2: Radiation-dominated era from tf to te q, where w = 1/3.
• Epoch 3: Matter-dominated era from te q to t0 ,where w = 0.
Let the subscripts i and j stand for the starting and ending points of any of these intervals.
For a flat model, where Ωij ≃ 1 in any interval, we find (see the equation for the Hubble
parameter, eq 2.1.13 in your book, for the starting point):
Hi2
≃
Hj2
ai
aj
!2 "
Hi ai
≃
Hj aj
Ωj
ai
aj
aj
ai
1+3w #
!−(1+3w)/2
.
(505)
(506)
To solve the horizon problem we require that the comoving horizon scale now is much
smaller than at the start of inflation,
rc (t0 ) ≡
c
ca0
<< rc (ti ) =
,
H0
ȧi
(507)
which implies that
H0 a0 >> ȧi = Hi ai .
87
(508)
Consequently, this means that
Hi ai
H0 a0
H0 a0 Heq aeq
<<
=
,
Hf af
Hf af
Heq aeq Hf af
(509)
!−(3w+1)/2
!−1
(510)
!2
(511)
which gives
ai
af
af
ai
<<
−(3w+1)
a0
aeq
!−1/2
>>
a0
aeq
aeq
af
!
aeq
af
If one substitutes in a0 /aeq = (1 + zeq )−1 , and
1 + zeq
Tf
aeq
=
=
,
af
1 + zf
Te q
(512)
taking Teq ≃ 10−30 TP , where TP is the Planck temperature, this yields
af
ai
−(1+3w)
>> 1060 (1 + zeq )− 1
Tf
TP
2
.
(513)
Consequently, for an exponential expansion this implies that the number of e-foldings is
"
#
ln 10 + ln(Tf /TP )/30 − ln(1 + zeq )/60
N ≡ ln(af /ai ) >> 60
.
|1 + 3w|
(514)
For most proposed models, 10−5 < Tf /TP < 1, which means that we require N >> 60.
Think about this for a moment. This says that the universe had to grow by a factor of
at least e60 during inflation, or a factor of 1026 . As we shall see in a moment, this likely
would have to happen during a time interval of order 10−32 s. As an aside, note that while
the expansion rate is >> c, this does not violate standard physics since it is spacetime rather
than matter/radiation undergoing superluminal motion. Any particle initially at rest in the
spacetime remains at rest and just sees everything redshift away.
28.3
Inflation and the Monopole Problem
Let’s now see how inflation helps (or fails to help) some of the other problems. One problem
with the basic Big Bang model is that most grand unification theories (GUTs) in particle
physics predict that magnetic defects are formed when the strong and electroweak forces
decouple. In the most simple case these defects are magnetic monopoles – analogous to
electrons and positrons, but with a magnetic rather than electric charge.
From these theories one finds that magnetic monopoles should have a charge that is a
multiple of the Dirac charge, gD , such that
gM = ngD = n68.5e,
88
(515)
with n = 1 or n = 2, and a mass
mM ≃ 1016 GeV.
(516)
nM > 10−10 nγ ≃ n0b ,
(517)
[Note that this g is charge rather than degrees of freedom!] One amusing comparison that I
came across is that this is roughly the mass of an amoeba (http://www.orionsarm.com/tech/monopoles.html
[Note that equation 7.6.4 in the book is wrong.] The mass of a magnetic monopoles is close
to the energy/temperature of the universe at the symmetry breaking scale (1014 −1015 GeV).
In some GUT theories instead of (or in addition to) magnetic monopoles, this symmetry
breaking produces higher dimensional defects (structures) such as strings, domain walls, and
textures (see figure 7.3 in your book). Magnetic monopoles can also be produced at later
times with m ∼ 105 − 101 2 GeV via later phase transitions in some models.
So what’s the problem? Well, first of all we don’t actually see any magnetic monopoles
or the other more exotic defects. More worrisome though, one can calculate how common
they should be. We are going to skip the details (which are in your book), but the main
point is that the calculation gives
so there should be as many magnetic monopoles as baryons.
[Question for the class: Could this be “dark matter”? Why (not)?]
Working out the corresponding density parameter, we see that
ΩM >
mM
Ωb ≃ 1016 .
mp
(518)
A definite problem. How does inflation help? Well, consider what inflation does – if you
expand the universe by a factor of 1060 , then the density of any particles that exist prior to
inflation goes to Ω → 0. This is analogous to our picture of the present universe in which
the current accelerated expansion should eventually make the matter density go to zero. In
this case, the density of magnetic monopoles should go to zero as long as inflation occurs
after GUT symmetry breaking (t > 10−36 s).
At this point you may be asking how we have a current matter/energy density larger than
zero if inflation devastated the density of pre-existing particles. We will return to this issue
a bit later in our discussion of phase transitions. For now, I will just say that the universe is
expected to gain energy from the expansion in the standard particle physics interpretations,
and all particles/energy that we see today arise at the end of the inflationary era.
28.4
The Flatness Problem
OK – next on the list is the flatness problem. Specifically, why is the universe flat (or at least
very, very close to it)? You might ask why not (and it certainly does simplify the math),
but in truth there’s no a priori reason to expect it to be flat rather than have some other
curvature.
Indeed, if you look at this from a theoretical perspective, the only characteristic scale in
the evolution of the universe is the Planck scale. One might therefore expect that for a closed
89
universe the lifetime might be tu ≃ tP . Similarly, for an open universe one would expect the
curvature to dominate after roughly a Planck time. Clearly not the case in reality!
Let’s start by quantifying how flat the universe is. We’ll do this for a model without a
cosmological constant, but the same type of derivation is possible (with a bit more algebra)
for a more general model. We can rearrange the Friedmann equation,
8πG 2
ȧ2 −
ρa = −Kc2 ,
(519)
3
to find
2
8πG
Kc2
ȧ
−
ρ=− 2
(520)
a
3
a
Kc2
(521)
H 2 (1 − ρ/ρc ) = − 2
a
H 2 (1 − Ω)a2 = −Kc2
(522)
ρ 1−Ω 2
a = −Kc2
(523)
H2
ρc
Ω
−Kc2 3
(Ω−1 − 1)ρa2 =
= constant
(524)
8πG
so
2
(Ω−1 − 1)ρa2 = (Ω−1
(525)
0 − 1)ρ0 a0 .
We can put this in terms of more observable parameters. Since we know that ρ ∝ a−4
for the radiation-dominated era and ρ ∝ a−3 for the matter-dominated era, we can use the
above to solve for the density parameter at early times. Specifically
and
So
2
(Ω−1 − 1)ρa2 = (Ω−1
eq − 1)ρeq aeq ;
(526)
2
−1
2
(Ω−1
eq − 1)ρeq aeq = (Ω0 − 1)ρ0 a0
(527)
−1
(Ω
a
− 1)
aeq
!−2
= (Ω−1
eq − 1)
(528)
a0
aeq
(529)
−1
(Ω−1
eq − 1) = (Ω0 − 1)
(530)
and therefore
Ω−1 − 1
aeq
=
−1
a
Ω0 − 1
2
a0
aeq
!
= (1 + zeq )
−1
Teq
T
2
≃ 10
−60
TP
T
2
.
(531)
Consequently, even for an open model with Λ = 0, Ω0 = 0.3, the above result requires that
−60
Ω−1
at the Planck time. Indeed, right now we know that Ω0 + ΩΛ = 1.02 ± 0.02
P − 1 ≤ 10
(WMAP, Spergel et al. 2003), which further tightens the flatness constraint in the Planck
era.
90
28.4.1
Enter inflation
How does inflation help with this one? Well, the basic idea is that inflation drives the
universe back towards critical density, which means that it didn’t necessarily have to be so
close to critical density at the Planck time. To see this, divide the history of the universe
into three epochs, as we did before. Going along the same argument as above, we have
2
−1
2
−1
2
−1
2
(Ω−1
i − 1)ρi ai = (Ωf − 1)ρf af = (Ωeq − 1)aeq = (Ω0 − 1)ρ0 a0 .
(532)
Rearranging, this gives
ρ0 a20 ρeq a2eq ρf a2f
Ω−1
ρ0 a20
i −1
=
,
=
ρi a2i
ρeq a2eq ρf a2f ρi a2i
Ω−1
0 −1
(533)
,
(534)
Ω−1
i −1
=
Ω−1
0 −1
a0
aeq
!−1
aeq
af
!−2 af
ai
−(1+3w)
or
af
ai
!2
aeq
Ω−1 − 1 a0
= i−1
af
Ω0 − 1 aeq
2
−1
1 − Ωi
−1
60 Tf
(1 + zeq ) 10
.
≃
TP
1 − Ω−1
0
!
−(1+3w)
(535)
(536)
Recall in the horizon section that the horizon problem was solved if
af
ai
−(1+3w)
−1
>> (1 + zeq ) 10
60
Tf
TP
2
.
(537)
The flatness problem is now also resolved as long as the universe is no flatter now than it
was prior to inflation, i.e.
1 − Ω−1
i
(538)
−1 ≥ 1.
1 − Ω0
To rephrase, the problem before is that in the non-inflationary model the universe had to
be 1060 times closer to critical density than it is now at the Planck time. With inflation, it’s
possible to construct cases in which the universe was further from critical density than it is
now. Indeed, inflation can flatten out rather large initial departures from critical density.
28.5
Origin of CMB Fluctuations
Converse to the flatness problem, let us now ask why we see any bumps and wiggles in the
CMB. If all the regions were causally connect, and we know that random fluctuations can’t
grow prior to recombination, where did these things come from?? Quantum mechanics,
which operates on ridiculously smaller scales, is the one way in which you can generate
such random fluctuations. Inflation provides an elegant way out of this dilemma. If the
91
universe expanded exponentially, any quantum fluctuations in the energy density during the
expansion are magnified to macroscopic scales. Since quantum wavefunctions are gaussian,
inflation makes the testable prediction that CMB fluctuations should be gaussian as well.
This is the one currently testable prediction of inflation, and it appears that the fluctuations
are indeed gaussian. We’ll talk more about these later.
28.6
The Cosmological Constant Problem: First Pass
Why is there a non-zero cosmological constant that has a value anywhere near the critical
density? This is the basic question. We will return to this in greater detail after our
discussion of phase transitions, but the basic problem is that current particle physics predicts
a cosmological constant (if non-zero) that’s off by about 110 orders of magnitude. Inflation
does not help with this one at all.
28.7
Constraints on the epoch of inflation
Most theories predict that inflation should have occurred at t = 10−36 − 10−32 s. Inflation
must occur no earlier than 10−36 s, which is the GUT time, or else we should see magnetic
monopoles or related topological defects. The end time constraint is more nebulous – it
simply must have lasted at least until 10−32 s for a 1060 increase in the scale factor.
29
The Physics of Inflation
In the previous section we have motivated why a period of accelerated expansion in the early
universe would be a nice thing to have. Now how would one physically achieve such a state?
This question is in fact even more relevant that it was even 10 years ago, as we now know that
the universe is currently in the early stage of another accelerated (inflationary!) expansion.
We will not go through all the details, but will qualitatively describe the fundamental (and
partially speculative) physics.
29.1
Phase Transitions
Most of you are familiar with the concept of a phase transition in other contexts. Phase
transitions are defined by abrupt changes in one or more of the physical properties of a
system when some variable (temperature) is changed slightly. Examples of well-known phase
transitions include:
• Freezing and boiling (transformation from the liquid to solid and gas phases).
• Magnetism – materials are ferromagnetic below the Curie temperature, but lose their
ferromagnetism above this temperature
92
• Superconductivity – for some materials there is a critical temperature below which a
material becomes conductive.
• Bose-Einstein condensation
What these processes have in common is that as the temperature is lowered slightly
beyond some critical point, the material changes from a disordered to a more ordered state
(i.e. the entropy decreases). For example an amorphous fluid can produce a crystal with an
ordered lattice structure when it solidifies (sodium chloride, quartz, etc). Let the parameter
Φ describe the amount of order in the system. What we are essentially saying is that Φ
increases during the phase transition from the warmer to cooler state. Depending on the
type of system, this order parameter can be defined in assorted ways (like the magnetism
for a ferromagnet), but this basic meaning is unchanged. What we are going to see is that
symmetry breaking in cosmology (i.e the points at which the forces break apart) can be
considered phase transitions. Think of the universe instantaneously crystallizing to a more
ordered state – quarks for example spontaneously congealing to form hadrons, particles
suddenly gaining mass, etc. These are profound transitions, and are accompanied by a
change in the free energy of the system as the universe settles down to a new minima state.
A related phrase that I will (without much description) introduce now is the vacuum energy.
The corollary way of thinking about this is that prior to the phase transition the vacuum
has some intrinsic energy density. During the transition this energy is freed and the universe
settles down to a new, lower vacuum energy. Bit nebulous at this point, but we’ll see whether
we can fill in a few details.
Returning to thermodynamics (from which we never quite escape), the free energy of a
system is F = U − T S, where U is the internal energy, T is the temperature, and S is the
entropy. By definition, an equilibrium state corresponds to a minima in F (i.e. minimizing
the free energy of the system). Consider a case in which for temperatures above the phase
transition the free energy has a minimum at Φ = 0. During a phase transition, you are
effectively creating new minima at higher values of Φ. [See figure]
To have true minima, the dependence must be on Φ2 rather that Φ. Why? Consider the
case of a magnet, where the “order parameter” is the magnetization, M. The free energy
doesn’t depend on the direction of the magnetic field – only the magnitude of the magnetism
matters (say that ten times fast). Consequently, the free energy must depend on M 2 rather
that M. Put more succinctly, the system needs to be invariant to transformations, so Φ and
−Φ need to be treated equally. If we expand F as a power series function of Φ2 , then we
93
can write
F (Φ) ≈ F0 + AΦ2 + Bφ4 .
(539)
If A > 0 and B > 0, then we have a simple curve with the minima at Φ = 0 – i.e. the
minimum is in the most disordered state. On the other hand, if A < 0 and B > 0, then
you can create new minima at more ordered states. If you think of the free energy plot as a
potential well, you can see that the phase transition changes the potential and the universe
should roll down to the new minima.
How would you change from one curve to another? Consider the case in which A =
K(T − Tc ). In this case, the sign of A changes when you drop below the critical temperature,
and as you go to lower temperatures the free energy minima for the ordered states get lower
and lower. This type of transition is a second order phase transition. As you can see
from a time sequence, the transition is smooth between T > Tc and T < Tc and the process
is transition is gradual as the system slowly rolls towards the new minima.
As an alternative, there are also first-order phase transitions. In first-order transitions
the order parameter appears rapidly and the difference in free-energy above and below the
critical temperature is finite rather than infinitesimal. In other words, there is a sharp
change in the minimum free energy right at the critical temperature. The finite change in
the free energy at this transition, ∆F , is the latent heat of the transition (sound familiar
from chemistry?).
Lets look at an example of such a transition. Consider the figure below, in which there
are initially two local minima for T > Tc . As the system cools, these become global minima
at T = TC , but the system has no way of reaching these minima. At some later time,
after the system has cooled further, it becomes possible for the system to transition to the
more ordered state (either by waiting until the barrier is gone, or via quantum tunneling
depending on the type of system. In this case, the system rapidly transitions to the new
minima and releases the latent heat associated with the change in free energy. This process
is supercooling. From a mathematical perspective, one example (as shown in the figure)
can be achieved by making the dependence,
F = F0 + AΦ2 + C|Φ|3 + BΦ4 ,
(540)
with A > 0, B > 0, and C < 0.
29.2
Cosmological Phase Transitions
So how does freezing water relate to cosmology? The basic idea is that the Universe undergoes cosmological phase transitions. You may recall that we have used the term “spontaneous
symmetry breaking” in describing the periods during which the fundamental forces separate.
From the particle physics perspective, these events correspond to phase transitions in which
the universe moves from a disordered to a more ordered state (for instance particles acquiring mass, and differentiation of matter into particles with unique properties like quarks and
leptons). The free energy in this interpretation corresponds to a vacuum energy contained
94
1.5
1.25
1.0
0.75
0.5
0.25
0.0
−1.0
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
0.8
1.0
x
Figure 7 The x axis is the order parameter, while the y-axis is the free energy. The curves
show how the free energy function change as temperature decreases. The yellow curve is at
the critical temperature; the other two are slightly above and below the critical temperature.
95
0.2
0.1
0.0
−0.1
−0.2
−0.3
−0.4
−1.0
−0.5
0.0
0.5
1.0
x
Figure 8 Similar to previous figure, except that this one shows an example of supercooling
(first-order phase transition). In this case, there is a barrier that prevents the system from
moving to the new minimum until T << TC , at which point it rapidly transitions to the new
state and releases latent heat.
96
in some scalar field (referred to as the inflaton field for inflation) – which is equivalent to
the order parameter that we have been discussing (again for reference, temperature is an
example of a scalar field while gravity is an example of a vector field). During the phase
transition this vacuum energy decreases. Note that there are also other phase transitions
in the early universe not associated directly with spontaneous symmetry breaking (think of
the quarks congealing into hadrons for instance). As we shall discuss soon, this vacuum energy can potentially drive a period of inflationary expansion if it dominates the total energy
density. Meanwhile, the latent heat released during the phase transition is also key.
So when during the early universe are cosmological phase transitions possible? Well,
basically most of the time, as the constituents in the universe are rapidly evolving and new,
more ordered structures are forming. The most dramatic phase transitions correspond to
the spontaneous symmetry breaking scales when the forces separate, but it is possible to
have other phase transitions along the way. Your book attempts to divide this time up until
the quark-hadron transition into intervals characterized by the types of phase transitions
occurring at each epoch. This is a reasonable approach, and we shall review these periods
here.
• The Planck Time (∼ 1019 GeV) – This is the point at which we require a quantum
theory of gravity, and for which any super grand unification theory must unify gravity
with the other forces above this temperature.
• GUT (∼ 1015 GeV) – This is the temperature at which the strong and electroweak
forces break apart. The GUT scale is when magnetic monopoles are expected to form,
so we require a period of inflation during or after this epoch. We want inflation to
occur very near this epoch though, because only at and above this temperature do
most current models permit creation of a baryon-antibaryon asymmetry. This is not
a hard constraint though, as it is possible to construct scenarios in which baryon
conservation is violated at somewhat lower temperatures.
• Between the GUT and Electroweak scales. The main point in the book is that the
timescale between GUT and Electroweak is from 10−37 − 10−11 s, which logarithmically
leaves a lot of time for other phase transitions to occur. These phase transitions would
not be associated with symmetry breaking.
• Electroweak scale to quark-hadron transition. The universe undergoes a phase transition when the weak and electromagnetic forces split. It’s at this point that leptons
acquire mass incidentally. Also in this category is the (much lower temperature) quarkhadron transition, at which free quarks are captured into hadrons.
Any and all of the above transitions can yield a change in the vacuum energy. Not all
of the above however can cause inflation. Keep in mind as we go along that for inflation to
occur, the vacuum energy must dominate the total energy density.
97
29.3
Return to the Cosmological Constant Problem
As promised, we’re now going to finish up talking about the cosmological constant problem
– specifically the problem of how it can be non-zero and small. Recall that the density
corresponding to the cosmological constant (WMAP value) is given by
|ρΛ | = 0.7 ×
Λc2
= 1.4 × 10−29 gcm−3 ≃ 10−48 GeV4 .
8πG
(541)
Equivalently, one can compute the value of Λ, finding Λ = 10−55 cm−2 .
Small numbers, but is this a problem? The cosmological constant is often interpreted as
corresponding to the vacuum energy of some scalar field. This is analogous to the discussion
of free energy that we saw in previous sections. Modern gauge theories in particle physics
predicts this vacuum energy corresponds to an effective potential,
ρv ≈ V (Φ, T ),
(542)
and that the drop in the vacuum energy at a phase transition should be of the order
∆ρv ≈
m4
,
(h̄c)3
(543)
where m is the mass scale of the relevant phase transition. This density change corresponds
to 1060 GeV4 for the GUT scale and values of 10−4 − 1012 GeV4 for other transitions (like
the electroweak).
Now, if we take all the phase transitions together, this says that
ρv (tP lanck ) = ρv (t0 ) +
X
i
∆ρv (mi ) ≈ 10−48 + 1060 GeV4 =
X
∆ρv (mi )(1 + 10−108 ). (544)
i
In other words, starting with the vacuum density before GUT, the current vacuum density
is a factor of 10108 smaller – and this value is very close to the critical density. Your book
regards this as perhaps the most serious problem in all of cosmology. What is a greater
mystery I would argue is why we find ourselves at a point in the history of the universe
during which we are just entering a new inflationary phase.
29.4
Inflation: Putting the Pieces Together
We’ve now defined inflation in terms of its impact upon the growth of the scale factor (ä > 0),
explored how it can resolve some key problems with the basic big bang, and done a bit of
background regarding phase transitions. Seems like a good idea, so time to start assembling
a coherent picture of how one might incorporate inflation into the Big Bang model. At a
very basic level, all inflationary models have the following properties:
• There must be an epoch in the early universe in which the vacuum energy density,
ρ ∝ V (Φ), dominates the total energy density.
98
• During this epoch the expansion is accelerated, which drives the radiation and matter
density to zero.
• Vacuum energy is converted into matter and radiation as Φ oscillates about the new
minima. This reheats the universe back to a value near the value prior to inflation,
with all previous structure having been washed out.
• This must occur during or after the GUT phase to avoid topological defects. (note:
some versions don’t address this directly)
We will now consider the physics of general inflationary models and then discuss a few
of the zoo of different flavors of inflation. For a scalar field Φ, the Lagrangian of the field is
1
LΦ = Φ̇2 − V (Φ, T ),
2
(545)
analogous to classical mechanics. Note that the scalar field Φ is the same as the order
parameter that we have been discussing and the potential V is analogous to the free energy.
The density associated with this scalar field is
1
ρΦ = Φ̇2 + V (Φ, T )
2
(546)
(547)
Consider the case of a first-order phase transition (supercooling). In this case, the phase
transition does not occur until some temperature Tb < Tc , at which point Φ assumes the
new minimum value. If this transition is assumed to occur via either quantum tunneling or
thermal fluctuations (rather than elimination of the barrier), then the transition will occur
in a spatially haphazard fashion. In other words, the new phase will appear as nucleating
bubbles in the false vacuum, which will grow until the entire Universe has settled to the new
vacuum.
On the other hand, if the transition is second order, the process is more uniform as
all regions of space descend to the new minima simultaneously. Note however that not all
locations will descend to the same minima, so you will end up with “bubbles” or domains.
The idea is that one such bubble should eventually encompass our portion of the Universe.
Now, how does this evolution occur? We’ll phrase this in terms of the equation of motion
for the scalar field,
d ∂LΦ a3 ∂LΦ a3
−
= 0,
(548)
dt ∂ Φ̇
∂Φ
which, using the Lagrangian above,
∂V (Φ)
ȧ
Φ̈ + 3 Φ̇ +
=0
a
∂Φ
(549)
Let’s look at this above equation. If we ignore the ȧ term, then this is equivalent to a
ball oscillating back and forth in the bottom of a potential well. In this analogy, the 3ȧ/a
99
term then corresponds to friction damping the kinetic energy of the ball. It is standard in
inflation to speak of the vacuum as “rolling down” to the new minima. More specifically, at
the start of inflation one normally considers what is called the “slow roll phase”, in which
the kinetic energy is << the potential energy. This corresponds to the case in which the
motion is friction dominated, so the ball slowly moves down to the new minima.
Remember that inflation causes ρr and ρm to trend to zero, so the Friedmann equation
during the phase transition is approximately,
2
ȧ
a
8πGρΦ
8πG 1 2
=
Φ̇ + V (Φ, T ) .
=
3
3
2
(550)
In the slow roll phase, this reduces to
2
ȧ
a
=
8πG
V (Φ, T ),
3
(551)
so
a ∝ exp(t/τ ),
(552)
where
1/2
3
.
(553)
8πGV
For most models the above timescale works out to roughly τ = 10−34 s. Since we need
a minimum of 60 e-foldings to solve the horizon problem, this means that the inflationary
period should last for at least 10−32 s. Note that this assumes inflation starts right at the
phase transition. It’s possible to have ongoing inflation for a while before this, but you still
want to have a large number of e-foldings to get rid of monopoles and other relics produced
at the GUT temperature.
As the roll down to the new minimum proceeds, the field eventually leaves the slow
roll phase and rapidly drops down towards, and oscillates about the new minima. These
oscillations are damped by the creation of new particles (i.e. conversion of the vacuum
energy into matter and radiation). Mathematically, this corresponds to the addition of a
damping term in the equation of motion
τ≃
ȧ
∂V (Φ)
Φ̈ + 3 Φ̇ + ΓΦ̇ +
= 0.
a
∂Φ
(554)
Physically, this has the effect of reheating the universe back up to some temperature,
T < Tcrit , after which we proceed with a normal Big Bang evolution. Note that this new
temperature has to be sufficiently high for baryosynthesis.
So to summarize, the best way to look at things is like this:
1. Vacuum energy starts to dominate, initiating an inflationary expansion.
2. Inflation cools us through a phase transition, which initiates a slow roll down to the
new minima. Inflation continues during this epoch.
100
3. Slow roll phase ends and vacuum drops to new minima. Inflation ends.
4. Scalar field oscillates around this new minima, releasing energy via particle production
until it settles into the new minima. This released energy reheats the universe.
5. Back to the way things were before the inflationary period.
29.5
Types of Inflation
OK - so what are the types of inflation. Inflation as a topic could fill the better part of a
semester, with much of the time devoted to the various flavors. Here I am aiming to provide
just an overview of inflation, and will continue that theme with a sparse sampling of types
of inflation.
29.5.1
Old Inflation
The original inflationary model (Guth, 1981) suggested that inflation is associated with a
first-order phase transition. As we discussed, a first-order phase transition implies a spatially
haphazard transition. It turns out that the bubbles produced in this way are too small for
our Universe and never coalesce into a larger bubble, so this model was quickly abandoned.
29.5.2
New Inflation
Shortly after the work by Guth, Andre Linde (1982) proposed a new version with a secondorder rather than first-order phase transition. It turns out that a second order transition
leaves larger spatial domains, and enables the entire universe to be in a single bubble with
the same value of Φ. New inflation has several problems though that inspired other versions.
(see your book for details)
29.5.3
Chaotic Inflation
Chaotic inflation (Linde 1983) was an interesting revision in that it does not require any
phase transitions. Instead, the idea is that near the Planck time Φ (whatever it is) varies
spatially. Consider an arbitrary potential V(Φ) with the one condition that the minimum
is at Φ = 0. Now, take a patch of the universe with a large, non-zero value of Φ. Clearly,
within this region Φ will evolve just as it would right after a second-order phase transition
– starting with a slow roll and eventually reheating and settling into the minima.
The mathematics is the same as before – the main difference now is that we’ve removed
the connection between inflation and normal particle physics. It’s completely independent
of GUT or any other phase transitions.
101
29.5.4
Stochastic Inflation
Stochastic, or eternal, inflation is an extension of chaotic inflation. Starting with an inhomogeneous universe, the stochastic model incorporates quantum fluctuations as Φ evolves. The
basic idea then is that there are always portions of the universe entering the inflationary
phase, so you have many independent patches of universe that inflate at different times.
What’s kind of interesting about this approach is that it brings us full circle to the Steady
State model in the sense that there is no overall beginning or end – just an infinite number
of Hubble patches evolving separately infinitely into the future.
102
30
Cosmic Microwave Background
[Chapter 17]
It is now time to return for a more detailed look at the cosmic microwave background –
although not as detailed a look as one would like due to time constraints on this class. We
are now in what should be considered the “fun” part of the term – modern cosmology and
issues that remain relevant/open at the present time. Let us start with a qualitative look
at the CMB and how the encoded information can be represented. We will then discuss the
underlying physics in greater detail and play with some animations and graphics on Wayne
Hu’s web page.
30.1
Extracting information from the CMB
The structure observed in the CMB, as we will see, is a veritable treasure trove of information.
It provides a picture of the matter distribution at the epoch of recombination, constrains
a host of cosmological parameters, and provides information on assorted physics that has
occurred subsequent to recombination (such as the epoch of reionization). Before we get to
the physics though, a first question that we will discuss is now one might go about extracting
the essential information from a 2-d map of the CMB sky.
The standard approach is to parameterize the sky map in terms of spherical harmonics,
such that
∞ X
l
∆T
T− < T > X
≡
=
alm Ylm (θ, φ),
(555)
<T >
<T >
l=0 m=−l
where the Ylm are the standard spherical harmonics familiar from quantum mechanics or
(helio)seismology
#1/2
"
2l + 1 (l − m)!
Plm (cosθ)eimφ ,
(556)
Ylm (θ, φ) =
4π (l + m)!
with the Plm being Legendre polynomials,
Plm (cos θ) =
m/2
l
dl+m 2
(−1)m 2
1
−
cos
θ
cos
θ
−
1
2l l!
d cosl+m θ
(557)
Now, for a given map the coefficients alm are not guaranteed to be real – in general they will
be complex numbers. Rather, the more physical quantity to consider is the power in each
mode, which is defined as
Cl ≡< |alm |2 > .
(558)
As we will see in a moment, the angular power spectrum, measured in terms of Cl , is the
fundamental observable for CMB studies. Specifically, when we see a typical angular power
spectrum for the CMB, the y axis is given by [l(l + 1)Cl /(2π)]1/2 . The units are µK, and this
can physically thought of as the amplitude of temperature fluctuations ∆T /T for a given
angular scale, appropriately normalized.
103
First though, let’s consider the physical interpretation of different l modes. The l = 0
mode corresponds to a uniform offset in temperature, and thus can be ignored. The l = 1
mode is dipole mode. This term, which for the CMB is several orders of magnitude larger
than any other terms, is interpreted as being due to our motion relative to the CMB.
How does this effect the temperature? Assume that our motion is non-relativistic (which
is the case). In this case, the observed frequency of the CMB is shifted by a factor ν ′ =
ν(1 + βcosθ), where β = v/c and θ = 0 is defined as the direction of our motion relative to
the CMB. For a black-body spectrum it can be shown that this corresponds to a temperature
distribution,
T (θ) = T0 (1 + β cos θ).
(559)
Thus, the lowest order anisotropy in the CMB background tells us our velocity (both
speed and direction) relative to the microwave background, and hence essentially relative to
the cosmic rest frame. Not a bad start.
Moving beyond the dipole mode, the l ≥ 2 modes are due primarily to intrinsic anisotropy
produced either at recombination or by subsequent physics. These are the modes that we
care most about. The book provides a rough guide that the angular scale of fluctuations for
large values of l is θ ≃ 60◦ /l – more useful and correct numbers to keep in mind are that
l = 10 corresponds to about 10◦ and l = 100 to about 1◦ .
30.2
Physics
See http://background.uchicago.edu/w̃hu/intermediate/intermediate.html
The quick summary is that the peaks in the CMB angular power spectrum are due
to acoustic oscillations in the plasma at recombination. The first peak corresponds to a
fundamental mode with size equal to the sound horizon at recombination, while the higher
order peaks are harmonics of this fundamental mode. The location of the first peak depends
upon the angular diameter distance to the CMB, and is consequently determined primarily
by the spatial curvature (with some dependence upon Λ). The relative amplitude of the
second peak constrains the baryon density, while the third peak can be used to measure
the total matter density. Meanwhile, the damping tail provides a cross-check on the above
measurements. Finally, if it can be measured the polarization provides a means of separating
the effects of reionization epoch and gravitational waves. Note that the currently measured
power spectrum of temperature fluctuations is commonly referred to as the scalar power
spectrum (since temperature is a scalar field). Polarization on the other hand also probes
the tensor and vector power spectrum.
30.3
CMB Polarization and Inflation
[see section 13.6 in your book] One of the predictions of inflation is the presence of gravitational waves, which alter the B-mode of the CMB tensor power spectrum. If one can measure
this polarization, then one can constrain the nature of the inflation potential. Consider the
104
equation of motion for a scalar field φ
φ̈ + 3H φ̇ + V ′ (φ) = 0
Let us define two quantities, which we will refer to as “slow roll parameters”,that together
define the shape of the inflation potential:
m2P
ǫ=
16πG
V′
V
!2
m2
η= P
8πG
V ′′
V
!
where mP is the Planck mass, V = V (φ), and all derivatives are with respect to φ. In the
slow roll regime, the equation of motion is dominated by the damping term, so
V′
.
3H
Additionally, the slow roll parameters must both be much less than 1. The requirement
ǫ << 1 corresponds to V >> φ̇2 – which is the condition necessary for inflation to occur.
The requirement that |η| << 1 can be derived from the other two conditions, so is considered
a consistency requirement for the previous two requirements.
We will (if time permits) later see that the primordial power spectrum for structure
formation is normally taken to have the form Pk ∝ k n , where k is the wavenumber. The case
of n = 1 is scale invariant and called the Harrison-Zel’dovich power spectrum. The scalar
and tensor power,
φ̇ = −
Pk ∝ k n
T
PkT ∝ k n ,
are related to the inflation potential via their indices,
n = 1 − 6ǫ + 2η
nT = −2ǫ,
where here ǫ and η correspond to the values when the perturbation scale k leaves the horizon.
We now know that n=1, as expected in the slow-roll limit. A measurement of the tensor
power spectrum provides the information necessary to separately determine ǫ and η and
hence recover the derivatives of the inflaton potential. Now, how much power is in the
tensor spectrum compared to the scalar power spectrum. The ratio is
r=
T
CT
= lS = 12.4ǫ.
S
Cl
Upcoming CMB experiments are typically aiming for r ∼ 0.1, or pushing to a factor of 10
smaller amplitudes that were needed for measuring the scalar field.
Now, the real challenge lies in separating out the tensor signal from gravitational waves
from the other tensor signals, like gravitational lensing. As can be seen in the figures presented in class, gravitational lensing is the dominant signal, and it is only at small l (large
angular scales) that one cam reasonably hope to detect the the B-mode signal from gravitational waves associated with inflation.
105
30.4
Free in the CMB: Sunyaev-Zeldovich
Obviously, in the discussion above we have focused solely on the physics of the CMB and
ignored the ugly observational details associated with foreground sources that contaminate
the signal. While we will largely skip this messy subject, it is worthwhile to note that one
person’s trash is another’s treasure. In particular, perhaps the most interesting foregrounds
are galaxy clusters, which are visible via what is know as the Sunyaev-Zeldovich effect.
Physically, the Sunyaev-Zeldovich effect is inverse Compton scattering. The CMB photons
gain energy by Thomson scattering off the ionized intracluster medium (temperature of order
a few million degrees K). If one looks at the Rayleigh-Jeans (long-wavelength) tail of the
CMB spectrum, one consequently sees a decrement – the sky looks cooler at the location of
the cluster than. At shorter wavelengths, on can instead see an enhancement of photons, so
the sky looks hotter. This is a rather distinctive observational signature, and really the only
way that I know of to generate a negative feature on the CMB. Now, there are actually two
components to the SZ effect – the thermal and kinematic SZ. Essentially, the exact frequency
dependence of the modified spectrum is a function of the motion of the scattering electrons.
The part of the effect due to the random motions of the scattering electrons is called the
thermal SZ effect; the part due to bulk motion of the cluster relative to the CMB is called
the kinematic SZ effect. The thermal component is the part upon which people generally
focus at this point in time. For an radiation field passing through an electron cloud, there
is a quantity called the Comptonization factor, y, which is a dimensionless measure of the
time spent by the radiation in the electron distribution. Along a given line of sight,
y=
Z
dl ne σT
kB Te
,
me c2
(560)
where σT is the Thomson cross-section. For the thermal SZ, along a given line of sight
ne = ne (r) and Te = Te (r), where r is the cluster-centric distance.
Essentially, y gives a measure of the signal strength (“flux”). If the cluster is modelled
as a homogeneous, isothermal sphere of radius Rc , one finds that the maximum temperature
decrement in the cluster center is given by
∆T
4Rc ne kB Te σT
=−
∝ RC Te ,
T
me c2
(561)
where ne and Te are again the electron density and temperature in the cluster. Both quantities scale with the cluster mass.
Now, there is something very important to note about both of the previous two equations.
Both of them depend upon the properties of the cluster (ne ,Te ), but are independent of the
distance to the cluster. What this means is that SZ surveys are in principle able to detect
uniform, roughly mass-limited samples of galaxy clusters at all redshifts. The relevance to
cosmology is that the redshift evolution of the cluster mass function is a very strong function
of cosmological parameters (particularly ΩM and w), so measuring the number of clusters
above a given mass as a function of redshift provides important information. The key point is
106
that this is an extremely sensitive test. The big stumbling block with cluster mass functions
is systematic rather than statistical – relating observed quantities to mass. A nice aspect
of the SZ approach is that they should be roughly mass-limited, although you still want to
have other data (x-ray, optical) to verify this.
Observationally, the SZ folk have be “almost” ready to conduct blind cluster searches for
about a decade (even when I was starting grad school), but it is only in the past year that
clusters have begun to be discovered in this way.
Another application of the SZ effect, which is perhaps less compelling these days, is direct
measurement of the Hubble parameter. This is done by using the ∆T /T relation to get Rc
and then measuring the angular size of the cluster. When done for an ensemble of clusters
to minimize the statistical errors, this can be used to obtain H0 (or more generally ΩM and
ΩΛ if one spans a large redshift baseline). In practice, large systematic uncertainties have
limited the usefulness of this test.
The above is a very quick discussion. If you are particularly interested in the SZ effect,
I recommend Birkinshaw astro-ph/9808050.
31
Dark Matter
Time to turn our attention to the dark side of the universe, starting with dark matter. The
general definition of dark matter is any matter from which we cannot observe electromagnetic
radiation. By this definition, we include such mundane objects as cool white dwarfs as well as
more exotic material. As we shall see though, there is now strong evidence for a component
of exotic, non-baryonic dark matter that dominates the total matter density.
31.1
Classic Observational Evidence
Galaxy Clusters
The first evidence for dark matter was the observation by Fritz Zwicky (1933) that the
velocity dispersion of the Coma cluster is much greater than can be explained by the visible
matter. This is a simple application of standard dynamics, where
GM
v2
2σ 2
=
=
.
r2
r
r
GM
= 2σ 2
r
GL M
= 2σ 2
r
L
2σ 2 r
M
=
,
L
GL
(562)
(563)
(564)
(565)
where L is the total cluster luminosity and M/L is the mass-to-light ratio. Typical stellar
mass to light ratios are of order a few (M⊙ /L⊙ = 1; integrated stellar populations M/L <
107
10). If you plug in appropriate numbers for galaxy clusters, you get M/L ∼ 200 [100-500]
– a factor of 10-50 higher than the stellar value. This was the first direct evidence that the
bulk of matter on cluster scales is in a form other than stars.
In recent years other observations have confirmed that clusters indeed have such large
masses (gravitational lensing, X-ray temperatures), and M/L has been shown to be a function of the halo mass – i.e. lower mass-to-light ratios for smaller systems (see figure in class).
Still, this observation was considered little more than a curiosity until complementary observations of galaxy rotation curves in the 1970’s.
Rotation Curves
In the early 1970’s Rubin and Ford compiled the first large sample of galaxy rotation curves,
finding that the rotation curves were flat at large radii. In other words, the rotation curves
are consistent with solid-body rather than Keplerian rotation, which argues that the observed
disk is embedded in a more massive halo component. These observations were the ones that
elevated the idea of dark matter from an idle curiosity to a central feature of galaxies that
required explanation. Subsequent work also showed that the presence of a massive halo is
actually required in galactic dynamics to maintain disk stability, and the above data played
a key role in influencing the later development of the cold dark matter model of structure
formation (Blumenthal et al. 1984).
31.2
Alternatives
Is there any way to avoid the consequence of dark matter? The most popular alternative is
to modify gravity at large distances. One of the more well-known of these theories is called
Modified Newtonian Dynamics (MOND, Milgrom 1980). The idea here is to change the
Newtonian force law at small accelerations from F = ma to F = µma, where µ = 1 if a > a0
and µ = a/a0 if a < a0 . In our normal everyday experience, we experience a > a0 , so the
modification to the acceleration would only matter for very small accelerations. Now, if we
consider the gravitational attraction of two objects,
F =
GMm
= µma.
r2
(566)
If we assume that at large distances a < a0 so that µ = a/a0 , then
GM
a2
=
r2
a0
√
GMa0
.
a=
r
For a circular orbit,
v2
a=
=
r
so
√
GMa0
,
r
v = (GMa0 )1/4
108
(567)
(568)
(569)
(570)
As you can see, this yields a circular velocity that is constant with radius – a flat rotation
curve. One can calculate the required constant for the galaxy, finding a0 ≃ 10−10 m s−2 .
Similar arguments can be made for explaining the cluster velocity dispersions.
A limitation of MOND is that, like Newtonian gravity, it is not Lorentz covariant. Consequently, just as GR is required as a foundation for cosmology, one would need a Lorentz
covariant version of the theory to test it in a cosmological context. There is now one such
Lorentz covariant version, TeVeS (Tensor-Vector-Scalar theory; Bekenstein 2004), from which
one can construct cosmological world models. However, in order to provide a viable alternative to dark matter, TeVeS – or any other modified gravity theory – must be as successful
as dark matter in explaining a large range of modern cosmological observations, including
our entire picture of structure formation from initial density fluctuations.
31.3
Modern Evidence for Dark Matter
So why do we believe that dark matter exists? While modified gravity is an interesting
means of attempting to avoid the presence of dark matter, at this point I would argue that
we have a preponderance of evidence against this hypothesis. One relatively clean example
is the Bullet Cluster. For this system, we (Clowe et al. 2004,2006) used weak lensing to
demonstrate that the mass and intracluster gas (which contains the bulk of the baryons)
are offset from one another due to viscous drag on the gas. Hence the baryons cannot be
responsible for the lensing and there must be some other component causing the lensing.
The TeVeS community has attempted to reconcile this observation with modified gravity,
but are unable to do so using baryons alone. The are able to manage rough qualitative (and
I would argue poor) agreement if they assume that 80% of the total matter density is in
2 eV neutrinos. [It is worth noting that 2 eV is the maximum mass that a neutrino can
have if one relies on constraints that are independent of GR, but new experiments should be
running in the next few years that will significantly lower this mass limit.] Thus, even with
modified gravity one still requires 80% of the total matter density to be ‘dark’.
Aside from this direct evidence, a compelling argument can be made based upon the
remarkable success of the cold dark matter model in explaining the growth and evolution of
structure in the Universe. Dark matter provides a means for seed density fluctuations to grow
prior to the surface of last scattering, and CDM reproduces the observed growth of structure
in the Universe from the CMB to z = 0. It is not obvious a priori that this should be the
case. As we have seen, the cosmic microwave background provides us with a measurement
of the ratio between total and baryonic matter, arguing that there is roughly a factor of 7
more matter than the baryon density, and yields a measurement of the total matter density
(assuming GR is valid). These results from the CMB, with the baryon density confirmed
by Li abundance measurements, yield densities that, when used as inputs to CDM, produce
the observed structures at the present. The fact that the bulk of the total matter is dark
matter seems unavoidable.
109
31.4
Baryonic Dark Matter
So what is dark matter? From the CMB observations we how have convincing evidence
that much of the dark matter is non-baryonic. Baryonic Dark Matter is worth a few words
though, as it actually dominates the baryon contribution. In fact, only about 10% of baryons
are in the form of stars, and even including HI and molecular clouds the majority of baryonic
matter is not observed. Where is this matter? The predominant form of non-baryonic dark
matter is ionized gas in the intragalactic medium. This is basically all of the gas that
hadn’t fallen into galaxies prior to reionization. In addition, there is some contribution from
MACHOS (Massive Compact Halo Objects) – basically old, cold white dwarfs, neutron stars,
and stellar black holes that we can’t see.
31.5
Non-Baryonic Dark Matter
Non-baryonic matter is more interesting – it dominates the matter distribution (Ωnon−baryonic ∼
0.23) and points the way towards a better understanding of fundamental physics if we can
figure out what it is. There are a vast number of dark matter candidates with varying degrees of plausibility. These can largely be subdivided based upon a few underlying properties.
Most dark matter candidates, with the exceptions of primordial black holes and cosmological defects (both relatively implausible), are considered to be relic particles that decoupled
at some point in the early universe. These particles can be classified by the following two
criteria:
• Are the particles in thermal equilibrium prior to decoupling?
• Are the particles relativistic when they decouple?
We will discuss each case below.
31.5.1
Thermal and Non-Thermal Relics
Let’s start with the question of thermal equilibrium. Thermal relics are particle species that
are held in thermal equilibrium until they decouple. An example would be neutrinos. If
relics are thermal, then we can use the same type of formalism as in the case of neutrinos to
derive their temperature and density evolution. On the other hand, non-thermal relics are
species that are not in equilibrium when they decouple, and hence their expected properties
are less well constrained. We will start our discussion with thermal relics.
First, let us write down the equation for the time evolution of a particle species. If no
particles are being created or destroyed, we know that for a particle X the matter density
evolves as nX ∝ a−3 ,
dn
ȧ
= −3 nx .
(571)
dt
a
If we then let particles be created at a rate ψ and destroyed by collisional annihilation,
dn
ȧ
= −3 nx + ψ− < σA v > n2X .
dt
a
110
(572)
If the creation and annihilation processes have an equilibrium level such that ψ =<
sigmaA v > n2X,eq , then the above becomes
dn
ȧ
= −3 nx + < σA v > (n2X,eq − n2X ),
dt
a
(573)
or converting this to a comoving density via nc = n(a/a0 )3 (with a few intermediate steps),

Note that,
a dnc
< σA v > neq  nc
=−
nc,eq da
ȧ/a
nc,eq
!2

− 1
τH
< σA v > neq
,
=
ȧ/a
τcoll
(574)
(575)
so we are left with a differential equation describing the particle evolution with scale factor
as a function of the relevant timescales. In the limiting cases,
nc ≃ nc,eq if τcoll << τH
nc ≃ nc,decoupling if τcoll >> τH .
(576)
(577)
Not surprisingly, we arrive back at a familiar conclusion. The species has an equilibrium
density before it decouples, and then “freezes out” at the density corresponding to equilibrium at decoupling. How the temperature and density evolve before decoupling depends
upon whether the species is relativistic (“hot”) or non-relativistic (“cold”) when it decouples.
31.5.2
Hot Thermal Relics
For the discussion of hot thermal relics we return to the discussion of internal degrees of
freedom from sections 22-24, correcting a bit of sloppiness that I introduced in that discussion. We have previously shown that for a multi-species fluid the total energy density will
be


4
X
X
7
σr T 4
2
∗ σr T


ρc =
=g
.
(578)
gi +
gi
8 f ermions
2
2
bosons
The first bit of sloppiness is that previously I assumed that all components were in thermal
equilibrium, which meant that in the energy density expression I took temperature out of
the g ∗ expression and defined it as
g∗ =
X
gi +
bosons
7 X
gi .
8 f ermions
(579)
To be fully correct, then expression should be
∗
g =
X
bosons
gi
Ti
T
4
7 X
Ti
+
gi
8 f ermions
T
111
4
.
(580)
We also learned that the entropy for the relativistic components is
2
sr = gS∗ σr T 3 .
3
(581)
The second bit of sloppiness is that in the previous discussion I treated g ∗ and gS∗ interchangeably, which is valid for most of the history of the universe, but not at late times (like
the present). The definition of gS∗ is
gS∗
=
X
bosons
gi
Ti
T
3
7 X
Ti
+
gi
8 f ermions
T
3
.
(582)
Now, for a species that is relativistic when it decouples (3kT >> mc2 ), entropy conservation requires that
∗
3
∗
3
gS,X
T0X
= gS0
T0γ
,
(583)
where
∗
gS0
T0ν
7
= 2 + × 2 × Nν ×
8
T0γ
!3
≃ 3.9
(584)
for Nν = 3.
Anyway, you can also calculate the number density in the same way as before,
!3
gX ζ(3) kB TX
nX = α
π2
h̄c
gX T0X 3
n0X
=α
n0γ
2 T0r
∗
gS,X gS,X
n0X = n0γ α
,
∗
2 gS,0
(585)
(586)
(587)
where α = 3/4 or α = 1 depending on whether the particle is a fermion or boson. The
density parameter in this case is
mX n0X
≃ 2αgX
ΩX =
ρ0c
31.5.3
∗
gS,X
∗
gS,0
!
mX
2
10 eVh−2
(588)
Cold Thermal Relics
The situation is not as straightforward for non-relativistic (“cold”) thermal relics. In this
case, at decoupling the number density is described by the Boltzmann distribution,
ndecoupling,X
gX
=
h̄
mX kB T
2π
112
!3/2
mx c2
exp −
kB T
!
(589)
and hence the present day density is lower by a factor of a3 ,
n0X
g∗
= ndecoupling,X S,0
∗
gX
T0r
Tdecoupling
!3
.
(590)
The catch is figuring out what the decoupling temperature is. As usual, you set τH = τcoll .
We previously saw that
!1/2
0.3h̄TP
3
≃√ ∗
τH ≃
,
(591)
32πGρ
gS kB T 2
(as in equation 7.1.9 in your book), and that
τcoll = (nσv)−1 .
(592)
The definition of the σv part is a bit more complex, since the cross-section can be velocity
dependent. If we parameterize σv as
< σv >= σ0
kB T
mX c2
!q
,
(593)
with q normally having a value of 0 or 1 (i.e. σ ∝ v −1 or σ ∝ v, then working through the
algebra one would find that
ρ0X
31.5.4
3
∗ (kB T0r )
≃ 10 gX
h̄c4 σ0 mP
q
mX c2
kB Tdecoupling
!q+1
(594)
Significance of Hot versus Cold Relics
Physically, there is a much more significant difference between hot and cold relics that how
to calculate the density and temperature. The details of the calculations we will have to
leave for another course (they depend upon Jeans mass calculations, which are covered in
chapter 10). The basic concept though is that after relativistic particles decouple from the
radiation field, they are able to “free-stream” away from the locations of the initial density
perturbations that exist prior to recombination. In essence, the velocity of the particles is
greater than the escape velocity for the density fluctuations that eventually lead to galaxy and
galaxy cluster formation. The net effect is damp the amplitude of these density fluctuations,
which leads to significantly less substructure than is observed on small scales.
In contrast, cold relics (cold dark matter) only damp out structure on scales much smaller
than galaxies, so the fluctuations grow uninterrupted. The difference in the two scenarios
is rather dramatic, and we can easily exclude hot dark matter as a dominant constituent.
Finally, our observations of local structures also tells us that the dark matter must currently
be non-relativistic or else it would not remain bound to galaxies.
113
31.5.5
Non-Thermal Relics
We have shown how one would calculate the density of particles for relics that were in
equilibrium when they decoupled. There does however exist the possibility that the dark
matter consists of particles that were not in thermal equilibrium. If this is the case, then we
are left is a bit of a predicament in this regard, as there is no a priori way to calculate the
density analogous to the previous sections. As we shall see, one of the leading candidates
for dark matter is a non-thermal relic.
31.6
Dark Matter Candidates
At this point we have argued that the dark matter must be non-baryonic and “cold”, but not
necessarily thermal. While the preferred ideas are that dark matter is a particle relic, there
are non-particle candidates as well. Right now we will briefly review a few of the leading
particle candidates, which are motivated by both cosmology and particle physics.
31.6.1
Thermal Relics: WIMPS
Weakly interacting massive particles (WIMPS) are a favorite dark matter candidates. This
class of particles are cold thermal relics. We worked out above a detailed expression for ρ0X ,
but to first order we can make the approximation that,
ΩW IM P ≃
10−26 cm3 s−1
.
< σv >
(595)
For ΩDM ∼ 1 (0.3 being close enough), the annihilation cross section < σv > turns out to
be about what would be predicted for particles with electroweak scale interactions – hence
“weakly interacting” in the name.
From a theoretical perspective, this scale of the annihilation cross-section is potentially a
very exciting clue to both the nature of dark matter and new fundamental physics – specifically the idea of super-symmetry. Stepping back for a moment, the notion of antiparticles
(perhaps rather mundane these days) comes from Dirac (1930), who predicted the existence
of positrons based upon theoretical calculations that indicated electrons should have a symmetric antiparticle. It is now a fundamental element of particle physics that all particles
have associated, oppositely charged antiparticles.
What is relatively new is the idea of “super-symmetry”. Super-symmetry (SUSY) is a
generalization of quantum field theory in which bosons can transform into fermions and vice
versa. In a nutshell, the idea of super-symmetry is that every particle (and antiparticle)
has a super-symmetric partner with opposite spin statistics (spin different by 1/2). In other
words, every boson has a super-symmetric partner that is a fermion, and every fermion has
a super-symmetric partner that is a boson. The partners for quarks and leptons are called
squarks and sleptons; parters for photons are photinos, and for neutral particles (Higgs, etc)
are called neutralinos.
114
Now why would one want to double the number of particles? First, SUSY provides a
framework for potential unification of particle physics and gravity. Of the numerous attempts
to make general relativity consistent with quantum field theory (unifying gravity with the
strong and electroweak forces), all of the most successful attempts have required a new
symmetry. In fact, it has been shown (the Coleman-Mandual theorum) that there is no way
to unify gravity with the standard gauge theories that describe the strong and electroweak
interactions without incorporating some supersymmetry.
There are also several other problems that SUSY addresses – the mass hierarchy problem,
coupling constant unification, and the anomalous muon magnetic moment. We won’t go into
these here, other than to point out that they exist, and briefly explain the coupling constant
problem. Essentially, the strength of the strong, weak, and electromagnetic forces is set by
the coupling constants (like αwk , which the book calls gwk ). These coupling “constants”
(similar to the Hubble constant) are actually not constant, but dependent upon the energy
of the interactions. It was realized several decades ago that the coupling constants for the
three forces should approach the same value at 1015 GeV, allowing “grand unification” of
the three forces. In recent years though, improved observations of the coupling constants
have demonstrated in that in the Standard Model the three coupling constants in fact never
approach the same value. Supersymmetry provides a solution to this problem – if supersymmetric particles exist and have appropriate masses, they can modify the above picture
and force the coupling constants to unify.
The way in which this ties back into dark matter is that the neutral charge supersymmetric particles (broadly grouped under the name neutralinos) become candidate dark
matter particles. Due to a broken symmetry in supersymmetry, the super-symmetric partner
particles do not have the same masses as normal particles, and so can potentially be the dark
matter.
There are many flavors of sypersymmetry, but one popular (and relatively simple) version
called the Minimal Super-symmetric Standard Model (MSSM) illustrates the basic idea. In
MSSM, one takes the standard model, and adds the corresponding super-symmetric partners
(plus an extra Higgs doublet). The lightest super-symmetric particle (LSP) is stable (i.e
doesn’t decay – an obvious key property for dark matter), and typically presumed to be the
main constituent of dark matter in this picture. The combined requirements that the SUSY
model both unify the forces and reproduce the dark matter abundance gives interesting
constraints on the regime of parameter space in which one wants to search. A somewhat
old, but illustrative, example is de Boer et al. (1996). These authors find that there are
two regions of parameter space where the constraints can be simultaneously satisfied. In
the first, the mass of the Higgs particle is relatively light (mH < 110 GeV) and the LSP
abundance is ΩLSP h2 = 0.42 with mLSP = 80 GeV. In the other region, mH = 110 GeV and
the abundance is ΩLSP h2 = 0.19. These values clearly bracket the current best observational
data. Incidentally, note that all of these particles are very non-relativistic at the GUT scale
(1015 GeV), and so quite cold.
115
31.6.2
Axions
Axions are the favorite among the non-thermal relic candidates, and like WIMPS are popular
for reasons pertaining to particle physics as much as cosmology. The axion was originally
proposed as part of a solution to explain the lack of CP (charge-parity) violation in strong
nuclear interactions – e.g. quarks and gluons, which are the fundamental constituents of protons and neutrons (see http://www.phys.washington.edu/groups/admx/the axion.html and
http://www.llnl.gov/str/JanFeb04/Rosenberg.html for a bit of background). CP is violated
for electroweak interactions, and in the standard model it is difficult to explain why the
strong interaction should be finely-tuned to not violated CP in a similar fashion. Somewhat
analogous to the case of supersymmetry, a new symmetry (Peccei-Quinn symmetry) has
been proposed to explain this lack of CP violation. A nice aspect of this solution is that it
explains why neutrons don’t have an electrical dipole moment (although we won’t discuss
this).
An important prediction of this solution is the existence of a particle called the axion.
Axions have no electric charge or spin and interact only weakly with normal matter – exactly
the properties one requires for a dark matter candidate. There are two very interesting
differences between axions and WIMPS though. First, axions are very light. Astrophysical
and cosmological constraints require that 10−6 < maxion < 10−3 eV – comparable to the
plausible mass range for neutrinos. Specifically, the requirement m < 10−3 eV is based upon
SN 1987a – if the axion mass were larger than this value, then the supernova core should
have cooled by both axion and neutrino emission (remember, they’re weakly interacting, but
can interact) and the observed neutrino burst should have been much shorter than observed.
The lower bound, somewhat contrary to intuition, comes from the requirement that the
total axion density not exceed the observed dark matter density. Axions lighter than 10−6
eV would have been overproduced in the Big Bang, yielding Ω >> ΩM .
At a glance, one might think that the low mass of the axion would be a strong argument
against axions being dark matter. After all, shouldn’t axions be relativistic if they are so
light? The answer would be yes – if they were thermal relics. Axions are never coupled to
the radiation field though, and the mechanism that produces them gives them very little
initial momentum, so axions are in fact expected to be quite cold relics.
31.7
Other Candidates
The above two sections describe what are believed to be the most probable dark matter
candidates. It should be pointed out though that (1) WIMPS are a broad class and there
are many options within this category, and (2) there are numerous other suggestions for
dark matter. These other suggestions include such exotic things as primordial black holes
formed at very early times/high density, and cosmic strings. While I would suspect that
these are rather unlikely, they cannot be ruled out. Similarly, it remains possible that all
of the above explanations are wrong. Fortunately, there are a number of experiments now
underway that should either detect or eliminate some of these candidates. To go out on
a limb, my personal guess is that things will turn out to be somewhat more complicated
116
than expected. Specifically, it seems plausible that both axions and WIMPS exist and each
contribute at some level to the total matter density.
31.8
Detection Experiments
So how might one go about detection dark matter? Given the wide range of masses and
interaction cross-sections for the various proposed candidates, the first step is basically this
– pick what you believe is the most plausible candidate and hope that you are correct. If
you are, and can be the first to find it, then a Nobel prize awaits. Conversely, if you pick
the wrong candidate you could very well spend much of your professional career chasing a
ghost. Assuming that you are going to search though, let’s take a look at how people are
attempting to detect the different particles.
Something to keep in mind in general in this discussion is that there are essentially two
classes of dark matter searches – terrestrial direct detection experiments and astrophysical
indirect detection observations. Keep this in mind.
117
32
An Aside on Scalar Fields
In the discussion of inflation we talked about inflation being driven by the vacuum energy in
a scalar field. Since there is some confusion on the concept of scalar fields, let us revisit this
matter briefly. Mathematically, a scalar field is simply a field that at each point in space
can be represented by a scalar value. Everyday examples include things like temperature or
density.
Turning more directly to physics, consider gravitational and electric fields. In Newtonian
gravity, the gravitational potential is a scalar field Φ defined by Poisson’s equation,
∇2 Φ = −4πGρ, where
F = −∇Φ, and
V (Φ) =
Z
ρ(x)Φ(x)d3 x.
Similarly, for an electric potential φ,
∇2 φ = −4πGρ, where
E = −∇φ, and
V (φ) =
Z
ρφd3 x.
In particle physics (quantum field theory), scalar fields are associated with particles. For
instance the Higgs field is associated with the predicted Higgs particle. The Higgs field is
expected to have a non-zero value everywhere and be responsible for giving all particles mass.
In the context of inflation we are simply introducing a new field that follows the same
mathematical formalism. The term vacuum energy density simply means that a region of
vacuum that is devoid of matter and radiation (i.e. no gravitational or electromagnetic
fields) has a non-zero energy density due to energy contained in a field such as the inflaton
field (named because in the particle physics context it should be associated with an inflaton
particle). During inflation this energy is liberated from the inflaton field. Note that dark
energy does not have to be vacuum energy though.
33
33.1
Dark Energy
Generic Properties
As we have seen during the semester, the observable Friedmann equation is H = H0 E(z),
where
#1/2
"
X
X
Ω0i )(1 + z)2
,
(596)
Ω0i (1 + z)3(1+wi ) + (1 −
E(z) =
i
i
and the energy density of any component goes at
ρ = ρ0 (1 + z)3(1+w) ,
118
(597)
where w is the equation of state. Recall that w = 0 for dust-like matter, w = 1/3 for
radiation, and w = −1 for a cosmological constant. While we have previously discussed
the possibility of dark energy corresponding to a cosmological constant, this is not the only
possibility. Indeed, the most generic definition is that any substance or field that has an
equation of state with w < −1/3, which corresponds to a negative energy density, is dark
energy. Perhaps the single most popular question in cosmology at present is the nature of
dark energy, and the best means of probing this question is by attempting to measure w(z).
33.2
Fine Tuning Problem
Let us start by framing this question. As you may have noticed, a recurring theme in
cosmology is the presence of what are called “fine-tuning” problems. These tend to be the
most severe problems and the ones that point the way to new physics (like inflation). In
the current context, there is a very significant fine-tuning problem associated with either a
cosmological constant or a dark energy component with a constant equation of state. For
the specific case of the cosmological constant, the current concordance model values imply
that the universe only started accelerating at z ≃ 0.7 and that the cosmological constant
only began to dominate at z ≃ 0.4. The question is why we should be so close in time to the
era when the dark energy begins to dominate – a point where we can see evidence for the
acceleration, but haven’t yet had structures significantly accelerated away from one another.
Put another way, to get the current ratio of ρΛ /ρm ≈ 2, we require that at the Planck time
ρΛ /ρr ≈ 10−120 . This issue is intricately related to the phrasing of the cosmological constant
problem that we discussed earlier this semester, albeit in a somewhat more general form.
What, then, are possibilities for dark energy, and can these possibilities also alleviate this
fine-tuning problem?
33.3
Cosmological Constant
The cosmological constant remains the leading candidate for dark energy, as recent observations strongly argue that at z = 0 we have w = −1 to within 10%. If it is truly a cosmological
constant, then avoiding the fine tuning problem will require either new physics or a novel
solution (like inflation as a solution to other fine tuning problems).
33.4
Quintessence and Time Variation of the Equation of State
Quintessence is the general name given to models with w ≥ −1. Quintessence models
were introduced as alternatives to the cosmological constant for two reasons – (1) because
you can. If we don’t know why there should be a cosmological constant, why not propose
something else, and (2) because if the equation of state is made to be time-dependent, one
can potentially avoid the fine-tuning problem described above.
There are many types of quintessence, but one feature that most have in common is that,
like a cosmological constant, they are interpreted as being associated with the energy density
119
of scalar fields. These are generally taken to have
1
ρ = φ̇2 + V (φ)
2
1 2
p = φ̇ − V (φ)
2
(598)
(599)
Note that in order to generate an accelerated expansion, the above relations require that
1
3
1
2
φ̇2 < V (φ) if w < − ,
3
2
φ̇2 < V (φ) if w < −
(600)
(601)
which is equivalent to saying that the potential term dominates over the kinetic term – not
necessarily quite slow roll, but not too far off. As you go to w = −1, you very much move
into the slow roll regime.
One particularly entertaining class of quintessence models correspond to what are called
“tracker” models, in which the energy density of the scalar field remains close to the matter/radiation density through most of the history of the universe. It turns out that if the
potential is sufficiently steep that
V ′′ V
> 1,
(602)
(V ′ )2
the scalar field rolling down the potential approaches a common evolutionary path such that
the dark energy tracks the radiation energy density as desired.
33.5
Time Variation of the Equation of State
Now, as this will be part of the discussion as we proceed, there is one important distinction
to note if we have a time variable equation of state. The standard equation
h
H 2 = H02 Ωw (1 + z)3(1+w)
i
(603)
only holds for a constant value of w. If w is also a function of redshift, then it must also be
integrated appropriately, and what you end up with is
2
H =
H02
"
Ωw exp 3
Z
ln(1+z)
0
!#
(1 + w(x))d ln(1 + x).
(604)
The origin of this expression can be seen by returning to the derivation of ρ(z) in §10,
where we derived that for a constant w
ρ = ρ0 (1 + z)3(1+w) ,
(605)
given an adiabatic expansion. If we start with the intermediate equation from that derivation,
da3
dρ
= − (1 + w) 3
ρ
a
120
(606)
we see that for a variable w
ρ
ln
ρ0
!
dρ
= −(1 + w(a))d ln a3
ρ
(607)
dρ
= −(1 + w(z))d ln(1 + z)3
ρ
(608)
=3
ρ
= exp 3
ρ0
Z
ln(1+z)
Z
0
ln(1+z)
0
(1 + w(z))d ln(1 + z)
(609)
!
(610)
(1 + w(z))d ln(1 + z)
(611)
In principle, you can insert any function of w(z) that you prefer. At the moment though,
the data isn’t good enough to constrain a general function, so people typically use a first
order parameterization along the lines of
w = w0 + w1
33.6
z
.
1+z
(612)
Phantom Energy
The recent observations have given rise to serious consideration of one of the more bizarre
possibilities – w < −1. Knop et al. (2003) actually showed that if you removed the priors on
Ωm for the data existing at the time, then the dark energy equation of state yielded a 99%
probability of having w < −1. Current data have improved, with uncertainties somewhat
more symmetric about w = −1, but this possibility persists.
Models with w < −1 violate what is know as the weak energy condition, which simply
means that for these models ρc2 + p < 0 – the universe has a net negative energy density.
If the weak energy condition is violated and the equation of state is constant, this leads to
some rather untenable conclusions, such as.
(1) The scale factor becomes infinite in a finite time after the phantom energy begins to
dominate. Specifically, if w < 1
a ≃ aeq
"
t
(1 + w) − w
teq
#2/3(1+w)
,
(613)
where the subscript eq denotes the time when the matter and phantom energy densities are
equal. Note that the exponent in this equation is negative, which means that the solution is
singular (a → ∞) at a finite point in the future when
t = teq
w
.
1+w
(614)
For example, for w = −1.1, this says that the scale factor diverges when t = 10teq (so we’re
over a tenth of the way there!). If we look back at the standard equation for the Hubble
121
parameter, we see that it also diverges (which is consistent), as does the phantom density,
which increases as
"
#−2
t
ρ ∝ (1 + z) − w
.
(615)
teq
The above divergences have been termed the “Big Rip”.
(2) The sound speed in the medium, v = (|dp/dρ|)1/2 can exceed the speed of light.
It is important to keep in mind that the above issues only transpire if the value of w is
constant. You can get away with temporarily having w < −1.
33.7
Chaplygin Gas
The Chaplygin gas is yet another way to get a dark energy equation of state. Assume that
there is some fluid which exerts a negative pressure of the form
p=
−A
.
ρ
(616)
For an adiabatic expansion, where dE = −pdV , or d(ρa3 ) = −pda3 , this yields
ρ = (A + Ba−6 )1/2 = (A + B(1 + z)6 )1/2 .(see below)
(617)
If you look at the limits of this equation, you see that as z → ∞,
ρ → B 1/2 (1 + z)3 ,
(618)
which is the standard density equation for pressureless dust models, while at late times,
ρ → A1/2 = constant,
(619)
similar to a cosmological constant.
The nice aspect of this solution is that you have a simple transition between matter and
dark energy dominated regimes. In practice, there are certain problems with the Chaplygin
gas models though (such as structure formation). The more recent revision to this proposal
has been for what is called a “generalized Chaplygin gas”, where
p ∝ −ρ−α ,
(620)
which gives and equation of state
w(z) = −
|w0|
,
|w0 | + (1 − |w0 |)(1 + z)3(1+α)
(621)
where w0 is the current value of the equation of state. Note that we are now seeing another
example of a time-dependent equation of state.
122
Derivation of equation for density - starting from the adiabatic expression,
A 3
da
ρ
a3 ρdρ = −(ρ2 − A)da3
−da3
ρdρ
=
ρ2 − A
a3
1/2 ln(ρ2 − A) = B ln a−3
(ρ2 − A) = Ba−6
ρ = (A + Ba−6 )1/2
ρda3 + a3 dρ =
33.8
(622)
(623)
(624)
(625)
(626)
(627)
Cardassian Model
Yet another approach to the entire problem is to modify the Friedmann equation, as with
the Randall-Sundrum model at early times, replacing
H2 =
8πG
ρ
3
(628)
with a general form of H 2 = g(ρ), where g is some arbitrary function of only the matter and
radiation density. The key aspect of Cardassian models is that they don’t include a vacuum
component or curvature – the “dark energy” is entirely contained in this modification of the
Friedmann equation.
A simple version of these models is
H2 =
8πG
ρ + Bρn , where n < 2/3.
3
(629)
In Cardassian models the additional term is negligible at early times and only begins to
dominate recently. Once this term dominates, then a ∝ t2/(3n) . The key point though is
that for these models the universe can be flat, matter-dominated, and accelerating with a
sub-critical matter density.
Moving beyond the above simple example, the “generalized Cardassian model” has

ρ
8πG 
ρ 1+
H2 =
3
ρcard
!q(n−1) 1/q

,
(630)
where n < 2/3, q > 0 and ρcard is a critical density such that the modifications only matter
when ρ < ρcard .
Note that in many ways this is eerily reminiscent of MOND. This scenario cannot be
ruled out though given the current observations, and there is a somewhat better motivation
than in the case of MOND. In particular, modified Friedmann equations arise generically in
theories with extra dimensions (Chung & Freese 1999), such as braneworld scenarios.
123
33.9
Other Alternatives
In the brief amount of time that we have in the semester I can only scratch the surface
(partially because there are a huge number of theories that are only modestly constrained by
the data). For completeness, I will simply list some of the other proposed solutions to dark
energy, such that you know the names if you wish to learn more. These include k−essence,
scalar-tensor models, Quasi-Steady State Cosmology, and Brane world models.
124
34
Gravitational Lensing
Reading: Chapter 19, Coles & Lucchin
Like many other sections of this course, the topic of gravitational lensing could cover an
entire semester. Here we will aim for a shallow, broad overview. I also note that this section
of the notes is currently more sparse than the other sections thus far and most of the lecture
was not directly from these notes. For more in depth reading, I refer you to the following
excellent text on the subject: Gravitational Lensing: Strong, Weak & Micro, Saas-Fee
Advanced Course 33, Meylan et al. (2005).
34.1
Einstein GR vs. Newtonian
A pseudo-Newtonian derivation for the deflection of light yields
α̂ =
2GM
.
rc2
(631)
In general relativity however there is an extra factor of 2 such that the deflection is
α̂ =
4GM
.
rc2
(632)
This can be derived directly from the GR spacetime metric for the weak field limit around
a mass M,
2GM 2 2
2GM
2
ds = 1 +
c
dt
−
1
−
dl2 ,
(633)
rc2
rc2
as you will do in your homework.
-refer to book discussion of deflection of light by the sun.
34.2
Gravitational Optics
I refer the reader here to the Figure 19.1 in Coles & Lucchin or Figure 12 in the Saas-Fee
text. If one considers a beam of light passing through a gravitational field, the amount by
which the beam is deflected is determined by the gradient of the potential perpendicular to
the direction of the beam. Physically, a gradient parallel to the path clearly can have no
effect, and the stronger the gradient the more the light is deflected. The deflection angle is
defined as
Z
2
(634)
α̂ = 2 ∇⊥ Φdl,
c
where l is the direction of the beam. The above is formally only valid in the limit that the
deflection angle is small (i.e. weak field), which for a point source lens is equivalent to saying
that the impact parameter ξ is much larger than the Schwarzschild radius (rs ≡ 2GM/cv 2 ).
Definitions: The lens plane is considered to be the plane that lies at the distance of the
lens; the source plane is equivalently the plane that lies at the distance of the source. It is
125
common to talk about the locations of objects in the source plane and the image in the lens
plane.
A Point Source Lens - For a point source the gravitational potential is
Φ(ξ, x) = −
(ξ 2
GM
,
+ x2 )1/2
(635)
where x is the distance from the lens parallel to the direction of the light ray. Taking the
derivative and integrating along dx, one finds
α̂ =
2
c2
Z
∇⊥ Φdx =
4GM
,
c2 ξ
(636)
which is the GR deflection angle that we saw before.
Extended Lenses - Now, let us consider the more general case of a mass distribution rather
than a point source. We will make what is called the thin lens approximation that all the
matter lies in a thin sheet. In this case, the surface mass density is
Σ(ξ) =
Z
ρ(ξ, x)dx,
(637)
the mass within a radius ξ is
M(ξ) = 2π
Z
ξ
0
Σ(ξ ′ )ξ ′dξ ′ ,
(638)
and the deflection angle is
4G Z (ξ − ξ ′ )Σ(ξ ′ ) 2 ′ 4GM
α̂ = 2
dξ = 2 ,
c
|ξ − ξ ′ |2
cξ
(639)
Another way to think of this is as the continuum limit of the sum of the deflection angles for
a distribution of N point masses. Note that in the above equation α̂ is now a two-dimensional
vector.
The Lens equation - Now, look at the figure referenced above. In this figure α, called
the reduced deflection angle, is the angle in the observers frame between the observed source
and where the unlensed source would be. The angle β is the angle between the lens and the
location of the unlensed source. It is immediately apparent that θ, the angle between the
lense and the observed source is related to these two quantities by the lens equation,
β = θ − α(θ).
(640)
Now, if one assumes that the distances (Ds , Dd s, Dd ) are large, as will always be the case,
then one can immediately show via Euclidean geometry that
α=
Dds
α̂.
Ds
126
(641)
[Note that equation 19.2.10 in Coles & Lucchin is incorrect – the minus sign should be a
plus sign.]
Note that if there is more than one solution to the lens equation then a source at β will
produce several images at different locations.
If we take the definition of α̂ and rewrite the expression in angular rather than spatial
coordinates (ξ = Dd θ), then
α(θ) =
Dds
1
θ − θ′
α̂ = intd2 θ′ κ(θ′ )
,
Ds
π
|θ − θ′ |2
(642)
Σ(Dd θ)
Σcr
(643)
where
κ(θ) =
and
c2
Ds
.
(644)
4πG Dd Dds
In the above equations κ is the convergence, and is also sometimes called the dimensionless
surface mass density. It is the ratio of the surface mass density to the critical surface mass
density Σcr . The significance of Σcr is that for Σ ≥ Σcr the lens is capable of producing
multiple images of sources (assuming that the sources are in the correct locations). This is
the definition of strong lensing, so Σcr is the dividing line between strong and weak lensing.
Your book also provides another way of interpreting the critical density, which is that for the
critical density one can obtain β = 0 for any angle theta – i.e. for a source directly behind
the lens all light rays are focused at a well-defined focal length (which of course will differ
depending on the angle theta).
Axisymmetric Lenses – Now, let’s consider the case of a circularly symmetric lens. In
this case,
Dds
Dds 4GM(θ)
4GM(θ)
α(θ) =
α̂ =
=
,
(645)
2
Ds
D d Ds c θ
Dc2 θ
where
D d Ds
D≡
,
(646)
Dds
and
4GM(θ)
.
(647)
β =θ−
Dc2 θ
The case β = 0 corresponds to
Σcr =
θE =
4GM(θE )
Dc2
!1/2
,
(648)
where θE is called the Einstein radius. A source at β=0 is lensed into a ring of radius θE .
Note that this angle is again simply the deflection angle in GR. One can rewrite the lensing
equation in this case for a circularly symmetric lense to be
β = θ − θE2 /θ,
127
(649)
or
1
β ± (β 2 + 4θE2 )
(650)
2
These solutions correspond to two images – one on each side of the source. One of these is
always at θ < θE , while the other is at θ > θE . In the case of β = 0, the two solutions are
obviously both at the Einstein radius.
General Case – Consider the more general case of a lense that lacks any special symmetry.
Let us define what is called the deflection potential
θ± =
2 Z
Φ(Dd θ, x)dx.
ψ(θ) =
Dc2
(651)
The gradient of this deflection potential with respect to theta is
∇ θ ψ = Dd ∇ξ ψ =
Dds
α̂ = α
Dd
(652)
and the Laplacian is
∇2θ ψ = 2κ(θ) = 2Σ/Σcr .
(653)
The significance of this is that we can express the potential and deflection angle in terms of
the convergence,
1Z
κ(θ) log |θ − θ′ |d2 θ′
ψ(θ) =
π
1Z
θ − θ′ 2 ′
α(θ) =
κ(θ)
dθ
π
|θ − θ′ |2
(654)
(655)
(skip the last two equations in class).
We will momentarily see why Φ(θ) is a useful quantity. Let us return to the lensing
equation
β = θ − α(θ),
(656)
where all quantities can be considered vectors in the lens plane with components in both the
x and y directions (which we will can for example θ1 and θ2 ).
Let us define a matrix based upon the derivative of β with respect to θ
Aij
=
∂βi
∂θj
= δij −
= δij −
∂αi (θ)
∂θj
∂2ψ
∂θi ∂θj
(657)
(658)
(659)
This is the matrix that maps the source plane to the lens(image) plane.
Now, let us see why ψ is particularly useful. Equation 653 can be rewritten as
κ=
1
(ψ11 + ψ22 ) ,
2
128
(660)
and we can also use the deflection potential to construct a shear tensor,
γ1 =
1
(ψ11 − ψ22 )
2
γ2 = ψ12 .
(661)
(662)
Recall that convergence corresponds to a global size change of the image, while shear
corresponds to stretching of the image in a given direction. Using these definitions for shear
and convergence, we can rewrite A as
A(θ) =
1 − κ − γ1
−γ2
−γ2
1 − κ + γ1
!
1 − g1 −g2
−g2 1 + g1
!
or
A(θ) = (1 − κ)
.
.
where g ≡ γ/(1 − κ) is called the reduced shear tensor. When you look at a lensed image
on the sky it is this reduced shear tensor that is actually observable. What you really want
to measure though is κ, since this quantity is linearly proportional to mass surface density
(at least in the context of general relativity). I will skip the details, but given the mapping
A in terms of κ and g, one can derive an expression for the convergence of
1
ln(1 − κ) =
1 − g12 − g22
1 − g1 −g2
−g2 1 + g1
!
g1,1 + g2,2
g2,1 − g2,2
!
.
From this one can then recover the mass distribution. The one caveat here is what is known
as mass sheet degeneracy, which simply put states that the solution is only determined
to within an arbitrary constant. To see this consider a completely uniform sheet of mass.
What is the deflection angle? Zero. Thus, you can always modify your mass distribution
by an arbitrary constant. For determinations of masses for systems like galaxy clusters the
assumption is that far enough away the mass density goes to zero (at least associated with
the cluster).
34.3
Magnification
It is worth pointing out that the magnification of a source is given by the ratio of the
observed solid angle to the unlensed solid angle. This is described by a magnification tensor
M(θ) = A−1 such that the magnification is
µ=
.
∂2θ
1
1
= detM =
=
2
∂β
detA
(1 − κ)2 − |γ|2
129
(663)
34.4
Critical Curves and Caustics
Definitions: Critical curves are locations in the lens plane where the Jacobian vanishes (det
A(θ) = 0). These are smooth, closed curves and formally correspond to infinite magnification, though the limits of the geometrical optics approximation break down before this
point. Lensed images that lie near critical curves are highly magnified though, and for high
redshift galaxies behind galaxy clusters these magnifications can reach factors of 25-50.
Definitions: Caustics correspond the the mapping of the critical curves into the source
plane – i.e. they are the locations at which sources must lie in the source plane for an image
to appear on the critical curve.
130