Download Taylor`s Theorem for smooth functions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Minkowski space wikipedia , lookup

Partial differential equation wikipedia , lookup

Vector space wikipedia , lookup

Derivation of the Navier–Stokes equations wikipedia , lookup

Metric tensor wikipedia , lookup

Four-vector wikipedia , lookup

Noether's theorem wikipedia , lookup

Transcript
Taylor’s Theorem for smooth functions
Let p be a point of a real affine space M of finite dimension, whose tangent
space at any point is the real vector space X of finite dimension.
Let V be an open subset of M, such that for any x ∈ X, the line segment
joining x to p lies in V.
This line segment has the natural parametrization x+t(p−x), where t ∈ [0, 1].
Let f : V → Y be a smooth function taking values in a real vector space Y
of finite dimensions.
For any positive integer k and for any x ∈ V, put:
Z 1 k−1
t
f (k) (x + t(p − x)) dt,
gk (x) =
(k
−
1)!
0
hk (x) = gk (x)((x − p)k ),
(x−p)k = (x−p)⊗k = (x−p)⊗(x−p) · · ·⊗(x−p) (k factors of (x − p) in this tensor product).
Note that gk is well-defined and smooth on V and takes values in (X∗ )k ⊗ Y.
Also we have the formulas, using the Fundamental Theorem of Calculus and
the Chain Rule:
Z
f (k) (p) 1 k−1
f (k) (p)
gk (p) =
t dt =
,
(k − 1)! 0
k!
Z 1
h1 (x) = g1 (x)(x − p) =
f (1) (x + t(p − x))(x − p) dt
0
Z
=−
0
1
d
(f (x + t(p − x))) dt = − [f (x + t(p − x))]t=1
t=0 = f (x) − f (p).
dt
Also, again by the Fundamental Theorem of Calculus and the Chain Rule,
we have, for any positive integer k and any x ∈ V:
Z 1 k−1
t
tk (k+1)
(k)
k
k+1
hk (x)−hk+1 (x) =
f (x + t(p − x))((x − p) ) − f
(x + t(p − x))((x − p) ) dt
(k − 1)!
k!
0
Z 1 k
d t (k)
k
=
f (x + t(p − x))((x − p) ) dt
k!
0 dt
k
t=1
t (k)
1
k
f (x + t(p − x))((x − p) )
= f (k) (p)((x − p)k ).
=
k!
k!
t=0
1
Summing this relation from k = 1 to n, the sum on the left-hand side collapses
and we get, for any x ∈ V and any positive integer n, the formula:
n
X
1 (k)
f (p)((x − p)k ).
h1 (x) − hn+1 (x) =
k!
k=1
Rearranging, and using the formula h1 (x) = f (x) − f (p), proved above, we
have proved Taylor’s theorem for smooth functions, valid for any x ∈ V and
any non-negative integer n:
n+1
f (x) = gn+1 (x)((x − p)
1
gn+1 (x) =
n!
Z
n
X
1 (k)
(f (p)((x − p)k ),
)+
k!
k=0
1
n (n+1)
t f
(x + t(p − x)) dt,
0
f (n+1) (p)
gn+1 (p) =
.
(n + 1)!
Here gn+1 : V → (X∗ )(n+1) ⊗ Y is smooth.
Note that the formula, for a given non-negative integer n, is also valid, as
written, when f is only C n+1 on V, except that the remainder function gn+1
is only continuous, not necessarily smooth.
Z
=
0
1
Finally note that if we write f (k) = Dk f , where D is the derivative operator, we have formally (non-rigorously), using the series for eu :
t=1
(e(x−p).D f )(p) − f (x) = (et(x−p).D f )(x + t(p − x)) t=0
∞ Z 1
X
d
d tk (k)
k
t(x−p).D
f (x + t(p − x))((x − p) ) dt
(e
f )(x + t(p − x)) dt =
dt
dt k!
k=0 0
Z 1
∞
X
d
=
(hk (x)−hk+1 (x))+
[f (x + t(p − x))] dt = h1 (x)+f (p)−f (x) = 0.
0 dt
k=1
Here we needed to assume that hn (x) → 0 as n → ∞.
So formally, we have the equation (e(x−p).D f )(p) = f (x).
This formula can be rigorously justified in the analytic case for x in the range
of convergence around the base point p.
Note that in the smooth case, however, it is generally false, since, for example
1
in one-dimension, for the smooth function f (x) = e− x , for x > 0 and f (x) =
0, for x ≤ 0, we have (ex.D f )(0) = 0 6= f (x), since f and all its derivatives
vanish at the origin.
2
Vector fields, first-order homogeneous differential operators and derivations
We work in the smooth category.
Fix a real affine space M of finite dimension, whose tangent space at any
point is the real finite dimensional vector space X.
Also fix U a non-empty open subset of M.
The co-ordinate vector function x of U assigns to any x ∈ U the position
vector x ∈ X, relative to a chosen origin for M, so x can be regarded as a
smooth function on U with values in X.
In particular we have, for its (constant) derivative:
x0 = δ ∈ X∗ ⊗ X.
Here δ is the Kronecker delta, which gives the identity endomorphism of X.
A derivation v of C ∞ (U) is by definition a map v : C ∞ (U) → C ∞ (U), such
that:
• v kills constants, v(c) = 0, for any constant function c on U.
• v is an abelian group homomorphism: v(f + g) = v(f ) + v(g), for any
f and g in C ∞ (U).
• v obeys a Liebniz rule: for any f and g in U, we have:
v(f g) = v(f )g + f v(g).
Intuitively v is an infinitesimal automorphism of C ∞ (U).
Denote by T (U) the collection of all derivations of C ∞ (U).
Then it is clear that T (U) is a C ∞ (U)-module, under left multiplication by
elements of C ∞ (U).
• By definition, a smooth vector field on U is a derivation of C ∞ (U).
We want to find a concrete description of T (U).
3
Finally note that we can introduce the derivative derivation ∂:
∂ : C∞ (U) → X∗ ⊗ C∞ (U).
Here ∂ is defined by the succinct formula, valid for any f ∈ C∞ (U):
∂(f ) = f 0 .
To this end, we introduce the derivative derivation ∂:
∂ : C∞ (U) → X∗ ⊗ C∞ (U).
Here ∂ is defined by the succinct formula, valid for any f ∈ C∞ (U):
∂(f ) = f 0 .
Note that the properties of the derivative immediately give that ∂ is a derivation, as required. Then if w is a smooth function on U with values in X, w
determines an operator on C ∞ (U), denoted w.∂ : C ∞ (U) → C ∞ (U) given by
the formula, valid for any f ∈ C ∞ (U) and any x ∈ U:
(w.∂(f ))(x) = f 0 (x)(w(x)).
So the pairing (w, ∂) → w.∂ is the usual dual pairing of a dual vector with a
vector. Normally one would write this as ∂(w), but we want to make clear
that the derivative operator is not acting on w, so we keep it to the left of the
derivative operator ∂, which acts to the right. Note that immediately from its
definition, by the standard properties of the derivative, we have w.∂ ∈ T (U).
Also note that since x0 is the Kronecker delta, we have:
w.∂(x) = w(x).
This means, in particular, that given the operator w.∂, the smooth function
w is uniquely determined.
• By definition, a first-order smooth homogeneous differential operator is
an operator on C ∞ (U), of the form w.∂, for some w, a smooth function
on U with values in X.
(A general smooth first-order operator is then of the form f → w.∂(f ) + gf ,
where g is a smooth function on U).
4
The main theorem we want to prove is:
• An operator on C ∞ (U) is a derivation, if and only if it is a smooth vector
field, if and only if it is a first-order smooth homogeneous differential
operator.
Everything in this theorem is immediate from the definitions and the discussion above, except for the following, which we have to prove:
• Given a derivation v of C ∞ (U), we must show that for any f ∈ C ∞ (U)
and for any x ∈ U, we have the formula:
(v(f ))(x) = f 0 (x)(v(x)).
The first key result is the following:
• Let φ 6= V ⊂ U, with V open.
If f ∈ C ∞ (U) is such that f |V vanishes, then v(f )|V = 0.
Proof:
Let p ∈ V.
We must show that v(f )(p) = 0.
Let b ∈ C ∞ (U) be a bump function, supported in V, such that there exists an
open subset W ⊂ V, with p ∈ W, such that b|W = 1 (the constant function).
We know that such a function b exists, by work done earlier this term.
Then we have bf = 0 on U, since b vanishes outside V and f vanishes inside.
We write the Liebniz rule:
0 = v(0) = v(bf ) = bv(f ) + v(b)f.
Evaluate at the point p.
We get, since b(p) = 1 and f (p) = 0:
0 = b(p)v(f )(p) + v(b)(p)f (p) = v(f )(p) + 0 = v(f )(p).
So the required result holds and we are done.
We can rephrase this result as follows: for f ∈ C∞ (U), denote by Z(f )
the closure of the open subset of U on which f is non-zero. Z(f ) is called
the support of f . Note that f vanishes on the open set U − Z(f ). Then the
result just proved implies that for any f ∈ C∞ (U) and for any vector field
v ∈ T (U), we have Z(v(f )) ⊂ Z(f ): i.e. the action of v on C ∞ (U) is support
non-increasing.
5
Now fix p ∈ U.
Pick V, an open ball centered at p with V ⊂ U.
The ball V exists, since U is open.
Also if x ∈ V, the line segment from p to x lies entirely within V.
So we can apply the second-order Taylor Theorem to the set V, based at
p, a special case of the full Taylor Theorem proved above:
• We have the expansion, valid for any x ∈ V:
f (x) = f (p) + f 0 (p)(x − p) + g(x)((x − p)2 ).
Here g(x) is a smooth function on V with values in X∗ X∗ and (x −
p)2 = (x−p)⊗(x−p) ∈ X⊗X, with the property that g(p) = 2−1 f 00 (p).
Let b ∈ C ∞ (U) be a bump function, supported in V, such that there exists an
open subset W ⊂ V, with p ∈ W, such that b|W = 1 (the constant function).
Again, by work done earlier this term, we know that such a function b exists.
Then for any x ∈ U, put:
• G(x) = b(x)g(x), if x ∈ V,
• G(x) = 0, if x ∈
/ V.
Then G is a smooth function on U with values in X∗ X∗ .
Also, by construction, we have G(x) = g(x), for any x ∈ W.
Next define a function F by the formula, valid for any x ∈ U:
F (x) = f (p) + f 0 (p)(x − p) + G(x)(x − p, x − p).
Then F ∈ C ∞ (U) and by construction, we have (F − f )|W = 0.
It follows in particular, by the support non-increasing property of v, proved
above, that v(F − f )(p) = 0.
This gives the relation, using the derivation property of v:
v(f )(p) = v(F )(p)−v(F −f )(p) = v(F )(p) = v (f (p) + f 0 (p)(x − p) + G(x)(x − p, x − p)) (p)
= f 0 (p)(v(x − p))(p) + v(G(x)(x − p, x − p))(p)
= f 0 (p)(v(x)(p)) + (v(G(x))(x − p, x − p))(p) + 2(G(x)(v(x − p), x − p))(p)
= (f 0 (x)(v(x))(p) + v(G(x))(p)(p − p, p − p) + 2G(p)(v(x)(p), p − p)
= (f 0 (x)(v(x))(p) + v(G(x))(p)(0, 0) + 2G(p)(v(x)(p), 0) = (f 0 (x)(v(x))(p).
Since p ∈ U is arbitrary, we have v(f )(x) = f 0 (x)(v(x)), for any x ∈ U and
any f ∈ C ∞ (U). So v = ∂w , where w(x) = v(x), for any x ∈ U and we are
done.
6
We may rephrase the theorem just proved as follows:
• Every derivation V of C∞ (U) is given by the formula V = v(x).∂, for
some smooth function v defined on U with values in X. Also v(x) =
V (x).
More explicitly, we have, using co-ordinates (x1 , x2 , . . . , xn ) and their corre∂
sponding partial derivative operators ∂i =
, i = 1, 2, . . . , n.
∂xi
V =
j
X
v i ∂i .
i=1
Here the coefficients v i = V (xi ), i = 1, 2, . . . , n are smooth functions. More
generally, if {ei , I = 1, 2, . . . , k} forms a basis of derivations, we can write:
V =
j
X
V i ei .
i=1
Here the coefficients V i are smooth functions. Then if {fi , I = 1, 2, . . . , k}
forms another basis, we have transitions formula:
fi =
k
X
mji ej ,
ei =
k
X
nji fj .
j=1
j=1
Here each of mji and nji is a smoothly invertible matrices of smooth functions
on U and each is the inverse of the other:
k
X
mki njk
t=1
=
k
X
nki mjk = δij .
t=1
The special case where we have two systems of coordinates X and Y for U.
By definition these are smooth diffeomorphisms: X : U → X and Y : U → Y
where X and Y lie in Rk and k is the dimension of U. Then, given a vector
field, V the list of functions V (X) determines V uniquely, as does the list
V (Y ) and they are connected by the chain rule:
i
V (Y ) =
k
X
Jji (X)V (X j ).
j=1
7
Here Jji is the Jacobian matrix of the transformation. More explicitly, we
have Y as a composite function of X and then the Jacobian is the Jacobian
matrix of the transformation. For example:
• Let U = {(x, y, z) ∈ R3 : if y = 0, then x > 0}
• Let X −1 : (r, t, w) ∈ R+ × (−π, π) × R → U be given by the formula
X −1 (r, t, w) = (r cos(t), r sin(t), w).
In co-ordinates, we have:
x = r cos(t), y = r sin(t), z = w,
p
r = x2 + y 2 , t = arctan(yx−1 ), w = z.
• Let Y −1 map (s, p, u) ∈ R+ × (0, π) × (−π, π) → U be given by the
formula Y −1 (s, u, p) = (s cos(u) sin(p), s sin(u) sin(p), s cos(p)).
In co-ordinates, we have:
x = s cos(u) sin(p),
s=
p
x2 + y 2 + z 2 ,
y = s sin(u) sin(p),
z = s cos(p),
21 !
2
2
x +y
.
p = arctan
2
x + y2 + z2
u = arctan(yx−1 ),
Then we have:
Y ◦ X −1 (r, t, w) = Y (r cos(t), r sin(t), w) = (s, u, p),
r
√
2
2
s = r + w , u = t, p = arctan
,
w
r = s sin(p), t = u, w = s cos(p).
The (x, y, z) coordinate basis of derivations ∂ = [∂x , ∂y , ∂z ] is related to the
(r, t, w) co-ordinate derivations [∂r , ∂t , ∂w ] by the invertible matrix transformations:
[∂r , ∂t , ∂w ] = [∂x , ∂y , ∂z ]
∂x
∂r
∂y
∂r
∂z
∂r
∂x
∂t
∂y
∂t
∂z
∂t
∂x
∂w
∂y
∂w
∂z
∂w
cos(t) −r sin(t) 0
= [∂x , ∂y , ∂z ] sin(t) r cos(t) 0 ,
0
0
1
1
x(x2 + y 2 )− 2 −y 0
1
= [∂x , ∂y , ∂z ] y(x2 + y 2 )− 2 x 0 .
0
0 1
8
The inverse transformations are:
[∂x , ∂y , ∂z ] = [∂r , ∂t , ∂w ]
∂r
∂x
∂t
∂x
∂w
∂x
∂r
∂y
∂t
∂y
∂w
∂y
∂r
∂z
∂t
∂z
∂w
∂z
1
1
x(x2 + y 2 )− 2 y(x2 + y 2 )− 2 0
= [∂r , ∂t , ∂w ] −y(x2 + y 2 )−1 x(x2 + y 2 )−1 0
0
0
1
cos(t)
sin(t)
0
= [∂r , ∂t , ∂w ] −r−1 sin(t) r−1 cos(t) 0 .
0
0
1
Similarly, we have:
[∂s , ∂p , ∂u ] = [∂x , ∂y , ∂z ]
∂x
∂s
∂y
∂s
∂z
∂s
∂x
∂p
∂y
∂p
∂z
∂p
∂x
∂u
∂y
∂u
∂z
∂u
cos(u) sin(p) s cos(u) cos(p) −s sin(u) sin(p)
= [∂x , ∂y , ∂z ] sin(u) sin(p) s sin(u) cos(p) s cos(u) sin(p)
cos(p)
−s sin(p)
0
1
1
x(x2 + y 2 + z 2 )− 2 zx(x2 + y 2 )− 2 −y
1
1
= [∂x , ∂y , ∂z ] y(x2 + y 2 + z 2 )− 2 zy(x2 + y 2 )− 2 x .
1
1
z(x2 + y 2 + z 2 )− 2 −(x2 + y 2 ) 2
0
The inverse transformations are:
[∂x , ∂y , ∂z ] = [∂s , ∂p , ∂u ]
∂s
∂x
∂p
∂x
∂u
∂x
∂s
∂y
∂p
∂y
∂u
∂y
∂s
∂z
∂p
∂z
∂u
∂z
1
1
1
y(x2 + y 2 + z 2 )− 2
z(x2 + y 2 + z 2 )− 2
x(x2 + y 2 + z 2 )− 2
1
1
1
= [∂s , ∂p , ∂u ] zx(x2 + y 2 )− 2 (x2 + y 2 + z 2 )−1 yz(x2 + y 2 )− 2 (x2 + y 2 + z 2 )−1 −(x2 + y 2 ) 2 (x2 + y 2 + z 2 )−1 .
−y(x2 + y 2 )−1
x(x2 + y 2 )−1
0
Similarly, we have the inverse pair of transformations:
[∂r , ∂t , ∂w ] = [∂s , ∂p , ∂u ]
∂s
∂r
∂p
∂r
∂u
∂r
∂s
∂t
∂p
∂t
∂u
∂t
∂s
∂w
∂p
∂w
∂u
∂w
1
1
r(r2 + w2 )− 2 0 w(r2 + w2 )− 2
= [∂s , ∂p , ∂u ] w(r2 + w2 )−1 0 −r(r2 + w2 )−1
0
1
0
sin(p)
0
cos(p)
= [∂s , ∂p , ∂u ] cos(p)s−1 0 − sin(p)s−1 ,
0
1
0
[∂s , ∂p , ∂u ] = [∂r , ∂t , ∂w ]
∂r
∂s
∂t
∂s
∂w
∂s
∂r
∂p
∂t
∂p
∂w
∂p
∂r
∂u
∂t
∂u
∂w
∂u
1
sin(p) s cos(p) 0
r(r2 + w2 )− 2 w 0
0
0
1 = [∂r , ∂t , ∂w ]
0
0 1 .
= [∂r , ∂t , ∂w ]
1
−
2
2
cos(p) −s sin(p) 0
w(r + w ) 2 −r 0
It may be checked that these matrix transformations are compatible.
9
Finally note that smooth vector fields have the sheaf property: if we have
an open subset V of U, then a smooth vector field V on U restricts naturally
to a vector field, denoted V |V , on V, simply by restricting the coefficient
functions v i .
More invariantly, if we are given f ∈ C ∞ (V) and v ∈ T (U) and we wish
to compute v|V (f ), we can do this point-wise as follows:
Given x ∈ V, pick a smooth bump function bx on U which has the constant value 1 in an open set containing x and is zero outside V (we know how
to do this). Then the product function g = f bx is smooth on V and extends
naturally, by zero, to a smooth function on U, still called g. Then V (g) is then
well-defined as a smooth function on U. Now put (V |V (f ))(x) = (v(g))(x).
It is easy to see that this number is independent of the various choices involved. As x varies in V this gives the required smooth function (V |V (f )) on
V and then the map f → (V |V (f )) for f ∈ C ∞ (V) is the required restriction
of the derivation V . Finally this restriction operation makes the collection
of derivations into a sheaf of C ∞ modules over the space M.
In particular
we can assign a ”value” to the vector
field V at a point p:
P
P
if V = ki=1 v i (x)∂i , then its value at p is Vp = ki=1 v i (p)∂i . This is called
a tangent vector at p. The space of all such tangent vectors at p is a vector space of dimension k, denoted Tp . Note that a tangent vector acts on
smooth functions by Vp (f ) = V (f )(p). This can be shown easily to be independent of the choices made and depends only on the image of f in the
space Tp∗ = Zp /Zp2 (called the co-tangent space at p), where Zp is the space
of smooth functions defined near p which vanish at p and Zp2 is the space
spanned by all squares of elements of Zp . Then the image of f in Tp∗ is called
(df )p and we have:
Vp =
k
X
i
v (p)∂i , (df )p =
i=1
Vp (f ) =
k
X
(∂i f )(p)dxi ,
i=1
k
X
v i (p)(∂i f )(p).
i=1
So the tangent space Tp and the co-tangent space Tp∗ at p are dual.
10
The Lie bracket
For v ∈ T (U) and w ∈ T (U) and f ∈ C ∞ (U) define:
[v, w](f ) = v(w(f )) − w(v(f )).
As f varies this gives a map [v, w] : C ∞ (U) → C ∞ (U).
• Then [v, w] ∈ T (U ).
The vector field [v, w] is called the Lie bracket of the vector fields v and w.
Proof:
The only non-trivial property to be checked is the Liebniz rule.
So let f and g be smooth functions.
Then we have:
[v, w](f g) = v(w(f g)) − (w(v(f g))
= v(w(f )g + f w(g)) − (w(v(f )g + f v(g))
= v(w(f )g) + v(f w(g)) − w(v(f )g) − w(f v(g))
= v(w(f ))g+w(f )v(g)+v(f )w(g)+f v(w(g))−w(v(f ))g−v(f )w(g)−w(f )v(g)−f w(v(g))
= (v(w(f )) − w(v(f )))g + f (v(w(g)) − w(v(g)))
= ([v, w](f ))g + f [v, w](g).
From its definition, we see that the Lie bracket is skew symmetric and linear
in each argument over the reals:
[v, w] = −[w, v],
[v + x, w + y] = [v, w] + [x, w] + [v, y] + [x, y],
[av, bw] = ab[v, w].
Here v, w, x and y are in T (U ) and a and b are real constants.
Also we have a Liebniz rule for multiplication by functions:
[f v, gw] = f g[v, w] + f v(g)w − gw(f )v.
This is valid for any f and g in C ∞ (U) and any v and w in T (U).
The proof is direct calculation: for any h ∈ C ∞ (U), we have:
[f v, gw](h) = f v(gw(h))−gw(f v(h)) = f v(g)w(h)+f gv(w(h))−gw(f )v(h)−gf w(v(h))
= f g(v(w(h)) − w(v(h))) + f v(g)w(h) − gw(f )v(h)
= (f g[v, w] + f v(g)w − gw(f )v)(h).
11
Pn i
Pn
j
Using this rule we see that if v =
i=1 v ∂i and w =
j=1 w ∂j , where
the ∂i , i = 1, 2, . . . , n are the partial derivative derivations and where the
coefficient functions v i and wj are smooth on U, then we have:
[v, w] =
n
X
z k ∂k ,
k=1
k
k
k
z = v(w ) − w(v ) =
n
X
v i ∂i wk − wi ∂i v k .
i=1
Note that the partial derivative derivations commute:
[∂i , ∂j ] = 0.
Finally, we have the Jacobi identity for any three vector fields v, w and x in
T (U):
[v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0.
Again this is checked by direct calculation.
For any h ∈ C ∞ (U), we have:
([v, [w, x]]+[w, [x, v]]+[x, [v, w]])(h) = [v, [w, x]](h)+[w, [x, v]](h)+[x, [v, w]](h)
= v([w, x](h))−[w, x](v(h))+w([x, v](h))−[x, v](w(h))+x([v, w](h))−[v, w](x(h))
= v(w(x(h)) − x(w(h))) − (w(x(v(h))) − x(w(v(h)))) + w(x(v(h)) − v(x(h)))
−(x(v(w(h)))−v(x(w(h))))+x(v(w(h))−w(v(h)))−(v(w(x(h)))−(w(v(x(h))))
= v(w(x(h)))−v(x(w(h)))−w(x(v(h)))+x(w(v(h)))+w(x(v(h)))−w(v(x(h)))
−x(v(w(h)))+v(x(w(h)))+x(v(w(h)))−x(w(v(h)))−v(w(x(h)))+w(v(x(h))) = 0.
For each v ∈ T (U), we define the Lie derivative operator Lv acting on smooth
functions and on vector fields, by the formulas, valid for any vector field
w ∈ T (U) and any smooth function f on U:
• Lv f = v(f ),
• Lv w = [v, w].
12
Then we have, rewriting the above formulas, valid for any f and g in C ∞ (U)
and any smooth vector fields v, w and x on U and for any real constant
function c on U:
Lf v = f Lv ,
Lv+w = Lv + Lw ,
Lv w = −Lw v,
Lv c = 0,
Lv (f + g) = Lv f + Lv g,
Lv (f g) = (Lv f )g + f Lv g,
Lv (w + x) = Lv w + Lv x,
Lv (f w) = (Lv f )w + f Lv w,
Lv Lw − Lw Lv = L[v,w] .
The contravariant tensor algebra
The contravariant tensor algebra of U is the (associative) tensor algebra
denoted T (U), generated by T 0 (U) = C ∞ (U) and by T 1 (U), the space of
smooth vector fields on U (note the change of notation here, the space T 1 (U),
by itself, was previously called T (U)), subject to the relations that multiplication of tensors is linear in each argument, over the ring C ∞ (U). So each
tensor is a (finite) sum of words v = f v1 v2 . . . vk (called a word of k letters),
with f ∈ C ∞ (U) and vi ∈ T 1 (U), i = 1, 2, . . . , k, for k a non-negative integer, with the product of the word v with the word w = gw1 w2 . . . wm , with
g ∈ C ∞ (U) and wi ∈ T 1 (U), i = 1, 2, . . . , m, for m a non-negative integer,
being the k + m word x = vw = hx1 x2 . . . xk+m , where h = f g and xi = vi ,
for 1 ≤ i ≤ k and xi = wi−k , for k + 1 ≤ i ≤ k + m. The subspace of T (U)
spanned by words of k letters, for a fixed k is denoted T k (U) and its elements
are called tensors of type k. Then the product maps T p (U) × T q (U) into
T p+q (U), for any nonnegative integers p and q.
In co-ordinates, every tensor τ of T p (U) can be written uniquely:
τ=
n X
n
X
i1 =1 i2 =1
···
n
X
τ i1 i2 ...ip (x)∂i1 ∂i2 . . . ∂ip .
ip =1
13
Here each coefficient τ i1 i2 ...ip (x) is a smooth function on U. To avoid confusion
with differential operators, it is traditional to introduce the notation ⊗ for
the tensor product, so τ is then written:
τ=
n X
n
X
i1 =1 i2 =1
···
n
X
τ i1 i2 ...ip (x)∂i1 ⊗ ∂i2 ⊗ · · · ⊗ ∂ip .
ip =1
So the np words ∂i1 ⊗ ∂i2 ⊗ · · · ⊗ ∂ip with 1 ≤ ij ≤ n, for each j = 1, 2, . . . , p,
form a basis for T p (U), over the ring C ∞ (U), called the coordinate basis.
The symmetric tensor algebra ST (U) is the algebra obtained from the tensor algebra, by requiring that all letters commute, so the multiplication in
this algebra is commutative and indeed is a standard commutative algebra
of polynomials in n-variables over C ∞ (U). So there is a surjective algebra
homomorphism from T (U) to ST (U). which maps each word to itself.
14
For clarity the symmetric tensor product is often represented by the notation . The image of the co-ordinate basis for T p (U), gives the following
coordinate basis of words for the image ST p (U) of T p (U):
∂i1 ∂i2 · · · ∂ip , for 1 ≤ i1 ≤ i2 · · · ≤ ip ≤ n.
n+p−1
This basis has
elements.
p
The skew-symmetric tensor algebra ΩT (U) is the algebra obtained from
the tensor algebra, by requiring that all letters representing vector fields
anti-commute, so the multiplication in this algebra is graded-commutative:
words with an even number of letters commute with everything, words with
an odd number of letters anti-commute with each other. There is a surjective
algebra homomorphism from T (U) to ΩT (U), which maps each word to itself.
For clarity the skew-symmetric tensor product is often represented by the
notation ∧. The image of the co-ordinate basis for T p (U), gives the following coordinate basis for the image ST p (U) of T p (U):
∂i1 ∧ ∂i2 ∧ · · · ∧ ∂ip , for 1 < i1 < i2 · · · < ip < n.
n
This basis has
elements.
p
In particular, we have ΩT p (U) = 0 if p > n and the total dimension of ΩT (U)
n X
n
∞
over C (U) is finite and is
= 2n .
p
p=0
If v is a vector field, the Lie derivative along v, Lv extends naturally to
T (U) such that the Lie derivative acts on vector fields and smooth functions
as given above and for any tensors τ and υ any smooth vector fields v and
w and any smooth function f on U, we have:
Lv (τ + υ) = Lv (τ ) + Lv (υ),
Lv (τ υ) = Lv (τ )υ + τ Lv (υ),
Lv+w = Lv + Lw ,
Lf v = f Lv ,
Lv Lw − Lw Lv = L[v,w] .
15
Acting on the word w = gw1 w2 . . . wk , we have:
Lv w = v(g)w1 w2 . . . wk +g[v, w1 ]w2 . . . wk +gw1 [v, w2 ] . . . wk +· · ·+gw1 w2 . . . [v, wk ].
In particular Lv preserves the type of a tensor. This action is compatible
with the passages to the symmetric and skew symmetric tensor algebras, so
the Lie derivative also acts naturally on these algebras obeying the same
rules as above.
Finally note that there are natural collections of tensors that maps isomorphically to each of the symmetric and skew symmetric algebras.
• In the case of the symmetric algebra it is the space spanned by all
words of the form f wp , for f smooth and w a vector field.
More explicitly there is an idempotent symmetrization map which we describe as follows. Recall the permutation group Sn defined for any given
n ∈ N as the group of bijections from Nn = {1, 2, . . . , n} ⊂ N to itself, with
the group multiplication being composition. Then Sn has n! elements.
Given a word w = f w1 w2 . . . wp in T p (U) as above and given a permutation σ in Sp , we define:
wσ = f wσ(1) wσ(2) . . . wσ(p) .
Note that if ρ is another permutation in Sp , we have:
(wσ )ρ = f wρ(σ(1)) wρ(σ(2)) . . . wρ(σ(p))
= f w(ρ◦σ)(1)) w(ρ◦σ)(2)) . . . wρ◦σ)(p))
= wρ◦σ .
Then the assignment w → wσ for each word extends naturally to give a homomorphism, τ → τσ from T p (U) to itself and we have (τσ )ρ = τρ◦σ , for any
permutations ρ and σ in Sp .
Then given any tensor τ ∈ T p (U), we define:
S(τ ) =
1 X
τσ .
p! σ∈S
p
16
This gives an endomorphims of T p (U). We prove that it is idempotent:


X
1
1 XX
S 2 (τ ) = S 
τσ  =
(τσ )ρ
p! σ∈S
p! ρ∈S σ∈S
p
p
1 XX
τρ◦σ
p! ρ∈S σ∈S
=
p
p
=
p
1 XX
τρ◦(ρ−1 ◦σ)
p! ρ∈S σ∈S
p
p
1 XX
τσ
p! ρ∈S σ∈S
p
p
X
τσ = S(τ ).
=
=
σ∈Sp
Here we used the fact that as σ runs once through all permutations in Sp ,
so does ρ−1 ◦ σ, for each fixed permutation ρ in Sp . The image of S is
then called the space of symmetric tensors. On this space we can define
the symmetric tensor product by the formula A B = S(AB). This makes
the space of symmetric tensors into an algebra naturally isomorphic to the
algebra ST (U).
• In the case of the skew symmetric tensors algebra, we use the fact,
easily proved by induction, that there is a group epimorphism sn of the
symmetric group Sn to Z2 , the multiplicative group of two elements 1
and −1, with 1 the identity and (−1)2 = 1. For a permutation σ, the
quantity sn (σ) is called its sign and σ is called even if its sign is 1 and
odd otherwise. A simple transposition is a permutation which fixes all
but two element of Nn . Every permutation is a composition of simple
transpositions and sn (σ) is then even if and only if there are an even
number of terms in this composition.
Derivations of differential forms
Given a smooth function f on U, its
The algebra of forms Ω(U) is the graded algebra over C ∞ (U) generated by
the zero-forms C ∞ (U) and by exact differential one forms df , for f ∈ C ∞ (U),
17
such that the map d : C ∞ (U) → Ω1 (U), f → df , for any f ∈ C ∞ (U), is a
derivation. Also we have (df )(dg) = −(dg)(df ), for any f and g in C ∞ (U).
A derivation v of Ω(U) is by definition a map v : Ω(U) → Ω(U), such that:
• v kills constants, v(c) = 0, for any constant function c on U.
• v is an abelian group homomorphism: v(α + β) = v(α) + v(β), for any
α and β in Ω(U).
• v obeys a Liebniz rule: for any integers k and l and any α and β in
Ω(U), where α has degree k and β degree l, we have:
v(αβ) = v(α)β + (−1)kl v(β)α.
The operator d : C ∞ (U) → Ω1 (U) extends naturally and uniquely to a derivation of Ω(U), still called d, such that d2 = 0.
Then d is called the exterior derivative.
Denote by D(U) the space of all derivations of Ω(U).
Note that if v is in D(U), then so is γv, the multiplication of v on the left by
any element γ of Ω(U), so D(U) is naturally an Ω(U)-module.
A derivation v is said to be local if and only if it annihilates Ω0 (U).
Define the X∗ -valued derivation δ of Ω(U) by the formula, valid for any
smooth function f :
δ(f ) = 0, δ(df ) = f 0 .
Then δ is a derivation of degree minus one; also every local derivation V of
Ω(U) has the unique expression:
V = v.δ.
Let V be any derivation of Ω(U). Then its action on C ∞ (U) is that of a
derivation so it may be given by the formula:
V (f ) = v.f 0 .
Here v is a unique smooth form on U, with values in X.
Put W = v.δ, so W is a local derivation. Then we have, since W annihilates
functions, so also does its even and odd parts:
(V − [W, d])(f ) = v.f 0 − v.δ(df ) = 0.
18
So we have the unique decomposition of any derivation:
V = [W, d] + T, with T and W local derivations.
19
Notes on the theorem of Archimedes, Newton, Green,
Stokes, Ostragodsky, Poincaré, de Rham, etc.
The theorem is roughly stated as follows:
• Let R be an oriented smooth n-manifold with boundary ∂R, at least
piecewise smooth, where ∂R acquires the induced orientation from that
of R.
Let ω be an (n − 1)-form, smoothly defined on R and smooth up to
and including the boundary ∂R.
Then we have:
Z
Z
dω =
R
ω.
∂R
For us these manifolds will always be embedded inside some Euclidean space.
A fairly simple case and, as we shall see, essentially the base case for a
reverse induction proof of the general case, is when R is a bounded region
in Rn , with a smooth boundary, ∂R, given by a single equation f (x) = 0,
where f is a smooth function on Rn , such that R is the region given by the
inequality f (x) ≤ 0 and on the boundary the differential df 6= 0 (by the
rank theorem this guarantees that the boundary is a smooth manifold). Also
x = (x1 , x2 , . . . xn ), where xj ∈ R, for each j = 1, 2, . . . , n. The gradient
vector ∇f is then non-vanishing on ∂R and is everywhere outward pointing.
Denote by N the unit outward pointing, normal vector, so the unit vector in
the direction of ∇f , so we have the relations:
p
∇f = N (∇f ).(∇f ), N.N = 1.
In the following we will often have occasion to divide by a partial derivative
of f with respect to a coordinate: one or more of these derivatives may vanish at a point, but always there is at least one that is non-vanishing, so the
relevant formula will be understood to apply only to them.
20
Introduce the volume n-form:
Σ = dx1 dx2 . . . dxn .
Note (though we will not really need this) that we can write this invariantly
as:
θa1 θa2 . . . θan = Σa1 a2 ...an .
Here θa gives the canonical isomorphism between dual vectors and one-forms:
αa → αa θa . Also the θa anti-commute: θa θb = −θb θa . Then a1 a2 ...an is the
orientation tensor and is totally skew. Dual to θb is the derivation δa of
degree minus one, which annihilates functions and obeys δa θb = δab . More
explicitly, the components of δa give n derivations, each of degree minus one,
δ = (δ1 , δ2 , . . . , δn ), where δi (dxj ) = δij . Then put:
Σ = δΣ = (δ1 Σ, δ2 Σ, . . . , δn Σ)
= dx2 dx3 dx4 . . . dxn , −dx1 dx3 dx4 . . . dxn , dx1 dx2 dx4 . . . dxn , . . . , (−1)n−1 dx1 dx2 . . . dxn−1 .
For example:
• When n = 1, we have Σ = (1).
• When n = 2, with Cartesian coordinates (x, y), we have:
Σ = (dy, −dx).
• When n = 3, with Cartesian coordinates (x, y, z), we have:
Σ = (dy dz, dz dx, dx dy).
• When n = 4, with Cartesian coordinates (t, x, y, z), we have:
Σ = (dx dy dz, −dt dy dz, −dt dz dx, −dt dx dy).
Notice that we have the identity:
θa Σb = δba Σ.
Proof:
Using the derivation property of δb , we have:
θa Σb = θa δb Σ = −δb (θa Σ) + (δb θa )Σ = δba Σ.
21
Here we used also that θa Σ = 0, which is true because the left-hand side is
an (n + 1)-form, which automatically vanishes in n dimensions.
For example, we have in the case n = 3:
dx Σ = dx (dy dz, dz dx, dx dy) = (Σ, 0, 0),
dy Σ = dy (dy dz, dz dx, dx dy) = (0, Σ, 0),
dz Σ = dz (dy dz, dz dx, dx dy) = (0, 0, Σ).
Now we may write the smooth (nP
− 1)-form ω in terms P
of the basis of (n − 1)forms given by Σ as ω = v.Σ = ni=1 v i Σi , where v = ni=1 v i ∂i is a smooth
vector field on R. Then, since dΣi = 0, dxi Σj = δji Σ and d = Σni=1 dxI ∂i , we
have:
!
n
n
n
n
X
X
X
X
i
i
i
j
dω = d
v Σi =
(dv )Σi =
dx (∂i v )Σj =
δji (∂i v j )Σ = (∇.v)Σ.
i=1
i=1
i,j=1
i,j=1
So the Stokes’ Theorem can be rewritten as the divergence theorem:
Z
Z
v.Σ = (∇.v)Σ.
R
∂R
Now, restricted to the space ∂R, any (n − 1)-forms should be linearly dependent, since the space of (n−1)-forms in (n−1) dimensions is one-dimensional.
In terms of the differentials of the co-ordinates, we have the non-trivial linear
relation on restriction to the surface f = 0:
0 = df = (∇f ).θ = f1 dx1 + f2 dx2 + · · · + fn dxn .
This allows us to reduce the space of differentials to a space of dimension
n − 1. Indeed, consider the (n − 1)-form Σi at a point where the differential
df is non-zero. Then we can consider the differential df as the first element
of a basis of forms at that point, so we have the expression Σi = λi β + (df )µi ,
for some (n − 1)-form β and some (n − 2)-forms, µi and some scalars λi (here
β is the product of the other forms of the basis). Multiplying this equation
by df we get the relation:
(df )Σi = λi (df )β = Σnj=1 fj dxj Σi = fi Σ.
22
But the form Σ 6= 0, so at least one of the fi Σ is not zero, so neither can the
form (df )β be zero, so we get λi = kfi = mNi , for some non-zero scalars k
and m. Put α = mβ giving the relation:
Σi = Ni α + (df )µi
In particular, on restriction to the hypersurface f = 0, since then df = 0, we
get the relation, for some (n − 1)-form α:
Σ = N α.
Taking the dot product of this equation with the unit vector N we get α =
N.Σ. Accordingly, put α = N.Σ. Then restricted to the surface f = 0, we
have the relation:
(∇f )α
Σ = Nα = p
.
(∇f ).(∇f )
This gives the relations, valid on restriction to the boundary surface f = 0:
p
Σi (∇f ).(∇f )
(∂i f )(∇f ).Σ
, Σi =
.
α=
∂i f
(∇f ).(∇f )
So, for example, in two dimensions on f = 0, we have:
p
p
dy fx2 + fy2
dx fx2 + fy2
fx dy − fy dx
(fx , fy ).(dy, −dx)
p 2
α = N.Σ =
= p 2
=
=−
.
fx
fy
fx + fy2
fx + fy2
Here we used that on the curve f = 0, we have df = 0, so we have:
fx dx + fy dy = 0,
dx
dy
=− ,
fy
fx
dy
fx
=− ,
dx
fy
dx
fy
=− .
dy
fx
p
p
|dx| fx2 + fy2
|dy| fx2 + fy2
and ds =
In particular we have that the measures ds =
|fy |
|fx |
are equal and each gives the standard measure of length along the curve:
2 !
2 !
dy
fx
2
2
2
2
2
ds = dx + dy = dx 1 +
= dx 1 +
dx
fy
=
dy 2 2
dx2 2
2
2
f
+
f
=
f
+
f
y
y ,
fy2 x
fx2 x
|α| = ds.
23
Now consider the proof of the divergence theorem in two dimensions, where
we put v = [P, Q]:
Z
Z
v.Σ = (∇.v)Σ,
R
∂R
Z
Z
[P, Q].[dy, −dx] =
(Px + Qy )dxdy,
R
∂R
Z
Z
(P dy − Qdx) =
(Px + Qy )dxdy,
R
∂R
This is linear over the reals in the functions P and Q, so to prove the theorem,
it suffices to prove the following two relations:
Z
Z
Z
Z
P dy =
Px dxdy, −
Qdx =
Qy dxdy.
R
∂R
R
∂R
Here the two-dimensional integrals are properly oriented, so they reduce to
ordinary double integrals.
We will prove the theorem for a simple region, R, simple in the sense that it
can be parametrized easily to cover the region in two ways:
• For c ≤ y ≤ d, we have a(y) ≤ x ≤ b(y).
• For a ≤ x ≤ b, we have c(x) ≤ y ≤ d(x).
See the sketch:
the boundary ∂R of R
is traced counterclockwise
y = d(x)
d
a simple region R
y
x = a(y)
c
x = b(y)
y = c(x)
a
x
b
24
Then using the first parametrization, we have:
Z
Z d Z b(y)
Z
Px dxdy =
Px dxdy =
R
c
a(y)
d
Z
Z
c
b(y)
[P (x, y)]a(y) dy
c
P (b(y), y)dy +
=
d
Z
P (a(y), y)dy =
c
d
P (x, y)dy.
∂R
The point here is that the right-hand-part of the boundary ∂R is parametrized
by x = b(y), for y from c to d, whereas the left-hand part is parametrized by
x = a(y), for y from d to c.
Next, using the second parametrization, we have:
Z
Z b Z d(x)
Z b
d(x)
Qy dxdy =
Qy dydx =
[Q(x, y)]c(x) dx
R
Z
=−
a
c(x)
b
Z
Q(x, c(x))dx −
a
a
a
Z
Q(x, d(x))dx = −
b
Q(x, y)dx.
∂R
The point here is that the lower part of the boundary ∂R is parametrized
by y = c(x), for x from a to b, whereas the upper part is parametrized by
y = d(x), for x from b to a.
So the required result is proved.
Example
By this theorem, the area A of the region R can be given by any of the
formulas:
Z
Z
Z
Z
1
ydx =
xdy = −
dxdy =
A=
(xdy − ydx).
2 ∂R
∂R
∂R
R
In polar co-ordinates, we have:
x = r cos(θ), dx = cos(θ)dr − sin(θ)rdθ,
y = r sin(θ), dy = sin(θ)dr + cos(θ)rdθ,
xdy−ydx = r cos(θ)(sin(θ)dr+cos(θ)rdθ)−r sin(θ)(cos(θ)dr−sin(θ)rdθ) = r2 dθ,
1
1
dxdy = d(xdy − ydx) = d(r2 dθ) = rdrdθ.
2
2
So the area is also given by the formulas:
Z
Z
1
A=
rdrdθ =
r2 dθ.
2
∂R
R
25
The proof of the divergence theorem in all dimensions is basically the same.
In (n + 1)-dimensions, let a simple region be one whose projection to each
coordinate hyperplane is itself a simple region in n-dimensions.
In particular, if we write the (n+1)-co-ordinates as (t, x) where x = (x1 , x2 , . . . , xn ),
by the linearity of Stokes’ Theorem in the vector field, v, it suffices to take
the vector field v to have only one non-zero component, denoted w, in the
t-direction, in which case ∇.v = wt , so we just need to prove:
Z
Z
n
wt dtd x =
v.N dn S.
R
∂R
Here N is the outward pointing normal and dn S is the hyper-surface volume
element. The point here is that once we have proved this case, the other
cases, where v has just one non-zero component in another co-ordinate direction, follow by appropriate permutation of the coordinates and then the
general case, where v is arbitrary is obtained by summation of these special
case, using linearity over the reals. Now the region R is described by the
range t goes from t = a(x) to t = b(x), as x ranges over S, the projection of
the region R onto the hyper-plane t = 0.
By the Fundamental Theorem of Calculus, the integral on the left-hand side
is:
Z
Z
Z
Z b(x)
b(x) n
n
(w(b(x), x) − w(a(x), x))dn x
[w]a(x) d x =
wt dtd x =
x∈S
a(x)
x∈S
x∈S
On the upper part of the region, given by the equation f (t, x) = t − b(x) = 0,
the outward pointing (non-normalized) gradient vector is the vector:
∇(t − b(x)) = (1, −∇b).
On the lower part of the region, given by the equation f (t, x) = a(x) − t = 0,
the outward pointing (non-normalized) gradient vector is the vector:
∇(a(x) − t) = (−1, ∇a).
So, on the upper boundary, we have, for the surface element, dn S, in terms
of the coordinates x1 , x2 , . . . xn :
p
p
ft2 + (∇b).(∇b) n
n
d x = 1 + (∇b).(∇b)dn x.
d S=
|ft |
26
Similarly on the lower boundary, we have:
p
p
ft2 + (∇a).(∇a) n
n
d x = 1 + (∇a).(∇a)dn x.
d S=
|ft |
Also for the integrand v.N we have the expression, since v = (w, 0, 0, . . . , 0)
on the upper boundary where t = b(x):
w(b(x), x)
, ,
v.N = p
1 + (∇b).(∇b)
v.N dn S = w(b(x), x)dn x.
On the lower boundary, where t = a(x), we have instead:
w(a(x), x)
v.N = − p
,
1 + (∇a).(∇a)
v.N dS = −w(a(x), x)dn x.
So we get:
Z
Z
n
n
v.N d S =
∂R
v.N d S +
t=b(x),x∈S
Z
v.N dn S
t=a(x),x∈S
Z
n
w(b(x), x)d x −
=
Z
w(a(x), x)dn x.
x∈S
x∈S
Z
=
(w(b(x), x) − w(a(x), x))dn x.
x∈S
This agrees with the left-hand-side of Stokes’ Theorem, calculated above, so
we are done: the divergence theorem holds.
27
Next we consider the case where the manifold of integration lies in a higher
dimensional space. Stokes’ Theorem is additive in the regions under consideration, so using the Rank theorem, we may take for the integration region
R, the space (t(x), x) as x varies over a simple region S in Rn and the boundary ∂R is given by the same formula (t(x), x), for x ∈ ∂S. Here the map t is
a smooth map from S to Rm , for some m, the total dimension of the space
inside which the manifold
the pull-back formula for
R lies being
R m + n. Then
R
integration tells us that R dω = S p∗ (dω) = S d(p∗ (ω)), where p : S → R
is the bijection x → p(x) = (t(x), x). But applied to the boundary, we have
also the pull-back formula:
Z
Z
ω=
p∗ (ω).
∂R
∂S
Bu the divergence theorem, proved above, we have
Z
Z
∗
d(p∗ (ω)).
p (ω) =
S
∂S
So Stokes’ Theorem follows. An alternative and perhaps instructive way of
thinking of this case, without using the pull-back argument, is to restrict to
the case that the boundary of S is piecewise flat. Then we form a closed domain, whose (curved) ”roof” is the region R, flat ”floor” is S and such that
the boundary pieces each lie in a hyperplane, so form (flat) vertical ”walls”,
with base the edges of the boundary of S and tops the corresponding curved
pieces of the boundary of R. Then we proceed by reverse induction on the
dimension n: we assume that Stokes’ theorem holds in all higher dimensions,
the base case being the highest possible dimension, where the theorem becomes the divergence theorem, which we have already proved. Denote by
T the ”room” with ”ceiling” R, ”floor” S and a finite number of ”walls”,
the boundary regions Wi , i = 1, 2, . . . , k, each with outward pointing normal,
above the boundary pieces Si , i = 1, 2, . . . k of S. These various regions form
the boundary of the room T . Specifically in the sense of integration, where
we orient both S and R upwards, we have:
∂T = R − S + Σkj=1 Wj ,
0 = ∂ 2 T = ∂R − ∂S +
k
X
∂Wj ,
j=1
∂R = ∂S −
k
X
∂Wj .
j=1
The minus signs in front of S in the first equation is there because the
orientation chosen for S points inside the region T .
28
Then we have by the induction hypothesis, for any form α defined on T :
Z
Z
Z
Z
k Z
X
dα =
α=
α− α+
α.
T
R
∂T
S
j=1
Wj
So now take the case that α = dω.
Then dα = 0, so we get, using the inductive hypothesis:
Z
Z
Z
Z
Z
k Z
X
dα = 0 =
α=
dω =
dω − dω +
T
∂T
R
∂T
Z
Z
dω −
dω =
S
R
S
k Z
X
j=1
j=1
dω,
Wj
dω.
Wj
But each of the regions S and Wj , j = 1, 2, . . . , k are ordinary flat regions, so
the usual divergence theorem applies to each of them and so Stokes’ theorem
holds for them, giving:
Z
Z
Z
Z
k Z
X
dω =
ω−
ω=
ω=
ω.
P
R
∂S
j=1
∂Wj
∂S−
k
j=1
∂Wj
∂R
So by reverse induction on dimension, the required theorem holds.
Notice that here we are exploiting the idea that d on forms and ∂ on regions are dual to each other: then dual to the idea that d2 = 0 is the idea
that ∂ 2 = 0 or ” boundary of a boundary is empty”. This idea is codified in
the theory of singular homology.
Summarizing, if we represent the integral in the style of Paul Dirac: hR|ωi
denoting the integral of ω over the region R, then Stokes’ Theorem says that
the adjoint of d is ∂:
h∂R|ωi = hR|dωi.
It follows that the adjoint of d2 is ∂ 2 :
h∂ 2 R|ωi = h∂(∂R)|ωi = h∂R|dωi = hR|d(dω)i = hR|d2 ωi.
So perhaps it is not surprising that the fact that d2 = 0 is related to the fact
that ∂ 2 = 0. Finally we can build up Stokes’ Theorem for more complicated
regions, by splitting it appropriately into simple regions, using the additivity
of both sides of the Stokes’ Theorem under the decomposition of regions.
29
Special cases of Stokes’ Theorem
In one dimension, the theorem is just the Fundamental Theorem of Calculus:
Z b
df = f (b) − f (a).
a
The same theorem is true in any higher dimension, where f is a function and
the integration region is a path R connecting its boundary points ∂R = B−A
(the minus sign coming from the orientation):
Z
Z
df =
f = f (B) − f (A).
R
∂R
In two dimensions, we may write the theorem in divergence form with v =
[P, Q] and ∇.v = Px + Qy :
Z
Z
Z
Z
[P, Q].N ds.
P dy − Qdx =
(∇.v)dxdy = (Px + Qy )dxdy =
R
R
∂R
∂R
Alternatively, we may replace P by Q and Q by −P in this formula giving
the more natural looking:
Z
Z
Z
Z
P dx + Qdy = v.T ds.
(∇ × v)dxdy = (Qx − Py )dxdy =
R
R
∂R
Here v = [P, Q], ∇ × v = Qx − Py and T is the unit tangent vector to the
curve ∂R; note that T ds = P dx + Qdy.
In three dimensions, we have the divergence theorem, when R is threedimensional:
Z
Z
(∇.v)dxdydz = (Px + Qy + Rz )dxdydz
R
Z
=
R
Z
v.Σ =
∂R
Z
(P dydz + Qdzdx + Rdxdy) =
∂R
∂R
2
Here v = [P, Q, R], Σ = [dydz, dzdx, dxdy] = N d S.
30
v.N d2 S.
The lower dimensional Stokes’ Theorem reads, for the case that R is twodimensional:
Z
Z
(P dx+Qdy+Rdz) = ((Ry −Qz )dydz+(Pz −Rx )dzdx+(Qx −Py )dxdy),
R
∂R
Z
Z
Z
(∇ × v).Σ =
v.dx =
v.T ds =
∂R
Z
∂R
R
[N, ∇, v]d2 S.
R
Here the notation [A, B, C] denotes the vector triple product, the determinant
of its vector arguments, A, B and C, so [A, B, C] = A.(B × C) = (A × B).C,
so [N, ∇, v] = N.(∇ × v) = (N × ∇).v.
Also T is the unit tangent vector in the direction of the curve, so we have
T ds = dx = [dx, dy, dz] and N d2 S = [dydz, dzdx, dxdy], as usual.
In four dimensions, with coordinates xa = (t, x, y, z) = (t, x), where x0 =
t and x1 = x, x2 = y and x3 = z, there are two dual ways of writing a
two-form:
• First we just write it out as a polynomial in dxa :
1
F = Σ4a=1 Σ4b=1 Fab dxa dxb = dtE + B,
2
E = E.dx = F01 dx + F02 dy + F03 dz,
F01 = E 1 , F02 = E 2 , F03 = E 3 ,
B = B.Σ = B1 dx2 dx3 +B2 dx3 dx1 +B3 dx1 dx2 = B1 dydz+B2 dzdx+B3 dxdy.
• Dually, we write:
1
1
F ∗ = Σ4a=1 Σ4b=1 F ab δa δb Ω = Σ4a=1 Σ4b=1 F ab Σab .
2
2
Here, if we are in space-time, [F 01 , F 02 , F 03 ] = −E and [F 23 , F 31 , F 12 ] =
B. Also Ω = dtdxdydz. Then we have:
Σ0i = δ0 δi Ω = −δi (dxdydz) = −Σi ,
Σ23 = δ2 δ3 Ω = −dtdx, Σ31 = δ3 δ1 Ω = −dtdy, Σ12 = δ1 δ2 Ω = −dtdz,
F ∗ = −dtB.dx + E.Σ − dtB.dx.
31
So the effect of going from F to F ∗ is the same as replacing (E, B) by
(−B, E). The free electromagnetic field equations of James Clerk Maxwell
are then:
0 = dF, 0 = dF ∗ .
Here the electromagnetic two-form in four dimensions, F encodes both the
electric field E and the magnetic field B.
Write d = dt∂t + ∂, where ∂ = dx∂x + dy∂y + dz∂z = dx.∇.
Then we have, with Σ = dxdydz and Σ = [dydz, dzdx, dxdy] as usual:
dF = d(dtE + B.Σ) = −dt(∇ × E).Σ + dt(∂t B).Σ) + (∇.B)Σ,
= dt(−(∇ × E).Σ + ∂t B).Σ + (∇.B)Σ,
dF ∗ = dt((∇ × B).Σ + ∂t E).Σ + (∇.E)Σ.
The free Maxwell equations are then:
0 = dF, 0 = dF ∗ ,
0 = ∇.B,
0 = ∇.E,
∂t B = ∇ × E,
∂t E = −∇ × B,
Recall the derivative identity: ∇ × (∇ × u) = ∇(∇.u) − ∆u.
Here u is a vector field and ∆ = ∇.∇ = ∂x2 + ∂y2 + ∂z2 is called the threedimensional Laplacian.
Then we get, using the free Maxwell equations:
∂t2 B = ∇ × ∂t E = −∇ × (∇ × B) = ∆B,
∂t2 E = −∇ × ∂t B = −∇ × (∇ × E) = ∆E.
B = 0, E = 0.
Here is the wave operator = ∂t2 − ∆.
The fact that the free Maxwell equations obey the wave equation in four dimensions, is one of the cornerstones of Albert Einstein’s theory of relativity.
The four-dimensional perspective is due to Hermann Minkowski.
Maxwell’s equations in the presence of currents are the following:
dF = 0,
dF ∗ = J.
32
Here J is the current three-form, which can be written J = [φ, j].Ω, where
we have:
Ω = [δ0 , δ1 , δ2 , δ3 ]Ω = [dxdydz, −dtdydz, −dtdzdx, −dtdxdy].
So we have:
J = φΣ − dtj.Σ.
The equation of current conservation, is the equation dJ = 0, which follows
by applying d to the equation dF ∗ = J:
0 = dJ = (dt∂t + ∂)(φdxdydz − dtj.Σ)
= (∂t φ + ∇.j)dtdxdydz,
∂t φ + ∇.j = 0.
The Maxwell equations now read:
dF = 0, 0 = ∇.B,
J = dF ∗ ,
∂t B = ∇ × E,
φΣ − dtj.Σ = dt((∇ × B).Σ + ∂t E).Σ + (∇.E)Σ,
φ = ∇.E,
−j = ∂t E + ∇ × B.
Now we have:
∂t2 E = −∂t j − ∇ × ∂t B = −∂t j − ∇ × (∇ × E) = ∆E − ∂t j − ∇(∇.E)
E = −∂t j − ∇φ,
∂t2 B = ∇ × ∂t E = −∇ × j − ∇ × (∇ × B) = ∆B − ∇ × j,
B = −∇ × j.
So the electric and magnetic fields obey the wave equation with sources
determined by the current.
33