* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Taylor`s Theorem for smooth functions
Survey
Document related concepts
Transcript
Taylor’s Theorem for smooth functions Let p be a point of a real affine space M of finite dimension, whose tangent space at any point is the real vector space X of finite dimension. Let V be an open subset of M, such that for any x ∈ X, the line segment joining x to p lies in V. This line segment has the natural parametrization x+t(p−x), where t ∈ [0, 1]. Let f : V → Y be a smooth function taking values in a real vector space Y of finite dimensions. For any positive integer k and for any x ∈ V, put: Z 1 k−1 t f (k) (x + t(p − x)) dt, gk (x) = (k − 1)! 0 hk (x) = gk (x)((x − p)k ), (x−p)k = (x−p)⊗k = (x−p)⊗(x−p) · · ·⊗(x−p) (k factors of (x − p) in this tensor product). Note that gk is well-defined and smooth on V and takes values in (X∗ )k ⊗ Y. Also we have the formulas, using the Fundamental Theorem of Calculus and the Chain Rule: Z f (k) (p) 1 k−1 f (k) (p) gk (p) = t dt = , (k − 1)! 0 k! Z 1 h1 (x) = g1 (x)(x − p) = f (1) (x + t(p − x))(x − p) dt 0 Z =− 0 1 d (f (x + t(p − x))) dt = − [f (x + t(p − x))]t=1 t=0 = f (x) − f (p). dt Also, again by the Fundamental Theorem of Calculus and the Chain Rule, we have, for any positive integer k and any x ∈ V: Z 1 k−1 t tk (k+1) (k) k k+1 hk (x)−hk+1 (x) = f (x + t(p − x))((x − p) ) − f (x + t(p − x))((x − p) ) dt (k − 1)! k! 0 Z 1 k d t (k) k = f (x + t(p − x))((x − p) ) dt k! 0 dt k t=1 t (k) 1 k f (x + t(p − x))((x − p) ) = f (k) (p)((x − p)k ). = k! k! t=0 1 Summing this relation from k = 1 to n, the sum on the left-hand side collapses and we get, for any x ∈ V and any positive integer n, the formula: n X 1 (k) f (p)((x − p)k ). h1 (x) − hn+1 (x) = k! k=1 Rearranging, and using the formula h1 (x) = f (x) − f (p), proved above, we have proved Taylor’s theorem for smooth functions, valid for any x ∈ V and any non-negative integer n: n+1 f (x) = gn+1 (x)((x − p) 1 gn+1 (x) = n! Z n X 1 (k) (f (p)((x − p)k ), )+ k! k=0 1 n (n+1) t f (x + t(p − x)) dt, 0 f (n+1) (p) gn+1 (p) = . (n + 1)! Here gn+1 : V → (X∗ )(n+1) ⊗ Y is smooth. Note that the formula, for a given non-negative integer n, is also valid, as written, when f is only C n+1 on V, except that the remainder function gn+1 is only continuous, not necessarily smooth. Z = 0 1 Finally note that if we write f (k) = Dk f , where D is the derivative operator, we have formally (non-rigorously), using the series for eu : t=1 (e(x−p).D f )(p) − f (x) = (et(x−p).D f )(x + t(p − x)) t=0 ∞ Z 1 X d d tk (k) k t(x−p).D f (x + t(p − x))((x − p) ) dt (e f )(x + t(p − x)) dt = dt dt k! k=0 0 Z 1 ∞ X d = (hk (x)−hk+1 (x))+ [f (x + t(p − x))] dt = h1 (x)+f (p)−f (x) = 0. 0 dt k=1 Here we needed to assume that hn (x) → 0 as n → ∞. So formally, we have the equation (e(x−p).D f )(p) = f (x). This formula can be rigorously justified in the analytic case for x in the range of convergence around the base point p. Note that in the smooth case, however, it is generally false, since, for example 1 in one-dimension, for the smooth function f (x) = e− x , for x > 0 and f (x) = 0, for x ≤ 0, we have (ex.D f )(0) = 0 6= f (x), since f and all its derivatives vanish at the origin. 2 Vector fields, first-order homogeneous differential operators and derivations We work in the smooth category. Fix a real affine space M of finite dimension, whose tangent space at any point is the real finite dimensional vector space X. Also fix U a non-empty open subset of M. The co-ordinate vector function x of U assigns to any x ∈ U the position vector x ∈ X, relative to a chosen origin for M, so x can be regarded as a smooth function on U with values in X. In particular we have, for its (constant) derivative: x0 = δ ∈ X∗ ⊗ X. Here δ is the Kronecker delta, which gives the identity endomorphism of X. A derivation v of C ∞ (U) is by definition a map v : C ∞ (U) → C ∞ (U), such that: • v kills constants, v(c) = 0, for any constant function c on U. • v is an abelian group homomorphism: v(f + g) = v(f ) + v(g), for any f and g in C ∞ (U). • v obeys a Liebniz rule: for any f and g in U, we have: v(f g) = v(f )g + f v(g). Intuitively v is an infinitesimal automorphism of C ∞ (U). Denote by T (U) the collection of all derivations of C ∞ (U). Then it is clear that T (U) is a C ∞ (U)-module, under left multiplication by elements of C ∞ (U). • By definition, a smooth vector field on U is a derivation of C ∞ (U). We want to find a concrete description of T (U). 3 Finally note that we can introduce the derivative derivation ∂: ∂ : C∞ (U) → X∗ ⊗ C∞ (U). Here ∂ is defined by the succinct formula, valid for any f ∈ C∞ (U): ∂(f ) = f 0 . To this end, we introduce the derivative derivation ∂: ∂ : C∞ (U) → X∗ ⊗ C∞ (U). Here ∂ is defined by the succinct formula, valid for any f ∈ C∞ (U): ∂(f ) = f 0 . Note that the properties of the derivative immediately give that ∂ is a derivation, as required. Then if w is a smooth function on U with values in X, w determines an operator on C ∞ (U), denoted w.∂ : C ∞ (U) → C ∞ (U) given by the formula, valid for any f ∈ C ∞ (U) and any x ∈ U: (w.∂(f ))(x) = f 0 (x)(w(x)). So the pairing (w, ∂) → w.∂ is the usual dual pairing of a dual vector with a vector. Normally one would write this as ∂(w), but we want to make clear that the derivative operator is not acting on w, so we keep it to the left of the derivative operator ∂, which acts to the right. Note that immediately from its definition, by the standard properties of the derivative, we have w.∂ ∈ T (U). Also note that since x0 is the Kronecker delta, we have: w.∂(x) = w(x). This means, in particular, that given the operator w.∂, the smooth function w is uniquely determined. • By definition, a first-order smooth homogeneous differential operator is an operator on C ∞ (U), of the form w.∂, for some w, a smooth function on U with values in X. (A general smooth first-order operator is then of the form f → w.∂(f ) + gf , where g is a smooth function on U). 4 The main theorem we want to prove is: • An operator on C ∞ (U) is a derivation, if and only if it is a smooth vector field, if and only if it is a first-order smooth homogeneous differential operator. Everything in this theorem is immediate from the definitions and the discussion above, except for the following, which we have to prove: • Given a derivation v of C ∞ (U), we must show that for any f ∈ C ∞ (U) and for any x ∈ U, we have the formula: (v(f ))(x) = f 0 (x)(v(x)). The first key result is the following: • Let φ 6= V ⊂ U, with V open. If f ∈ C ∞ (U) is such that f |V vanishes, then v(f )|V = 0. Proof: Let p ∈ V. We must show that v(f )(p) = 0. Let b ∈ C ∞ (U) be a bump function, supported in V, such that there exists an open subset W ⊂ V, with p ∈ W, such that b|W = 1 (the constant function). We know that such a function b exists, by work done earlier this term. Then we have bf = 0 on U, since b vanishes outside V and f vanishes inside. We write the Liebniz rule: 0 = v(0) = v(bf ) = bv(f ) + v(b)f. Evaluate at the point p. We get, since b(p) = 1 and f (p) = 0: 0 = b(p)v(f )(p) + v(b)(p)f (p) = v(f )(p) + 0 = v(f )(p). So the required result holds and we are done. We can rephrase this result as follows: for f ∈ C∞ (U), denote by Z(f ) the closure of the open subset of U on which f is non-zero. Z(f ) is called the support of f . Note that f vanishes on the open set U − Z(f ). Then the result just proved implies that for any f ∈ C∞ (U) and for any vector field v ∈ T (U), we have Z(v(f )) ⊂ Z(f ): i.e. the action of v on C ∞ (U) is support non-increasing. 5 Now fix p ∈ U. Pick V, an open ball centered at p with V ⊂ U. The ball V exists, since U is open. Also if x ∈ V, the line segment from p to x lies entirely within V. So we can apply the second-order Taylor Theorem to the set V, based at p, a special case of the full Taylor Theorem proved above: • We have the expansion, valid for any x ∈ V: f (x) = f (p) + f 0 (p)(x − p) + g(x)((x − p)2 ). Here g(x) is a smooth function on V with values in X∗ X∗ and (x − p)2 = (x−p)⊗(x−p) ∈ X⊗X, with the property that g(p) = 2−1 f 00 (p). Let b ∈ C ∞ (U) be a bump function, supported in V, such that there exists an open subset W ⊂ V, with p ∈ W, such that b|W = 1 (the constant function). Again, by work done earlier this term, we know that such a function b exists. Then for any x ∈ U, put: • G(x) = b(x)g(x), if x ∈ V, • G(x) = 0, if x ∈ / V. Then G is a smooth function on U with values in X∗ X∗ . Also, by construction, we have G(x) = g(x), for any x ∈ W. Next define a function F by the formula, valid for any x ∈ U: F (x) = f (p) + f 0 (p)(x − p) + G(x)(x − p, x − p). Then F ∈ C ∞ (U) and by construction, we have (F − f )|W = 0. It follows in particular, by the support non-increasing property of v, proved above, that v(F − f )(p) = 0. This gives the relation, using the derivation property of v: v(f )(p) = v(F )(p)−v(F −f )(p) = v(F )(p) = v (f (p) + f 0 (p)(x − p) + G(x)(x − p, x − p)) (p) = f 0 (p)(v(x − p))(p) + v(G(x)(x − p, x − p))(p) = f 0 (p)(v(x)(p)) + (v(G(x))(x − p, x − p))(p) + 2(G(x)(v(x − p), x − p))(p) = (f 0 (x)(v(x))(p) + v(G(x))(p)(p − p, p − p) + 2G(p)(v(x)(p), p − p) = (f 0 (x)(v(x))(p) + v(G(x))(p)(0, 0) + 2G(p)(v(x)(p), 0) = (f 0 (x)(v(x))(p). Since p ∈ U is arbitrary, we have v(f )(x) = f 0 (x)(v(x)), for any x ∈ U and any f ∈ C ∞ (U). So v = ∂w , where w(x) = v(x), for any x ∈ U and we are done. 6 We may rephrase the theorem just proved as follows: • Every derivation V of C∞ (U) is given by the formula V = v(x).∂, for some smooth function v defined on U with values in X. Also v(x) = V (x). More explicitly, we have, using co-ordinates (x1 , x2 , . . . , xn ) and their corre∂ sponding partial derivative operators ∂i = , i = 1, 2, . . . , n. ∂xi V = j X v i ∂i . i=1 Here the coefficients v i = V (xi ), i = 1, 2, . . . , n are smooth functions. More generally, if {ei , I = 1, 2, . . . , k} forms a basis of derivations, we can write: V = j X V i ei . i=1 Here the coefficients V i are smooth functions. Then if {fi , I = 1, 2, . . . , k} forms another basis, we have transitions formula: fi = k X mji ej , ei = k X nji fj . j=1 j=1 Here each of mji and nji is a smoothly invertible matrices of smooth functions on U and each is the inverse of the other: k X mki njk t=1 = k X nki mjk = δij . t=1 The special case where we have two systems of coordinates X and Y for U. By definition these are smooth diffeomorphisms: X : U → X and Y : U → Y where X and Y lie in Rk and k is the dimension of U. Then, given a vector field, V the list of functions V (X) determines V uniquely, as does the list V (Y ) and they are connected by the chain rule: i V (Y ) = k X Jji (X)V (X j ). j=1 7 Here Jji is the Jacobian matrix of the transformation. More explicitly, we have Y as a composite function of X and then the Jacobian is the Jacobian matrix of the transformation. For example: • Let U = {(x, y, z) ∈ R3 : if y = 0, then x > 0} • Let X −1 : (r, t, w) ∈ R+ × (−π, π) × R → U be given by the formula X −1 (r, t, w) = (r cos(t), r sin(t), w). In co-ordinates, we have: x = r cos(t), y = r sin(t), z = w, p r = x2 + y 2 , t = arctan(yx−1 ), w = z. • Let Y −1 map (s, p, u) ∈ R+ × (0, π) × (−π, π) → U be given by the formula Y −1 (s, u, p) = (s cos(u) sin(p), s sin(u) sin(p), s cos(p)). In co-ordinates, we have: x = s cos(u) sin(p), s= p x2 + y 2 + z 2 , y = s sin(u) sin(p), z = s cos(p), 21 ! 2 2 x +y . p = arctan 2 x + y2 + z2 u = arctan(yx−1 ), Then we have: Y ◦ X −1 (r, t, w) = Y (r cos(t), r sin(t), w) = (s, u, p), r √ 2 2 s = r + w , u = t, p = arctan , w r = s sin(p), t = u, w = s cos(p). The (x, y, z) coordinate basis of derivations ∂ = [∂x , ∂y , ∂z ] is related to the (r, t, w) co-ordinate derivations [∂r , ∂t , ∂w ] by the invertible matrix transformations: [∂r , ∂t , ∂w ] = [∂x , ∂y , ∂z ] ∂x ∂r ∂y ∂r ∂z ∂r ∂x ∂t ∂y ∂t ∂z ∂t ∂x ∂w ∂y ∂w ∂z ∂w cos(t) −r sin(t) 0 = [∂x , ∂y , ∂z ] sin(t) r cos(t) 0 , 0 0 1 1 x(x2 + y 2 )− 2 −y 0 1 = [∂x , ∂y , ∂z ] y(x2 + y 2 )− 2 x 0 . 0 0 1 8 The inverse transformations are: [∂x , ∂y , ∂z ] = [∂r , ∂t , ∂w ] ∂r ∂x ∂t ∂x ∂w ∂x ∂r ∂y ∂t ∂y ∂w ∂y ∂r ∂z ∂t ∂z ∂w ∂z 1 1 x(x2 + y 2 )− 2 y(x2 + y 2 )− 2 0 = [∂r , ∂t , ∂w ] −y(x2 + y 2 )−1 x(x2 + y 2 )−1 0 0 0 1 cos(t) sin(t) 0 = [∂r , ∂t , ∂w ] −r−1 sin(t) r−1 cos(t) 0 . 0 0 1 Similarly, we have: [∂s , ∂p , ∂u ] = [∂x , ∂y , ∂z ] ∂x ∂s ∂y ∂s ∂z ∂s ∂x ∂p ∂y ∂p ∂z ∂p ∂x ∂u ∂y ∂u ∂z ∂u cos(u) sin(p) s cos(u) cos(p) −s sin(u) sin(p) = [∂x , ∂y , ∂z ] sin(u) sin(p) s sin(u) cos(p) s cos(u) sin(p) cos(p) −s sin(p) 0 1 1 x(x2 + y 2 + z 2 )− 2 zx(x2 + y 2 )− 2 −y 1 1 = [∂x , ∂y , ∂z ] y(x2 + y 2 + z 2 )− 2 zy(x2 + y 2 )− 2 x . 1 1 z(x2 + y 2 + z 2 )− 2 −(x2 + y 2 ) 2 0 The inverse transformations are: [∂x , ∂y , ∂z ] = [∂s , ∂p , ∂u ] ∂s ∂x ∂p ∂x ∂u ∂x ∂s ∂y ∂p ∂y ∂u ∂y ∂s ∂z ∂p ∂z ∂u ∂z 1 1 1 y(x2 + y 2 + z 2 )− 2 z(x2 + y 2 + z 2 )− 2 x(x2 + y 2 + z 2 )− 2 1 1 1 = [∂s , ∂p , ∂u ] zx(x2 + y 2 )− 2 (x2 + y 2 + z 2 )−1 yz(x2 + y 2 )− 2 (x2 + y 2 + z 2 )−1 −(x2 + y 2 ) 2 (x2 + y 2 + z 2 )−1 . −y(x2 + y 2 )−1 x(x2 + y 2 )−1 0 Similarly, we have the inverse pair of transformations: [∂r , ∂t , ∂w ] = [∂s , ∂p , ∂u ] ∂s ∂r ∂p ∂r ∂u ∂r ∂s ∂t ∂p ∂t ∂u ∂t ∂s ∂w ∂p ∂w ∂u ∂w 1 1 r(r2 + w2 )− 2 0 w(r2 + w2 )− 2 = [∂s , ∂p , ∂u ] w(r2 + w2 )−1 0 −r(r2 + w2 )−1 0 1 0 sin(p) 0 cos(p) = [∂s , ∂p , ∂u ] cos(p)s−1 0 − sin(p)s−1 , 0 1 0 [∂s , ∂p , ∂u ] = [∂r , ∂t , ∂w ] ∂r ∂s ∂t ∂s ∂w ∂s ∂r ∂p ∂t ∂p ∂w ∂p ∂r ∂u ∂t ∂u ∂w ∂u 1 sin(p) s cos(p) 0 r(r2 + w2 )− 2 w 0 0 0 1 = [∂r , ∂t , ∂w ] 0 0 1 . = [∂r , ∂t , ∂w ] 1 − 2 2 cos(p) −s sin(p) 0 w(r + w ) 2 −r 0 It may be checked that these matrix transformations are compatible. 9 Finally note that smooth vector fields have the sheaf property: if we have an open subset V of U, then a smooth vector field V on U restricts naturally to a vector field, denoted V |V , on V, simply by restricting the coefficient functions v i . More invariantly, if we are given f ∈ C ∞ (V) and v ∈ T (U) and we wish to compute v|V (f ), we can do this point-wise as follows: Given x ∈ V, pick a smooth bump function bx on U which has the constant value 1 in an open set containing x and is zero outside V (we know how to do this). Then the product function g = f bx is smooth on V and extends naturally, by zero, to a smooth function on U, still called g. Then V (g) is then well-defined as a smooth function on U. Now put (V |V (f ))(x) = (v(g))(x). It is easy to see that this number is independent of the various choices involved. As x varies in V this gives the required smooth function (V |V (f )) on V and then the map f → (V |V (f )) for f ∈ C ∞ (V) is the required restriction of the derivation V . Finally this restriction operation makes the collection of derivations into a sheaf of C ∞ modules over the space M. In particular we can assign a ”value” to the vector field V at a point p: P P if V = ki=1 v i (x)∂i , then its value at p is Vp = ki=1 v i (p)∂i . This is called a tangent vector at p. The space of all such tangent vectors at p is a vector space of dimension k, denoted Tp . Note that a tangent vector acts on smooth functions by Vp (f ) = V (f )(p). This can be shown easily to be independent of the choices made and depends only on the image of f in the space Tp∗ = Zp /Zp2 (called the co-tangent space at p), where Zp is the space of smooth functions defined near p which vanish at p and Zp2 is the space spanned by all squares of elements of Zp . Then the image of f in Tp∗ is called (df )p and we have: Vp = k X i v (p)∂i , (df )p = i=1 Vp (f ) = k X (∂i f )(p)dxi , i=1 k X v i (p)(∂i f )(p). i=1 So the tangent space Tp and the co-tangent space Tp∗ at p are dual. 10 The Lie bracket For v ∈ T (U) and w ∈ T (U) and f ∈ C ∞ (U) define: [v, w](f ) = v(w(f )) − w(v(f )). As f varies this gives a map [v, w] : C ∞ (U) → C ∞ (U). • Then [v, w] ∈ T (U ). The vector field [v, w] is called the Lie bracket of the vector fields v and w. Proof: The only non-trivial property to be checked is the Liebniz rule. So let f and g be smooth functions. Then we have: [v, w](f g) = v(w(f g)) − (w(v(f g)) = v(w(f )g + f w(g)) − (w(v(f )g + f v(g)) = v(w(f )g) + v(f w(g)) − w(v(f )g) − w(f v(g)) = v(w(f ))g+w(f )v(g)+v(f )w(g)+f v(w(g))−w(v(f ))g−v(f )w(g)−w(f )v(g)−f w(v(g)) = (v(w(f )) − w(v(f )))g + f (v(w(g)) − w(v(g))) = ([v, w](f ))g + f [v, w](g). From its definition, we see that the Lie bracket is skew symmetric and linear in each argument over the reals: [v, w] = −[w, v], [v + x, w + y] = [v, w] + [x, w] + [v, y] + [x, y], [av, bw] = ab[v, w]. Here v, w, x and y are in T (U ) and a and b are real constants. Also we have a Liebniz rule for multiplication by functions: [f v, gw] = f g[v, w] + f v(g)w − gw(f )v. This is valid for any f and g in C ∞ (U) and any v and w in T (U). The proof is direct calculation: for any h ∈ C ∞ (U), we have: [f v, gw](h) = f v(gw(h))−gw(f v(h)) = f v(g)w(h)+f gv(w(h))−gw(f )v(h)−gf w(v(h)) = f g(v(w(h)) − w(v(h))) + f v(g)w(h) − gw(f )v(h) = (f g[v, w] + f v(g)w − gw(f )v)(h). 11 Pn i Pn j Using this rule we see that if v = i=1 v ∂i and w = j=1 w ∂j , where the ∂i , i = 1, 2, . . . , n are the partial derivative derivations and where the coefficient functions v i and wj are smooth on U, then we have: [v, w] = n X z k ∂k , k=1 k k k z = v(w ) − w(v ) = n X v i ∂i wk − wi ∂i v k . i=1 Note that the partial derivative derivations commute: [∂i , ∂j ] = 0. Finally, we have the Jacobi identity for any three vector fields v, w and x in T (U): [v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0. Again this is checked by direct calculation. For any h ∈ C ∞ (U), we have: ([v, [w, x]]+[w, [x, v]]+[x, [v, w]])(h) = [v, [w, x]](h)+[w, [x, v]](h)+[x, [v, w]](h) = v([w, x](h))−[w, x](v(h))+w([x, v](h))−[x, v](w(h))+x([v, w](h))−[v, w](x(h)) = v(w(x(h)) − x(w(h))) − (w(x(v(h))) − x(w(v(h)))) + w(x(v(h)) − v(x(h))) −(x(v(w(h)))−v(x(w(h))))+x(v(w(h))−w(v(h)))−(v(w(x(h)))−(w(v(x(h)))) = v(w(x(h)))−v(x(w(h)))−w(x(v(h)))+x(w(v(h)))+w(x(v(h)))−w(v(x(h))) −x(v(w(h)))+v(x(w(h)))+x(v(w(h)))−x(w(v(h)))−v(w(x(h)))+w(v(x(h))) = 0. For each v ∈ T (U), we define the Lie derivative operator Lv acting on smooth functions and on vector fields, by the formulas, valid for any vector field w ∈ T (U) and any smooth function f on U: • Lv f = v(f ), • Lv w = [v, w]. 12 Then we have, rewriting the above formulas, valid for any f and g in C ∞ (U) and any smooth vector fields v, w and x on U and for any real constant function c on U: Lf v = f Lv , Lv+w = Lv + Lw , Lv w = −Lw v, Lv c = 0, Lv (f + g) = Lv f + Lv g, Lv (f g) = (Lv f )g + f Lv g, Lv (w + x) = Lv w + Lv x, Lv (f w) = (Lv f )w + f Lv w, Lv Lw − Lw Lv = L[v,w] . The contravariant tensor algebra The contravariant tensor algebra of U is the (associative) tensor algebra denoted T (U), generated by T 0 (U) = C ∞ (U) and by T 1 (U), the space of smooth vector fields on U (note the change of notation here, the space T 1 (U), by itself, was previously called T (U)), subject to the relations that multiplication of tensors is linear in each argument, over the ring C ∞ (U). So each tensor is a (finite) sum of words v = f v1 v2 . . . vk (called a word of k letters), with f ∈ C ∞ (U) and vi ∈ T 1 (U), i = 1, 2, . . . , k, for k a non-negative integer, with the product of the word v with the word w = gw1 w2 . . . wm , with g ∈ C ∞ (U) and wi ∈ T 1 (U), i = 1, 2, . . . , m, for m a non-negative integer, being the k + m word x = vw = hx1 x2 . . . xk+m , where h = f g and xi = vi , for 1 ≤ i ≤ k and xi = wi−k , for k + 1 ≤ i ≤ k + m. The subspace of T (U) spanned by words of k letters, for a fixed k is denoted T k (U) and its elements are called tensors of type k. Then the product maps T p (U) × T q (U) into T p+q (U), for any nonnegative integers p and q. In co-ordinates, every tensor τ of T p (U) can be written uniquely: τ= n X n X i1 =1 i2 =1 ··· n X τ i1 i2 ...ip (x)∂i1 ∂i2 . . . ∂ip . ip =1 13 Here each coefficient τ i1 i2 ...ip (x) is a smooth function on U. To avoid confusion with differential operators, it is traditional to introduce the notation ⊗ for the tensor product, so τ is then written: τ= n X n X i1 =1 i2 =1 ··· n X τ i1 i2 ...ip (x)∂i1 ⊗ ∂i2 ⊗ · · · ⊗ ∂ip . ip =1 So the np words ∂i1 ⊗ ∂i2 ⊗ · · · ⊗ ∂ip with 1 ≤ ij ≤ n, for each j = 1, 2, . . . , p, form a basis for T p (U), over the ring C ∞ (U), called the coordinate basis. The symmetric tensor algebra ST (U) is the algebra obtained from the tensor algebra, by requiring that all letters commute, so the multiplication in this algebra is commutative and indeed is a standard commutative algebra of polynomials in n-variables over C ∞ (U). So there is a surjective algebra homomorphism from T (U) to ST (U). which maps each word to itself. 14 For clarity the symmetric tensor product is often represented by the notation . The image of the co-ordinate basis for T p (U), gives the following coordinate basis of words for the image ST p (U) of T p (U): ∂i1 ∂i2 · · · ∂ip , for 1 ≤ i1 ≤ i2 · · · ≤ ip ≤ n. n+p−1 This basis has elements. p The skew-symmetric tensor algebra ΩT (U) is the algebra obtained from the tensor algebra, by requiring that all letters representing vector fields anti-commute, so the multiplication in this algebra is graded-commutative: words with an even number of letters commute with everything, words with an odd number of letters anti-commute with each other. There is a surjective algebra homomorphism from T (U) to ΩT (U), which maps each word to itself. For clarity the skew-symmetric tensor product is often represented by the notation ∧. The image of the co-ordinate basis for T p (U), gives the following coordinate basis for the image ST p (U) of T p (U): ∂i1 ∧ ∂i2 ∧ · · · ∧ ∂ip , for 1 < i1 < i2 · · · < ip < n. n This basis has elements. p In particular, we have ΩT p (U) = 0 if p > n and the total dimension of ΩT (U) n X n ∞ over C (U) is finite and is = 2n . p p=0 If v is a vector field, the Lie derivative along v, Lv extends naturally to T (U) such that the Lie derivative acts on vector fields and smooth functions as given above and for any tensors τ and υ any smooth vector fields v and w and any smooth function f on U, we have: Lv (τ + υ) = Lv (τ ) + Lv (υ), Lv (τ υ) = Lv (τ )υ + τ Lv (υ), Lv+w = Lv + Lw , Lf v = f Lv , Lv Lw − Lw Lv = L[v,w] . 15 Acting on the word w = gw1 w2 . . . wk , we have: Lv w = v(g)w1 w2 . . . wk +g[v, w1 ]w2 . . . wk +gw1 [v, w2 ] . . . wk +· · ·+gw1 w2 . . . [v, wk ]. In particular Lv preserves the type of a tensor. This action is compatible with the passages to the symmetric and skew symmetric tensor algebras, so the Lie derivative also acts naturally on these algebras obeying the same rules as above. Finally note that there are natural collections of tensors that maps isomorphically to each of the symmetric and skew symmetric algebras. • In the case of the symmetric algebra it is the space spanned by all words of the form f wp , for f smooth and w a vector field. More explicitly there is an idempotent symmetrization map which we describe as follows. Recall the permutation group Sn defined for any given n ∈ N as the group of bijections from Nn = {1, 2, . . . , n} ⊂ N to itself, with the group multiplication being composition. Then Sn has n! elements. Given a word w = f w1 w2 . . . wp in T p (U) as above and given a permutation σ in Sp , we define: wσ = f wσ(1) wσ(2) . . . wσ(p) . Note that if ρ is another permutation in Sp , we have: (wσ )ρ = f wρ(σ(1)) wρ(σ(2)) . . . wρ(σ(p)) = f w(ρ◦σ)(1)) w(ρ◦σ)(2)) . . . wρ◦σ)(p)) = wρ◦σ . Then the assignment w → wσ for each word extends naturally to give a homomorphism, τ → τσ from T p (U) to itself and we have (τσ )ρ = τρ◦σ , for any permutations ρ and σ in Sp . Then given any tensor τ ∈ T p (U), we define: S(τ ) = 1 X τσ . p! σ∈S p 16 This gives an endomorphims of T p (U). We prove that it is idempotent: X 1 1 XX S 2 (τ ) = S τσ = (τσ )ρ p! σ∈S p! ρ∈S σ∈S p p 1 XX τρ◦σ p! ρ∈S σ∈S = p p = p 1 XX τρ◦(ρ−1 ◦σ) p! ρ∈S σ∈S p p 1 XX τσ p! ρ∈S σ∈S p p X τσ = S(τ ). = = σ∈Sp Here we used the fact that as σ runs once through all permutations in Sp , so does ρ−1 ◦ σ, for each fixed permutation ρ in Sp . The image of S is then called the space of symmetric tensors. On this space we can define the symmetric tensor product by the formula A B = S(AB). This makes the space of symmetric tensors into an algebra naturally isomorphic to the algebra ST (U). • In the case of the skew symmetric tensors algebra, we use the fact, easily proved by induction, that there is a group epimorphism sn of the symmetric group Sn to Z2 , the multiplicative group of two elements 1 and −1, with 1 the identity and (−1)2 = 1. For a permutation σ, the quantity sn (σ) is called its sign and σ is called even if its sign is 1 and odd otherwise. A simple transposition is a permutation which fixes all but two element of Nn . Every permutation is a composition of simple transpositions and sn (σ) is then even if and only if there are an even number of terms in this composition. Derivations of differential forms Given a smooth function f on U, its The algebra of forms Ω(U) is the graded algebra over C ∞ (U) generated by the zero-forms C ∞ (U) and by exact differential one forms df , for f ∈ C ∞ (U), 17 such that the map d : C ∞ (U) → Ω1 (U), f → df , for any f ∈ C ∞ (U), is a derivation. Also we have (df )(dg) = −(dg)(df ), for any f and g in C ∞ (U). A derivation v of Ω(U) is by definition a map v : Ω(U) → Ω(U), such that: • v kills constants, v(c) = 0, for any constant function c on U. • v is an abelian group homomorphism: v(α + β) = v(α) + v(β), for any α and β in Ω(U). • v obeys a Liebniz rule: for any integers k and l and any α and β in Ω(U), where α has degree k and β degree l, we have: v(αβ) = v(α)β + (−1)kl v(β)α. The operator d : C ∞ (U) → Ω1 (U) extends naturally and uniquely to a derivation of Ω(U), still called d, such that d2 = 0. Then d is called the exterior derivative. Denote by D(U) the space of all derivations of Ω(U). Note that if v is in D(U), then so is γv, the multiplication of v on the left by any element γ of Ω(U), so D(U) is naturally an Ω(U)-module. A derivation v is said to be local if and only if it annihilates Ω0 (U). Define the X∗ -valued derivation δ of Ω(U) by the formula, valid for any smooth function f : δ(f ) = 0, δ(df ) = f 0 . Then δ is a derivation of degree minus one; also every local derivation V of Ω(U) has the unique expression: V = v.δ. Let V be any derivation of Ω(U). Then its action on C ∞ (U) is that of a derivation so it may be given by the formula: V (f ) = v.f 0 . Here v is a unique smooth form on U, with values in X. Put W = v.δ, so W is a local derivation. Then we have, since W annihilates functions, so also does its even and odd parts: (V − [W, d])(f ) = v.f 0 − v.δ(df ) = 0. 18 So we have the unique decomposition of any derivation: V = [W, d] + T, with T and W local derivations. 19 Notes on the theorem of Archimedes, Newton, Green, Stokes, Ostragodsky, Poincaré, de Rham, etc. The theorem is roughly stated as follows: • Let R be an oriented smooth n-manifold with boundary ∂R, at least piecewise smooth, where ∂R acquires the induced orientation from that of R. Let ω be an (n − 1)-form, smoothly defined on R and smooth up to and including the boundary ∂R. Then we have: Z Z dω = R ω. ∂R For us these manifolds will always be embedded inside some Euclidean space. A fairly simple case and, as we shall see, essentially the base case for a reverse induction proof of the general case, is when R is a bounded region in Rn , with a smooth boundary, ∂R, given by a single equation f (x) = 0, where f is a smooth function on Rn , such that R is the region given by the inequality f (x) ≤ 0 and on the boundary the differential df 6= 0 (by the rank theorem this guarantees that the boundary is a smooth manifold). Also x = (x1 , x2 , . . . xn ), where xj ∈ R, for each j = 1, 2, . . . , n. The gradient vector ∇f is then non-vanishing on ∂R and is everywhere outward pointing. Denote by N the unit outward pointing, normal vector, so the unit vector in the direction of ∇f , so we have the relations: p ∇f = N (∇f ).(∇f ), N.N = 1. In the following we will often have occasion to divide by a partial derivative of f with respect to a coordinate: one or more of these derivatives may vanish at a point, but always there is at least one that is non-vanishing, so the relevant formula will be understood to apply only to them. 20 Introduce the volume n-form: Σ = dx1 dx2 . . . dxn . Note (though we will not really need this) that we can write this invariantly as: θa1 θa2 . . . θan = Σa1 a2 ...an . Here θa gives the canonical isomorphism between dual vectors and one-forms: αa → αa θa . Also the θa anti-commute: θa θb = −θb θa . Then a1 a2 ...an is the orientation tensor and is totally skew. Dual to θb is the derivation δa of degree minus one, which annihilates functions and obeys δa θb = δab . More explicitly, the components of δa give n derivations, each of degree minus one, δ = (δ1 , δ2 , . . . , δn ), where δi (dxj ) = δij . Then put: Σ = δΣ = (δ1 Σ, δ2 Σ, . . . , δn Σ) = dx2 dx3 dx4 . . . dxn , −dx1 dx3 dx4 . . . dxn , dx1 dx2 dx4 . . . dxn , . . . , (−1)n−1 dx1 dx2 . . . dxn−1 . For example: • When n = 1, we have Σ = (1). • When n = 2, with Cartesian coordinates (x, y), we have: Σ = (dy, −dx). • When n = 3, with Cartesian coordinates (x, y, z), we have: Σ = (dy dz, dz dx, dx dy). • When n = 4, with Cartesian coordinates (t, x, y, z), we have: Σ = (dx dy dz, −dt dy dz, −dt dz dx, −dt dx dy). Notice that we have the identity: θa Σb = δba Σ. Proof: Using the derivation property of δb , we have: θa Σb = θa δb Σ = −δb (θa Σ) + (δb θa )Σ = δba Σ. 21 Here we used also that θa Σ = 0, which is true because the left-hand side is an (n + 1)-form, which automatically vanishes in n dimensions. For example, we have in the case n = 3: dx Σ = dx (dy dz, dz dx, dx dy) = (Σ, 0, 0), dy Σ = dy (dy dz, dz dx, dx dy) = (0, Σ, 0), dz Σ = dz (dy dz, dz dx, dx dy) = (0, 0, Σ). Now we may write the smooth (nP − 1)-form ω in terms P of the basis of (n − 1)forms given by Σ as ω = v.Σ = ni=1 v i Σi , where v = ni=1 v i ∂i is a smooth vector field on R. Then, since dΣi = 0, dxi Σj = δji Σ and d = Σni=1 dxI ∂i , we have: ! n n n n X X X X i i i j dω = d v Σi = (dv )Σi = dx (∂i v )Σj = δji (∂i v j )Σ = (∇.v)Σ. i=1 i=1 i,j=1 i,j=1 So the Stokes’ Theorem can be rewritten as the divergence theorem: Z Z v.Σ = (∇.v)Σ. R ∂R Now, restricted to the space ∂R, any (n − 1)-forms should be linearly dependent, since the space of (n−1)-forms in (n−1) dimensions is one-dimensional. In terms of the differentials of the co-ordinates, we have the non-trivial linear relation on restriction to the surface f = 0: 0 = df = (∇f ).θ = f1 dx1 + f2 dx2 + · · · + fn dxn . This allows us to reduce the space of differentials to a space of dimension n − 1. Indeed, consider the (n − 1)-form Σi at a point where the differential df is non-zero. Then we can consider the differential df as the first element of a basis of forms at that point, so we have the expression Σi = λi β + (df )µi , for some (n − 1)-form β and some (n − 2)-forms, µi and some scalars λi (here β is the product of the other forms of the basis). Multiplying this equation by df we get the relation: (df )Σi = λi (df )β = Σnj=1 fj dxj Σi = fi Σ. 22 But the form Σ 6= 0, so at least one of the fi Σ is not zero, so neither can the form (df )β be zero, so we get λi = kfi = mNi , for some non-zero scalars k and m. Put α = mβ giving the relation: Σi = Ni α + (df )µi In particular, on restriction to the hypersurface f = 0, since then df = 0, we get the relation, for some (n − 1)-form α: Σ = N α. Taking the dot product of this equation with the unit vector N we get α = N.Σ. Accordingly, put α = N.Σ. Then restricted to the surface f = 0, we have the relation: (∇f )α Σ = Nα = p . (∇f ).(∇f ) This gives the relations, valid on restriction to the boundary surface f = 0: p Σi (∇f ).(∇f ) (∂i f )(∇f ).Σ , Σi = . α= ∂i f (∇f ).(∇f ) So, for example, in two dimensions on f = 0, we have: p p dy fx2 + fy2 dx fx2 + fy2 fx dy − fy dx (fx , fy ).(dy, −dx) p 2 α = N.Σ = = p 2 = =− . fx fy fx + fy2 fx + fy2 Here we used that on the curve f = 0, we have df = 0, so we have: fx dx + fy dy = 0, dx dy =− , fy fx dy fx =− , dx fy dx fy =− . dy fx p p |dx| fx2 + fy2 |dy| fx2 + fy2 and ds = In particular we have that the measures ds = |fy | |fx | are equal and each gives the standard measure of length along the curve: 2 ! 2 ! dy fx 2 2 2 2 2 ds = dx + dy = dx 1 + = dx 1 + dx fy = dy 2 2 dx2 2 2 2 f + f = f + f y y , fy2 x fx2 x |α| = ds. 23 Now consider the proof of the divergence theorem in two dimensions, where we put v = [P, Q]: Z Z v.Σ = (∇.v)Σ, R ∂R Z Z [P, Q].[dy, −dx] = (Px + Qy )dxdy, R ∂R Z Z (P dy − Qdx) = (Px + Qy )dxdy, R ∂R This is linear over the reals in the functions P and Q, so to prove the theorem, it suffices to prove the following two relations: Z Z Z Z P dy = Px dxdy, − Qdx = Qy dxdy. R ∂R R ∂R Here the two-dimensional integrals are properly oriented, so they reduce to ordinary double integrals. We will prove the theorem for a simple region, R, simple in the sense that it can be parametrized easily to cover the region in two ways: • For c ≤ y ≤ d, we have a(y) ≤ x ≤ b(y). • For a ≤ x ≤ b, we have c(x) ≤ y ≤ d(x). See the sketch: the boundary ∂R of R is traced counterclockwise y = d(x) d a simple region R y x = a(y) c x = b(y) y = c(x) a x b 24 Then using the first parametrization, we have: Z Z d Z b(y) Z Px dxdy = Px dxdy = R c a(y) d Z Z c b(y) [P (x, y)]a(y) dy c P (b(y), y)dy + = d Z P (a(y), y)dy = c d P (x, y)dy. ∂R The point here is that the right-hand-part of the boundary ∂R is parametrized by x = b(y), for y from c to d, whereas the left-hand part is parametrized by x = a(y), for y from d to c. Next, using the second parametrization, we have: Z Z b Z d(x) Z b d(x) Qy dxdy = Qy dydx = [Q(x, y)]c(x) dx R Z =− a c(x) b Z Q(x, c(x))dx − a a a Z Q(x, d(x))dx = − b Q(x, y)dx. ∂R The point here is that the lower part of the boundary ∂R is parametrized by y = c(x), for x from a to b, whereas the upper part is parametrized by y = d(x), for x from b to a. So the required result is proved. Example By this theorem, the area A of the region R can be given by any of the formulas: Z Z Z Z 1 ydx = xdy = − dxdy = A= (xdy − ydx). 2 ∂R ∂R ∂R R In polar co-ordinates, we have: x = r cos(θ), dx = cos(θ)dr − sin(θ)rdθ, y = r sin(θ), dy = sin(θ)dr + cos(θ)rdθ, xdy−ydx = r cos(θ)(sin(θ)dr+cos(θ)rdθ)−r sin(θ)(cos(θ)dr−sin(θ)rdθ) = r2 dθ, 1 1 dxdy = d(xdy − ydx) = d(r2 dθ) = rdrdθ. 2 2 So the area is also given by the formulas: Z Z 1 A= rdrdθ = r2 dθ. 2 ∂R R 25 The proof of the divergence theorem in all dimensions is basically the same. In (n + 1)-dimensions, let a simple region be one whose projection to each coordinate hyperplane is itself a simple region in n-dimensions. In particular, if we write the (n+1)-co-ordinates as (t, x) where x = (x1 , x2 , . . . , xn ), by the linearity of Stokes’ Theorem in the vector field, v, it suffices to take the vector field v to have only one non-zero component, denoted w, in the t-direction, in which case ∇.v = wt , so we just need to prove: Z Z n wt dtd x = v.N dn S. R ∂R Here N is the outward pointing normal and dn S is the hyper-surface volume element. The point here is that once we have proved this case, the other cases, where v has just one non-zero component in another co-ordinate direction, follow by appropriate permutation of the coordinates and then the general case, where v is arbitrary is obtained by summation of these special case, using linearity over the reals. Now the region R is described by the range t goes from t = a(x) to t = b(x), as x ranges over S, the projection of the region R onto the hyper-plane t = 0. By the Fundamental Theorem of Calculus, the integral on the left-hand side is: Z Z Z Z b(x) b(x) n n (w(b(x), x) − w(a(x), x))dn x [w]a(x) d x = wt dtd x = x∈S a(x) x∈S x∈S On the upper part of the region, given by the equation f (t, x) = t − b(x) = 0, the outward pointing (non-normalized) gradient vector is the vector: ∇(t − b(x)) = (1, −∇b). On the lower part of the region, given by the equation f (t, x) = a(x) − t = 0, the outward pointing (non-normalized) gradient vector is the vector: ∇(a(x) − t) = (−1, ∇a). So, on the upper boundary, we have, for the surface element, dn S, in terms of the coordinates x1 , x2 , . . . xn : p p ft2 + (∇b).(∇b) n n d x = 1 + (∇b).(∇b)dn x. d S= |ft | 26 Similarly on the lower boundary, we have: p p ft2 + (∇a).(∇a) n n d x = 1 + (∇a).(∇a)dn x. d S= |ft | Also for the integrand v.N we have the expression, since v = (w, 0, 0, . . . , 0) on the upper boundary where t = b(x): w(b(x), x) , , v.N = p 1 + (∇b).(∇b) v.N dn S = w(b(x), x)dn x. On the lower boundary, where t = a(x), we have instead: w(a(x), x) v.N = − p , 1 + (∇a).(∇a) v.N dS = −w(a(x), x)dn x. So we get: Z Z n n v.N d S = ∂R v.N d S + t=b(x),x∈S Z v.N dn S t=a(x),x∈S Z n w(b(x), x)d x − = Z w(a(x), x)dn x. x∈S x∈S Z = (w(b(x), x) − w(a(x), x))dn x. x∈S This agrees with the left-hand-side of Stokes’ Theorem, calculated above, so we are done: the divergence theorem holds. 27 Next we consider the case where the manifold of integration lies in a higher dimensional space. Stokes’ Theorem is additive in the regions under consideration, so using the Rank theorem, we may take for the integration region R, the space (t(x), x) as x varies over a simple region S in Rn and the boundary ∂R is given by the same formula (t(x), x), for x ∈ ∂S. Here the map t is a smooth map from S to Rm , for some m, the total dimension of the space inside which the manifold the pull-back formula for R lies being R m + n. Then R integration tells us that R dω = S p∗ (dω) = S d(p∗ (ω)), where p : S → R is the bijection x → p(x) = (t(x), x). But applied to the boundary, we have also the pull-back formula: Z Z ω= p∗ (ω). ∂R ∂S Bu the divergence theorem, proved above, we have Z Z ∗ d(p∗ (ω)). p (ω) = S ∂S So Stokes’ Theorem follows. An alternative and perhaps instructive way of thinking of this case, without using the pull-back argument, is to restrict to the case that the boundary of S is piecewise flat. Then we form a closed domain, whose (curved) ”roof” is the region R, flat ”floor” is S and such that the boundary pieces each lie in a hyperplane, so form (flat) vertical ”walls”, with base the edges of the boundary of S and tops the corresponding curved pieces of the boundary of R. Then we proceed by reverse induction on the dimension n: we assume that Stokes’ theorem holds in all higher dimensions, the base case being the highest possible dimension, where the theorem becomes the divergence theorem, which we have already proved. Denote by T the ”room” with ”ceiling” R, ”floor” S and a finite number of ”walls”, the boundary regions Wi , i = 1, 2, . . . , k, each with outward pointing normal, above the boundary pieces Si , i = 1, 2, . . . k of S. These various regions form the boundary of the room T . Specifically in the sense of integration, where we orient both S and R upwards, we have: ∂T = R − S + Σkj=1 Wj , 0 = ∂ 2 T = ∂R − ∂S + k X ∂Wj , j=1 ∂R = ∂S − k X ∂Wj . j=1 The minus signs in front of S in the first equation is there because the orientation chosen for S points inside the region T . 28 Then we have by the induction hypothesis, for any form α defined on T : Z Z Z Z k Z X dα = α= α− α+ α. T R ∂T S j=1 Wj So now take the case that α = dω. Then dα = 0, so we get, using the inductive hypothesis: Z Z Z Z Z k Z X dα = 0 = α= dω = dω − dω + T ∂T R ∂T Z Z dω − dω = S R S k Z X j=1 j=1 dω, Wj dω. Wj But each of the regions S and Wj , j = 1, 2, . . . , k are ordinary flat regions, so the usual divergence theorem applies to each of them and so Stokes’ theorem holds for them, giving: Z Z Z Z k Z X dω = ω− ω= ω= ω. P R ∂S j=1 ∂Wj ∂S− k j=1 ∂Wj ∂R So by reverse induction on dimension, the required theorem holds. Notice that here we are exploiting the idea that d on forms and ∂ on regions are dual to each other: then dual to the idea that d2 = 0 is the idea that ∂ 2 = 0 or ” boundary of a boundary is empty”. This idea is codified in the theory of singular homology. Summarizing, if we represent the integral in the style of Paul Dirac: hR|ωi denoting the integral of ω over the region R, then Stokes’ Theorem says that the adjoint of d is ∂: h∂R|ωi = hR|dωi. It follows that the adjoint of d2 is ∂ 2 : h∂ 2 R|ωi = h∂(∂R)|ωi = h∂R|dωi = hR|d(dω)i = hR|d2 ωi. So perhaps it is not surprising that the fact that d2 = 0 is related to the fact that ∂ 2 = 0. Finally we can build up Stokes’ Theorem for more complicated regions, by splitting it appropriately into simple regions, using the additivity of both sides of the Stokes’ Theorem under the decomposition of regions. 29 Special cases of Stokes’ Theorem In one dimension, the theorem is just the Fundamental Theorem of Calculus: Z b df = f (b) − f (a). a The same theorem is true in any higher dimension, where f is a function and the integration region is a path R connecting its boundary points ∂R = B−A (the minus sign coming from the orientation): Z Z df = f = f (B) − f (A). R ∂R In two dimensions, we may write the theorem in divergence form with v = [P, Q] and ∇.v = Px + Qy : Z Z Z Z [P, Q].N ds. P dy − Qdx = (∇.v)dxdy = (Px + Qy )dxdy = R R ∂R ∂R Alternatively, we may replace P by Q and Q by −P in this formula giving the more natural looking: Z Z Z Z P dx + Qdy = v.T ds. (∇ × v)dxdy = (Qx − Py )dxdy = R R ∂R Here v = [P, Q], ∇ × v = Qx − Py and T is the unit tangent vector to the curve ∂R; note that T ds = P dx + Qdy. In three dimensions, we have the divergence theorem, when R is threedimensional: Z Z (∇.v)dxdydz = (Px + Qy + Rz )dxdydz R Z = R Z v.Σ = ∂R Z (P dydz + Qdzdx + Rdxdy) = ∂R ∂R 2 Here v = [P, Q, R], Σ = [dydz, dzdx, dxdy] = N d S. 30 v.N d2 S. The lower dimensional Stokes’ Theorem reads, for the case that R is twodimensional: Z Z (P dx+Qdy+Rdz) = ((Ry −Qz )dydz+(Pz −Rx )dzdx+(Qx −Py )dxdy), R ∂R Z Z Z (∇ × v).Σ = v.dx = v.T ds = ∂R Z ∂R R [N, ∇, v]d2 S. R Here the notation [A, B, C] denotes the vector triple product, the determinant of its vector arguments, A, B and C, so [A, B, C] = A.(B × C) = (A × B).C, so [N, ∇, v] = N.(∇ × v) = (N × ∇).v. Also T is the unit tangent vector in the direction of the curve, so we have T ds = dx = [dx, dy, dz] and N d2 S = [dydz, dzdx, dxdy], as usual. In four dimensions, with coordinates xa = (t, x, y, z) = (t, x), where x0 = t and x1 = x, x2 = y and x3 = z, there are two dual ways of writing a two-form: • First we just write it out as a polynomial in dxa : 1 F = Σ4a=1 Σ4b=1 Fab dxa dxb = dtE + B, 2 E = E.dx = F01 dx + F02 dy + F03 dz, F01 = E 1 , F02 = E 2 , F03 = E 3 , B = B.Σ = B1 dx2 dx3 +B2 dx3 dx1 +B3 dx1 dx2 = B1 dydz+B2 dzdx+B3 dxdy. • Dually, we write: 1 1 F ∗ = Σ4a=1 Σ4b=1 F ab δa δb Ω = Σ4a=1 Σ4b=1 F ab Σab . 2 2 Here, if we are in space-time, [F 01 , F 02 , F 03 ] = −E and [F 23 , F 31 , F 12 ] = B. Also Ω = dtdxdydz. Then we have: Σ0i = δ0 δi Ω = −δi (dxdydz) = −Σi , Σ23 = δ2 δ3 Ω = −dtdx, Σ31 = δ3 δ1 Ω = −dtdy, Σ12 = δ1 δ2 Ω = −dtdz, F ∗ = −dtB.dx + E.Σ − dtB.dx. 31 So the effect of going from F to F ∗ is the same as replacing (E, B) by (−B, E). The free electromagnetic field equations of James Clerk Maxwell are then: 0 = dF, 0 = dF ∗ . Here the electromagnetic two-form in four dimensions, F encodes both the electric field E and the magnetic field B. Write d = dt∂t + ∂, where ∂ = dx∂x + dy∂y + dz∂z = dx.∇. Then we have, with Σ = dxdydz and Σ = [dydz, dzdx, dxdy] as usual: dF = d(dtE + B.Σ) = −dt(∇ × E).Σ + dt(∂t B).Σ) + (∇.B)Σ, = dt(−(∇ × E).Σ + ∂t B).Σ + (∇.B)Σ, dF ∗ = dt((∇ × B).Σ + ∂t E).Σ + (∇.E)Σ. The free Maxwell equations are then: 0 = dF, 0 = dF ∗ , 0 = ∇.B, 0 = ∇.E, ∂t B = ∇ × E, ∂t E = −∇ × B, Recall the derivative identity: ∇ × (∇ × u) = ∇(∇.u) − ∆u. Here u is a vector field and ∆ = ∇.∇ = ∂x2 + ∂y2 + ∂z2 is called the threedimensional Laplacian. Then we get, using the free Maxwell equations: ∂t2 B = ∇ × ∂t E = −∇ × (∇ × B) = ∆B, ∂t2 E = −∇ × ∂t B = −∇ × (∇ × E) = ∆E. B = 0, E = 0. Here is the wave operator = ∂t2 − ∆. The fact that the free Maxwell equations obey the wave equation in four dimensions, is one of the cornerstones of Albert Einstein’s theory of relativity. The four-dimensional perspective is due to Hermann Minkowski. Maxwell’s equations in the presence of currents are the following: dF = 0, dF ∗ = J. 32 Here J is the current three-form, which can be written J = [φ, j].Ω, where we have: Ω = [δ0 , δ1 , δ2 , δ3 ]Ω = [dxdydz, −dtdydz, −dtdzdx, −dtdxdy]. So we have: J = φΣ − dtj.Σ. The equation of current conservation, is the equation dJ = 0, which follows by applying d to the equation dF ∗ = J: 0 = dJ = (dt∂t + ∂)(φdxdydz − dtj.Σ) = (∂t φ + ∇.j)dtdxdydz, ∂t φ + ∇.j = 0. The Maxwell equations now read: dF = 0, 0 = ∇.B, J = dF ∗ , ∂t B = ∇ × E, φΣ − dtj.Σ = dt((∇ × B).Σ + ∂t E).Σ + (∇.E)Σ, φ = ∇.E, −j = ∂t E + ∇ × B. Now we have: ∂t2 E = −∂t j − ∇ × ∂t B = −∂t j − ∇ × (∇ × E) = ∆E − ∂t j − ∇(∇.E) E = −∂t j − ∇φ, ∂t2 B = ∇ × ∂t E = −∇ × j − ∇ × (∇ × B) = ∆B − ∇ × j, B = −∇ × j. So the electric and magnetic fields obey the wave equation with sources determined by the current. 33