Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Matrix multiplication wikipedia , lookup
System of linear equations wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Tensor product of modules wikipedia , lookup
Cross product wikipedia , lookup
Exterior algebra wikipedia , lookup
Laplace–Runge–Lenz vector wikipedia , lookup
Vector space wikipedia , lookup
Matrix calculus wikipedia , lookup
Euclidean vector wikipedia , lookup
Tensors, Vectors, and Linear Forms Michael Griffith May 9, 2014 Abstract The metric tensor and its implications for basic linear algebra and calculus are introduced and discussed. A generalized notion of vectors is introduced to distinguish the algebraic structure from the physical object. Linear forms and their vector structure are developed along the way. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 Linear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5 Inner Product and Norm . . . . . . . . . . . . . . . . . . . . . . . . 10 6 Tensors 7 The Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Bibliography 17 1 Introduction This capstone project was an exercise in dissociating intuition from rigorous definition. The basic concepts of linear algebra, the vector and vector space behavior, had to be completely separated from the author’s physicist understanding. The 1 construction of tensors, and in particular the metric tensor, was the ultimate goal for the paper. Over the course of the year-long capstone course, most of the material was internalized by reading the same topic in multiple sources. For this reason, it is difficult to cite sources in the text, so the author would like to make note of them here. The primary text used was Elementary Differential Geometry[O’Neill(1966)], which provided most of the conceptual framework for the paper. Other sources were used to supplement the mechanical understanding of the material, and include texts in introductory relativity[Schutz(2009)], riemannian geometry[Morgan(1993)], calculus on manifolds[Spivak(1965)], and tensors and vectors[Fleisch(2011)]. 2 Vectors It is common practice in physics and introductory linear algebra courses to identify the general notion of a vector with the physical example. A physical vector is, more or less, a coordinate. We see such objects represented by a drawn arrow so often that it is easy to conflate the example with the concept. To think of a vector as simply a directed quantity, a coordinate, or an arrow, hides the more general underlying structure involved. If we wish to understand what a vector is more abstractly, we should start with what we expect from simple examples. Any grand scheme we develop should still 2 look more or less like what we are used to. Suppose we have a set V with some elements ~vi , then we are interested in determining exactly how those elements must behave in order for V to be reasonably dubbed a vector space. Fundamentally, addition of two vectors must yield another vector. For each vector, there must exist another whose action is its exact negation. The addition of these two vectors must be a vector, therefore we must have a resultant zero vector whose action is entirely nil. Thus we require V be closed under addition and contain an additive identity, as well as inverses. 1) ∀ ~vi , ~vj ∈ V , ~vi + ~vj ∈ V 2) ∃ ~0 ∈ V | ∀ ~vi ∈ V , ~vi + ~0 = ~vi 3) ∀ ~vi ∈ V , ∃ − ~vi ∈ V | ~vi + (−~vi ) = ~0 When adding vectors, the order in which one chooses to combine them should not matter; the addition should be both associative and commutative. 4) ∀ ~vi , ~vj , ~vk ∈ V , (~vi + ~vj ) + ~vk = ~vi + (~vj + ~vk ) 5) ∀ ~vi , ~vj ∈ V , ~vi + ~vj = ~vj + ~vi These conditions precisely define the behavior of an abelian group, and indeed a vector space must exhibit group behavior. However, not all abelian group constitute vector spaces. So what is the true shibboleth of a vector space? Of central 3 importance, even above these basic rules for recombination, is the effect of scaling a vector. Elements of a vector space must behave linearly under scalar multiplication. That is to say that contracting or dilating the length of some vectors and then adding them must be equivalent to contracting or dilating the sum of the original vectors by precisely the same scalar value. Similarly, adding together mutliple scaled copies of a vector must be equivalent to scaling that vector by one overall value, while scaling a single vector twice by different values must amount to scaling by their product. These properties may seem slightly opaque, but they are precisely the rules we would anticipate geometrically when looking at physical vectors. 6) ∀ a, b ∈ R, ~vi ∈ V , a(b~vi ) = (ab)~vi 7) ∀ a, b ∈ R, ~vi ∈ V , (a + b)~vi = a~vi + b~vi 8) ∀ a ∈ R, ~vi , ~vj ∈ V , a(~vi + ~vj ) = a~vi + a~vj With these eight properties, the set V can be considered a vector space, and its elements vectors. Note that while the motivations for many of these rules were rooted in our geometric understanding of physical vectors, the nature of the definition is strictly algebraic. Any set of objects which obeys these rules may be treated as a vector space, regardless of whether they can reasonably be represented by arrows on a page, or by any geometric diagramming whatsoever. 4 The trouble with this new definition is that it leaves us without a sense of how to actually going about adding vectors, or really comparing them in any meaningful way. In stripping out the geometry, we have also removed much of the methodology with which we are familiar. In order to reconstruct these lost tools, we must find a way to assign a generic geometry to our generic vectors. The primary tool in this endeavor is the metric, which defines the “distance” between two vectors. Before we can define a metric, we need a more familiar representation for our vectors. By our definition, any vector can be written as the sum of other vectors. We could ask then how many vectors are required to be able to write any other vector as a linear combination. A collection whose combinations comprise all other vectors in the space is called a basis. If we can find a finite basis, then the size of the smallest possible basis is called the dimension of the vector space. The elements of the basis are said to span the space. Suppose we have a vector space V with dimension n. There exists some basis vectors ~e1 , ~e2 , ..., ~en in V such that any ~v in V can be written as a linear combination of the ~ei . If we agree upon an ordering of the basis, then we can communicate ~v by listing the scalar coefficients of each basis element used in its combination. These coefficients are the “components” of the vector. It is easy to see that distinct vectors will have distinct component representations. 5 We will use the Einstein summation convention to compactly denote the component forms of vectors. In this notation, we assume that summation is implied over any index which appears in both the superscript and subscript of multiplied terms. We mark the components of a vector with superscript indices as follows. ~v = v i~ei (1) We can see that this is exactly the same as our physical vectors, but without the notion of the components having direction. If we wish to add two vectors together now all we need do is add their components, since the common basis can factored out entirely. ~x + ~y = xi~ei + y j~ej = (xk + y k )~ek (2) 3 Linear Forms To define a geometry on a vector space, we must be able to compare vectors in a meaningful way. One way to do this is by assigning to each vector, by some criteria, a scalar value. To this end we must define real valued functions which take vectors as their input. This type of an object is called a function, or linear form. A linear form which takes a single vector as its input is called a 1-form. A 16 form can be something as simple as assigning to each vector one of its component values, essentially ranking vectors by only one of their parts. We will use a bar rather than an arrow to mark forms. Then we could write the 1-form which assigns to each vector its first component as follows. ω̄(~v ) = v 1 (3) We could have any more complicated combination of the components of the input vector. For example, a 1-form acting on a 3-dimensional vector could have the form ω̄(~v) = 4v 1 + v 2 − 3v 3 (4) We can see that these forms have components of their own in a sense, corresponding to the components of the vectors on which they act. We can write a general 1-form acting on a vector ~v . ω̄(~v) = ωi v i (5) In keeping with the summation convention, the components of these functionals are written in the subscript. This conveniently allows us to differentiate between vector components and form components. We have seen that we can define a 1-form which gives a single component of a 7 vector as output. If label these forms ēi by analogy with the basis vectors, then we can write 1-forms using almost the same notation we had for vectors. ω̄ = ωi ēi (6) It turns out that these functionals constitute their own vector space, related to but distinct from the vector space over which they are defined. To return to our motivating question, how do these forms aid us in defining a geometry over a given vector space? To see their utility, we must compare the behaviors of vectors and 1-forms under a change of basis. 4 Linear Operators A linear operator is a matrix whose entries send one basis of a vector space to a new basis. Since such a matrix operates on vectors and gives vectors as output, we must use a mixture of upper and lower indices. Let T be the linear operator giving the transformation from a basis {~ei } to a new basis {~e˜i } and S be its inverse. We will mark objects in the new basis with a tilde. Taking advantage of the fact that the representation but not the vector itself change between coordinate frames, we can express the action of T on the basis vectors, and subsequently an arbitrary vector, as follows 8 Tji~ei = ~e˜j (7) ṽ i~e˜i = ṽ j Tji~ei = v i~ei (8) i ṽ = Sji v j The components of the new basis can be written as a linear combination of the components in the original basis, and the transformation is the inverse of the transformation of the basis. If the vector is the same between the two frames, its magnitude, or norm, should be invariant. This is the quantity which will allow us to define a consistent geometry for the vector space. If we attempt to use the familiar dot product on our generic vector space, we immediately run into problems with the change of bases. Consider the behavior of a vector under the change T . ~v˜ · ~v˜ = Tji 2 ~v · ~v (9) Which shows that the norm is not invariant, as the right hand side of this equation only equals the dot product in the original basis if the coordinate transformation is the identity, or in other words no change at all. However, the behavior of forms under this change in vector basis is quite different. We note now that the basis forms acting on the basis vectors give the Kronecker delta as output. 9 That is, the ith component of each basis vector is one for the ith vector and zero otherwise. ω̃j = ω̄(~e˜j ) = ω̄(Tji~ei ) (10) = Tji ω̄(~ei ) = Tji ωi So we see that 1-forms transform from one vector basis to another with the coordinate transform matrix, while the vectors themselves transform with its inverse. For this reason, we refer to forms as covariant and vectors as contravariant. 5 Inner Product and Norm Now let us define, for some vector ~v , a dual 1-form whose components are precisely the same in some basis {~ei }. Then we may perform an operation similar to the dot product, but using the vector and this dual form. This is called the inner product, and is a simplified version of the method to come. h~v , v̄i = v i vi (11) Now let us perform this same operation, but using the same transformed components as before. 10 D E ~v˜, v̄˜ = ṽ i ṽi = Tji Sij v i vi = v i vi = h~v, v̄i (12) The change in the 1-form between frames precisely negates the change in the vector! This is a sort of proof of concept, a demonstration of why these tools are useful in constructing a coordinate invariant geometry for a vector space. The full formulation of this idea requires a broader category of object than we have previously seen: the tensor. 6 Tensors A tensor is a multidimensional array of values whose transformation between coordinate frames is given by some number of covariant and contravariant rules. The combination of rules for any particular tensor is described by its indices, and we rank tensors based on the number of upper and lower indices they have. Everything we have seen so far have been simple examples of tensors, with the following classification: • A scalar value, with no co- or contravariant indices, is a rank-(0,0) tensor • A vector, with only one contravariant index, is a rank-(1,0) tensor • A 1-form, with only one covariant index, is a rank-(0,1) tensor • A linear operator, with one of each type of index, is a rank-(1,1) tensor 11 A tensor is a difficult beast to understand intuitively, but we can start by getting a sense for what each index means. From our simple examples, we can glean three basic ideas about the nature of tensors. First, each contravariant index in a tensor behaves like a vector. This means that rank-(n,0) tensors can be thought of as multi-vectors in some capacity. Second, each covariant index behaves like a 1-form. A rank-(0,m) tensor then is a multilinear form with m arguments. Third, multiplication of two tensors is performed by summing the product of their entries over some shared index. We have seen this in the action of linear operators. The third point is an important one, as it implies that the action of a rank(n,m) tensor sends a rank-(m,n) tensor to the real numbers. A multilinear form, for example, is a real-valued function over some class of multi-vectors. Thus we can think of the general (n,m) tensor as a function which takes n 1-forms and m vectors, or any collection of other tensors with the right cumulative rank, into the real numbers. The tensor product is the generalization of the dot product from earlier. Its definition is difficult to read initially, simply due to the fact that we must account for an arbitrary number of indices. In essence, it is just repeated application of the Einstein summation convention. Each tensor multiplication collapses one contravariant index of one of the input with a covariant index of the other. For 12 example, let i,j,k range from one to three. Then we could have the following tensor product A ⊗ ~x = Akij xj = Aki1 x1 + Aki2 x2 + Aki3 x3 (13) Or, with the same tensor A and vector ~x, we could form an entirely different product A ⊗ ~x = Akij xi = Ak1j x1 + Ak2j x2 + Ak3j x3 (14) In this example, A is a 3×3×3 array, and the change in the indexing just means that we are multiplying the vector argument’s entries by a different “column” of A. The tensor product is notably independent in structure from the actual values of any particular tensor. This is very important, because it allows us to define an inner product on a vector space using a tensor rather than a specialized operator. In fact, many of the operations we are used to performing on physical vectors can be stated as tensors. 7 The Metric Tensor An inner product, by definition, should take two vectors as input and give a real number as output. This is exactly the description of the behavior of a rank-(0,2) 13 tensor. Earlier we used a specially chosen 1-form to create an inner product with a given vector. This method is functionally equivalent, but does not require us to determine before-hand whether the 1-form we choose must be expressed in a different basis. The metric tensor is a 2-form, a rank-(0,2) tensor. From our earlier intuition, the action of this type of tensor on single vector produces a 1-form. Suppose we want to find the inner product of two vectors ~x and ~y . We need only decide on a metric tensor η which represents the geometry we wish to consider, and the process of keeping track of changing bases is accomplished for us by the notation. h~x, ~yi = η ⊗ ~x ⊗ ~y = ηij xi y j (15) So long as the vectors are chosen from the same basis to begin with, η takes care of the expression of the 1-form equivalent to ~x. Essentially, we can think of the product η ⊗ ~x as a 1-form of its own, which can clearly be applied to ~y . In simple euclidean geometry, the metric tensor is just the identity matrix. This is why we are able to multiply physical vectors together without any sense of the complications we have discussed here. The behavioral difference between vectors and 1-forms is invisible under a euclidean metric. In physics, General Relativity makes heavy use of tensor notation. The apparent inconsistencies between multiple observers’ measurements can be understood 14 in terms of a non-euclidean metric. In this construction, gravity causes the geometry of the universe to be warped to some degree, resulting in a non-constant interpretation of distance and length of time depending on the location and motion of the observer. Electromagnetism also makes use of tensor mathematics; we are able to express the electric and magnetic fields as a single physical phenomenon by considering them as manifestations of one tensor in differently moving reference frames. All of the fundamental structure of linear algebra and calculus that we familiarize ourselves with at the undergraduate level is actually dependent on one choice of metric tensor. The shortcuts and heuristics which we have now dismissed are entirely valid when using the euclidean metric. This is effectively the statement that classical physics works in flat space. However, the language of tensors allows us to consider these topics outside of the familiar, flat space of euclidean geometry. It allows us to assess Minkowski space-time, for example, which describes the relationship between length of time and spacial distance in terms of a coordinate geometry. Tensor formulations open up an entirely new are of study. With them defined and at least partially understood, a much more powerful form of linear algebra immediately arises in which we may assign anything from physical quantities to polynomials as vectors, so long as they meet the algebraic requirements. We 15 can then consider their behavior and even their geometry, where before we would have had no sense of the geometry of polynomials because we could not attach a physical diagram to them. It is eventually possible to construct a tensor calculus which is workable in arbitrary curved spaces, whereas our undergraduate tools are very specialized. This generalized notion of integration is the next goal for post-graduate study of tensor-based mathematics. 16 Bibliography [O’Neill(1966)] B. O’Neill, Elementary Differential Geometry (Academic Press, 1966). [Schutz(2009)] B. Schutz, A First Course in General Relativity (Cambridge University Press, 2009). [Morgan(1993)] F. Morgan, Riemannian Geometry: A Beginner’s Guide (Jones and Bartlett Publishers, 1993). [Spivak(1965)] M. Spivak, Calculus on Manifolds (W. A. Benjamin, Inc., 1965). [Fleisch(2011)] D. Fleisch, A Student’s Guide to Vectors and Tensors (Cambridge University Press, 2011). 17