Download Tensors, Vectors, and Linear Forms Michael Griffith May 9, 2014

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Matrix multiplication wikipedia , lookup

System of linear equations wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Bivector wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Tensor product of modules wikipedia , lookup

Cross product wikipedia , lookup

Exterior algebra wikipedia , lookup

Laplace–Runge–Lenz vector wikipedia , lookup

Vector space wikipedia , lookup

Matrix calculus wikipedia , lookup

Euclidean vector wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Four-vector wikipedia , lookup

Transcript
Tensors, Vectors, and Linear Forms
Michael Griffith
May 9, 2014
Abstract
The metric tensor and its implications for basic linear algebra and calculus are
introduced and discussed. A generalized notion of vectors is introduced to distinguish the algebraic structure from the physical object. Linear forms and their
vector structure are developed along the way.
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3
Linear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
4
Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
5
Inner Product and Norm . . . . . . . . . . . . . . . . . . . . . . . . 10
6
Tensors
7
The Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Bibliography
17
1 Introduction
This capstone project was an exercise in dissociating intuition from rigorous definition. The basic concepts of linear algebra, the vector and vector space behavior,
had to be completely separated from the author’s physicist understanding. The
1
construction of tensors, and in particular the metric tensor, was the ultimate
goal for the paper. Over the course of the year-long capstone course, most of
the material was internalized by reading the same topic in multiple sources. For
this reason, it is difficult to cite sources in the text, so the author would like
to make note of them here. The primary text used was Elementary Differential
Geometry[O’Neill(1966)], which provided most of the conceptual framework for
the paper. Other sources were used to supplement the mechanical understanding
of the material, and include texts in introductory relativity[Schutz(2009)], riemannian geometry[Morgan(1993)], calculus on manifolds[Spivak(1965)], and tensors
and vectors[Fleisch(2011)].
2 Vectors
It is common practice in physics and introductory linear algebra courses to identify
the general notion of a vector with the physical example. A physical vector is, more
or less, a coordinate. We see such objects represented by a drawn arrow so often
that it is easy to conflate the example with the concept. To think of a vector
as simply a directed quantity, a coordinate, or an arrow, hides the more general
underlying structure involved.
If we wish to understand what a vector is more abstractly, we should start with
what we expect from simple examples. Any grand scheme we develop should still
2
look more or less like what we are used to. Suppose we have a set V with some
elements ~vi , then we are interested in determining exactly how those elements must
behave in order for V to be reasonably dubbed a vector space.
Fundamentally, addition of two vectors must yield another vector. For each
vector, there must exist another whose action is its exact negation. The addition
of these two vectors must be a vector, therefore we must have a resultant zero
vector whose action is entirely nil. Thus we require V be closed under addition
and contain an additive identity, as well as inverses.
1) ∀ ~vi , ~vj ∈ V , ~vi + ~vj ∈ V
2) ∃ ~0 ∈ V | ∀ ~vi ∈ V , ~vi + ~0 = ~vi
3) ∀ ~vi ∈ V , ∃ − ~vi ∈ V | ~vi + (−~vi ) = ~0
When adding vectors, the order in which one chooses to combine them should
not matter; the addition should be both associative and commutative.
4) ∀ ~vi , ~vj , ~vk ∈ V , (~vi + ~vj ) + ~vk = ~vi + (~vj + ~vk )
5) ∀ ~vi , ~vj ∈ V , ~vi + ~vj = ~vj + ~vi
These conditions precisely define the behavior of an abelian group, and indeed
a vector space must exhibit group behavior. However, not all abelian group constitute vector spaces. So what is the true shibboleth of a vector space? Of central
3
importance, even above these basic rules for recombination, is the effect of scaling
a vector.
Elements of a vector space must behave linearly under scalar multiplication.
That is to say that contracting or dilating the length of some vectors and then
adding them must be equivalent to contracting or dilating the sum of the original
vectors by precisely the same scalar value. Similarly, adding together mutliple
scaled copies of a vector must be equivalent to scaling that vector by one overall
value, while scaling a single vector twice by different values must amount to scaling
by their product. These properties may seem slightly opaque, but they are precisely
the rules we would anticipate geometrically when looking at physical vectors.
6) ∀ a, b ∈ R, ~vi ∈ V , a(b~vi ) = (ab)~vi
7) ∀ a, b ∈ R, ~vi ∈ V , (a + b)~vi = a~vi + b~vi
8) ∀ a ∈ R, ~vi , ~vj ∈ V , a(~vi + ~vj ) = a~vi + a~vj
With these eight properties, the set V can be considered a vector space, and
its elements vectors. Note that while the motivations for many of these rules
were rooted in our geometric understanding of physical vectors, the nature of the
definition is strictly algebraic. Any set of objects which obeys these rules may be
treated as a vector space, regardless of whether they can reasonably be represented
by arrows on a page, or by any geometric diagramming whatsoever.
4
The trouble with this new definition is that it leaves us without a sense of
how to actually going about adding vectors, or really comparing them in any
meaningful way. In stripping out the geometry, we have also removed much of
the methodology with which we are familiar. In order to reconstruct these lost
tools, we must find a way to assign a generic geometry to our generic vectors. The
primary tool in this endeavor is the metric, which defines the “distance” between
two vectors. Before we can define a metric, we need a more familiar representation
for our vectors.
By our definition, any vector can be written as the sum of other vectors. We
could ask then how many vectors are required to be able to write any other vector
as a linear combination. A collection whose combinations comprise all other vectors
in the space is called a basis. If we can find a finite basis, then the size of the
smallest possible basis is called the dimension of the vector space. The elements
of the basis are said to span the space.
Suppose we have a vector space V with dimension n. There exists some basis
vectors ~e1 , ~e2 , ..., ~en in V such that any ~v in V can be written as a linear combination
of the ~ei . If we agree upon an ordering of the basis, then we can communicate ~v by
listing the scalar coefficients of each basis element used in its combination. These
coefficients are the “components” of the vector. It is easy to see that distinct
vectors will have distinct component representations.
5
We will use the Einstein summation convention to compactly denote the component forms of vectors. In this notation, we assume that summation is implied
over any index which appears in both the superscript and subscript of multiplied
terms. We mark the components of a vector with superscript indices as follows.
~v = v i~ei
(1)
We can see that this is exactly the same as our physical vectors, but without
the notion of the components having direction. If we wish to add two vectors
together now all we need do is add their components, since the common basis can
factored out entirely.
~x + ~y = xi~ei + y j~ej = (xk + y k )~ek
(2)
3 Linear Forms
To define a geometry on a vector space, we must be able to compare vectors in
a meaningful way. One way to do this is by assigning to each vector, by some
criteria, a scalar value. To this end we must define real valued functions which
take vectors as their input. This type of an object is called a function, or linear
form.
A linear form which takes a single vector as its input is called a 1-form. A 16
form can be something as simple as assigning to each vector one of its component
values, essentially ranking vectors by only one of their parts. We will use a bar
rather than an arrow to mark forms. Then we could write the 1-form which assigns
to each vector its first component as follows.
ω̄(~v ) = v 1
(3)
We could have any more complicated combination of the components of the
input vector. For example, a 1-form acting on a 3-dimensional vector could have
the form
ω̄(~v) = 4v 1 + v 2 − 3v 3
(4)
We can see that these forms have components of their own in a sense, corresponding to the components of the vectors on which they act. We can write a
general 1-form acting on a vector ~v .
ω̄(~v) = ωi v i
(5)
In keeping with the summation convention, the components of these functionals
are written in the subscript. This conveniently allows us to differentiate between
vector components and form components.
We have seen that we can define a 1-form which gives a single component of a
7
vector as output. If label these forms ēi by analogy with the basis vectors, then
we can write 1-forms using almost the same notation we had for vectors.
ω̄ = ωi ēi
(6)
It turns out that these functionals constitute their own vector space, related
to but distinct from the vector space over which they are defined.
To return to our motivating question, how do these forms aid us in defining
a geometry over a given vector space? To see their utility, we must compare the
behaviors of vectors and 1-forms under a change of basis.
4 Linear Operators
A linear operator is a matrix whose entries send one basis of a vector space to a
new basis. Since such a matrix operates on vectors and gives vectors as output, we
must use a mixture of upper and lower indices. Let T be the linear operator giving
the transformation from a basis {~ei } to a new basis {~e˜i } and S be its inverse. We
will mark objects in the new basis with a tilde. Taking advantage of the fact that
the representation but not the vector itself change between coordinate frames, we
can express the action of T on the basis vectors, and subsequently an arbitrary
vector, as follows
8
Tji~ei = ~e˜j
(7)
ṽ i~e˜i = ṽ j Tji~ei = v i~ei
(8)
i
ṽ =
Sji v j
The components of the new basis can be written as a linear combination of
the components in the original basis, and the transformation is the inverse of the
transformation of the basis. If the vector is the same between the two frames, its
magnitude, or norm, should be invariant. This is the quantity which will allow us
to define a consistent geometry for the vector space.
If we attempt to use the familiar dot product on our generic vector space, we
immediately run into problems with the change of bases. Consider the behavior
of a vector under the change T .
~v˜ · ~v˜ = Tji
2
~v · ~v
(9)
Which shows that the norm is not invariant, as the right hand side of this
equation only equals the dot product in the original basis if the coordinate transformation is the identity, or in other words no change at all. However, the behavior
of forms under this change in vector basis is quite different. We note now that
the basis forms acting on the basis vectors give the Kronecker delta as output.
9
That is, the ith component of each basis vector is one for the ith vector and zero
otherwise.
ω̃j = ω̄(~e˜j )
= ω̄(Tji~ei )
(10)
= Tji ω̄(~ei )
= Tji ωi
So we see that 1-forms transform from one vector basis to another with the coordinate transform matrix, while the vectors themselves transform with its inverse.
For this reason, we refer to forms as covariant and vectors as contravariant.
5 Inner Product and Norm
Now let us define, for some vector ~v , a dual 1-form whose components are precisely
the same in some basis {~ei }. Then we may perform an operation similar to the dot
product, but using the vector and this dual form. This is called the inner product,
and is a simplified version of the method to come.
h~v , v̄i = v i vi
(11)
Now let us perform this same operation, but using the same transformed components as before.
10
D E
~v˜, v̄˜ = ṽ i ṽi = Tji Sij v i vi = v i vi = h~v, v̄i
(12)
The change in the 1-form between frames precisely negates the change in the
vector! This is a sort of proof of concept, a demonstration of why these tools are
useful in constructing a coordinate invariant geometry for a vector space. The
full formulation of this idea requires a broader category of object than we have
previously seen: the tensor.
6 Tensors
A tensor is a multidimensional array of values whose transformation between coordinate frames is given by some number of covariant and contravariant rules. The
combination of rules for any particular tensor is described by its indices, and we
rank tensors based on the number of upper and lower indices they have. Everything we have seen so far have been simple examples of tensors, with the following
classification:
• A scalar value, with no co- or contravariant indices, is a rank-(0,0) tensor
• A vector, with only one contravariant index, is a rank-(1,0) tensor
• A 1-form, with only one covariant index, is a rank-(0,1) tensor
• A linear operator, with one of each type of index, is a rank-(1,1) tensor
11
A tensor is a difficult beast to understand intuitively, but we can start by
getting a sense for what each index means. From our simple examples, we can
glean three basic ideas about the nature of tensors.
First, each contravariant index in a tensor behaves like a vector. This means
that rank-(n,0) tensors can be thought of as multi-vectors in some capacity. Second, each covariant index behaves like a 1-form. A rank-(0,m) tensor then is a
multilinear form with m arguments. Third, multiplication of two tensors is performed by summing the product of their entries over some shared index. We have
seen this in the action of linear operators.
The third point is an important one, as it implies that the action of a rank(n,m) tensor sends a rank-(m,n) tensor to the real numbers. A multilinear form,
for example, is a real-valued function over some class of multi-vectors. Thus we
can think of the general (n,m) tensor as a function which takes n 1-forms and m
vectors, or any collection of other tensors with the right cumulative rank, into the
real numbers.
The tensor product is the generalization of the dot product from earlier. Its
definition is difficult to read initially, simply due to the fact that we must account
for an arbitrary number of indices. In essence, it is just repeated application
of the Einstein summation convention. Each tensor multiplication collapses one
contravariant index of one of the input with a covariant index of the other. For
12
example, let i,j,k range from one to three. Then we could have the following tensor
product
A ⊗ ~x = Akij xj = Aki1 x1 + Aki2 x2 + Aki3 x3
(13)
Or, with the same tensor A and vector ~x, we could form an entirely different
product
A ⊗ ~x = Akij xi = Ak1j x1 + Ak2j x2 + Ak3j x3
(14)
In this example, A is a 3×3×3 array, and the change in the indexing just means
that we are multiplying the vector argument’s entries by a different “column” of
A.
The tensor product is notably independent in structure from the actual values
of any particular tensor. This is very important, because it allows us to define an
inner product on a vector space using a tensor rather than a specialized operator.
In fact, many of the operations we are used to performing on physical vectors can
be stated as tensors.
7 The Metric Tensor
An inner product, by definition, should take two vectors as input and give a real
number as output. This is exactly the description of the behavior of a rank-(0,2)
13
tensor. Earlier we used a specially chosen 1-form to create an inner product with
a given vector. This method is functionally equivalent, but does not require us
to determine before-hand whether the 1-form we choose must be expressed in a
different basis.
The metric tensor is a 2-form, a rank-(0,2) tensor. From our earlier intuition,
the action of this type of tensor on single vector produces a 1-form. Suppose we
want to find the inner product of two vectors ~x and ~y . We need only decide on a
metric tensor η which represents the geometry we wish to consider, and the process
of keeping track of changing bases is accomplished for us by the notation.
h~x, ~yi = η ⊗ ~x ⊗ ~y = ηij xi y j
(15)
So long as the vectors are chosen from the same basis to begin with, η takes
care of the expression of the 1-form equivalent to ~x. Essentially, we can think of
the product η ⊗ ~x as a 1-form of its own, which can clearly be applied to ~y .
In simple euclidean geometry, the metric tensor is just the identity matrix.
This is why we are able to multiply physical vectors together without any sense
of the complications we have discussed here. The behavioral difference between
vectors and 1-forms is invisible under a euclidean metric.
In physics, General Relativity makes heavy use of tensor notation. The apparent inconsistencies between multiple observers’ measurements can be understood
14
in terms of a non-euclidean metric. In this construction, gravity causes the geometry of the universe to be warped to some degree, resulting in a non-constant
interpretation of distance and length of time depending on the location and motion
of the observer. Electromagnetism also makes use of tensor mathematics; we are
able to express the electric and magnetic fields as a single physical phenomenon by
considering them as manifestations of one tensor in differently moving reference
frames.
All of the fundamental structure of linear algebra and calculus that we familiarize ourselves with at the undergraduate level is actually dependent on one choice
of metric tensor. The shortcuts and heuristics which we have now dismissed are
entirely valid when using the euclidean metric. This is effectively the statement
that classical physics works in flat space. However, the language of tensors allows
us to consider these topics outside of the familiar, flat space of euclidean geometry. It allows us to assess Minkowski space-time, for example, which describes the
relationship between length of time and spacial distance in terms of a coordinate
geometry.
Tensor formulations open up an entirely new are of study. With them defined
and at least partially understood, a much more powerful form of linear algebra
immediately arises in which we may assign anything from physical quantities to
polynomials as vectors, so long as they meet the algebraic requirements. We
15
can then consider their behavior and even their geometry, where before we would
have had no sense of the geometry of polynomials because we could not attach a
physical diagram to them. It is eventually possible to construct a tensor calculus
which is workable in arbitrary curved spaces, whereas our undergraduate tools
are very specialized. This generalized notion of integration is the next goal for
post-graduate study of tensor-based mathematics.
16
Bibliography
[O’Neill(1966)] B. O’Neill, Elementary Differential Geometry (Academic Press,
1966).
[Schutz(2009)] B. Schutz, A First Course in General Relativity (Cambridge University Press, 2009).
[Morgan(1993)] F. Morgan, Riemannian Geometry: A Beginner’s Guide (Jones
and Bartlett Publishers, 1993).
[Spivak(1965)] M. Spivak, Calculus on Manifolds (W. A. Benjamin, Inc., 1965).
[Fleisch(2011)] D. Fleisch, A Student’s Guide to Vectors and Tensors (Cambridge
University Press, 2011).
17