Download Math 211

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Jordan normal form wikipedia , lookup

Vector space wikipedia , lookup

Matrix multiplication wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Four-vector wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Brouwer fixed-point theorem wikipedia , lookup

Derivative wikipedia , lookup

Matrix calculus wikipedia , lookup

Generalizations of the derivative wikipedia , lookup

Transcript
Math 211
Course Summary
table of contents
I.
II.
III.
IV.
V.
VI.
Functions of several variables.
Rn .
Derivatives.
Taylor’s Theorem.
Differential Geometry.
Applications.
1. Best affine approximations.
2. Optimization.
3. Lagrange multipliers.
4. Conservation of energy.
I. Functions of several variables.
Definition 1.1. Let S and T be sets. The Cartesian product of S and T is the set of ordered
pairs:
S × T := {(s, t) | s ∈ S, t ∈ T }.
Definition 1.2. Let S and T be sets. A function from S to T is a subset W of the Cartesian
product S × T such that: (i) for each s ∈ S there is an element in W whose first component
is s, i.e., there is an element (s, t) ∈ W for some t ∈ T ; and (ii) if (s, t) and (s, t0 ) are in W ,
then t = t0 . Notation: if (s, t) ∈ W , we write f (s) = t. The subset W , which is by definition
the function f , is also called the graph of f .
Definition 1.3. Let f : S → T be a function between sets S and T .
1. f is one-to-one or injective if f (x) = f (y) only if x = y.
2. The image or range of f is {f (s) ∈ T | s ∈ S}. The image will be denoted by im(f )
or f (S).
3. f is onto if im(f ) = T .
4. The domain of f is S and the codomain of f is T .
5. The inverse image of t ∈ T is f −1 (t) := {s ∈ S | f (s) = t}.
1
Definition 1.4. Let f : S → T and g: T → U . The composition of f and g is the function
g ◦ f : S → U given by (g ◦ f )(s) := g(f (s)).
Definition 1.5. Rn is the Cartesian product of R with itself n times. We think of Rn as
the set of ordered n-tuples of real numbers:
Rn := {(a1 , . . . , an ) | ai ∈ R, 1 ≤ i ≤ n}.
The elements of Rn are called points or vectors.
Definition 1.6. A function of several variables is a function of the form f : S → Rm where
S ⊆ Rn . Writing f (x) = (f1 (x), . . . , fm (x)), the function fi : S → R, for each i = 1, . . . , m, is
called the i-th component function of f .
Definition 1.7. Let f be a function of several variables, f : S → Rm , with S ⊆ Rn . If n = 1,
then f is a parametrized curve, if n = 2, then f is a parametrized surface. In general, we say
f is a parametrized n-surface.
Definition 1.8. A vector field is a function of the form f : S → Rm where S ⊆ Rm .
Definition 1.9. If f : S → R with S ⊆ Rn , a level set of f is the inverse image of a point in
R. A drawing showing several level sets is called a contour diagram for f .
II. Rn .
linear structure.
Definition 2.1. The i-th coordinate of a = (a1 , . . . , an ) ∈ Rn is ai . For i = 1, . . . , n, define
the i-th standard basis vector for Rn to be the vector ei whose coordinates are all zero except
the i-th coordinate, which is 1.
Definition 2.2.
(−a1 , . . . , −an ).
The additive inverse of a = (a1 , . . . , an ) ∈ Rn is the vector −a :=
Definition 2.3. In Rn , define ~0 := (0, . . . , 0), the vector whose coordinates are all 0.
Definition 2.4. (Linear structure on Rn .) If a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) are points
in Rn and s ∈ R, define
a + b = (a1 , . . . , an ) + (b1 , . . . , bn ) := (a1 + b1 , . . . , an + bn )
sa = s(a1 , . . . , an ) := (sa1 , . . . , san ).
The point a + b is the translation of a by b (or of b by a) and sa is the dilation of a by a
factor of s. Define a − b := a + (−b).
2
metric structure.
Definition 2.5. The dot product on Rn is the function Rn × Rn → R given by
(a1 , . . . , an ) · (b1 , . . . , bn ) :=
n
X
ai b i .
i=1
The dot product is also called the inner product or scalar product. If a, b ∈ Rn , the dot
product is denoted by a · b, as above, or sometimes by (a, b) or ha, bi.
Definition 2.6. The norm or length of a vector a = (a1 , . . . , an ) ∈ Rn is
v
u n
uX
√
a2i .
|a| := a · a = t
i=1
The norm can also be denoted by ||a||.
Definition 2.7. The vector a ∈ Rn is a unit vector if |a| = 1.
Definition 2.8. Let p ∈ Rn and r ∈ R.
1. The open ball of radius r centered at p is the set
Br (p) := {a ∈ Rn | |a − p| < r}.
2. The closed ball of radius r centered at p is the set
Br (p) := {a ∈ Rn | |a − p| ≤ r}.
3. The sphere of radius r centered at p is the set
Sr (p) := {a ∈ Rn | |a − p| = r}.
Definition 2.9. The distance between a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) in Rn is
v
u n
uX
d(a, b) := |a − b| = t (ai − bi )2 .
i=1
Definition 2.10. Points a, b ∈ Rn are perpendicular or orthogonal if a · b = 0.
Definition 2.11. Suppose a, b are nonzero vectors in Rn . The angle between them is defined
a·b
to be cos−1 |a||b|
.
3
Definition 2.12. Let a, b ∈ Rn with b =
6 ~0. The component of a along b is the scalar
a·b
a·b
c := b·b = |b|2 . The projection of a along b is the vector cb where c is the component of a
along b.
affine subspaces.
Definition 2.13. A nonempty subset W ⊆ Rn is a linear subspace if it is closed under
vector addition and scalar multiplication. This means that: (i) if a, b ∈ W then a + b ∈ W ,
and (ii) if a ∈ W and s ∈ R, then sa ∈ W .
n
Definition 2.14. A vector v ∈ Rn is a linear
Pk combination of vectors v1 , . . . , vk ∈ R if there
are scalars a1 , . . . , ak ∈ R such that v = i=1 ai vi .
Definition 2.15. A subspace W ⊆ Rn is spanned by a subset S ⊆ Rn if every element of
W can be written as a linear combination of elements of S. If W is spanned by S, we write
span(S) = W .
Definition 2.16. The dimension of a linear subspace W ⊆ Rn is the smallest number of
vectors needed to span W .
Definition 2.17. Let W be a subset of Rn and let p ∈ Rn . The set
p + W := {p + w | w ∈ W }
is called the translation of W by p. An affine subspace of Rn is any subset of the form p + W
where W is a linear subspace of Rn . In this case, the dimension of the affine subspace is
defined to be the dimension of W .
Definition 2.18 A k-plane in Rn is an affine subspace of dimension k. A line is a 1-plane,
and a hyperplane is a (n − 1)-plane.
affine functions.
Definition 2.19. A function L: Rn → Rm is a linear function (or transformation or map) if
it preserves vector addition and scalar multiplication. This means that for all a, b ∈ Rn and
for all s ∈ R,
1. L(a + b) = L(a) + L(b);
2. L(sa) = sL(a).
Definition 2.20. (Linear structure on the space of linear functions.) Let L and M be linear
functions with domain Rn and codomain Rm .
1. Define the linear function L + M : Rn → Rm by
(L + M )(v) := L(v) + M (v)
for all v ∈ Rn .
4
2. If s ∈ R, define the linear function sL: Rn → Rm by
(sL)(v) := L(sv)
for all v ∈ Rn .
Definition 2.21. A function f : Rn → Rm is an affine function (or transformation or map)
if it is the ‘translation’ of a linear function. This means that there is a linear funtion
L: Rn → Rm and a point p ∈ Rm such that f (v) = p + L(v) for all v ∈ Rn .
Definition 2.22. Let W be a k-dimensional affine subspace of Rn . A parametric equation
for W is any affine function f : Rk → Rn whose image is W .
Definition 2.23. An m × n matrix is a rectangular block of real numbers with m rows and
n columns. The real number appearing in the i-th row and j-th column is called the i, j-th
entry of the matrix. We write A = (aij ) for the matrix whose i, j-th entry is aij .
Definition 2.24. (Linear structure on matrices.) Let A = (aij ) and B = (bij ) be m × n
matrices. Define A + B := (aij + bij ). If s ∈ R, define sA := (saij ).
Definition 2.25. (Multiplication of matrices.) Let A = (aij ) be an m × k matrix, and let
B = (bij ) be a k × n matrix. Define the product, AB to be the m × n matrix whose i, j-th
P
entry is k`=1 ai` b`j .
Definition 2.26. Let A = (aij ) be an m × n matrix. The linear function determined by (or
associated with) A is the function LA : Rn → Rm such that
P
P
L(x1 , . . . , xn ) = ( nj=1 a1j xj , . . . , nj=1 amj xj ).
Definition 2.27. Let L: Rn → Rm be a linear function. The matrix determined by (or
associated with) L is the m × n matrix whose i-th column is the image of the i-th standard
basis vector for Rn under L, i.e., L(ei ).
Definition 2.28. An n × n matrix, A, is invertible or nonsingular if there is an n × n matrix
B such that AB = In where In is the identity matrix whose entries consist of 1s along the
diagonal and 0s otherwise. In this case, B is called the inverse of A and denoted A−1 .
theorems
Theorem 2.1. Let a, b, c ∈ Rn and s, t ∈ R. Then
1. a + b = b + a.
2. (a + b) + c = a + (b + c).
3. ~0 + a = a + ~0 = a.
5
4. a + (−a) = (−a) + a = ~0.
5. 1a = a and (−1)a = −a.
6. (st)a = s(ta).
7. (s + t)a = sa + ta.
8. s(a + b) = sa + sb.
Theorem 2.2. Let a, b, c ∈ Rn and s ∈ R. Then
1. a · b = b · a.
2. a · (b + c) = a · b + a · c.
3. (sa) · b = s(a · b).
4. a · a ≥ 0.
5. a · a = 0 if and only if a = ~0.
Theorem 2.3. Let a, b ∈ Rn and s ∈ R. Then
1. |a| ≥ 0.
2. |a| = 0 if and only if a = ~0.
3. |sa| = |s||a|.
4. |a · b| ≤ |a||b| (Cauchy-Schwartz inequality).
5. |a + b| ≤ |a| + |b| (triangle inequality).
Theorem 2.4. Let a, b ∈ Rn be nonzero vectors. Then
−1 ≤
a·b
≤ 1.
|a||b|
This shows that our definition of angle makes sense.
Theorem 2.5. (Pythagorean theorem.) Let a, b ∈ Rn . If a and b are perpendicular, then
|a|2 + |b|2 = |a + b|2 .
Theorem 2.6. Any linear subspace of Rn is spanned by a finite subset.
6
Theorem 2.7. If a = (a1 , . . . , an ) 6= ~0 and p = (p1 , . . . , pn ) are elements of Rn , then
H := {x ∈ Rn | (x − p) · a = 0}
is a hyperplane. In other P
words, the set of solutions, (x1 , . . . , xn ), to the equation a1 x1 +
· · · + an xn = d where d = ni=1 ai pi is a hyperplane. Conversely, every hyperplane is the set
of solutions to an equation of this form.
Theorem 2.8. If L: Rn → Rm is a linear function and W ⊆ Rn is a linear subspace, then
L(W ) is a linear subspace of Rm .
Theorem 2.9. A linear map is determined by its action on the standard basis vectors. In
other words: if you know the images of the standard basis vectors, you know the image of
an arbitrary vector.
Theorem 2.10. The image of the linear map determined by a matrix is the span of the
columns of that matrix.
Theorem 2.11. Let W be a k-dimensional subspace of Rn spanned by vectors v1 , . . . , vk ,
and let p ∈ Rn . Then a parametric equation for the affine space p + W is
f : Rk → Rn
(a1 , . . . , ak ) 7→ p +
k
X
ai vi .
i=1
Theorem 2.12. Let L be a linear function and let A be the matrix determined by L. Then
the linear map determined by A is L. (The converse also holds, switching the roles of L
and A.)
Theorem 2.13. The linear structures on linear maps and on their associated matrices are
combatible: Let L and M be linear functions with associated matrices A and B, respectively,
and let s ∈ R. Then the matrix associated with L + M is A + B, and the matrix associated
with sL is sA.
Theorem 2.14. Let L: Rn → Rk and M : Rk → Rm be linear functions with associated
matrices A and B, respectively. Then the matrix associated with the composition, M ◦ L is
the product BA.
III. Derivatives.
Definition 3.1. A subset U ⊆ Rn is open if for each u ∈ U there is a nonempty open ball
centered at u contained entirely in U : there exists a real number r > 0 such that Br (u) ⊆ U .
7
Definition 3.2. A point u ∈ Rn is a limit point of a subset S ⊆ Rn if every open ball
centered at u, Br (u), contains a points of S different from u.
Definition 3.3. Let f : S → Rm be a function with S ⊆ Rn . Let s be a limit point of S. The
limit of f (x) as x approaches s is v ∈ Rm if for all real numbers > 0, there is a real number
δ > 0 such that 0 < |x − s| < δ and x ∈ S ⇒ |f (x) − v| < . Notation: limx→s f (x) = v.
Definition 3.4. Let f : S → Rm with S ⊆ Rn , and let s ∈ S. The function f is continuous
at s ∈ S if for all real numbers > 0, there is a real number δ > 0 such that |x − s| < δ
and x ∈ S ⇒ |f (x) − f (s)| < . (Thus, f is continuous at a limit point s ∈ S if and only if
limx→s f (x) = f (s) and f is automatically continuous at all points in S which are not limit
points of S.) The function f is continuous on S if it is continuous at each point of S.
Definition 3.5. Let f : U → Rm with U an open subset of Rn , and let ei be the i-th standard
basis vector for Rn . The i-th partial of f at u ∈ U is the vector in Rm
f (u + tei ) − f (u)
∂f
(u) := lim
t→0
∂xi
t
provided this limit exists.
Definition 3.6. Let f : U → R with U an open subset of Rn . Let u ∈ U , and let v ∈ Rn be
a unit vector. The directional derivative of f at u in the direction of v is the real number
f (u + tv) − f (u)
t→0
t
provided this limit exists. The directional derivative of f at u in the direction of an arbitrary
nonzero vector w is defined to be the directional derivative of f at u in the direction of the
unit vector w/|w|.
fv (u) := lim
Definition 3.7. Let f : U → Rm with U an open subset of Rn . Then f is differentiable at
u ∈ U if there is a linear function Dfu : Rn → Rm such that
|f (u + h) − f (u) − Dfu (h)|
= 0.
|h|
h→~0
lim
The linear function Dfu is then called the derivative of f at u. The notation f 0 (u) is
sometimes used instead of Dfu . The function f is differentiable on U if it is differentiable
at each point of U .
Definition 3.8. Let f : U → Rm with U an open subset of Rn . The Jacobian matrix of f at
u ∈ U is the m × n matrix of partial derivatives of the component functions of f :
 ∂f1

∂f1
(u)
.
.
.
(u)
∂x1
∂xn
∂fi


..
..
..
(u) = 
Jf (u) :=
.
.
.
.
∂xj
∂fm
∂fm
(u) . . . ∂xn (u)
∂x1
8
1. The i-th column of the Jacobian matrix is the i-th partial derivative f at u and is
called the i-th principal tangent vector to f at u.
2. If n = 1, then f is a parametrized curve and the Jacobian matrix consists of a single
column. This column is the tangent vector to f at u or the velocity of f at u, and its
length is the speed of f at u. We write
0
f 0 (u) = (f10 (u), . . . , fm
(t))
for this tangent vector.
3. If m = 1, the Jacobian matrix consists of a single row. This row is called the gradient
vector for f at u and denoted ∇f (u) or gradf (u):
∂f
∂f
(u), . . . ,
(u) .
∇f (u) :=
∂x1
∂xn
theorems
Theorem 3.1. Let f : S → Rm and g: S → Rm where S is a subset of Rn .
1. The limit of a function is unique.
2. The limit, limx→s f (x), exists if and only if the corresponding limits for each of the
component functions, limx→s fi (x), exists. In that case,
lim f (x) = lim f1 (x), . . . , lim fm (x) .
x→s
x→s
x→s
3. Define f +g: S → Rm by (f +g)(x) := f (x)+g(x). If limx→s f (x) = a and limx→s g(x) =
b, then limx→s (f + g)(x) = a + b. Similarly, if t ∈ R, define tf : U → Rm by (tf )(x) :=
t(f (x)). If limx→s f (x) = a, then limx→s (tf )(x) = ta.
4. If m = 1, define (f g)(x) := f (x)g(x) and (f /g)(x) := f (x)/g(x) (provided g(x) 6= 0).
If limx→s f (x) = a and limx→s g(x) = b, then limx→s (f g)(x) = ab and, if b 6= 0, then
limx→s (f /g)(x) = a/b.
5. If m = 1 and g(x) ≤ f (x) for all x, then limx→s g(x) ≤ limx→s f (x) provided these
limits exist.
Theorem 3.2. Let f : S → Rm and g: S → Rm where S is a subset of Rn .
1. The function f is continuous if and only if the inverse image of every open subset of
Rm under f is the intersection of an open subset of Rn with S.
9
2. The function f is continuous at s if and only if each of its component functions is
continuous at s.
3. The composition of continuous functions is continuous.
4. The functions f + g and tf for t ∈ R as above are continuous at s ∈ S provided f and
g are continuous at s.
5. If m = 1 and f and g are continuous at s ∈ S, then f g and f /g are continuous at s
(provided g(s) 6= 0 in the latter case).
6. A function whose coordinate functions are polynomials is continuous.
Theorem 3.3. If f : Rn → Rm is a linear transformation, then f is differentiable at each
p ∈ Rn , and Dfp = f .
Theorem 3.4. (The chain rule.) Let f : U → Rk and g: V → Rm where U is an open subset
of Rn and V is an open subset of Rk . Suppose that f (U ) ⊆ V so that we can form the
composition, g ◦ f : U → Rm . Suppose that f is differentiable at p ∈ U and g is differentiable
at f (p); then g ◦ f is differentiable at p, and
D(g ◦ f )p = Dgf (p) ◦ Dfp .
In terms of Jacobian matrices, we have,
J(g ◦ f )(p) = Jg(f (p))Jf (p).
Theorem 3.5. Let f : U → Rm where U is an open subset of Rn . Then f is differentiable at
p ∈ U if and only if each component function fi : U → R is differentiable, and in that case,
Dfp (v) = (Df1p (v), . . . , Dfmp (v)) for all v ∈ Rn .
Theorem 3.6. Let f : U → R where U is an open subset of Rn . If the directional derivative
of f at u ∈ U in the direction of the unit vector v ∈ Rn exists, it is equal to the dot product
∇f (u) · v.
Theorem 3.7. Let f : U → R be a differentiable function on an open subset U ∈ Rn . The
gradient vector, ∇f (u), of f at u ∈ U points in the direction of quickest increase of f and
its magnitude gives the rate of increase of f in that direction.
Theorem 3.8. Let f : U → R be a differentiable function on an open subset U ⊆ Rn . The
gradient vector, ∇f (u) of f at u is perpendicular to the level set of f through u, i.e., to
f −1 (f (u)). More precisely, let h: I → U be a differentiable function on an open interval
I ⊆ R containing the origin with h(0) = u. Suppose that f ◦ h is constant, i.e., the image of
10
h lies in the level set through u. Then the gradient of f at u is perpendicular to the tangent
to h at 0: h0 (0) · ∇f (u) = 0.
Theorem 3.9. Let f : U → Rm where U is an open subset of Rn . Suppose that the partial
derivative of each of the component functions of f exists at a ∈ U . Then each partial
derivative of f exists at a and
∂f
∂f1
∂fm
(a) =
(a), . . . ,
(a) .
∂xi
∂xi
∂xi
Theorem 3.10. Let f : U → Rm where U is an open subset of Rn , and let u ∈ U . The
second partial derivatives, ∂ 2 f (u)/∂xi ∂xj and ∂ 2 f (u)/∂xj ∂xi are equal if they exist and are
continuous.
Theorem 3.11. Let f : U → Rm where U is an open subset of Rn . If f is differentiable at
u ∈ U , then each of the first partial derivatives of each of the component functions exists
and Dfu is the linear map determined by the Jacobian matrix, Jf (u).
Theorem 3.12. Let f : U → Rm where U is an open subset of Rn . If each of the first partial
derivatives of each of the component functions of f exists at u ∈ U and is continuous, then
f is differentiable at u. In this case, f is said to be continuously differentiable at u.
Theorem 3.13. (The inverse function theorem.) Let f : U → Rn be a function with continuous partial derivatives on the open set U ⊆ Rn . Suppose that the Jacobian matrix, Jf (u)
is invertible at some point u ∈ U . Then there is an open subset V ⊆ U containing u such
that f is injective when restricted to V and its inverse (defined on f (V )) is differentiable
with Jacobian matrix Jf (u)−1 .
Theorem 3.14. (The implicit function theorem.) Suppose f : Rn × Rm → Rm is a function
with continuous partial derivatives. Suppose that the m × m matrix
∂fi
(u)
∂xn+j
where 1 ≤ i, j ≤ m is invertible for some u = (u1 , u2 ) ∈ Rn × Rm . Then there is an open
set U ⊆ Rn containing u1 and an open set V ⊆ Rm containing u2 such that for each x ∈ U ,
there is a unique g(x) ∈ V with f (x, g(x)) = 0. The function g is differentiable.
Theorem 3.15 (The rank theorem.) Suppose f : U → V is a smooth mapping and that f
has constant rank k in some open neighborhood of p ∈ U . Then there exist open sets Ũ ⊆ U
containing p and Ṽ ⊆ V containing f (p) along with diffeomorphisms
φ : Ũ → U 0
ψ : Ṽ → V 0
11
onto open subsets U 0 ⊆ Rn and V 0 ⊆ Rm such that
ψ ◦ f ◦ φ−1 (x1 , . . . , xn ) = (x1 , . . . , xk , 0, . . . , 0).
IV. Taylor’s Theorem
Definition 4.1. Let f : U → Rm with U an open subset of Rn , and let k be a nonnegative
integer. A partial derivative of order k of f at u ∈ U is defined recursively as follows: (i) the
zeroth-order partial derivative is f (u); (ii) a k-th order partial derivative, with k > 0 is any
partial derivative of a (k − 1)-th partial derivative.
Definition 4.2. Let f : U → Rm with U an open subset of Rn . Suppose that all partial
derivatives of f of order less than or equal to k exist and are continuous. The Taylor
polynomial of order k for f at u ∈ U is
Puk f (x1 , . . . , xn ) :=
X
i1 +···+in
∂ i1 +···+in f
1
(u)(x1 − u1 )i1 · · · (xn − un )in .
i1
i
n
i
!
·
·
·
i
!
1
n ∂x1 . . . ∂xn
≤k
If the partial derivatives of every order exist and are continuous at u, one defines the Taylor
series for f at u, Pu f , by replacing k by ∞ in the above displayed equation.
theorems
Theorem 4.1. (Taylor’s theorem in one variable.) Let f : S → R with S ⊆ R. Suppose
that S contains an open interval containing the closed interval [a, b]. Also suppose that all
the derivatives up to order k exist and are continuous on [a, b] and the (k + 1)-th derivative
exists on (a, b). Let x, y ∈ [a, b]. Then there exists a number c between x and y such that
f (x) = Pyk f (x) +
dk+1 f
1
(c)(x − y)k+1
(k + 1)! dxk+1
where Pyk f is the k-th order Taylor polynomial for f at y.
Theorem 4.2. (Taylor’s theorem in several variables.) Let f : U → Rm where U is an open
subset of Rn . Suppose that the partial derivatives of f up to order k exist and are continuous
on U and that the partial derivatives of order k + 1 exist on U . Let u, x ∈ U and suppose
that the line segment, u + tx, 0 ≤ t ≤ 1, is contained in U . Then there exists a number c
between 0 and 1 such that
f (x) = Puk f (x) + r(cx)
where Puk f is the k-th order Taylor polynomial for f at u and r: U → Rm is a function such
that limv→~0 r(v)/|v|k+1 = 0.
V. Differential geometry.
12
Definition 5.1. The tangent space to Rn at u ∈ Rn is Rn labeled with the point u; more
formally, it is the Cartesian product {u} × Rn . We denote the tangent space at u by Rnu .
Definition 5.2. If f : U → Rm with U an open subset of Rn is differentiable at u ∈ U , the
tangent map for f at u is defined to be the derivative of f at u, thought of as a mapping
between tangent spaces, Dfu : Rnu → Rm
f (u) .
Definition 5.3. Let f : Rn → Rm be a differentiable function. The first fundamental form
for f at p ∈ Rn is the function h , ip : Tp Rn × Tp Rn → R given by
hu, vip := Dfp (u) · Dfp (v).
Given the first fundamental form, define
1. The length of u ∈ Tp Rn is |u|p :=
p
hu, uip .
2. Vectors u, v ∈ Tp Rn are perpendicular if hu, vip = 0.
3. The angle between nonzero u, v ∈ Tp Rn is
arccos
hu, vip
.
|u|p |v|p
4. The component of u ∈ Tp Rn along nonzero v ∈ Tp Rn is
hu, vip
.
hv, vip
Definition 5.4. Let f : Rn → Rm be a differentiable function. The first fundamental form
matrix is the n × n symmetric matrix, I with i, j-th entry fxi · fxj ,
∂f
∂f
I = ∂xi · ∂xj .
Definition 5.5. A pseudo-metric on Rn is a symmetric n × n matrix I whose entries are
real-valued functions on Rn . Given any such I, define for each p ∈ Rn and for all u, v ∈ Tp Rn
hu, vip := uT I(p) v.
The matrix I is called a metric on Rn if h , ip is positive definite for each p ∈ Rn , i.e., for all
u ∈ Tp Rn , hu, uip ≥ 0 with equality exactly when u = 0. Given the form, h , ip , one defines
lengths, distances, angles, and components as with the first fundamental form.
Definition 5.6. Let c : [a, b] → Rn be a parametrized curve. The length of c is
Z b
length(c) =
|c0 (t)| dt.
a
13
Definition 5.7. Let c : [a, b] → Rn be a parametrized curve. Suppose α : [a0 , b0 ] → [a, b] is a
function such that α0 is continuous, α0 (t) > 0 for all t, α(a0 ) = a, and α(b0 ) = b. Then the
curve c̃ = c ◦ α : [a0 , b0 ] → Rn is called a reparametrization of c.
Definition 5.8. Let u : [a, b] → Rn be a parametrized curve, and let I be a pseudo-metric
on Rn . The length of u with respect to I is
Z b
|u0 (t)|u(t) dt,
length(u) =
a
where the length of the vector u0 (t) is its length with respect to I as an element of the tangent
space Tu(t) Rn .
Definition 5.9. Let u : [a, b] → Rn and f : Rn → Rm , so c = f ◦ u is a parametrized curve
on the image of f . Then c is a geodesic if
c00 (t) · fxi (u(t)) = 0
for i = 1, . . . , n.
Definition 5.10. Fix a metric I, and denote its ij-th entry by gij . Since I is positive definite,
it turns out that it must be an invertible matrix, and we denote the entries of the inverse by
g ij :
I = (gij ) ,
I−1 = g ij .
For each i, j, ` ∈ {1, . . . , n} define a Christoffel symbol
n 1 X ∂gjk ∂gij ∂gki
`
−
+
g k` .
Γij =
2 k=1 ∂xi
∂xk
∂xj
Let x(t) = (x1 (t), . . . , xn (t)) be a curve in Rn . Then x(t) is a geodesic if it satisfies the system
of differential equations
X
ẍk +
Γki,j ẋi ẋj = 0, k = 1, . . . , n.
i,j
In the case n = 2 we can write
E F
I=
,
F G
−1
I
1
=
∆
G −F
−F E
where ∆ = 1/ det I = 1/(EG − F 2 ). In this case, the Christoffel symbols are:
Γ111 =
1
(GEx
2∆
− 2F Fx + F Ey )
Γ211 =
1
(2EFx
2∆
Γ112 =
1
(GEy
2∆
− F Gx )
Γ212 =
1
(EGx
2∆
− F Ey )
Γ122 =
1
(2GFy
2∆
Γ222 =
1
(EGy
2∆
− 2F Fy + F Gx ),
− GGx − F Gy )
14
− EEy − F Ex )
and the equations for a geodesic are:
ẍ1 + ẋ21 Γ111 + 2ẋ1 ẋ2 Γ112 + ẋ22 Γ122 = 0
ẍ2 + ẋ21 Γ211 + 2ẋ1 ẋ2 Γ212 + ẋ22 Γ222 = 0.
theorems
Theorem 5.1. Let f : Rn → Rm be a differentiable function, and let p ∈ Rn . For u, v ∈
Tp Rn ,



v1
fx1 (p) · fx1 (p) . . . fx1 (p) · fxn (p)

  .. 
..
..
...
hu, vip = uT I(p) v = u1 . . . un 
 . .
.
.
fxn (p) · fx1 (p) . . .
fxn (p) · fxn (p)
vn
Theorem 5.2. The first fundamental form is a symmetric, positive definite form:
1. hx, yiu = hy, xiu for all x, y ∈ R2u .
2. hx + y, ziu = hx, ziu + hy, ziu for all x, y, z ∈ R2u .
3. hsx, yiu = shx, yiu for all x, y ∈ R2u and s ∈ R.
4. hx, xiu ≥ 0 for all x ∈ R2u , and hx, xiu = 0 if and only if x = ~0.
Theorem 5.3. Consider Rn with metric I. For each u, v ∈ Rn , then there is an > 0 and a
unique geodesic h: (−, ) → Rn with h(0) = u and h0 (0) = v.
Theorem 5.4. Geodesics give, locally, the shortest distance between two points on a surface:
Consider Rn with metric I, and let u ∈ Rn . Then there is a ball of radius r centered at u,
Br (u), such that for any v ∈ Br (u), the geodesic joining u to v is shorter than any other
curve joining u to v.
VI. Applications.
VI.1. Best affine approximations.
Definition 6.1.1. Let f : U → Rm be a differentiable function on an open set U ⊆ Rn . The
best affine approximation to f at u ∈ U is the affine function
T fu : Rn → Rm
x 7→ f (u) + Dfu (x − u)
where Dfu is the derivative of f at u.
15
Definition 6.1.2. With notation as in Definition 6.1.1, the image of T fu is called the
(embedded) tangent space to f at u.
theorems
Theorem 6.1.1. Let f : U → Rm be a differentiable function on an open set U ⊆ Rn , and
let u ∈ U . If A: Rn → Rm is an affine function with A(u) = f (u) and
|f (x) − A(x)|
=0
x→u
|x|
lim
then A is T fu , the best affine approximation to f at u.
Theorem 6.1.2. Let f : U → Rm with U and open subset of Rn , and let h: V → U where
V is an open subset of R. Let c := f ◦ h be the corresponding parametrized curve on the
surface f . Let u ∈ U , and suppose that c(0) = f (u). Then f (u) + c0 (0) is contained in the
embedded tangent space to f at u. Conversely, each element of the embedded tangent space
to f at u can be written as f (u) + c0 (0) for some parametrized curve c on f with c(0) = f (u).
VI.2. Optimization.
Definition 6.2.1. The set S ⊆ Rn is closed if its complement is open, i.e., if for every x not
in S, there is a nonempty open ball centered at x which does not intersect S.
Definition 6.2.2. The set S ⊂ Rn is bounded if it is contained in some open ball centered
at the origin, i.e., if there is a real number r > 0 such that |s| < r for all s ∈ S.
Definition 6.2.3. A point s ∈ S ⊆ Rn is in the interior of S if there is a nonempty open
ball centered at s contained entirely in S. Otherwise, s is on the boundary of S.
Definition 6.2.4. Let f : S → R where S ⊂ Rn .
1. s ∈ S is a (global) maximum for f if f (s) ≥ f (s0 ) for all s0 ∈ S.
2. s ∈ S is a (global) minimum for f if f (s) ≤ f (s0 ) for all s0 ∈ S.
3. s ∈ S is a local maximum for f if s is an interior point and f (s) ≥ f (s0 ) for all s0 in
some nonempty open ball centered at s.
4. s ∈ S is a local minimum for f if s is an interior point and f (s) ≤ f (s0 ) for all s0 in
some nonempty open ball centered at s.
5. A global extremum for f is a global maximum or global minimum for f . A local
extremum for f is a local maximum or local minimum for f .
16
Definition 6.2.5. Let f : S → R where S ⊆ Rn . A point s ∈ S is a critical or stationary
point of f if it is an interior point, f is differentiable at s, and ∇f (u) = 0, i.e., all the partial
derivatives of f vanish at s.
Definition 6.2.6. A critical point which is not a local extrema is called a saddle point.
Definition 6.2.7. Let f : U → R be a function on an open set U ⊆ Rn with continuous
partial derivatives up to order 3. Let u ∈ U be a critical point of f . The quadratic form for
f at u is defined by Qu f (x) := Pu2 f (x + u), where Pu2 is the second order Taylor polynomial
for f at u. Hence,
X
Qu f (x1 , . . . , xn ) :=
i1 +···+in
1
∂ 2f
(u)xi11 · · · xinn .
i1
i
n
i
!
·
·
·
i
!
n ∂x1 . . . ∂xn
=2 1
theorems
Theorem 6.2.1. Let S ⊂ Rn be a closed and bounded set. Let f : S → R be a continuous
function. Then f has a maximum and a minimum in S.
Theorem 6.2.2. Let f : U → R be a differentiable function on an open set U ⊆ Rn . If
u ∈ U is a local extrema for f , then u is a critical point for f .
Theorem 6.2.3. Let f : U → R be a function on an open set U ⊆ Rn with continuous
partial derivatives up to order 3. Suppose the quadratic form, Qu f , for f at a critical point
u is nonzero. Then u is a local minimum, a local maximum, or a saddle point according as
Qu f is a local minimum, a local maximum, or a saddle at ~0.
VI.3. Lagrange multipliers.
Theorem 6.3.1. Let f and g1 , . . . , gk be k + 1 differentiable, real-valued functions on an
open subset U ⊆ Rn . Let u ∈ U . Suppose that if there are constants a0 , a1 , . . . , ak for which
a0 ∇f (u) + a1 ∇g1 (u) + · · · + an ∇gn (u) = ~0
then a0 = a1 = · · · = an = 0. Then there are points u1 and u2 arbitrarily close to u such
that gi (u) = gi (u1 ) = gi (u2 ) for i = 1, . . . , k and f (u1 ) < f (u) < f (u2 ).
VI.4. Conservation of energy.
Definition 6.4.1. Let f : U → Rn be a differentiable function on an open set U ⊆ Rn (i.e.,
f is a differentiable vector field on U ). Then f is conservative if there is a differentiable
function φ: U → R such that f = grad φ. In that case, we call ψ := −φ a potential (energy)
function for f (so, in that case, f = −grad ψ).
17
Definition 6.4.2. Let f : U → Rn be a differentiable function on an open set U ⊆ Rn , and
let h: I → U be a function on an open subset I ⊆ R whose second derivatives exist. Suppose
there is a constant m such that f (h(t)) = mh00 (t). Then h is said to satisfy Newton’s Law.
Definition 6.4.3. With notation as in definition 6.4.2, define the kinetic energy of h to be
1
m|h0 (t)|2 .
2
theorems
Theorem 6.4.1. With notation as in definition 6.4.2, the sum of the potential and the
kinetic energy is constant.
18