Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Jordan normal form wikipedia , lookup
Vector space wikipedia , lookup
Matrix multiplication wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Four-vector wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Brouwer fixed-point theorem wikipedia , lookup
Math 211 Course Summary table of contents I. II. III. IV. V. VI. Functions of several variables. Rn . Derivatives. Taylor’s Theorem. Differential Geometry. Applications. 1. Best affine approximations. 2. Optimization. 3. Lagrange multipliers. 4. Conservation of energy. I. Functions of several variables. Definition 1.1. Let S and T be sets. The Cartesian product of S and T is the set of ordered pairs: S × T := {(s, t) | s ∈ S, t ∈ T }. Definition 1.2. Let S and T be sets. A function from S to T is a subset W of the Cartesian product S × T such that: (i) for each s ∈ S there is an element in W whose first component is s, i.e., there is an element (s, t) ∈ W for some t ∈ T ; and (ii) if (s, t) and (s, t0 ) are in W , then t = t0 . Notation: if (s, t) ∈ W , we write f (s) = t. The subset W , which is by definition the function f , is also called the graph of f . Definition 1.3. Let f : S → T be a function between sets S and T . 1. f is one-to-one or injective if f (x) = f (y) only if x = y. 2. The image or range of f is {f (s) ∈ T | s ∈ S}. The image will be denoted by im(f ) or f (S). 3. f is onto if im(f ) = T . 4. The domain of f is S and the codomain of f is T . 5. The inverse image of t ∈ T is f −1 (t) := {s ∈ S | f (s) = t}. 1 Definition 1.4. Let f : S → T and g: T → U . The composition of f and g is the function g ◦ f : S → U given by (g ◦ f )(s) := g(f (s)). Definition 1.5. Rn is the Cartesian product of R with itself n times. We think of Rn as the set of ordered n-tuples of real numbers: Rn := {(a1 , . . . , an ) | ai ∈ R, 1 ≤ i ≤ n}. The elements of Rn are called points or vectors. Definition 1.6. A function of several variables is a function of the form f : S → Rm where S ⊆ Rn . Writing f (x) = (f1 (x), . . . , fm (x)), the function fi : S → R, for each i = 1, . . . , m, is called the i-th component function of f . Definition 1.7. Let f be a function of several variables, f : S → Rm , with S ⊆ Rn . If n = 1, then f is a parametrized curve, if n = 2, then f is a parametrized surface. In general, we say f is a parametrized n-surface. Definition 1.8. A vector field is a function of the form f : S → Rm where S ⊆ Rm . Definition 1.9. If f : S → R with S ⊆ Rn , a level set of f is the inverse image of a point in R. A drawing showing several level sets is called a contour diagram for f . II. Rn . linear structure. Definition 2.1. The i-th coordinate of a = (a1 , . . . , an ) ∈ Rn is ai . For i = 1, . . . , n, define the i-th standard basis vector for Rn to be the vector ei whose coordinates are all zero except the i-th coordinate, which is 1. Definition 2.2. (−a1 , . . . , −an ). The additive inverse of a = (a1 , . . . , an ) ∈ Rn is the vector −a := Definition 2.3. In Rn , define ~0 := (0, . . . , 0), the vector whose coordinates are all 0. Definition 2.4. (Linear structure on Rn .) If a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) are points in Rn and s ∈ R, define a + b = (a1 , . . . , an ) + (b1 , . . . , bn ) := (a1 + b1 , . . . , an + bn ) sa = s(a1 , . . . , an ) := (sa1 , . . . , san ). The point a + b is the translation of a by b (or of b by a) and sa is the dilation of a by a factor of s. Define a − b := a + (−b). 2 metric structure. Definition 2.5. The dot product on Rn is the function Rn × Rn → R given by (a1 , . . . , an ) · (b1 , . . . , bn ) := n X ai b i . i=1 The dot product is also called the inner product or scalar product. If a, b ∈ Rn , the dot product is denoted by a · b, as above, or sometimes by (a, b) or ha, bi. Definition 2.6. The norm or length of a vector a = (a1 , . . . , an ) ∈ Rn is v u n uX √ a2i . |a| := a · a = t i=1 The norm can also be denoted by ||a||. Definition 2.7. The vector a ∈ Rn is a unit vector if |a| = 1. Definition 2.8. Let p ∈ Rn and r ∈ R. 1. The open ball of radius r centered at p is the set Br (p) := {a ∈ Rn | |a − p| < r}. 2. The closed ball of radius r centered at p is the set Br (p) := {a ∈ Rn | |a − p| ≤ r}. 3. The sphere of radius r centered at p is the set Sr (p) := {a ∈ Rn | |a − p| = r}. Definition 2.9. The distance between a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) in Rn is v u n uX d(a, b) := |a − b| = t (ai − bi )2 . i=1 Definition 2.10. Points a, b ∈ Rn are perpendicular or orthogonal if a · b = 0. Definition 2.11. Suppose a, b are nonzero vectors in Rn . The angle between them is defined a·b to be cos−1 |a||b| . 3 Definition 2.12. Let a, b ∈ Rn with b = 6 ~0. The component of a along b is the scalar a·b a·b c := b·b = |b|2 . The projection of a along b is the vector cb where c is the component of a along b. affine subspaces. Definition 2.13. A nonempty subset W ⊆ Rn is a linear subspace if it is closed under vector addition and scalar multiplication. This means that: (i) if a, b ∈ W then a + b ∈ W , and (ii) if a ∈ W and s ∈ R, then sa ∈ W . n Definition 2.14. A vector v ∈ Rn is a linear Pk combination of vectors v1 , . . . , vk ∈ R if there are scalars a1 , . . . , ak ∈ R such that v = i=1 ai vi . Definition 2.15. A subspace W ⊆ Rn is spanned by a subset S ⊆ Rn if every element of W can be written as a linear combination of elements of S. If W is spanned by S, we write span(S) = W . Definition 2.16. The dimension of a linear subspace W ⊆ Rn is the smallest number of vectors needed to span W . Definition 2.17. Let W be a subset of Rn and let p ∈ Rn . The set p + W := {p + w | w ∈ W } is called the translation of W by p. An affine subspace of Rn is any subset of the form p + W where W is a linear subspace of Rn . In this case, the dimension of the affine subspace is defined to be the dimension of W . Definition 2.18 A k-plane in Rn is an affine subspace of dimension k. A line is a 1-plane, and a hyperplane is a (n − 1)-plane. affine functions. Definition 2.19. A function L: Rn → Rm is a linear function (or transformation or map) if it preserves vector addition and scalar multiplication. This means that for all a, b ∈ Rn and for all s ∈ R, 1. L(a + b) = L(a) + L(b); 2. L(sa) = sL(a). Definition 2.20. (Linear structure on the space of linear functions.) Let L and M be linear functions with domain Rn and codomain Rm . 1. Define the linear function L + M : Rn → Rm by (L + M )(v) := L(v) + M (v) for all v ∈ Rn . 4 2. If s ∈ R, define the linear function sL: Rn → Rm by (sL)(v) := L(sv) for all v ∈ Rn . Definition 2.21. A function f : Rn → Rm is an affine function (or transformation or map) if it is the ‘translation’ of a linear function. This means that there is a linear funtion L: Rn → Rm and a point p ∈ Rm such that f (v) = p + L(v) for all v ∈ Rn . Definition 2.22. Let W be a k-dimensional affine subspace of Rn . A parametric equation for W is any affine function f : Rk → Rn whose image is W . Definition 2.23. An m × n matrix is a rectangular block of real numbers with m rows and n columns. The real number appearing in the i-th row and j-th column is called the i, j-th entry of the matrix. We write A = (aij ) for the matrix whose i, j-th entry is aij . Definition 2.24. (Linear structure on matrices.) Let A = (aij ) and B = (bij ) be m × n matrices. Define A + B := (aij + bij ). If s ∈ R, define sA := (saij ). Definition 2.25. (Multiplication of matrices.) Let A = (aij ) be an m × k matrix, and let B = (bij ) be a k × n matrix. Define the product, AB to be the m × n matrix whose i, j-th P entry is k`=1 ai` b`j . Definition 2.26. Let A = (aij ) be an m × n matrix. The linear function determined by (or associated with) A is the function LA : Rn → Rm such that P P L(x1 , . . . , xn ) = ( nj=1 a1j xj , . . . , nj=1 amj xj ). Definition 2.27. Let L: Rn → Rm be a linear function. The matrix determined by (or associated with) L is the m × n matrix whose i-th column is the image of the i-th standard basis vector for Rn under L, i.e., L(ei ). Definition 2.28. An n × n matrix, A, is invertible or nonsingular if there is an n × n matrix B such that AB = In where In is the identity matrix whose entries consist of 1s along the diagonal and 0s otherwise. In this case, B is called the inverse of A and denoted A−1 . theorems Theorem 2.1. Let a, b, c ∈ Rn and s, t ∈ R. Then 1. a + b = b + a. 2. (a + b) + c = a + (b + c). 3. ~0 + a = a + ~0 = a. 5 4. a + (−a) = (−a) + a = ~0. 5. 1a = a and (−1)a = −a. 6. (st)a = s(ta). 7. (s + t)a = sa + ta. 8. s(a + b) = sa + sb. Theorem 2.2. Let a, b, c ∈ Rn and s ∈ R. Then 1. a · b = b · a. 2. a · (b + c) = a · b + a · c. 3. (sa) · b = s(a · b). 4. a · a ≥ 0. 5. a · a = 0 if and only if a = ~0. Theorem 2.3. Let a, b ∈ Rn and s ∈ R. Then 1. |a| ≥ 0. 2. |a| = 0 if and only if a = ~0. 3. |sa| = |s||a|. 4. |a · b| ≤ |a||b| (Cauchy-Schwartz inequality). 5. |a + b| ≤ |a| + |b| (triangle inequality). Theorem 2.4. Let a, b ∈ Rn be nonzero vectors. Then −1 ≤ a·b ≤ 1. |a||b| This shows that our definition of angle makes sense. Theorem 2.5. (Pythagorean theorem.) Let a, b ∈ Rn . If a and b are perpendicular, then |a|2 + |b|2 = |a + b|2 . Theorem 2.6. Any linear subspace of Rn is spanned by a finite subset. 6 Theorem 2.7. If a = (a1 , . . . , an ) 6= ~0 and p = (p1 , . . . , pn ) are elements of Rn , then H := {x ∈ Rn | (x − p) · a = 0} is a hyperplane. In other P words, the set of solutions, (x1 , . . . , xn ), to the equation a1 x1 + · · · + an xn = d where d = ni=1 ai pi is a hyperplane. Conversely, every hyperplane is the set of solutions to an equation of this form. Theorem 2.8. If L: Rn → Rm is a linear function and W ⊆ Rn is a linear subspace, then L(W ) is a linear subspace of Rm . Theorem 2.9. A linear map is determined by its action on the standard basis vectors. In other words: if you know the images of the standard basis vectors, you know the image of an arbitrary vector. Theorem 2.10. The image of the linear map determined by a matrix is the span of the columns of that matrix. Theorem 2.11. Let W be a k-dimensional subspace of Rn spanned by vectors v1 , . . . , vk , and let p ∈ Rn . Then a parametric equation for the affine space p + W is f : Rk → Rn (a1 , . . . , ak ) 7→ p + k X ai vi . i=1 Theorem 2.12. Let L be a linear function and let A be the matrix determined by L. Then the linear map determined by A is L. (The converse also holds, switching the roles of L and A.) Theorem 2.13. The linear structures on linear maps and on their associated matrices are combatible: Let L and M be linear functions with associated matrices A and B, respectively, and let s ∈ R. Then the matrix associated with L + M is A + B, and the matrix associated with sL is sA. Theorem 2.14. Let L: Rn → Rk and M : Rk → Rm be linear functions with associated matrices A and B, respectively. Then the matrix associated with the composition, M ◦ L is the product BA. III. Derivatives. Definition 3.1. A subset U ⊆ Rn is open if for each u ∈ U there is a nonempty open ball centered at u contained entirely in U : there exists a real number r > 0 such that Br (u) ⊆ U . 7 Definition 3.2. A point u ∈ Rn is a limit point of a subset S ⊆ Rn if every open ball centered at u, Br (u), contains a points of S different from u. Definition 3.3. Let f : S → Rm be a function with S ⊆ Rn . Let s be a limit point of S. The limit of f (x) as x approaches s is v ∈ Rm if for all real numbers > 0, there is a real number δ > 0 such that 0 < |x − s| < δ and x ∈ S ⇒ |f (x) − v| < . Notation: limx→s f (x) = v. Definition 3.4. Let f : S → Rm with S ⊆ Rn , and let s ∈ S. The function f is continuous at s ∈ S if for all real numbers > 0, there is a real number δ > 0 such that |x − s| < δ and x ∈ S ⇒ |f (x) − f (s)| < . (Thus, f is continuous at a limit point s ∈ S if and only if limx→s f (x) = f (s) and f is automatically continuous at all points in S which are not limit points of S.) The function f is continuous on S if it is continuous at each point of S. Definition 3.5. Let f : U → Rm with U an open subset of Rn , and let ei be the i-th standard basis vector for Rn . The i-th partial of f at u ∈ U is the vector in Rm f (u + tei ) − f (u) ∂f (u) := lim t→0 ∂xi t provided this limit exists. Definition 3.6. Let f : U → R with U an open subset of Rn . Let u ∈ U , and let v ∈ Rn be a unit vector. The directional derivative of f at u in the direction of v is the real number f (u + tv) − f (u) t→0 t provided this limit exists. The directional derivative of f at u in the direction of an arbitrary nonzero vector w is defined to be the directional derivative of f at u in the direction of the unit vector w/|w|. fv (u) := lim Definition 3.7. Let f : U → Rm with U an open subset of Rn . Then f is differentiable at u ∈ U if there is a linear function Dfu : Rn → Rm such that |f (u + h) − f (u) − Dfu (h)| = 0. |h| h→~0 lim The linear function Dfu is then called the derivative of f at u. The notation f 0 (u) is sometimes used instead of Dfu . The function f is differentiable on U if it is differentiable at each point of U . Definition 3.8. Let f : U → Rm with U an open subset of Rn . The Jacobian matrix of f at u ∈ U is the m × n matrix of partial derivatives of the component functions of f : ∂f1 ∂f1 (u) . . . (u) ∂x1 ∂xn ∂fi .. .. .. (u) = Jf (u) := . . . . ∂xj ∂fm ∂fm (u) . . . ∂xn (u) ∂x1 8 1. The i-th column of the Jacobian matrix is the i-th partial derivative f at u and is called the i-th principal tangent vector to f at u. 2. If n = 1, then f is a parametrized curve and the Jacobian matrix consists of a single column. This column is the tangent vector to f at u or the velocity of f at u, and its length is the speed of f at u. We write 0 f 0 (u) = (f10 (u), . . . , fm (t)) for this tangent vector. 3. If m = 1, the Jacobian matrix consists of a single row. This row is called the gradient vector for f at u and denoted ∇f (u) or gradf (u): ∂f ∂f (u), . . . , (u) . ∇f (u) := ∂x1 ∂xn theorems Theorem 3.1. Let f : S → Rm and g: S → Rm where S is a subset of Rn . 1. The limit of a function is unique. 2. The limit, limx→s f (x), exists if and only if the corresponding limits for each of the component functions, limx→s fi (x), exists. In that case, lim f (x) = lim f1 (x), . . . , lim fm (x) . x→s x→s x→s 3. Define f +g: S → Rm by (f +g)(x) := f (x)+g(x). If limx→s f (x) = a and limx→s g(x) = b, then limx→s (f + g)(x) = a + b. Similarly, if t ∈ R, define tf : U → Rm by (tf )(x) := t(f (x)). If limx→s f (x) = a, then limx→s (tf )(x) = ta. 4. If m = 1, define (f g)(x) := f (x)g(x) and (f /g)(x) := f (x)/g(x) (provided g(x) 6= 0). If limx→s f (x) = a and limx→s g(x) = b, then limx→s (f g)(x) = ab and, if b 6= 0, then limx→s (f /g)(x) = a/b. 5. If m = 1 and g(x) ≤ f (x) for all x, then limx→s g(x) ≤ limx→s f (x) provided these limits exist. Theorem 3.2. Let f : S → Rm and g: S → Rm where S is a subset of Rn . 1. The function f is continuous if and only if the inverse image of every open subset of Rm under f is the intersection of an open subset of Rn with S. 9 2. The function f is continuous at s if and only if each of its component functions is continuous at s. 3. The composition of continuous functions is continuous. 4. The functions f + g and tf for t ∈ R as above are continuous at s ∈ S provided f and g are continuous at s. 5. If m = 1 and f and g are continuous at s ∈ S, then f g and f /g are continuous at s (provided g(s) 6= 0 in the latter case). 6. A function whose coordinate functions are polynomials is continuous. Theorem 3.3. If f : Rn → Rm is a linear transformation, then f is differentiable at each p ∈ Rn , and Dfp = f . Theorem 3.4. (The chain rule.) Let f : U → Rk and g: V → Rm where U is an open subset of Rn and V is an open subset of Rk . Suppose that f (U ) ⊆ V so that we can form the composition, g ◦ f : U → Rm . Suppose that f is differentiable at p ∈ U and g is differentiable at f (p); then g ◦ f is differentiable at p, and D(g ◦ f )p = Dgf (p) ◦ Dfp . In terms of Jacobian matrices, we have, J(g ◦ f )(p) = Jg(f (p))Jf (p). Theorem 3.5. Let f : U → Rm where U is an open subset of Rn . Then f is differentiable at p ∈ U if and only if each component function fi : U → R is differentiable, and in that case, Dfp (v) = (Df1p (v), . . . , Dfmp (v)) for all v ∈ Rn . Theorem 3.6. Let f : U → R where U is an open subset of Rn . If the directional derivative of f at u ∈ U in the direction of the unit vector v ∈ Rn exists, it is equal to the dot product ∇f (u) · v. Theorem 3.7. Let f : U → R be a differentiable function on an open subset U ∈ Rn . The gradient vector, ∇f (u), of f at u ∈ U points in the direction of quickest increase of f and its magnitude gives the rate of increase of f in that direction. Theorem 3.8. Let f : U → R be a differentiable function on an open subset U ⊆ Rn . The gradient vector, ∇f (u) of f at u is perpendicular to the level set of f through u, i.e., to f −1 (f (u)). More precisely, let h: I → U be a differentiable function on an open interval I ⊆ R containing the origin with h(0) = u. Suppose that f ◦ h is constant, i.e., the image of 10 h lies in the level set through u. Then the gradient of f at u is perpendicular to the tangent to h at 0: h0 (0) · ∇f (u) = 0. Theorem 3.9. Let f : U → Rm where U is an open subset of Rn . Suppose that the partial derivative of each of the component functions of f exists at a ∈ U . Then each partial derivative of f exists at a and ∂f ∂f1 ∂fm (a) = (a), . . . , (a) . ∂xi ∂xi ∂xi Theorem 3.10. Let f : U → Rm where U is an open subset of Rn , and let u ∈ U . The second partial derivatives, ∂ 2 f (u)/∂xi ∂xj and ∂ 2 f (u)/∂xj ∂xi are equal if they exist and are continuous. Theorem 3.11. Let f : U → Rm where U is an open subset of Rn . If f is differentiable at u ∈ U , then each of the first partial derivatives of each of the component functions exists and Dfu is the linear map determined by the Jacobian matrix, Jf (u). Theorem 3.12. Let f : U → Rm where U is an open subset of Rn . If each of the first partial derivatives of each of the component functions of f exists at u ∈ U and is continuous, then f is differentiable at u. In this case, f is said to be continuously differentiable at u. Theorem 3.13. (The inverse function theorem.) Let f : U → Rn be a function with continuous partial derivatives on the open set U ⊆ Rn . Suppose that the Jacobian matrix, Jf (u) is invertible at some point u ∈ U . Then there is an open subset V ⊆ U containing u such that f is injective when restricted to V and its inverse (defined on f (V )) is differentiable with Jacobian matrix Jf (u)−1 . Theorem 3.14. (The implicit function theorem.) Suppose f : Rn × Rm → Rm is a function with continuous partial derivatives. Suppose that the m × m matrix ∂fi (u) ∂xn+j where 1 ≤ i, j ≤ m is invertible for some u = (u1 , u2 ) ∈ Rn × Rm . Then there is an open set U ⊆ Rn containing u1 and an open set V ⊆ Rm containing u2 such that for each x ∈ U , there is a unique g(x) ∈ V with f (x, g(x)) = 0. The function g is differentiable. Theorem 3.15 (The rank theorem.) Suppose f : U → V is a smooth mapping and that f has constant rank k in some open neighborhood of p ∈ U . Then there exist open sets Ũ ⊆ U containing p and Ṽ ⊆ V containing f (p) along with diffeomorphisms φ : Ũ → U 0 ψ : Ṽ → V 0 11 onto open subsets U 0 ⊆ Rn and V 0 ⊆ Rm such that ψ ◦ f ◦ φ−1 (x1 , . . . , xn ) = (x1 , . . . , xk , 0, . . . , 0). IV. Taylor’s Theorem Definition 4.1. Let f : U → Rm with U an open subset of Rn , and let k be a nonnegative integer. A partial derivative of order k of f at u ∈ U is defined recursively as follows: (i) the zeroth-order partial derivative is f (u); (ii) a k-th order partial derivative, with k > 0 is any partial derivative of a (k − 1)-th partial derivative. Definition 4.2. Let f : U → Rm with U an open subset of Rn . Suppose that all partial derivatives of f of order less than or equal to k exist and are continuous. The Taylor polynomial of order k for f at u ∈ U is Puk f (x1 , . . . , xn ) := X i1 +···+in ∂ i1 +···+in f 1 (u)(x1 − u1 )i1 · · · (xn − un )in . i1 i n i ! · · · i ! 1 n ∂x1 . . . ∂xn ≤k If the partial derivatives of every order exist and are continuous at u, one defines the Taylor series for f at u, Pu f , by replacing k by ∞ in the above displayed equation. theorems Theorem 4.1. (Taylor’s theorem in one variable.) Let f : S → R with S ⊆ R. Suppose that S contains an open interval containing the closed interval [a, b]. Also suppose that all the derivatives up to order k exist and are continuous on [a, b] and the (k + 1)-th derivative exists on (a, b). Let x, y ∈ [a, b]. Then there exists a number c between x and y such that f (x) = Pyk f (x) + dk+1 f 1 (c)(x − y)k+1 (k + 1)! dxk+1 where Pyk f is the k-th order Taylor polynomial for f at y. Theorem 4.2. (Taylor’s theorem in several variables.) Let f : U → Rm where U is an open subset of Rn . Suppose that the partial derivatives of f up to order k exist and are continuous on U and that the partial derivatives of order k + 1 exist on U . Let u, x ∈ U and suppose that the line segment, u + tx, 0 ≤ t ≤ 1, is contained in U . Then there exists a number c between 0 and 1 such that f (x) = Puk f (x) + r(cx) where Puk f is the k-th order Taylor polynomial for f at u and r: U → Rm is a function such that limv→~0 r(v)/|v|k+1 = 0. V. Differential geometry. 12 Definition 5.1. The tangent space to Rn at u ∈ Rn is Rn labeled with the point u; more formally, it is the Cartesian product {u} × Rn . We denote the tangent space at u by Rnu . Definition 5.2. If f : U → Rm with U an open subset of Rn is differentiable at u ∈ U , the tangent map for f at u is defined to be the derivative of f at u, thought of as a mapping between tangent spaces, Dfu : Rnu → Rm f (u) . Definition 5.3. Let f : Rn → Rm be a differentiable function. The first fundamental form for f at p ∈ Rn is the function h , ip : Tp Rn × Tp Rn → R given by hu, vip := Dfp (u) · Dfp (v). Given the first fundamental form, define 1. The length of u ∈ Tp Rn is |u|p := p hu, uip . 2. Vectors u, v ∈ Tp Rn are perpendicular if hu, vip = 0. 3. The angle between nonzero u, v ∈ Tp Rn is arccos hu, vip . |u|p |v|p 4. The component of u ∈ Tp Rn along nonzero v ∈ Tp Rn is hu, vip . hv, vip Definition 5.4. Let f : Rn → Rm be a differentiable function. The first fundamental form matrix is the n × n symmetric matrix, I with i, j-th entry fxi · fxj , ∂f ∂f I = ∂xi · ∂xj . Definition 5.5. A pseudo-metric on Rn is a symmetric n × n matrix I whose entries are real-valued functions on Rn . Given any such I, define for each p ∈ Rn and for all u, v ∈ Tp Rn hu, vip := uT I(p) v. The matrix I is called a metric on Rn if h , ip is positive definite for each p ∈ Rn , i.e., for all u ∈ Tp Rn , hu, uip ≥ 0 with equality exactly when u = 0. Given the form, h , ip , one defines lengths, distances, angles, and components as with the first fundamental form. Definition 5.6. Let c : [a, b] → Rn be a parametrized curve. The length of c is Z b length(c) = |c0 (t)| dt. a 13 Definition 5.7. Let c : [a, b] → Rn be a parametrized curve. Suppose α : [a0 , b0 ] → [a, b] is a function such that α0 is continuous, α0 (t) > 0 for all t, α(a0 ) = a, and α(b0 ) = b. Then the curve c̃ = c ◦ α : [a0 , b0 ] → Rn is called a reparametrization of c. Definition 5.8. Let u : [a, b] → Rn be a parametrized curve, and let I be a pseudo-metric on Rn . The length of u with respect to I is Z b |u0 (t)|u(t) dt, length(u) = a where the length of the vector u0 (t) is its length with respect to I as an element of the tangent space Tu(t) Rn . Definition 5.9. Let u : [a, b] → Rn and f : Rn → Rm , so c = f ◦ u is a parametrized curve on the image of f . Then c is a geodesic if c00 (t) · fxi (u(t)) = 0 for i = 1, . . . , n. Definition 5.10. Fix a metric I, and denote its ij-th entry by gij . Since I is positive definite, it turns out that it must be an invertible matrix, and we denote the entries of the inverse by g ij : I = (gij ) , I−1 = g ij . For each i, j, ` ∈ {1, . . . , n} define a Christoffel symbol n 1 X ∂gjk ∂gij ∂gki ` − + g k` . Γij = 2 k=1 ∂xi ∂xk ∂xj Let x(t) = (x1 (t), . . . , xn (t)) be a curve in Rn . Then x(t) is a geodesic if it satisfies the system of differential equations X ẍk + Γki,j ẋi ẋj = 0, k = 1, . . . , n. i,j In the case n = 2 we can write E F I= , F G −1 I 1 = ∆ G −F −F E where ∆ = 1/ det I = 1/(EG − F 2 ). In this case, the Christoffel symbols are: Γ111 = 1 (GEx 2∆ − 2F Fx + F Ey ) Γ211 = 1 (2EFx 2∆ Γ112 = 1 (GEy 2∆ − F Gx ) Γ212 = 1 (EGx 2∆ − F Ey ) Γ122 = 1 (2GFy 2∆ Γ222 = 1 (EGy 2∆ − 2F Fy + F Gx ), − GGx − F Gy ) 14 − EEy − F Ex ) and the equations for a geodesic are: ẍ1 + ẋ21 Γ111 + 2ẋ1 ẋ2 Γ112 + ẋ22 Γ122 = 0 ẍ2 + ẋ21 Γ211 + 2ẋ1 ẋ2 Γ212 + ẋ22 Γ222 = 0. theorems Theorem 5.1. Let f : Rn → Rm be a differentiable function, and let p ∈ Rn . For u, v ∈ Tp Rn , v1 fx1 (p) · fx1 (p) . . . fx1 (p) · fxn (p) .. .. .. ... hu, vip = uT I(p) v = u1 . . . un . . . . fxn (p) · fx1 (p) . . . fxn (p) · fxn (p) vn Theorem 5.2. The first fundamental form is a symmetric, positive definite form: 1. hx, yiu = hy, xiu for all x, y ∈ R2u . 2. hx + y, ziu = hx, ziu + hy, ziu for all x, y, z ∈ R2u . 3. hsx, yiu = shx, yiu for all x, y ∈ R2u and s ∈ R. 4. hx, xiu ≥ 0 for all x ∈ R2u , and hx, xiu = 0 if and only if x = ~0. Theorem 5.3. Consider Rn with metric I. For each u, v ∈ Rn , then there is an > 0 and a unique geodesic h: (−, ) → Rn with h(0) = u and h0 (0) = v. Theorem 5.4. Geodesics give, locally, the shortest distance between two points on a surface: Consider Rn with metric I, and let u ∈ Rn . Then there is a ball of radius r centered at u, Br (u), such that for any v ∈ Br (u), the geodesic joining u to v is shorter than any other curve joining u to v. VI. Applications. VI.1. Best affine approximations. Definition 6.1.1. Let f : U → Rm be a differentiable function on an open set U ⊆ Rn . The best affine approximation to f at u ∈ U is the affine function T fu : Rn → Rm x 7→ f (u) + Dfu (x − u) where Dfu is the derivative of f at u. 15 Definition 6.1.2. With notation as in Definition 6.1.1, the image of T fu is called the (embedded) tangent space to f at u. theorems Theorem 6.1.1. Let f : U → Rm be a differentiable function on an open set U ⊆ Rn , and let u ∈ U . If A: Rn → Rm is an affine function with A(u) = f (u) and |f (x) − A(x)| =0 x→u |x| lim then A is T fu , the best affine approximation to f at u. Theorem 6.1.2. Let f : U → Rm with U and open subset of Rn , and let h: V → U where V is an open subset of R. Let c := f ◦ h be the corresponding parametrized curve on the surface f . Let u ∈ U , and suppose that c(0) = f (u). Then f (u) + c0 (0) is contained in the embedded tangent space to f at u. Conversely, each element of the embedded tangent space to f at u can be written as f (u) + c0 (0) for some parametrized curve c on f with c(0) = f (u). VI.2. Optimization. Definition 6.2.1. The set S ⊆ Rn is closed if its complement is open, i.e., if for every x not in S, there is a nonempty open ball centered at x which does not intersect S. Definition 6.2.2. The set S ⊂ Rn is bounded if it is contained in some open ball centered at the origin, i.e., if there is a real number r > 0 such that |s| < r for all s ∈ S. Definition 6.2.3. A point s ∈ S ⊆ Rn is in the interior of S if there is a nonempty open ball centered at s contained entirely in S. Otherwise, s is on the boundary of S. Definition 6.2.4. Let f : S → R where S ⊂ Rn . 1. s ∈ S is a (global) maximum for f if f (s) ≥ f (s0 ) for all s0 ∈ S. 2. s ∈ S is a (global) minimum for f if f (s) ≤ f (s0 ) for all s0 ∈ S. 3. s ∈ S is a local maximum for f if s is an interior point and f (s) ≥ f (s0 ) for all s0 in some nonempty open ball centered at s. 4. s ∈ S is a local minimum for f if s is an interior point and f (s) ≤ f (s0 ) for all s0 in some nonempty open ball centered at s. 5. A global extremum for f is a global maximum or global minimum for f . A local extremum for f is a local maximum or local minimum for f . 16 Definition 6.2.5. Let f : S → R where S ⊆ Rn . A point s ∈ S is a critical or stationary point of f if it is an interior point, f is differentiable at s, and ∇f (u) = 0, i.e., all the partial derivatives of f vanish at s. Definition 6.2.6. A critical point which is not a local extrema is called a saddle point. Definition 6.2.7. Let f : U → R be a function on an open set U ⊆ Rn with continuous partial derivatives up to order 3. Let u ∈ U be a critical point of f . The quadratic form for f at u is defined by Qu f (x) := Pu2 f (x + u), where Pu2 is the second order Taylor polynomial for f at u. Hence, X Qu f (x1 , . . . , xn ) := i1 +···+in 1 ∂ 2f (u)xi11 · · · xinn . i1 i n i ! · · · i ! n ∂x1 . . . ∂xn =2 1 theorems Theorem 6.2.1. Let S ⊂ Rn be a closed and bounded set. Let f : S → R be a continuous function. Then f has a maximum and a minimum in S. Theorem 6.2.2. Let f : U → R be a differentiable function on an open set U ⊆ Rn . If u ∈ U is a local extrema for f , then u is a critical point for f . Theorem 6.2.3. Let f : U → R be a function on an open set U ⊆ Rn with continuous partial derivatives up to order 3. Suppose the quadratic form, Qu f , for f at a critical point u is nonzero. Then u is a local minimum, a local maximum, or a saddle point according as Qu f is a local minimum, a local maximum, or a saddle at ~0. VI.3. Lagrange multipliers. Theorem 6.3.1. Let f and g1 , . . . , gk be k + 1 differentiable, real-valued functions on an open subset U ⊆ Rn . Let u ∈ U . Suppose that if there are constants a0 , a1 , . . . , ak for which a0 ∇f (u) + a1 ∇g1 (u) + · · · + an ∇gn (u) = ~0 then a0 = a1 = · · · = an = 0. Then there are points u1 and u2 arbitrarily close to u such that gi (u) = gi (u1 ) = gi (u2 ) for i = 1, . . . , k and f (u1 ) < f (u) < f (u2 ). VI.4. Conservation of energy. Definition 6.4.1. Let f : U → Rn be a differentiable function on an open set U ⊆ Rn (i.e., f is a differentiable vector field on U ). Then f is conservative if there is a differentiable function φ: U → R such that f = grad φ. In that case, we call ψ := −φ a potential (energy) function for f (so, in that case, f = −grad ψ). 17 Definition 6.4.2. Let f : U → Rn be a differentiable function on an open set U ⊆ Rn , and let h: I → U be a function on an open subset I ⊆ R whose second derivatives exist. Suppose there is a constant m such that f (h(t)) = mh00 (t). Then h is said to satisfy Newton’s Law. Definition 6.4.3. With notation as in definition 6.4.2, define the kinetic energy of h to be 1 m|h0 (t)|2 . 2 theorems Theorem 6.4.1. With notation as in definition 6.4.2, the sum of the potential and the kinetic energy is constant. 18