Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
History of calculus wikipedia , lookup
Series (mathematics) wikipedia , lookup
Automatic differentiation wikipedia , lookup
Sobolev space wikipedia , lookup
Partial differential equation wikipedia , lookup
Function of several real variables wikipedia , lookup
Matrix calculus wikipedia , lookup
Generalizations of the derivative wikipedia , lookup
Limit of a function wikipedia , lookup
Higher–Dimensional Chain Rules I. Introduction. The one–dimensional Chain Rule tells us how to find the derivative of a composed function w1 (t) = k c(t) from the derivatives of k and c—in the case where these are all Calculus I functions. As you are well aware, the Chain Rule says: If k 0 c(t) and c0 (t) both exist then w10 (t) exists and is given by i dh k c(t) = w10 (t) = k 0 c(t) c0 (t). dt (1) Now, the notion of composing functions extends to functions with more general domains and ranges. For example, if ~ C(t) = h cos(t), sin(t) i and k(x, y) = x2 e3y , ~ is then the composition of k with C ~ w2 (t) = k C(t) = cos2 (t)e3 sin(t) , (2) and you might guess that there exist chain rules similar to the first one for situations such as this. That d h ~ i guess would be correct: there is indeed a chain rule for calculating k C(t) from the derivatives of k dt ~ and C. In fact, there is a whole family of chain rules, one for each case Rn C k k R R` . −→ −→ This handout will state and (almost) prove the first and simplest of these new chain rules, the case in which n = ` = 1 and k = 2: ~ C k R R2 R. −→ −→ Note that Equation (2) above is an instance of this case. II. The statement of the theorem. First, let me set the stage and introduce some notation. ~ = C(t) ~ ~ 0 ) = h f (t0 ), g(t0 ) i. I will assume [a]: Let C = h f (t), g(t) i be a planar curve, with (x0 , y0 ) = C(t 0 0 0 ~ that C (t0 ) = h f (t0 ), g (t0 ) i exists. [b]: Let k = k(x, y) be a real–valued function of two variables, defined in a neighborhood of ~ 0 ). I will assume that k is differentiable at (x0 , y0 ), so that (x0 , y0 ) = C(t |f (x, y) − L(x, y)| = 0, (x,y)→(x0 ,y0 ) k h x − x0 , y − y0 i k lim (3) where L(x, y) = k(x0 , y0 ) + kx (x0 , y0 )(x − x0 ) + ky (x0 , y0 )(y − y0 ). [c]: For any h 6= 0, let xh = f (t0 + h) . yh = g(t0 + h) I can now state the theorem. ~ be as above, and let w2 (t) = k C(t) ~ Theorem. Let k and C . Then w2 is differentiable at t0 , and w20 (t0 ) = kx (x0 , y0 )f 0 (t0 ) + ky (x0 , y0 )g 0 (t0 ). (4) In class, we will differentiate the function in Equation (2) both by Calculus I methods and with this new formula, and we will compare the answers. 1 III. The (almost) proof of the theorem. Step 1. I will start with the (Calculus I) definition of the derivative of w2 : ~ 0 + h) − k C(t ~ 0) k C(t k f (t0 + h), g(t0 + h) − k f (t0 ), g(t0 ) = lim h→0 h→0 h h w20 (t0 ) = lim (5) k xh , yh − k x0 , y0 . h→0 h = lim Step 2. I will add and subtract L(xh , yh ) in the numerator of (5), so as to break (5) into two limit problems: (5) = lim h→0 k(xh , yh ) − L(xh , yh ) L(xh , yh ) − k(x0 , y0 ) + lim , h→0 h h | {z } | {z } (B) (A) if both of these limits exist. Step 3: Evaluating the limit of quantity (A). Since L(xh , yh ) = k(x0 , y0 ) + kx (x0 , y0 )(xh − x0 ) + ky (x0 , y0 )(yh − y0 ), (A) = so that kx (x0 , y0 )(xh − x0 ) + ky (x0 , y0 )(yh − y0 ) , h xh − x0 yh − y0 + lim ky (x0 , y0 ) h→0 h→0 h h xh − x0 y h − y0 = kx (x0 , y0 ) lim + ky (x0 , y0 ) lim h→0 h→0 h h f (t0 + h) − f (t0 ) g(t0 + h) − g(t0 ) = kx (x0 , y0 ) lim + ky (x0 , y0 ) lim h→0 h→0 h h = kx (x0 , y0 )f 0 (t0 ) + ky (x0 , y0 )g 0 (t0 ). lim (A) = lim kx (x0 , y0 ) h→0 Thus, the limit of quantity (A) is exactly the right-hand side of Equation (4). To prove the theorem, then, I must show that the limit of quantity (B) is zero. Step 4: Evaluating the limit of quantity (B). Assume, for all h 6= 0, that h xh , yh i 6= h x0 , y0 i1 , so that one can multiply and divide quantity (B) by k h xh − x0 , yh − y0 i k. I will do this arithmetic (and also replace quantity (B) with its absolute value): (B) = |k(xh , yh ) − L(xh , yh )| · k h xh − x0 , yh − y0 i k . k h xh − x0 , yh − y0 i kk |h| | {z } | {z } (C) (D) It follows that k h xh − x0 , yh − y0 i k |k(xh , yh ) − L(xh , yh )| · lim , lim (B) = lim h→0 h→0 k h xh − x0 , yh − y0 i k h→0 |h| | {z } | {z } (C) (6) (D) 1 This additional assumption is what makes this an almost proof. One must employ a trick involving ~ = (x0 , y0 ) for multiple values of t. I will not bother you with continuity to deal with the case in which the C the details. 2 if the two limits on the right of (6) exist. Furthermore, the limit of quantity (C) exists and equals zero; this ~ is continuous at t0 (see Exercise (1)). The limit follows immediately from equation (3) and the fact that C of quantity (D) will require some further computation: k h xh , yh i − h x0 , y0 i k |h| 1 = lim ( h xh , yh i − h x0 , y0 i ) h→0 h 1 ~ ~ 0) = lim C(t0 + h) − C(t h→0 h ~ 0 (t0 ) . = C lim (D) = lim h→0 h→0 I can now complete the proof by finishing the computation begun in Equation (6): k h xh − x0 , yh − y0 i k |k(xh , yh ) − L(xh , yh )| ~ 0 (t0 ) = 0. · lim = (0) C lim (B) = lim h→0 h→0 k h xh − x0 , yh − y0 i k h→0 |h| Exercise 1. Prove that the limit of quantity (C) is zero. IV. The gradient. Observe that the Calculus I Chain Rule (Equation (1)) is very compact: it says that you can calculate w10 (t), by multiplying together two derivatives, one supplied by the function k and the other supplied by the function c. The new Chain Rule (Equation (4)) is messier: this formula instructs us, in order to calculate w20 (t), to ~ combine four derivative-type quantities2 —two supplied by k and the other two supplied by C—in a rather 0 complicated way. As it turns out, though, we can do better. If you recombine the “f (t0 )” and “g 0 (t0 )” into the derivative vector ~ 0 (t0 ) = hf 0 (t0 ), g 0 (t0 )i C and you put the two partial derivatives of k together to make the vector h kx (x0 , y0 ), ky (x0 , y0 ) i , (7) then the right-hand side of (4) is exactly the dot product of these vectors: kx (x0 , y0 )f 0 (t0 ) + ky (x0 , y0 )g 0 (t0 ) = h kx (x0 , y0 ), ky (x0 , y0 ) i · h f 0 (t0 ), g 0 (t0 ) i . (8) Equations (8) makes it possible to state the new Chain Rule as succinctly as we can state the old one: you can calculate w20 (t), by taking the dot product of two vectors, one supplied by the func~ tion k and the other supplied by the function C. The vector supplied by k (Equation (7)) is important. It is called the gradient of k (x0 , y0 ), and it is denoted “∇k(x0 , y0 ).” with this notation the new Chain Rule can be expressed: ~ 0 )) · C ~ 0 (t0 ). w20 (t0 ) = ∇k(C(t 2 There will be six in the R ~ C k R3 R case! −→ −→ 3