Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Fundamental theorem of calculus wikipedia , lookup
Matrix calculus wikipedia , lookup
Function of several real variables wikipedia , lookup
Sobolev space wikipedia , lookup
Distribution (mathematics) wikipedia , lookup
Differential equation wikipedia , lookup
Lecture 3: First-Order Linear ODE’s Dr. Michael Dougherty January 13, 2010 1 Some Definitions Here we briefly define a few terms which will be useful later. In fact, we will revisit the definition of linear found in Farlow’s text (p. 12), and put it in a slightly expanded context. We will number the coefficients differently, but the idea is the same. Also note that the order that terms y, y ′ , y ′′ , · · · , y (n) changes according to convenience. The general nth order linear ODE, where we assume y = y(x), is given by an (x) dn y dn−1 y dy + an−1 (x) n−1 + · · · + a1 (x) + a0 (x)y = f (x). n dx dx dx (1) We note that this can be rewritten using summation notation or matrix multiplication: n X ak (x) k=0 dk y = f (x), dxk a0 (x) a1 (x) a2 (x) ··· an−1 (x) y y′ y ′′ .. . an (x) y (n−1) y (n) = [f (x)]. It is important that the coefficients a0 (x), a1 (x), and so on are functions of x only, as is the right hand side, f (x). An ODE which cannot be written in these ways would be called nonlinear. With few exceptions, we will not be developing techniques for any nonlinear ODE’s. A very useful notation we will use later in the course is borrowed from Calc I, where we define d dk the basic differential operator D = dx . Note that Dk = dx k . With this we can write our original ODE yet another way, where we define another, albeit more complicated differential operator L[ ]: Definition L[y] ======== an (x)Dn y + an−1 (x)Dn−1 y + · · · + a1 (x)Dy + a0 (x)y = an (x)Dn + an−1 (x)Dn−1 + · · · + a1 (x)D + a0 (x) y. (2) which will let us write the linear equation in the more compact form L[y] = f (x), (3) with the understanding that we are solving an equation, the solution being a function y = y(x). One reason that (1), and thus L[y] = f (x) is called linear is because L is a linear differential operator in the linear algebraic sense. The “differential” part means involving derivatives and the “linear” part means that L[y1 + y2 ] = L[y1 ] + L[y2 ], L[βy] = βL[y], for all relevant functions y1 , y2 , for all “scalars” β ∈ R. 1 (4) (5) In fact (4) and (5) together form the general linear-algebraic definition of a linear operator, with the vector space here being some kind of function space in which L makes sense (such as a space of n-times differentiable functions, to be somewhat specific).1 In linear algebra terms, an L satisfying (4) is said to “preserve (vector) addition,” while an L satisfying (5) is said to “preserve scalar multiplication,” the scalars here being the constants β ∈ R. To see that L as in (2) does indeed fit the definition of linear operator, we note first that Dk is a linear operator: Dk (y1 + y2 ) = Dk (y1 ) + Dk (y2 ), Dk (βy) = βDk y, which is just the fact that the derivative (of any order) of a sum is the sum of derivatives, and that multiplicative constants “go along for the ride” with derivatives. The fact that the general L is still linear follows similarly, since the ak (x) functions are coefficients which themselves “go along for the ride.” To prove in the more general case it is easier to cite a theorem from linear algebra: Theorem 1 An operator is linear, i.e., satisfies (4) and (5) if and only if the following holds for all y1 , y2 , α, β: L[αy1 + βy2 ] = αL[y1 ] + βL[y2 ]. (6) The proof is a fairly quick linear algebra exercise. In short, If (6) holds, it is true for α, β = 1, which gives (4), and for α = 0, giving (5). Conversely, if (4) and (5) both hold, then L[αy1 + βy2 ] = L[αy1 ] + L[βy2 ] = αL[y1 ] + βL[y2 ], {z } | {z } | by (4) by (5) which is (6), where we used preservation of addition first, and then preservation of scalar multiplication. That completes a proof. The upshot of Theorem 1 is that we can prove one equation, namely (6), that L preserves linear combinations of functions y1 , y2 , to show (2) gives a linear operator in the sense of (4) and (5). To save space, let us just show that an operator with nth, first and zero-order terms, i.e., L[y] = an (x) dy dn y + a1 (x) + a0 (x)y n dx dx is linear (and the middle terms would fit in the obvious way if included). dn d {αy1 + βy2 } + a1 (x) {αy1 + βy2 } + a0 (x) {αy1 + βy2 } n dx dx n n d y2 dy1 dy2 d y1 + βa1 (x) + αa0 (x)y1 + βa0 (x)y2 = αan (x) n + βan (x) n + αa1 (x) dx dx dx dx dn y2 dn y1 dy1 dy2 = α an (x) n + a1 (x) + a0 (x)y1 + β an (x) n + a1 (x) + a0 (x)y2 dx dx dx dx L[αy1 + βy2 ] = an (x) = αL[y1 ] + βL[y2 ], 1A vector space is a set with operations called “vector addition” and “scalar multiplication,” satisfying several structural axioms. (See any text on linear algebra.) For our part later in the course, the crucial axioms are that it is closed under these operations, meaning if the vector space is V , then (1) for all u, v ∈ V we have u + v ∈ V , and (2) for all u ∈ V and β ∈ R, we have βu ∈ V . More advanced texts define vector spaces which are “function spaces,” specifically ˛ ff ˛ C k (I) = f : I → R ˛˛ f, f ′ , f ′′ , . . . , f (k) exist and are continuous on I , where I is some interval, and f : I → R means that f inputs values in I and outputs values in R. (I will be the domain, and R will contain the range.) Thus C n (I) for a given interval I, or even C n (R) are natural domains of the operator L in (3). Note from calculus that if f k is defined and continuous, so is f k−1 , and therefore f k−2 , etc., until we are down to f itself. C(I) is the space of functions which are continuous on I, C 1 (I) is the set whose derivatives are also, etc. These are all vector spaces. 2 as we claimed. It is important that the coefficients ak (x) are functions of x only; if they are allowed to contain y as well, we would lose linearity. In fact, solving (1) is quite difficult, if not impossible without resorting to numerical methods, except under certain circumstances. If the coefficients a0 (x), a1 (x), . . . , an (x) are all constant, then there is great hope, as we will see in a later lecture. Fortunately, when we have such an equation which is only order 1, even when the coefficients are nonconstant a general method is available. It is a “clever” enough method that it is best memorized and not re-invented each time it is needed. It is presented below. 2 Solving First-Order Linear ODE’s By definition, these will be of the form a1 (x) dy + a0 (x)y = g(x). dx (7) However, this is not the form that we will use to build our method upon. Instead, we will divide by a1 (x), to get a0 (x) g(x) dy + y= , dx a1 (x) a1 (x) which we then write for convenience as dy + P (x)y = f (x). dx (8) Most texts call (8) the standard form of (7). Solving (8) is the subject of Farlow’s Section 2.1. Note that we divided by a1 (x), which may occasionally be zero. We have not yet discussed the topic of just where we can find a solution, i.e., for which x’s can we solve such an equation. Thus anytime we try to solve such an equation, we must realize that our method may well break down outside of intervals on which P (x) and f (x) are defined and continuous. Usually it is obvious, from the form of the solution, just where the solution is valid. We will revisit this idea as we continue our development. Returning to (8), the following technique (trick?) was discovered over the years: 1. Given (8), i.e., dy + P (x)y = f (x). dx 2. Multiply both sides by µ(x) = e R P (x) dx µ(x) e R P (x) dx dy dx : dy + µ(x)P (x)y = µ(x)f (x), dx +e R P (x) dx P (x)y = e R P (x) dx i.e., f (x). (9) (10) 3. Recognize that the LHS of (9) (or (10)) is a product rule. In fact, notice two things about this new equation: (a) The derivative of µ(x) is given by the chain rule and Fundamental Theorem of Calculus: Z R R dµ(x) P (x) dx d =e P (x) dx = e P (x) dx · P (x) = P (x)µ(x); (11) dx dx (b) The RHS is a function of x alone, call it q(x) = µ(x)f (x). Thus we have µ dµ dy +y = q(x). dx dx 3 (12) 4. Re-write the LHS as a derivative of a product: d (µy) = q(x). dx 5. This gives µy = Z (13) q(x) dx, so that R y= q(x) dx . µ If we would like to trace everything through based upon (8), we would get R R P (x) dx e f (x) dx R y= . P (x) dx e (14) R The function µ(x) = e P (x) dx is called an integrating factor, because multiplying by this function gives a desirable form, in this case a product rule form on the LHS of (8), from which we can quickly “integrate,” or solve the ODE. A couple of remarks about constants should be made here. First, we can use any constant we would like in the integral appearing in the integrating factor. It is usually easier to just assume the arbitrary constant of integration is zero there. In fact note e R P (x) dx+C2 =e R P (x) dx C2 e = C3 e R P (x) dx , R so if we change the constant in P (x) dx we are simply multiplying the equation in standard form (8) by a nonzero constant, which does not change anything, including the product rule form in the LHS of the new equation (12). However, the whole of the integral in the numerator of (14) will contain an arbitrary additive constant which does matter, and becomes the parameter in the one-parameter family of solutions of the original ODE. 3 The Integrating Factor in Action One could simply memorize the solution (14) to solve these. However, the above process is usually superior because the formula is sufficiently complicated, and there are places to catch mistakes if we break it into the smaller steps. Furthermore, all we need to memorize are the forms of the ODE (8), the spirit of the process, and the integrating factor µ(x) = e R P (x) dx . (15) Example 1 (From a textbook of Dennis G. Zill, which we used for years at SWOSU.) Solve the ODE: y ′ + 3x2 y = x2 . Solution: Here P (x) = 3x2 , so µ(x) = e R P (x) dx =e R 3x2 dx 3 = ex . Multiplying our ODE by µ(x) gives us y ′ + 3x2 y = x2 =⇒ =⇒ =⇒ 3 3 3 ex y ′ + 3x2 ex y = ex x2 3 ′ 3 ex y = x2 ex Z 3 x3 e y = x2 ex dx 3 1 x3 e +C 3 1 x3 +C 3e . y= x e 3 ex y = =⇒ =⇒ 4 Usually this process gives us a solution which can be then simplified a bit: 3 1 + Ce−x . 3 y= This is a one-parameter family of curves. In fact, the solution is valid for all x ∈ R, which is the where this solution is defined and is continuous. Example 1 is one of the simplest. In fact, it is separable (as the reader should check), unlike the subsequent examples given R below. These can become less forgiving as they become R more difficult to compute the integrals P (x) dx, and therefore the integrating factor µ = exp( P (x) dx. Indeed these can become much more involved. It is also crucial that the equation is in the form (8), i.e., y ′ + P (x)y = f (x). dy = x + y. dx First we need to get the correct form, which again was Example 2 Solve the ODE: y ′ + P (x)y = f (x): dy − y = x. dx This gives P (x) = −1, so µ(x) = e gives R P (x) dx =e y ′ − y = x =⇒ =⇒ R (−1) dx = e−x . Multiplying by this integrating factor e−x y ′ − e−x y = e−x x ′ e−x y = xe−x Z −x e y = xe−x dx. =⇒ Of course now we must integrate R xe−x dx by parts. u=x dv = e−x dx du = dx v = −e−x This gives us Z xe−x dx = uv − Z v du = x(−e−x ) + Inserting this into our earlier solution e−x y = R Z e−x dx = −xe−x − e−x + C. xe−x dx gives us e−x y = −xe−x − e−x + C. Multiplying by ex then gives us y = ex −xe−x − e−x + C and so y = −x − 1 + Cex . Example 3 (From another edition of Zill’s text) Solve the ODE: Here P (x) = cot x, and so µ(x) = e R P (x) dx =e R cot x dx dy + y cot x = 2 cos x. dx = eln | sin x| = | sin x|. Here we can wave our hands a bit. After all, | sin x| = ± sin x, depending upon whether sin x is positive or negative, but we can certainly multiply both sides of our ODE by either function (and 5 check that the method works, i.e., that we get a product rule form on the LHS of our new ODE). For simplicity we will multiply by sin x: y ′ + y cot x = 2 cos x =⇒ =⇒ y ′ sin x + y cot x sin x = 2 cos x sin x y ′ sin x + y cos x = 2 cos x sin x (y sin x)′ = 2 cos x sin x Z y sin x = 2 cos x sin x dx = sin2 x + C =⇒ =⇒ =⇒ y= sin2 x + C . sin x Thus y = sin x + C csc x, and is valid on all intervals of the form (nπ, (n + 1)π), i.e., except where sin x = 0. In the next example, a small amount of work has to be carried out to put the equation into the correct form before using the formula for µ. dy + y sin x = 1. (This is from Farlow 2.1, #10.) dx SolutionIt is important that the coefficient of the y ′ = dy/dx term is 1, so we divide: Example 4 Solve cos x cos x dy + y sin x = 1 =⇒ dx =⇒ =⇒ dy + (tan x)y = sec x dx e R tan x dx dy R tan x dx R tan x dx sec x dx dy + eln | sec x| (tan x)y = eln | sec x| sec x. eln | sec x| dx +e (tan x)y = e As before, we have other choices but we will simply use µ(x) = sec x here: sec x dy + (sec x tan x)y = sec2 x =⇒ dx =⇒ d (y sec x) = sec2 x dx y sec x = tan x + C =⇒ =⇒ y = cos x(tan x + C) y = sin x + C cos x. Such ODE’s can also be part of initial value problems (IVP’s). If we wanted a particular solution, say one whose graph runs through the point (π, 7), we would insert this into the solution: 7 = sin π + C cos π =⇒ 7 = 0 + C(−1) =⇒ C = −7. The solution to the IVP would be y = sin x − 7 cos x: dy + y sin x = 1 cos x dx =⇒ y(π) = 7 4 y = sin x − 7 cos x. Derivation of µ(x) For completeness the derivation of µ is given here. The idea is that one assumes an integrating factor exists, and then attempts to use the ODE to find what the form of µ must be. Recall that the whole point of multiplying by µ was to make the LHS a product rule, in this case. Thus we can ignore the RHS (so long as it is a function of x). What we really need is (µy)′ = RHS, i.e., (µy)′ = µy ′ + P (x)µy. 6 (16) Expanding the derivative on the left then gives µy ′ + yµ′ = µy ′ + P (x)µy. (17) Subtracting µy ′ from both sides gives yµ′ = P (x)µy, (18) which, after dividing by µy gives us a separable equation2 µ′ = P (x). µ (19) Recalling that µ = µ(x), and putting in the integrals gives Z ′ Z µ (x) dx = P (x) dx. µ(x) (20) The integrand on the LHS is the same as (ln |µ(x)|)′ , so we have Z ln |µ(x)| = P (x) dx, giving us R P (x) dx R P (x) dx |µ(x)| = e so that µ(x) = ±e (21) (22) . (23) Now as we mentioned before, if µ works as an integrating factor, so does −µ(x) (and in fact any nonzero multiple of µ will work because multiplicative constants can go along for the ride), so we wave our hands a little and just take µ(x) = e R P (x) dx . (24) Again, this derivation is not necessary once we know the method, but it is interesting to see how one could derive the method. Furthermore the techniques used in this derivation can be attempted with other, more exotic ODE’s, so it is included here.3 2 Equation (19) is separable in the sense that it can be written in the form (1/µ) · dµ/dx = P (x), or (1/µ) dµ = P (x) dx. Recall that dµ(x) = µ′ (x) dx (divide by dx if it is unclear), or dµ = µ′ dx. Thus we can see the “separation:” µ′ = P (x) µ ⇐⇒ 1 dµ = P (x) µ dx ⇐⇒ 1 dµ = P (x) dx. µ 3 Farlow’s derivation is interesting, except that he first does a simple case of y ′ +ay = f (x), with a being a constant, which he multiplies by eax to get eax (y ′ + ay) = eax f (x) ⇐⇒ (eax y)′ = eax f (x). From there (p. 31) he derives the more general µ following much the same procedure as above. Coddington also has similar derivations, with the names of the variables changed, and slightly different ways of looking at the intermediate steps. See Coddington, 1.7 (page 43), and adjacent sections for another derivation. 7