Download Lecture3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relativistic quantum mechanics wikipedia , lookup

Mathematical optimization wikipedia , lookup

Dirac bracket wikipedia , lookup

Numerical continuation wikipedia , lookup

Multiple-criteria decision analysis wikipedia , lookup

Inverse problem wikipedia , lookup

Mathematical descriptions of the electromagnetic field wikipedia , lookup

Perturbation theory wikipedia , lookup

Navier–Stokes equations wikipedia , lookup

Routhian mechanics wikipedia , lookup

Computational fluid dynamics wikipedia , lookup

Computational electromagnetics wikipedia , lookup

Transcript
Numerical Methods for Evolutionary Systems – Lecture 3
C. W. Gear
Celaya, Mexico, January 2007
Differential-Algebraic Equations (DAEs) and Low Dimensional Manifolds
Differential-Algebraic Equations (DAEs) have been a separate topic for about 45 years.
They refer to combinations of differential equations and coupled (non-linear) equations.
One place familiar to chemical engineers in which they arise is in reaction kinetics.
Suppose we had the system:
that represents some reaction kinetics and we knew that, say, the reaction for the 7-th species,
y7, was very fast. For that case it has sometimes been recommended that the corresponding
differential equation be deleted and the right hand side equated to zero.
L3-1
Copyright 2006, C. W. Gear
If we do this we get
0
These are now a set of DAEs. In this simple case we could compute y7 from the last equation
using y7 = μk5y1y2 /(k4fy3) and then eliminate it from the other 6 equations to get a system of 6
ODEs in 6 unknowns that presumably require less computation for their solution than the
original larger system.
Today DAEs arise in many contexts and in some cases they are very easy to solve, but in other
cases they are effectively unsolvable by computational methods until the equations have been
significantly modified. Furthermore, some DAEs can give rise to significantly larger numerical
errors in their solution than others with only slightly different properties, so it is important for
anyone using an off-the-shelf code for DAEs to be aware of the issues.
L3-2
Copyright 2006, C. W. Gear
We noted in the last lecture that stiff equations did not become a computational issue until
the digital computer was introduced and problems large enough to have stiff components
could be considered for solution.
DAEs did not become an issue until computer had enough memory and compute capacity
that one could start trying to model systems so large that it became difficult, if not impossible
to write down a system of ordinary differential equations to describe them. This first
happened in Network Analysis which is still one of the largest sources of DAE problems. In
Network Analysis one is presented with an interconnected system of discrete components
such as resistors, capacitors and transistors (for an Electrical Engineering example) or
beams and columns and other structural elements (for a Civil Engineering example) and one
wants to compute various characteristics. In Transient Analysis one wants to determine the
time-dependent behavior under applied driving conditions (electrical inputs or forces due to
changing loads). In fact, the analysis of these two cases is very similar. Until about 1970
these problems were handled by generating a set of ODEs that described the circuit. Let us
look at a very simple electrical network to see what is involved.
R1
Vin = g(t)
R2
C1
C2
R
3
Vout = ?
We are given the graph topology
and parameters of the network and
we want to determine the output
waveform from the input waveform.
This means we need to find the
differential equation that it satisfies.
This can be done in various ways
that will be discussed on the next
slide.
L3-3
Copyright 2006, C. W. Gear
R1
1. We start with the network:
2. Next we label all the voltages and
currents in the network – in this
case there are only two unknown
voltages, and one is the output
voltage.
R2
V1
I2
I1
Vin = g(t)
I4
I
Vin  V1  R1 I1
C1
I3
C2
I8
I1  I 7  0; I1  I 2  I 4
Vout  R3 I 6
I 4  I 7  I8 ; I 2  I 5  I 3
dV1
 I4
dt
dV
C2 out  I 5
dt
I8  I5  I 9 ;
C1
4. Now eliminate variables until we
only have differential equations –which
in this case is fairly easy
I
V1  Vout  R2 I 2
C1
3 (b) … then for the circuit topology –
called the Kirchoff current laws
R Vout = ?
3 I6
9
7
3. Now we write down all of the equations
satisfied by the system. (a) First for the
components in the system:
I5
I 3  I 6  0; I 6  I 9  0
dV1 Vin  V1 V1  Vout


dt
R1
R2
dVout V1  Vout Vout
C2


dt
R2
R3
L3-4
Copyright 2006, C. W. Gear
Obviously, any competent engineer could have written this down much more rapidly without
introducing all those unnecessary variables. However, when the network is very large and
the equations have to be produced by a computer we need an algorithm that is guaranteed to
generate the equations.
Throughout the 1960’s, automatic codes generated sets of ODEs from such networks by
using techniques related to what we just did, although they used some important short cuts
to avoid generating all of those current variables. (The primary technique was to build a
spanning tree of the graph by removing edges in the graph until there were no cycles - if you
don’t know what this means, it’s not important to what follows - the point is that such
techniques then permitted a minimum number of current variables to be defined, knowing
that there were direct methods to express all other current in terms of those variables.)
Some other techniques were used that could be guaranteed to find the set of ODEs satisfied
by the system.
HOWEVER, as computers grew in capacity, more and more complex networks were analyzed
(today they may have millions of elements) and more complex elements (not just resistors,
capacitors and simple transistors) were introduced it became increasingly difficult to
generate ODEs. At the same time, because the networks were increasing in size, they were
tending to become more stiff so implicit methods for stiff equations were being used.
If we look back at the equations on the previous slide we see that they can be written as
F(y’,y,t) = 0
where y is the vector of all variables (the voltages and the currents in this example) and F is
the function that represents all of the equations. This could be called an implicit ordinary
differential equation.
L3-5
Copyright 2006, C. W. Gear
Suppose we have the system of equations
F(y’,y,t) = 0
(1)
for an s-dimensional vector function F operating on two s-dimensional vectors, y’ and
y, and
on t. If this equation can be solved for y’ it can be rewritten as
y’ = f(y,t)
(2)
so it really is equivalent to an ordinary differential equation, and it is properly called an implicit
ODE. Note that a condition for the existence of a solution (2) to (1) is that the Jacobian
F
y 
is non singular. However, even if we can, in principle, solve to get (2) it might be
computationally more efficient to deal with the system (1) directly when we are dealing with stiff
equations because with stiff equations we often need to use implicit methods such as the
Backward Differentiation formulae (BDFs). Recall that these take the form

q
i 0
i yn1i  hyn 1
Suppose we substitute this directly into (1). we get:
q
F (i 0 i yn1i / h, yn1 , tn1 )  F (0 yn1 / h  i 1i yn1i / h, yn1, tn1 )  0
q
This is a (non-linear) equation for yn+1 that we could solve at each time step, thus effectively
solving (2) with a BDF method without ever actually finding equation (2) explicitly. If the
Jacobians of F with respect to y’ and y are sparse, this may be a much more efficient
calculation.
L3-6
Copyright 2006, C. W. Gear
Can we apply this method to the electrical network problem we discussed in slides 3 & 4 and
avoid the step of reducing it to an ODE system? Obviously sometimes the answer is “yes” (or we
wouldn’t be discussing it here!) For this example it works fine after we handle a minor difficulty.
However, there are networks (and other systems) that can’t be solved with such a simple
approach.
Looking back at the electrical network problem earlier we see that there are 12 equations (five
describing the behavior of the circuit elements, in this case resistors and capacitors, and 7 from
the Kirchoff current law). However, there are only 11 variables (2 voltages and 9 currents). If we
studied the current equations carefully we would see that one of them is redundant. In fact, that
will always be true for a fully connected circuit and there are well known ways to avoid generating
the unnecessary equation. Since we are not especially concerned with the network problem – we
are only using it as an illustration – we won’t bother with the details and just drop the first current
equation.
Now we have 11 equations in 11 unknowns that look like
F(y’,y,t) = 0
(3)
but F / y is very clearly singular (since only two of the variables even appear with a derivative).
Hence (3) is not an implicit ODE, it is a DAE. When we solve (3)
q
F (  i yn 1i / h, yn 1 , tn 1 )  0
i 0
for yn+1 we only need that
q
F (  i yn 1i / h, yn 1 , tn 1 )
i 0
yn 1
F  0 F



y h
y
(4)
be non singular – which happens to be true in this example.
L3-7
Copyright 2006, C. W. Gear
In fact, most (but not all) electrical network problems can be solved by such a method.
Handling problems directly in this way has several advantages:
1. It avoids having to find the regular ODE (which may not always be possible).
2. If the Jacobian of F is sparse, it frequently avoids a lot of fill in and so is
computationally more efficient.
3. It handles automatically generated systems of equations directly – it is fairly easy to
set up a “rule” for a computer program to generate the equations from a circuit
description – and this is important when the “circuit” of an integrated circuit on a chip
may involve millions of elements.
However, being able to solve equation (4) on the last slide is no guarantee that we will solve the
DAE correctly. To see that, let us look at a very simple example of a DAE, namely
1
F ( y, y, t )  0
0
0
1
0
0   u  0
0   v   0
0   w 1
1
0
0
0  u   0 
u 
1   v    0   0, y   v 
 w
0   w  g (t ) 
For this equation we have
1
F  0 F 

 0
y h
y
0
0
1
0
0
0

0  0  0
h
1
0 
1
0
0
0
1 
0 
which is clearly non-singular - its determinant is -1. However, let’s use Backward Euler on it
with variable step sizes.
L3-8
Copyright 2006, C. W. Gear
Note that if we write out the equations in a more direct manner, we actually have
u  g (t )
v  u
w  v
In other words, it is not a differential equation at all -- and needs no initial values!
This points out a problem with the DAE formulation – it may not be what it looks like!
There may be far fewer degrees of freedom (initial values) that we might at first think.
To see what might happen we will go ahead and use a Backward Differentiation (BDF)
method on the problem – we will use Backward Euler to simplify the discussion.
When we use Backward Euler we need initial values. Let’s assume that these are u0, v0, w0
and for the moment assume that we have the correct values. (Finding them is one of the
challenges of DAEs that we will discuss later.)
L3-9
Copyright 2006, C. W. Gear
Using Backward Euler starting from t = 0 with step sizes h0, h1, … we get the following
solution at t1 :
g (h0 )
g (h0 )
 u1  
 

 v    (u  u ) / h   [ g (h )  g (0)] / h 
0
0
0
0
 1  1


 w1 
 
 (v1  v0 ) / h0 
 
 [v1  g '(0)] / h0 

u1 is obviously correct. v1 is a first order approximation to g’(t1) which seems like all we can
expect from a first order method. For v1 we get
v1  ( g (t1 )  [ g (t1 )  h0 g (t1 )  h02 g (t1 ) / 2  L ]) / h0
 g (t1 )  h0 g (t1 ) / 2  L
which is order h correct, but from this we find that w1 is
w1  [ g (t1 )  h0 g (t1 ) / 2  L  g (0)] / h0
 [ g (t1 )  h0 g (t1 ) / 2  L  {g (t1 )  h0 g (t1 )}] / h0
 g (t1 ) / 2  L
which is half of what it should be. We could wonder if this is a problem with initial values (although
we started with the correct values). (Indeed, if we went one more step with the same step size we
would get a reasonable approximation to w1.) Let us take one more step with a different step size to
see what happens. As before, u and v will have reasonable solutions. For w we get
w2  (v2  v1 ) / h1  [{g '(t2 )  h1 g (t 2 ) / 2}  {g '(t1 )  h0 g (t1 ) / 2}] / h1

h
g '(t2 )  g '(t1 )
 (1  0 ) g (t2 ) / 2  L
h1
h1
L3-10
Copyright 2006, C. W. Gear
We can rewrite w2 as
w2  g (t2 )[1 
h0
]/ 2 L
h1
If h0 ≠ h1 then this is not an approximation to the true answer. In this case, we can get the correct
answer by using equal step sizes, but that is not necessarily true for non-linear problems (which
most interesting problems are!) This difficulty was not an artifact of the particular problem we
chose (which wasn’t really a differential equation at all) as we will see now.
There is another important source of DAEs that exhibit very similar characteristics. they
come from MECHANICAL SYSTEMS consisting of many rigid components interconnected at
a variety of joints which can be of many kinds (freely rotating, sliding, hinges {which can
rotate only in one direction}, etc.) They are really networks of elements, and, as with
electrical networks, it can be very difficult to write down the governing differential equations.
We will illustrate the class of problems with a very simple example – a classic
example – of a pendulum consisting of a point of mass M
swinging on a inextensible, weightless rod of length L. (Note that
most examples we construct are idealizations of reality. There is
no such thing as an inextensible, weightless rod in real life.
However, when we model problems we often ignore issues that we
L
do not believe affect the solution to the level of accuracy of
interest.)
θ
This problem is simple enough that we can write down an ODE
without difficulty. If we parameterize by the angle from the
vertical, θ, it is not difficult to write down the ODE
M
    sin( ) / L
Mg
L3-11
Copyright 2006, C. W. Gear
However, writing down the ODE, as we did on the last slide, required a little bit of
geometry/trigonometry – that was easy in this case, but could be very tedious in a complex
case. Suppose, instead we had written down Newton’s equations of motion for the point mass
and added the constraint that it was connected to the rigid pendulum rod. This is an example
of a general approach that leads to the Euler-Lagrange equations of motion.
Instead of using the angle as the variable, we use the usual 2-D Cartesian
coordinates, x and y and the velocities u and v. Newton’s equation of
motion (ignoring the constraint imposed by the pendulum) are
x’ = u
y’ = v
Mu’ = 0
Mv’= -Mgr
0
x
L
where gr is the acceleration due to gravity – which we will take as 1.
However, we also need to take account of the constraint which is
g(x,y) = x2 + y2 - L2 = 0
This is done by adding a term to the acceleration equations that provides
a force in the direction the constraint acts whose purpose is to keep the
rod at its fixed length. The resulting system of equations is
x’ = u
y’ = v
Mu’ = + λgx
Mv’= -M + λgy
g(x,y) = 0
where gx and gy are the partials of g with respect to x and y respectively.
M
Mg
-y
In this case, g(x,y) = x2 + y2 - L2 . The advantage of this statement of the problem is that
is it easy to generate the equations for a complex, interconnected system – one just writes
done Newton’s equations of motion for each separate piece, and then adds the constraints.
L3-12
Copyright 2006, C. W. Gear
For a general system we can write the Euler-Lagrange equations of motion in the form
x’ = v
v’ = f(x,v) + GTλ
g(x) = 0
where x is a vector of all spatial coordinates, v is a vector of all velocities, g(x) is a vector
of all of the constraints, G is its Jacobian with respect to x, and λ (called a Lagrange
multiplier) is a vector whose dimension is the number of constraints and which can be
interpreted as the force provided to maintain the constraint (e.g., the force in the pendulum
rod in our earlier example). GT λ is a force that is perpendicular to the constraint surface –
just as the force in the pendulum rod is perpendicular to the path of the pendulum ball.
The advantage of this representation is that it is relatively easy for a computer program to
generate the equations for a complex mechanical system. The downside to the technique
is that it is difficult to solve the problems automatically in this form with an automatic,
variable step size code. It is easy to solve them with fixed step size BDF methods (but
usually that is inadequate for large problems). The next two slides show a Matlab code and
the results for BDF methods of order 1 through 6 for the pendulum problem. As can be
seen, the correct order is obtained. The different colored lines give the errors in the
different components of the solution. (This code uses a regular Matlab integrator to
determine starting values for the method from an ODE form. Getting correct starting values
is, in general, another difficulty with DAEs.)
L3-13
Copyright 2006, C. W. Gear
Matlab code for fixed step BDF methods for Pendulum problem
% Ex10 Attempted solution of 2-D pendulum problem by BDF, several fixed h's.
% Length of pendulum is L = 1, Mass is 1 (Mass scales out except in lambda).
% We will take initial values that we know are correct.
clear
% alpha(p,j) are the BDF coefficients for order p. \alpha_i is in Alpha(p,i+1)
Alpha = [[1 -1 0 0 0 0 0]; [3/2 -2 1/2 0 0 0 0]; [11/6 -3 3/2 -1/3 0 0 0];
[25/12 -4 3 -4/3 1/4 0 0]; [137/60 -5 5 -10/3 5/4 -1/5 0];
[147/60 -6 7.5 -20/3 3.75 -6/5 1/6]];
Lt = 5;
x0 = 0.6;
y0 = -sqrt(1-x0^2);
u0 = 0; v0 = 0;
%Length of integration interval
% initial x
% initial y
% Parameters for computing very accurate solution
theta0 = [asin(.6); 0];
%Initial angular position & velocity
tol = 1E-12;
for order = 3:6
alpha_0 = Alpha(order,1);
alpha = Alpha(order,order+1:-1:2); % Reverse order
%Storage for errors:
x_err = []; y_err = []; u_err = []; v_err = []; lambda_err = [];
H = [];
% and step sizes used
for j = 1:10-order
N = 2^(j+3);
% Number of steps
h = Lt/N;
% Fixed step size
% Compute very accurate solution for starting and end point
options = odeset('RelTol',tol,'AbsTol',tol,'Stats','off');
[Time,Theta] = ode113(@fun10,[h*(-order+1:0) Lt],theta0,options);
x_end = sin(Theta(:,1)); y_end = -cos(Theta(:,1));
u_end = -Theta(:,2).*y_end; v_end = Theta(:,2).*x_end;
lambda_end = -(Theta(:,2).^2 - y_end)/2;
x_old = x_end(1:order); y_old = y_end(1:order); % Starting values for order
u_old = u_end(1:order); v_old = v_end(1:order);
lambda_old = lambda_end(1:order);
t = h*(order - 1);
X = x_old; Y = y_old; T = (0:order-1)'*h; % Saved values for plots
for i = 1:N
% integrate N steps
% We will "predict" the next value by using the current one
x = x_old(end); y = y_old(end); lambda = lambda_old(end);
t = t+h;
Del = 1;
% Del will be a measure of the change in the Newton iteration
% When small, stop iterating. Forces the first iteration
while Del > 1E-11
% test for Newton convergence
% These are the residuals that we wish to zero
f = [(alpha_0*(alpha_0*x + alpha*x_old)/h + alpha*u_old - 2*lambda*x*h);
(alpha_0*(alpha_0*y + alpha*y_old)/h + alpha*v_old - 2*lambda*y*h + h);
(x^2 + y^2 - 1)];
% Derivative for Newton iteration
J = [[(alpha_0^2/h-2*lambda*h)
0
-2*x*h];
[0
(alpha_0^2/h-2*lambda*h) -2*y*h];
[2*x
2*y
0 ]];
Dx = -J\f;
%Newton change -- J\f means J^(-1)*f
x = x + Dx(1); y = y + Dx(2); lambda = lambda + Dx(3); % Update iterate
Del = sum(abs(Dx));
% Measure of change for convergence test
end
% Compute final u an v for error comparison
u = (alpha_0*x + alpha*x_old)/h; v = (alpha_0*y + alpha*y_old)/h;
x_old = [x_old(2:end); x]; y_old = [y_old(2:end); y];
u_old = [u_old(2:end); u]; v_old = [v_old(2:end); v];
lambda_old = [lambda_old(2:end); lambda];
% Save past values
X = [X;x]; Y = [Y;y]; T = [T;t];
end
x_err = [x_err; x-x_end(end)]; y_err = [y_err; y-y_end(end)];
u_err = [u_err; u-u_end(end)]; v_err = [v_err; v-v_end(end)];
lambda_err = [lambda_err; lambda-lambda_end(end)]; H = [H;h];
end
figure(10+order);
loglog(H,abs(x_err),'-b',H,abs(y_err),'-g',H,abs(u_err),'-r',...
H,abs(v_err),'-m',H,abs(lambda_err),'-k')
title(['Order ' num2str(order) ' Method'])
eval(['print -dpsc Ex10order' num2str(order)])
pause(0.1)
end
function derivative = fun10(t,y)
% Pendulum in angular coordinates
derivative(1,1) = y(2);
derivative(2,1) = -sin(y(1));
L3-14
Copyright 2006, C. W. Gear
Errors for BDF fixed step methods, order 1 to 6. X-axis is step size, h, y-axis is error
L3-15
Copyright 2006, C. W. Gear
In some cases we can even manage to get a variable step size method to work, but we have to
realize that the problem we saw on slide L3-10 where the incorrect value was calculated for one
variable when the step size changed can cause problems. The plot below was produced by the
variable step size Backward Euler code on the next slide for the pendulum problem. (That code
is very similar to the one two slides back, so many comments have been left out to fit the code
on one slide.) (Because of the linear relation between x and u and y and v, u and v have been
eliminated to simplify the calculation – that step does not effect the results.) We plot the errors
in x, y, and λ as the requested tolerance is reduced. (Because we are controlling the local h2
error by tol, h is proportional to √(tol) so the global error, which is proportional to h is also
proportional to √(tol).)
L3-16
Copyright 2006, C. W. Gear
% Ex10auto Solution of 2-D pendulum problem by B. Euler with error control.
% Length of pendulum is L = 1, Mass is 1 (Mass scales out except in lambda).
% We will take initial values that we know are correct.
x0 = 0.6; y0 = -0.8; u0 = 0; v0 = 0;
h = 0.001;
% Initial step size
Lt = 10;
% Compute very accurate solution for starting and end point
options = odeset('RelTol',1E-11,'AbsTol',1E-11,'Stats','off');
[Time,Theta] = ode113(@fun10,[0 Lt],[asin(.6); 0],options);
x_end = sin(Theta(end,1)); y_end = -cos(Theta(end,1));
u_end = -Theta(end,2).*y_end; v_end = Theta(end,2).*x_end;
lambda_end = -(Theta(end,2).^2 - y_end)/2;
Xer = []; Yer = []; Lambdaer = []; Tol = [];
for itol = 1:9
%Max itol of 9
tol = 2E-4*2^(-itol);
% Our error tolerance
Tol = [Tol; tol];
xn = x0; yn = y0; un = u0; vn = v0; lambda_n = 0; t = 0; %Past values at last t
first = 1;
while t < Lt
% integrate past Lt
t = t+h;
fail = 1;
% Flag indicating next step not yet successful
while fail
% Loop until we have a successful step
if first
x = xn; y = yn; lambda = lambda_n;
else
% We will "predict" the next value using linear extrapolation
xp = xn +(xn-x_old)*h/h_old; yp = yn +(yn-y_old)*h/h_old;
lambda_p = lambda_n +(lambda_n-lambda_old)*h/h_old;
end
Del = 1;
while Del > 1E-10
% test for Newton convergence
f = [((x - xn)/h - un - 2*lambda*x*h);
((y - yn)/h - vn +h - 2*lambda*y*h);
(x^2 + y^2 - 1)];
J = [[(1/h-2*lambda*h)
0
-2*x*h];
[0
(1/h-2*lambda*h) -2*y*h];
[2*x
2*y
0 ]];
Dx = -J\f;
x = x + Dx(1); y = y + Dx(2); lambda = lambda + Dx(3);
Del = sum(abs(Dx));
end
if first
h_new = h;
fail = 0;
% There will be no error test for the first step
else
% Not first step, so we have information to test error
Error_est = sqrt((x-xp)^2 + (y - yp)^2);
h_ratio = sqrt(Error_est/tol);
% Estimated corrector error
if h_ratio >= 1
h = 0.9*h/h_ratio;
% Reduce h for failed step
else
if h_ratio < 0.9
h_new = 0.9*h/h_ratio;
% Increase h if possible
else
h_new = h;
end
fail = 0;
end
end
end
u = (x-xn)/h; v = (y-yn)/h; % Compute new u an v
x_old = xn; y_old = yn; lambda_old = lambda_n;
% Save past values
xn = x; yn = y; un = u; vn = v; lambda_n = lambda;
h_old = h; h = h_new;
H = [H;h_old];
first = 0;
end
% Linearly interpolate to end point
r = (T(end) - Lt)/(T(end) - T(end-1));
x_e = X(end)*(1-r) + X(end-1)*r;
y_e = Y(end)*(1-r) + Y(end-1)*r;
lambda_e = Lambda(end)*(1-r) + Lambda(end-1)*r;
Xer = [Xer; (x_e - x_end)];
Yer = [Yer; (y_e - y_end)];
Lambdaer = [Lambdaer; (lambda_e - lambda_end)];
end
figure(10)
loglog(Tol,abs(Xer),'-b',Tol,abs(Yer),'-g',Tol,abs(Lambdaer),'-k')
xlabel('Tolerance requested'); ylabel('Error');
title('Variable step size Backward Euler for Pendulum')
print -dpsc Ex10
function derivative = fun10(t,y)
% Pendulum in angular coordinates
derivative(1,1) = y(2);
derivative(2,1) = -sin(y(1));
L3-17
Copyright 2006, C. W. Gear
Note that in the code above we are only controlling the errors in x and y, NOT in λ. (See red code
line.) If we tried to control errors in λ, the code would fail to make headway because the variable λ
is like the variable w on Slide L3-10 and has big errors when h changes in size. Since λ is calculated
“from scratch” at each step (it is not given by a differential equation) these errors do not propagate
to future steps so the final error is still reasonable. HOWEVER, if the user is interested in the values
of λ along the trajectory then these errors are an issue. Since the values of λ represent the forces at
the connections, the modeler is very often interested in them – one must be certain that the physical
system is strong enough to stand up to the forces.
How do we know when a system is likely to cause trouble? The most important characteristic of a
DAE problem that determines whether we can compute a solution easily is its index. There are
many definitions of index in the literature. Perhaps the first, and one of easiest to understand, is
the differentiation index. It is defined as the number of times that one has to differentiate a DAE in
order to be able to find a system of ODEs that have a solution equal to the solution of the original
DAE. Let us illustrate with our first example:
1
F ( y, y, t )  0
0
0
1
0
0   u  0
0   v   0
0   w 1
1
0
0
0  u   0 
u 
1   v    0   0, y   v 
 w
0   w  g (t ) 
Writing this is the simpler form u = g(t); v = u’; w = v’ we see that we have to differentiate 3 times,
first getting u’ = g’(t), v’ = u”, w’ = v”; next getting u” = g”(t), v” = u’’’, w’’ = v’’’; and finally
getting u’’’ = g’’’(t), v’’’ = u””, w’’’ = v””. From this system we can extract the ODE
u’ = v, v’ = w, w’ = g’’’(t). Hence the index of this system is 3.
L3-18
Copyright 2006, C. W. Gear
The Euler-Lagrange equations with constraints also have index 3. if we start with
x’ = v
v’ = f(x,v) + GT λ
g(x) = 0
and differentiate the last equation we get
Gx’ = 0
since G  g / x . This leads to the equation Gv = 0 and we still don’t have a differential
equation for λ . Hence we differentiate again to get
Gv  Gx v 2  Gf ( x, v)  GGT   Gx v 2  0
One more differentiation yields
   [GGT ]1[Gxx v3  Gx vx v  Gx vvx  Gx f ( x, v)  G{ f xv  f v ( f  G T  )}]
There are a number of reasons for not wanting to generate this differential equation. One obvious
one is that it looks like we would get a real mess to evaluate! However, there is a more important
reason, and that is that the three differentiations have introduced three (or more) arbitrary constants
of integration. Going back to our earlier trivial example, we turned
u = g(t); v = u’; w = v’
which has no constants of integration because it is not a differential equation into
u’ = v, v’ = w, w’ = g’’’(t)
which has three since its solution is
u = g(t) + c1 + c2t +c3t2/2, v = g’(t) + c2 + cct, w = g”(t) + c3
If we were to integrate this system numerically, the computed solution would wander off the manifold
of the solution to the original problem.
L3-19
Copyright 2006, C. W. Gear
Meaning of the differential index
Let us consider a restricted form of DAEs called semi-explicit. (Generally, a DAE could be
brought into this form by a transformation of variables, although it is often not practical.
However, this form arises a lot in practice, and makes it easier to understand what is going on.
Suppose we have the system:
y’ = f(y,z)
0 = g(y,z)
If we differentiate the second equation we get
0 = gyf(y,z) + gzz’
If
gz is non singular we can solve this for z’ to get
z’ = -gz-1f(y,z)
to get a differential equation for
z, thus indicating that the differential index is 1.
Thus, the non
singularity of gz is a necessary and sufficient condition for the problem to have index 1. Index 1
problems occur quite frequently and fortunately are very easy to solve. The BDF methods work
for index one problems (replace y’ with the linear combination of past y’s and solve for the most
recent y and z).
An index-one problem specifies a manifold in the y-z space given by the solution of g(y,z)
on which the solution of the DAE must lie. In principle, we could just integrate for y with a
standard ODE package, and each time we need to evaluate y’ for a value of y we first solve
g(y,z) = 0 for z(y) and substitute into f(y,z) to find y’.
=0
(In practice, this may be inefficient, so
direct substitution with the BDF formula may be better.)
Note that finding suitable initial values is not difficult. As long as we are only given y0 we do not
have to do anything special.
L3-20
Copyright 2006, C. W. Gear
Suppose gz is singular, so the index is greater than one. Let us take the extreme case
y’ = f(y,z)
0 = g(y)
In this case, the first differentiation gives us
gyf(y,z) = 0
(1)
and we will have to differentiate at least one more time to get
gyfzz’ + gyfyf(y,z) = 0
Clearly, if gyfz is non singular we can now solve for z’ to get an ODE for z and the index is
two. However, note that equation (1) specifies a second manifold (in addition to g(y,z) =
on which the solution must lie. We call this a hidden manifold and a large part of the
difficulty with DAEs is the restriction that the solution must lie on these hidden manifolds
although they can only be revealed by differentiation (which we would prefer not to do).
0)
In general, for an index m problem there is one visible manifold and m-1 hidden manifolds on
which the solution must lie.
There are no good direct computational methods for the most general problems of high
index. Index 1 problems are easy. Many index 2 problems are OK. Special methods are
needed for higher index problems.
For a good reference on DAEs, use the book K.E. Brenan, S.L. Campbell, and L.R. Petzold,
Numerical solution of initial-value problems in differential-algebraic equations, SIAM,
Philadelphia, 1996
L3-21
Copyright 2006, C. W. Gear
What can we do if we have a high-index problem?
(assuming that it is not feasible to manipulate them symbolically into a simpler form)
1.
We could reduce the index by differentiation, but that introduces additional constants of
integration so we could wander off the constraint manifolds, so we must …
a) add additional equations as constraints to force the solution to lie in the manifolds
of the problem
b) project the solution back onto the constraints after each integration step.
2.
"Regularize" the equations – that means convert to a differential equation (or DAE of
lower index) which is stiff and whose slow solution is (approximately) the solution of the
DAE.
In either of these methods we have to be able to "get at the structure" of the DAE system to
modify it in some way, and this almost always has to be done symbolically. Generally we can
only deal with DAEs with restricted structures because of the near impossibility of performing
the necessary manipulations on the most general form of equations.
In fact, many DAEs have a special structure because of the way they arise. These are
described on the next slide.
L3-22
Copyright 2006, C. W. Gear
The three most common special cases of DAEs are:
`
y' = f(y,z)
0 = g(y,z)
Linear in derivatives:
Ay' = f(y,t)
"Hessenberg"
y1' = f1(y1,y2)
y2' = f2(y1,y2,y3)
…
yk' = fk(y1,y2,…,yk,z)
0 = g(y1)
Semi-explicit:
where matrix A is constant (but singular)
The first form, although it looks simpler than the general case F(y',y,t) = 0 really isn't any help
without additional information since we can always convert the latter to the former by introducing
another variable, z, and writing
y' = z
0 = F(z,y,t)
(although this transformation does increase the index by 1.) Hence, the only case of interest here
is the case when gz is nonsingular and we have an index 1 problem.
The second form arises in, for example, finite element modeling where A is the Mass matrix. A
could also depend on t but in that case it would be important that its structure did not change
(e.g., it's rank should not change during the calculation).
The third form occurs frequently – for example, the Euler-Lagrange equations have this form, and,
fortunately, Hessenberg DAEs are one type of index 2 DAEs that can be solved without to much
difficulty.
L3-23
Copyright 2006, C. W. Gear
Hessenberg DAEs
y1' = f1(y1,y2)
y2' = f2(y1,y2,y3)
…
yk' = fk(y1,y2,…,yk,z)
0 = g(y1)
One important fact about Hessenberg DAEs is that their index is at least k+1. This follows by
differentiating first g and then f1, f2, … , fk in turn. Only after these k+1 differentiations does
z' appear in the result. By that time we will have
f k
g f1 f 2
L
z  L  0
y1 y2 y3
z
where the other terms are functions of all of the y variables. Hence, if the matrix
g f1 f 2 f k
L
y1 y2 y3
z
is nonsingular, the problem has index k+1. Fortunately, in many problems the partial derivatives in this
product are either constant or have a simple structure than enables us to tell whether it is non-singular
or not fairly easily. For example, in the simple Euler-Lagrange equations
x’ = v
v’ = f(x,v) + GTλ
g(x) = 0
we see that the matrix product is GGT. G indicates specifies the directions in which the different
constraint forces act. These have to be linearly independent or one or more of the constraints would be
redundant (like using two pieces of string to hold a single pendulum). This linear independence of the
rows of G means that GGT is nonsingular. Thus, this guarantees that the index is 3.
L3-24
Copyright 2006, C. W. Gear
Because of the structure, it is very obvious how to reduce the index of a Hessenberg system
of DAEs. If we want to reduce the index by 1, we simply replace the "algebraic" equation
g(y1) = 0 with its derivative to get
g
f1 ( y1 , y2 )  0 @ g1 ( w1 )
y1
where we have renamed the set of variables (y1,y2) as w1. If we now rename y3 through yk
as w2 through wk-1 we will see that we have the reduced set of equations
w1' = f2(w1,w2)
…
wk-1' = fk(w1,w2,…,wk-1,z)
0 = g1(w1)
where the f's have been suitably redefined. This system has index k-1.
This reduction is particularly simple for the Euler-Lagrange equations because we already
have the partial of g, namely G, so the new algebraic equation is simply
G [f(x,v) + GTλ] = 0
This is called the velocity constraint since it says that any velocities must be tangential to
the constraints (so that there is no motion away from the constraint). However, when we
discard the equation g(x) = 0 we no longer have a system that stays on the position
constraint g(x) = 0. Small errors in the numerical integration could cause us to wander
off the constraint, and, for some systems, this could be disastrous, leading to physically
unrealistic systems or unstable systems.
L3-25
Copyright 2006, C. W. Gear
One approach is to add this equation back to the system, but we would now have an apparently
overdetermined system – more equations that there are unknowns (although we know that the
equations are consistent). If we do this we get the system
x  v
v  f ( x )  G T 
0  Gv
0  g ( x)
and if we tried to solve this by replacing the derivatives with their backward differentiation
approximations we would not only have an overdetermined system, but it would probably
not have a solution because the numerical approximations to the derivatives introduce
errors. In the pendulum example with backward Euler we get the system:
 x  xn 
u 
 y  y   h v 
 
n 

u  un 
 0 
 x
 v  v    1  2  y  


 
n 

0  xu  yv
0  x2  y 2 1
6 equations to solve for the 5 variables x, y, u, v, and λ. (Here we have dropped the subscript n+1
on the new values that have to be calculated to reduce clutter. The n subscript refers to the
previous time point.) In general, they will not have a solution.
L3-26
Copyright 2006, C. W. Gear
How can we enforce an extra constraint that we know is satisfied by the differential system but
will probably not be satisfied by the difference equation? We can ask this questiopn about an
ODE system also. Suppose we have an ODE y’ = f(y) which we know has an invariant (that is, a
function of the variables that doesn’t change). For example, such an invariant could be the total
energy of a closed system, the total mass, or other quantities. (For chemical kinetics, the mass
balance equations which essentially state that the total number of each type of atom present is
constant, are linear relations and it turns out that virtually all numerical methods automatically
preserve these within rounding error so they are not an issue. For other systems, the energy,
which is a non-linear function, is not normally conserved by a numerical method.)
So we suppose that there is a function, g(y), that is conserved by the differential equation (say,
at the value 0), meaning that dg/dt = gyf(y) = 0, and we want to ensure that g = 0 is conserved
by the numerical method. We can do this by appending the constraint with its own Lagrange
multiplier in the form
T
y  f ( y )  G 
(1)
0  g ( y)
(2)
where G = gy. Here μ is an additional variable (actually, a number of variables equal to the
dimension of g) that gives us the same number of variables as equations. However, note that
this modification does not change the solution of the differential equation because since we have
the relation Gf = 0 we get by differentiating (2) and substituting for y’ from (1)
T
T
0  Gy '  Gf  GG   GG 
or
  0 so that y '  f ( y )
L3-27
Copyright 2006, C. W. Gear
Equations (1) and (2) on the preceding slide can be seen to be an index 2 DAE. It may not seem
very useful to convert an ODE (which is easy to solve) into an index 2 DAE which is often not so
easy to solve. However, note that this DAE is in Hessenberg form, which is relatively easy to
solve numerically.
We can use the same approach to DAEs when we reduce their index by differentiation. Going
back to our pendulum example, we can write the enhanced system after one differentiation as
x  v  G T 
v  f ( x )  G T 
(1)
0  Gv
0  g ( x)
which makes it an index-2 Hessenberg DAE.
We can use this process to reduce the index of any high-index system, but not below index 2.
(However, if the system is not already in Hessenberg form so it is clear which are the
constraints that have to be differentiated, this process will involve a lot of symbolic
manipulation which may not be practical.)
In making this transformation we added a term to the derivative of x in the first equation above.
It modifies x in the direction GT. This direction is perpendicular to the constraint, g = 0. Thus
we can view this as nothing more than a projection back onto the constraint g = 0. This we
could consider simply integrating the reduced equations one step and then projecting the
solution back onto the original constraint. There is, however, a subtle difference between this
and the process given by equations (1) above. In the equations above we require the projected
result to satisfy all the other difference equations whereas if we project x after the integration
step we will no longer exactly satisfy the other equations such as Gv = 0.
L3-28
Copyright 2006, C. W. Gear
Initial Conditions for DAEs
When we have an initial value problem for an ODE, we know exactly what initial conditions we
need – the value of each variable at the initial time. When we have a DAE the issue is much
more complex. We need an initial value that lies on each of the manifolds implied by the
constraint – both the visible manifold, g(y) = 0, and the hidden manifolds. This is the
problem of consistent initial conditions.
There is one hidden manifold for each index in excess of 1, so if we have an index one problem
there are no hidden manifolds so it is not difficult to determine if a set of initial conditions are
consistent, although it may not be trivial to determine them in the first place. In the case of an
index one problem written in semi-explicit form
y '  f ( y, z )
0  g ( y, z )
the calculation is straightforward. We need an initial condition for y and then can calculate z
from the constraint equation.
Of course, it is possible that the user has some of the y values and some of the z values and
wishes to find the remaining y and z values. If the dimension of y is n and the dimension of z
is m then we need n initial conditions altogether and we could work with any unknowns for
which the Jacobian of g (which has to consist of m equations) with respect to the unknown
values is non-singular.
L3-29
Copyright 2006, C. W. Gear
When we have a higher index problem (or an index 1 problem not in semi-explicit form)
the consistent initial value issue is much more difficult since it essentially requires that
we determine the hidden manifolds. Since these are only revealed by differentiation we
either have to differentiate the system symbolically or use some numerical technique to
achieve something similar.
In practice, the appropriate technique will depend on the type of problem so it is difficult
to give general methods. In the case of, for example, Euler-Lagrange equations, it is
not difficult to find the first hidden manifold by differentiation, and this serves to ensure
that x and v are mutually consistent. In many cases it is possible to differentiate one
more time which is enough to develop an equation for the Lagrange multiplier, λ.
Specifically, let us suppose that the dimension of x (and v) is n and the number of
visible constraints – the dimension of g is m. Then we need to know n
–m
components of x. With these and the constraint g(x)
= 0 we can compute the
remaining m initial values of x. Next we need to know n – m components of v.
With
these and the first hidden constraint equation Gv = 0 we can compute the remaining m
initial values for v. We could then differentiate again to find λ but if we are going to
reduce the index of the problem to 2 by one differentiation (and then add back the
constraint with its own Lagrange multiplier μ as we did earlier) and solve by a method
such as BDF, we do not need initial values for λ or
μ.
L3-30
Copyright 2006, C. W. Gear
We have discussed how to reduce the index of DAEs by differentiation and then ways to force
the solution back onto the constraint manifold(s). An alternative approach to reducing the
index is by what are known as a regularization technique. Regularization refers to the
process of taking an ill-posed problem – that is, a problem which is very sensitive to
perturbations – and turning it into a well-posed problem – one whose answer changes only by
an amount more or less of the same size as the perturbation. (This is a very loose definition
since our purpose here is not to delve into complex mathematical analysis but to get an
understanding of issues facing the user of integration methods.)
There is a sense in which DAEs are ill-posed problems. The reason is that, as we have seen
with our first trivial example on slide L3-8, their solution may involve the derivative of given
functions. Hence any perturbation in a derivative will show up in the solution. If we specify a
function numerically and then approximate its derivative by numerical differentiation, the
answer is very sensitive because we have to civide by h, the spacing between the numerical
values, to get the derivative. h has to be small to get a good approximation, so we multiply by
a large number, 1/h, and amplify any errors significantly.
There are several indices defined for DAEs. We have been using the most common. Another
is called the perturbation index and is essentially defined as the number of times that any
numerical errors get multiplied by 1/h (the actual definition is much more technical than this
simplifed one, but it will serve to give you the general idea). It is this sense in which a DAE is
ill-defined, and regularization attempts to remove some of these problems (although one can’t
really remove them without making too many approximations).
L3-31
Copyright 2006, C. W. Gear
In regularization, we essentially try to change the DAE back to a stiff system of ODEs.
Recall that on the first slide of this lecture we started with a system of ODEs representing
a chemical kinetics problem and on the next slide used the assumption that one of
species reacted so rapidly that it was essentially in equilibrium the whole time so that we
could simply set its derivative to zero, thus getting a DAE. In regularization we try to
undo this process and get back to the ODE because we know how to solve ODEs! (Of
course, if we had gotten the DAE from an ODE in the first place, going back would require
little thought.)
Let us illustrate this for the simple index-1 semi-explicit case (although there is no need
to do it in this case because we can solve it directly).
y '  f ( y, z );
0  g ( y, z )
changes to
y '  f ( y, z );   g z z  g ( y, z )
The algebraic equation has been replaced by a differential equation. We have introduced a
small parameter, ε, so that the z changes very rapidly whenever g(y,z) ≠ 0 (i.e., the solution
is not on the constraint manifold). We have also chosen a scaling of z’ by gz so that the
eigenvalues of the Jacobian of z’ with respect to z are all -1/ε. This means that the solution
of the second differential equation very rapidly approaches a value of z that satisfies the
constraints.
L3-32
Copyright 2006, C. W. Gear
Applying regularization to higher index DAEs can either reduce then to ODEs or simply
reduce the index to something that can be solved. One of the earliest proposed
regularizations for Euler-Lagrange index 3 problems is due to Baumgarte. It replaces the
constraint equation g(x) = 0 with the differential equation
g ( x)   g ( x)   g ( x)  0
(1)
which is
Gv  Gx v 2   Gv   g  0
(2)
If we substitute the equation for v’ into the equation (2), we get an equation that can be solved
explicitly for λ so while we have actually reduced the index to one, we can obtain a system of
differential equations with λ eliminated. This system of ODEs has dimension 2n (if x and y have
dimension n) so we have a system that does not necessarily lie on the constraint manifolds.
However, if we look at equation (1) and think of it as an equation for g we see that it has a
stationary value at g(x) = 0. If we chose α and β so that this is a stable equilibrium, then if the
solution is off the constraint g(x) = 0 it will return to the constraint. It is often suggested that we
should choose α = 2η, β = η2 because then g returns to zero as exp(-ηt). In this case, our final
system is:
x  v
v '  f  GT 
0  GGT   Gf  Gx v 2  2Gv   2 g
or we could replace the last two equations with
v '  f  GT [GGT ]1[Gf  Gx v 2  2Gv   2 g ]
Copyright 2006, C. W. Gear
L3-33
The Baumgarte regularization, which leads to the system of equations on the bottom of the last
slide, has the interesting property that it gives us a system of equations which has a solution
that is a solution of the original Euler-Lagrange DAE, and furthermore, all other solutions of the
Baumgarte regularization are damped towards the solution of the original DAE. Even if we
start with the correct initial values, small numerical errors in the reduced index problem will
cause us to wander off the DAE solution. That suggests that we should choose a large value of
η so that the solution decays rapidly to the solution of the DAE (where g and g' are zero.)
Choosing a large η leads to a stiff system of equations. Since we can reduce the system
completely to an ODE (or a semi-explicit index one DAE which is equally easy to solve), why
not just use a stiff solver. If we do use a stiff solver we may find ourselves running into some
numerical difficulties. The problem can arise because a stiff solver will try to take a large step
(that, after all, is the point of a stiff solver). When it does this, the very fast components look
as if they obey the original DAE and we come back to all of the difficulties of DAEs. We can
see this by considering the very simple problem
εy' = F(y,t)
when ε is very small, we will essentially be solving the implicit equation F(y,t) = 0 at each step,
and our method will have to deal with all of the problem with the structure of F. Because of
this it is usually recommended that only a modest size be used for η.
L3-34
Copyright 2006, C. W. Gear
There are regularization techniques that are very useful but that do not have the nice property of
the Baumgarte technique that the solution of the original problem is a solution of the new problem.
Recall that a DAE restricts the solution to a manifold in the space of all of its variables. This
manifold is actually the intersection of the visible and all hidden manifolds. In the case of the
Euler-Lagrange equations there are 2n+m variables, but there are only 2(n-m) degrees of freedom in
the solution – that is, the solution lies in a 2(n-m)-dimension manifold. What a regularization
method that reduces the system to an ODE does is to extend the solution to the whole space such
that any solutions that are not close to the manifold of the solutions of the original problem decay
rapidly to it. That is, the regularized problem has a slow manifold to which all other solution are
attracted. We don't necessarily have to make this slow manifold a solution of the original problem,
it is enough that it be close to the solution of the original problem. We are going to make some
errors in our numerical solution, so we only need for the slow solution of the regularized problem
to introduce no larger errors than we are already making.
One approach to regularization is to realize that a DAE is often an approximation to the real world.
(For example on Slide L3-2 we created a DAE by assuming that a chemical species came to
equilibrium "infinitely fast.") Consider the pendulum problem. There is no such thing as a
pendulum rod that won't stretch. The pendulum rod stretches a small amount proportional to the
force on it (which is proportional to λ). Also there will be a small energy loss due to a less than
2
2
perfect material as it stretches and relaxes. Hence, rather than the constraint x1  x2  1  0
we could model the problem more accurately using    S   S  where S  x1  x2  1
is the strain in the pendulum rod (that is, the amount it has stretched). The first term says that the
force in the rod is proportional to the extension. The second term involving the derivative of S says
that the energy absorbing factor in the rod (friction) is working to slow down any extending or
contracting of the rod. (This serves to damp out any oscillations.) Since we are not particularly
interested in modeling this behavior accurately (it is too small to be of real consequence to the
problem) we are free to choose α and β to get the best behavior of the regularized system.
Choosing α = -η2 and β = -2η adds a pair of eigenvalues that are approximately –η.
2
2
L3-35
Copyright 2006, C. W. Gear
In the previous regularization, we used a physical basis for choosing the modification –
essentially putting back a physical phenomenon that had been ignored in the mathematical
model (very few problems are really DAEs, they mainly occur because of approximations
to physical reality). However, we don't have to base a modification on physical phenomena
as long as our modification does not change the slow solution very much. For example,
we could change the Euler-Lagrange equations to
x  v   GT [GGT ]1 g ( x)
v  f   GT [GGT ]1 g ( x)
This change effectively introduces new eigenvalues of -α and -β. If these are
sufficiently negative, their components die away rapidly until the components g and g'
are almost zero.
We have illustrated just three possible regularizations – the "classical" Baumgarte one
and a couple of others. What one might use depends very much on the nature of the
problem, so it is difficult to give a general rule. Sometimes physical intuition can
provide good starting points, other times one must simply look for changes that do not
change the slow manifold too much.
L3-36
Copyright 2006, C. W. Gear
THE CONSISTENT INITIAlZATION PROBLEM
One interesting advantage of using regularization is that now any initial values are valid – the
regularized system will rapidly damp any "initial condition errors" since they just mean that
the initial point is away from the slow manifold. This means that we no longer have to face the
difficult challenge of finding consistent initial values. Even is one is able to use a code
directly on a DAE (as is possible for an index 2 semi-explicit DAE, for example) , it may be
worth using a regularization technique initially to get correct initial values.
The problem with this is that one now has to choose initial values for all variables, and then let
the regularized equations drop rapidly onto the slow manifold. If one knew the dimension of
the slow manifold (which one certainly does in the Euler-Lagrange case) and had that number
of initial values, we might want to find initial values for all the variables such that the
prescribed initial values were satisfied.
Typically we do want to start with specific values of some of the variables – for example, we
may know very precisely the concentration of some of our species in the inflow to a chemical
reactor, but not know others. Unfortunately the method just suggested will "fall down" onto
the slow manifold in some place that may not be a suitable starting point for our integration.
In Lecture 4 we will discuss a way in which can handle this, but before we can look at that, we
have to look at a more fundamental issue – singularly perturbed equations.
L3-37
Copyright 2006, C. W. Gear
SINGULARLY PERTURBED PROBLEMS.
Many stiff equations are examples of singularly perturbed problems, and DAEs are really
examples of the limiting case. The "classical" singularly perturbed problem is the system
y '  f ( y, z )
 z '  g ( y, z )
We see that when ε = 0 we have a semi-explicit DAE. Singularly perturbed problems have been
studied for a long time. When ε is non-zero but small, the differential equation may have
rapidly changing solutions because z' is large unless g(y,z) is almost zero. If we have a
boundary value problem so that the boundary values constrain the solution to be finite at both
ends of an interval we may find that we get rapidly changing solutions at the boundaries (or
possibly in the middle) in what are called "boundary layers." These arise, for example, in the
modeling of slightly viscous fluid flow near a boundary where the fluid velocity changes
rapidly from zero (the boundary condition) to its value in the interior. If we have an initial value
problem, ε must be such that the fast solutions rapidly decay until g(y,z) is mearly zero, that is,
they decay down onto a slow manifold very close to the manifold defined by g(y,z) = 0.
An important feature of initial value problems for singularly perturbed problems is that if they
do decay rapidly to the slow manifold, their solution can be expressed as an asymptotic
N
expansion in ε of the form
n
y (t )   n 0  Yn (t )
z (t )   n 0  n Z n (t )
N
where Y0(t) and Z0(t)are solutions of the DAE obtained by setting ε = 0 and Yj(t) and Zj(t) are
solutions of ODEs involving the Yk and Zk for smaller k.
L3-38
Copyright 2006, C. W. Gear
When we regularize a DAE we convert it back into a stiff equation – which we have just
seen can be viewed as a singularly perturbed equation. The solution of this singularly
perturbed equation is only ε distant from the solution of the original DAE since it is the
first term in the asymptotic expansion in ε. Hence if we can use a very small ε in the
regularization we will know that we have not made too much of a perturbation to the
solution of the DAE.
On the other hand, we have seen that we cannot use too small an ε or a stiff method will
run into difficulties because it can't tell the difference between a very stiff equations and a
DAE at large step sizes
We will discuss one possible solution to this in lecture 4.
L3-39
Copyright 2006, C. W. Gear
ONE-SIDED CONSTRAINTS
One final topic we want to look at in DAEs is that of one-sided constraints. These are
more likely to arise in mechanical problems, but could arise in other engineering areas. In
the typical DAE, such as the Euler-Lagrange formulation of mechanical systems, we
express the system ignoring any constraints caused by items being linked together in
some way, and then add the constraints in with Lagrange multipliers – which turn out to
the be forces in the linkages that are exerted to maintain the constraint. Thus, when we
simulated a pendulum with a rigid rod, the length of the rod was a constraint. Suppose,
instead, we had a pendulum suspended on a piece of string (which for now we will
assume is inextensible and unbreakable!) Now the constraint is that the distance of the
pendulum ball from the point of suspension is no more than the length of the string. As
long as the pendulum only swings below the suspension point, there will be no difference,
but if we started it moving sufficiently fast that it swung above the suspension point, but
not fast enough to keep the string taut while it passed overhead, we know what would
happen – the pendulum ball would no longer stay on a circle but go into free fall until the
string once again became taut. Thus, the model would change from a system with one
constraint to a system with no constraints when the pendulum ball no longer had the
velocity to kepe the string taut. When the ball reached a position that the string was taut
again it would be subject to a sudden impulse from the string to force it back onto the
circular arc.
If we wish to simulate something like this we will have to utilize the techniques discussed
in the first lecture for handling discontinuities. If we are initial in a state where there is an
active constraint, we handle it like the DAE it is. When the force in the string changes
from positive (tension) to negative (compression – which a string can't handle) we must
detect the time of the change and change the model. The same must happen in the other
diection.
L3-40
Copyright 2006, C. W. Gear
L3-41
Copyright 2006, C. W. Gear