Download 2 - Google Groups

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Corecursion wikipedia , lookup

Data assimilation wikipedia , lookup

Japanese cryptology from the 1500s to Meiji wikipedia , lookup

Least squares wikipedia , lookup

Computational electromagnetics wikipedia , lookup

Low-density parity-check code wikipedia , lookup

Computational fluid dynamics wikipedia , lookup

Transcript
A Brief Overview of Methods for
Computing Derivatives
Wenbin Yu
Department of Mechanical & Aerospace Engineering
Utah State University, Logan, UT
Finite Difference vs Complex Step
 Forward finite difference
f ( x  h)  f ( x )
f ( x) 
 O ( h)
h
 Advantages: easy to use, no need to access to the source
codes, no need to understand the equation or the code
 Disadvantages
• step-size dilemma (small enough to avoid truncation error, big enough to
avoid subtractive cancellation error)
• Expensive: always n+1 times of analysis time for n perturbations
 Complex step approximation
 Better than finite difference if implemented correctly
 Complex variable
 Complex function
z  x  yI
f  u ( z )  v ( z ) I  u ( x , y )  v ( x, y ) I
Finite Difference vs Complex Step
 Complex step approximation (cont.)
 If analytic, Cauchy-Reimann equation holds
u v
v( x  ( y  h)I)  v( x  yI)

 lim
x y h0
h
 We deal with real functions of real variables
y0
v( x)  0
f ( x)  u ( x)
f u
v( x  hI)
Im[ f ( x  hI)]

 lim
 lim
h 0
x x h0
h
h
 Not explicitly subject to subtractive cancellation errors and the
truncation errors can be made as small as possible
h2
h3
f ( x  hI)  f ( x)  hf ( x)I 
f ( x) 
f ( x)I  
2!
3!
Im[ f ( x  hI)] h 2
f ( x) 

f ( x)  O(h 2 )
h
3!
Dual Number Automatic Differentiation (DNAD)
 Extend all real numbers by adding a second
component
 x1 , x1  x1  x1d
 d is just a symbol, analogous to the imaginary unit,
but all powers of d higher than one equal to zero
 Example: f ( x1 , x2 )  x1 x2  sin( x1 )
f ( x1  x1d, x2  x2 d)  ( x1  x1d)( x2  x2 d)  sin( x1  x1d)
 x1 x2  x1 x2 d  x2 x1d  sin( x1 )  cos( x1 ) x1d
 x1 x2  sin( x1 )  x1 x2 d  ( x2  cos( x1 ))x1d
f
f

 f ( x1 , x2 )  (
x1 
x2 )d
x1
x2
f
f
= f ( x1 , x2 ),
x1 
x2
x1
x2
Dual Number Automatic Differentiation (DNAD)
 Dual-number arithmetic
u , u    v, v  u  v, u   v
u , u    v, v  u  v, u   v
u , u  *  v, v  uv, u v  uv
sin( u , u  )  sin u , u  cos u 
u , u  / v, v  uv, u v  uv / v 2
exp( u, u  )  exp u, u  exp u 
log( u , u  )  log u , u  / u 
 u , u   k  u k , u ku k 1 
 Complex step arithmetic
(u  u I) * (v  vI)  (uv  u v)  I(u v  uv)
(uv  u v)
u v  uv
(u  u I) / (v  vI) 
I 2
2
2
v  v
v  v 2
1
log(u  u I)  log(u 2  u 2 )  I arg(u  u I)
2
Dual Number Automatic Differentiation (cont.)
 Comparing DNAD and complex step






DNAD is more efficient as calculations are never more and mostly
less (less for *, /, and most intrinsic functions)
DNAD is more accurate as it delivers the analytical derivatives up
to machine precision while complex-step is accurate only for
extremely small imaginary parts; cancelation and subtraction errors
can occur for some functions
Complex-step only has implementation and compiling optimization
advantage for codes in languages supporting complex algebra
(Fortran), while DNAD as a concept can be used for codes written in
any strongly typed languages with real numbers defined
Complex step is not applicable to codes having
IF(ABS(x)>0) THEN
complex operations and it can only compute
……
sensitivities with respect to one variable
ELSE
Changes the calculation of the original
……..
ENDIF
analysis and program flow
Hard to debug as many complex operations are
defined by not what you need
Derivative of Exp(|x|) at x=-3
Performance Comparison
Efficiency Comparison
Accuracy Comparison
Derivative of Exp(|z|) at z=-3
x=1.0; y=2.0; z=3.0
ftot=0.0d0
DO i=1,500000000
f=x*y-x*sin(y)*log(z)
ftot= (ftot- f)/exp(z)
ENDDO
write(*,*) ftot
Complex step: -20.0855362857837
DNAD:
-20.0855369231877
Exact:
-20.0855369231877
Both complex step and DNAD are
implemented in F90/95
Time (seconds) used by different methods
# of Design Variables
Finite Difference
Complex Step
DNAD
1
3
9
15
16
1.64*2 1.64*4 1.64*10 1.64*16 1.64*17
3.94 3.94*4 3.94*9 3.94*15 3.94*16
2.11
2.67
14.98 22.16 25.56
Implementation Using F90/95
 A general-purpose F90/95 module for
automatic differentiation of any Fortran
codes including Fortran 77/90/95
 Define a new data type DUAL_NUM
TYPE,PUBLIC:: DUAL_NUM
REAL(DBL_AD)::x_ad_
REAL(DBL_AD)::xp_ad_
END TYPE DUAL_NUM
Change to “xp_ad_(n)” with
n as # of DVs for sensitivities
wrt to multiple DVs
 Overload functions/operations needed in the
analysis codes to this new data type:
relational operators, arithmetic
operators/functions
Implementation Using F90/95 (cont.)
 Define EXP exp( u, u )  exp u, u exp u 
INTERFACE EXP
MODULE PROCEDURE EXP_D
END INTERFACE
ELEMENTAL FUNCTION EXP_D(u) RESULT(res)
TYPE (DUAL_NUM), INTENT(IN)::u
REAL(DBL_AD)::tmp
TYPE (DUAL_NUM)::res
tmp=EXP(u%x_ad_)
res%x_ad_ = tmp
res%xp_ad_ =u%xp_ad_* tmp
END FUNCTION EXP_D
How to Use DNAD
 To AD a Fortran code use DNAD
1. Replace all the definitions of real numbers with dual numbers
REAL(8) :: x
TYPE(DUAL_NUM) :: x
REAL(8), PARAMETER:: ONE=1.0D0
TYPE(DUAL_NUM),PARAMETER::ONE=DUAL_NUM(1.0D0,0.D0)
2. Insert “Use DNAD” right after Module/Function/Subroutine/
Program statements.
3. Change IO commands correspondingly if the code does not use
free formatting read and write (can be automated by written
some general-purpose utility subroutines)
4. Recompile the source along with DNAD.o
5. The whole process can be automated, and even manually it only
takes just a few minutes for most real analysis codes, although
step 3 is code dependent
How to Use DNAD (cont.)
 To use the sensitivity capability





Insert 0 after all real inputs not affected by the design variable
Insert 1 after the real input if it directly represents the design
variable
Insert the corresponding sensitivities calculated by other codes
if the real inputs are affected indirectly by design variable, such
as the sensitivity of nodal coordinates due to change of geometry
The sensitivities are reported in the outputs as the number
following the function value
Designers only need to manipulate inputs/outputs of the code
DNAD Example
PROGRAM CircleArea
REAL(8),PARAMETER:: PI=3.141592653589793D0
REAL(8):: radius, area
Input: 5
READ(*,*) radius
AREA=78.5398163397448
Area=PI*radius**2
WRITE(*,*) "AREA=", Area
END PROGRAM CircleArea
PROGRAM CircleArea
USE DNAD
TYPE (DUAL_NUM),PARAMETER:: PI=DUAL_NUM(3.141592653589793D0,0.D0)
TYPE (DUAL_NUM):: radius,area
READ(*,*) radius
Input: 5,1
Area=PI*radius**2
AREA=78.5398163397448, 31.4159265358979
WRITE(*,*) "AREA=",Area
END PROGRAM CircleArea
Example (VABS-AD)
• VABS: 10,000+ lines of F 90/95 codes
• An isotropic rectangular section
meshed with two 8-noded quads
E  2.6 GPa,   0.3, b  0.1m, h  0.2 m
8
4
10
Element 2
EA  Ebh  5200 104 N
h
EI 22  Ebh / 12  17.333  10 N.m
3
4
EI 33  Ehb / 12  4.333  10 N.m
3
x3
9
4
7
x2
13
2
11
6
2
Element 1
GJ  0.229Gb3h  4.58 104 N.m 2
5
12
Value
(  10 4 )
Sens. (E)
5
( 10 )
Sens. (b)
6
(  10 )
EA
5200
2000
520.0
Sens. (h)
( 105 )
2600
EI22
17.333
6.667
1.733
26.00
EI33
4.333
1.667
1.300
2.167
GJ
GJ (exact)
4.833
4.58
1.859
1.762
1.247
1.374
3.433
2.29
b
1
3
2
Loss of accuracy due to coarse
mesh remains the same, can be
verified by sensitivity wrt E
which is equal to GJ/E and GJ
is linear of E
Example (VABS-AD)
Changes to the inputs
• Sensitivity wrt E:
• Sensitivity wrt h:
Change 0.26E+10 .300000000E+00
To
0.26E+10 1. .300000000E+00 0.
• Sensitivity wrt b:
1 -0.0500000 -0.5 -0.1000000 0.0
2 0.0500000 0.5 -0.1000000 0.0
3 0.0000000 0.0 -0.1000000 0.0
4 0.0500000 0.5 0.1000000 0.0
5 0.0500000 0.5 -0.0500000 0.0
6 0.0500000 0.5 0.0000000 0.0
7 0.0500000 0.5 0.0500000 0.0
8 -0.0500000 -0.5 0.1000000 0.0
9 0.0000000 0.0 0.1000000 0.0
10 -0.0500000 -0.5 0.0500000 0.0
11 -0.0500000 -0.5 0.0000000 0.0
12 -0.0500000 -0.5 -0.0500000 0.0
13 0.0000000 0.0 0.0000000 0.0
1
2
3
4
5
6
7
8
9
10
11
12
13
-0.0500000 0.0 -0.1000000 -0.5
0.0500000 0.0 -0.1000000 -0.5
0.0000000 0.0 -0.1000000 -0.5
0.0500000 0.0 0.1000000 0.5
0.0500000 0.0 -0.0500000 -0.25
0.0500000 0.0 0.0000000 0.0
0.0500000 0.0 0.0500000 0.25
-0.0500000 0.0 0.1000000 0.5
0.0000000 0.0 0.1000000 0.5
-0.0500000 0.0 0.0500000 0.25
-0.0500000 0.0 0.0000000 0.
-0.0500000 0.0 -0.0500000 -0.25
0.0000000 0.0 0.0000000 0.0
For more complex geometry and mesh,
such inputs should be prepared by a mesh
generator: so-called geometry sensitivity.
Note # of design variables is not the
same as # of seeds for inputs
Example (GEBT-AD)
GEBT:
• 5000 lines of Fortran 90/95 codes
F3
a3
• 20,000 lines of Fortran 77 codes
a1
• Includes BLAS, MA28 sparse linear
solver, LAPACK, ARPACK sparse
eigensolver
1
2
• Sensitivity with respect to L
Responses and sensitivities by different analyses
 max
U 3 max
L  1m
F3  5  105 N
Value
Sensitivity
Value
Sensitivity
Exact (linear)
0.9615
2.8846
-1.4423
-2.8846
U 3max  F3 L3 3EI 22
GEBT (linear)
0.9615
2.8845
-1.4423
-2.8846
 max   F3 L2 2 EI 22
GEBT (nonlinear) 0.5989
1.1157
-1.0540
-1.2665
1.3619
-1.5995
-3.8806
GEBT (follower)
0.7114
Analytic Method
 If we know the equations, there are even more
efficient methods
 A linear system K ( x)q ( x)  F ( x)
 Unknowns: q; design parameter x; objective function: (q, x)
 Derivative of the objective
 dq 

 Direct method
q dx

x
K ( x)q( x)  K ( x)q( x)  F ( x)  K ( x)q( x)  F ( x)  K ( x)q( x)

  1


q( x) 

K [ F ( x)  K ( x)q( x)] 
q
x q
x
 Adjoint method
 1

K    K 
q
q

   [ F ( x)  K ( x)q( x)] 
x
Recommendation
 Neither equations no source codes are available:
finite difference (FD) method, step-size dilemma
 AD: source codes available
 Computational/algorithmic/automatic differentiation (AD) (apply
chain rule to each operation in the program flow)
 Forward (direct): SRT: ADIFOR, OpenAD, TAF, TAPENADE; OO:
AUTO_DERIV, HSL_AD02, ADF,SCOOT
 Reverse (adjoint): very difficult for a general-purpose implementation
 complex-step: better than FD, less accurate/efficient than AD
 Analytic methods: equations are known




Continuous sensitivity: differentiate then solve
Discrete sensitivity: approximate then differentiate
Direct differentiation (forward) or adjoint (reverse) formulation
Source codes can be exploited if the algorithms are also known
Recommendation (cont.)
 Forward (direct) vs reverse (adjoint)
 Forward mode is in principle more efficient if the
number of objectives and constraints is larger than
number of design variables (geometric sensitivities,
many stress constraints)
 Forward mode is easier and more straightforward for
implementation, easier to exploit sparsity and etc.
 AD vs analytic methods
 AD: very little effort to differentiate a code (conditional
compilation); can be done by analysts using AD tools
developed by professional differentiators; not efficient
 Analytic methods: efficient; needs to know the
equations; exploiting of existing codes is possible but
need to know the algorithms, more changes to the
original codes
Recommendation (cont.)
 Continuous vs discrete sensitivity
 Continuous method obtains approximate solution for exact
derivatives, while discrete method obtains exact derivatives of
approximate solution
 Continuous sensitivity is more accurate/efficient, particularly for
problems with changing domain (topology/shape design)
 Continuous sensitivity requires deep understanding of the problem
(GDEs & BCs), significant effects are needed to derive sensitivity
equations and BCs. Discrete sensitivity only requires nominal
understanding of the equations and algorithms
 Continuous sensitivity usually requires more changes to existing
codes, while discrete sensitivity needs less changes and most of
the change can be done automatically
 Continuous sensitivity is the same as the discrete sensitivity if
the same discretization, numerical integration, and linear design
velocity fields are used for both methods
Sensitivity Analysis of MDO
 Multiple analysis codes: preprocessors
(CAD/mesh generator), aerodynamic codes,
structure codes, performance analysis codes
 Different people involved:




Analysts: developers of analysis codes
Differentiators: sensitivity enablers of analysis codes,
Designers: end users of multiple analysis codes for MDO
They could be all different, and collaboration may not be
practical (e.g. sensitivity analysis of NASTRAN)
 Possible two-way communications between
analysis codes (e.g. iterative process); only
sensitivities of the converged state are needed, a
linear problem could be solved directly for one
code or iteratively for multiple coupled codes
Sensitivity Analysis of MDO (cont.)
 Suggestions
1. If source codes are not accessible, use finite difference
2. If source codes are available, but we don’t know much about the
equations/algorithms (NASTRAN, CAD), use AD; If possible,
iterative nonlinear solver should be avoided for efficiency
3. If we have some knowledge of the equations and algorithms, use
discrete analytical method
4. If we have a deep knowledge of the equations/algorithms, use
continuous analytical method
5. If # of objectives/constraints is larger than # of design
variables, use forward mode, otherwise use reverse (adjoint) mode
6. Source codes differentiated by AD can be used as excellent tools
to verify analytical sensitivity methods
7. The designer should only deal with inputs and outputs of the code
and not have to access to the source and recompile the codes:
similar to finite difference but with the capability to handle
multiple design variables
8. If possible, collaborations should be facilitated between designers,
differentiators, and analysts