Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT Finite Difference vs Complex Step Forward finite difference f ( x h) f ( x ) f ( x) O ( h) h Advantages: easy to use, no need to access to the source codes, no need to understand the equation or the code Disadvantages • step-size dilemma (small enough to avoid truncation error, big enough to avoid subtractive cancellation error) • Expensive: always n+1 times of analysis time for n perturbations Complex step approximation Better than finite difference if implemented correctly Complex variable Complex function z x yI f u ( z ) v ( z ) I u ( x , y ) v ( x, y ) I Finite Difference vs Complex Step Complex step approximation (cont.) If analytic, Cauchy-Reimann equation holds u v v( x ( y h)I) v( x yI) lim x y h0 h We deal with real functions of real variables y0 v( x) 0 f ( x) u ( x) f u v( x hI) Im[ f ( x hI)] lim lim h 0 x x h0 h h Not explicitly subject to subtractive cancellation errors and the truncation errors can be made as small as possible h2 h3 f ( x hI) f ( x) hf ( x)I f ( x) f ( x)I 2! 3! Im[ f ( x hI)] h 2 f ( x) f ( x) O(h 2 ) h 3! Dual Number Automatic Differentiation (DNAD) Extend all real numbers by adding a second component x1 , x1 x1 x1d d is just a symbol, analogous to the imaginary unit, but all powers of d higher than one equal to zero Example: f ( x1 , x2 ) x1 x2 sin( x1 ) f ( x1 x1d, x2 x2 d) ( x1 x1d)( x2 x2 d) sin( x1 x1d) x1 x2 x1 x2 d x2 x1d sin( x1 ) cos( x1 ) x1d x1 x2 sin( x1 ) x1 x2 d ( x2 cos( x1 ))x1d f f f ( x1 , x2 ) ( x1 x2 )d x1 x2 f f = f ( x1 , x2 ), x1 x2 x1 x2 Dual Number Automatic Differentiation (DNAD) Dual-number arithmetic u , u v, v u v, u v u , u v, v u v, u v u , u * v, v uv, u v uv sin( u , u ) sin u , u cos u u , u / v, v uv, u v uv / v 2 exp( u, u ) exp u, u exp u log( u , u ) log u , u / u u , u k u k , u ku k 1 Complex step arithmetic (u u I) * (v vI) (uv u v) I(u v uv) (uv u v) u v uv (u u I) / (v vI) I 2 2 2 v v v v 2 1 log(u u I) log(u 2 u 2 ) I arg(u u I) 2 Dual Number Automatic Differentiation (cont.) Comparing DNAD and complex step DNAD is more efficient as calculations are never more and mostly less (less for *, /, and most intrinsic functions) DNAD is more accurate as it delivers the analytical derivatives up to machine precision while complex-step is accurate only for extremely small imaginary parts; cancelation and subtraction errors can occur for some functions Complex-step only has implementation and compiling optimization advantage for codes in languages supporting complex algebra (Fortran), while DNAD as a concept can be used for codes written in any strongly typed languages with real numbers defined Complex step is not applicable to codes having IF(ABS(x)>0) THEN complex operations and it can only compute …… sensitivities with respect to one variable ELSE Changes the calculation of the original …….. ENDIF analysis and program flow Hard to debug as many complex operations are defined by not what you need Derivative of Exp(|x|) at x=-3 Performance Comparison Efficiency Comparison Accuracy Comparison Derivative of Exp(|z|) at z=-3 x=1.0; y=2.0; z=3.0 ftot=0.0d0 DO i=1,500000000 f=x*y-x*sin(y)*log(z) ftot= (ftot- f)/exp(z) ENDDO write(*,*) ftot Complex step: -20.0855362857837 DNAD: -20.0855369231877 Exact: -20.0855369231877 Both complex step and DNAD are implemented in F90/95 Time (seconds) used by different methods # of Design Variables Finite Difference Complex Step DNAD 1 3 9 15 16 1.64*2 1.64*4 1.64*10 1.64*16 1.64*17 3.94 3.94*4 3.94*9 3.94*15 3.94*16 2.11 2.67 14.98 22.16 25.56 Implementation Using F90/95 A general-purpose F90/95 module for automatic differentiation of any Fortran codes including Fortran 77/90/95 Define a new data type DUAL_NUM TYPE,PUBLIC:: DUAL_NUM REAL(DBL_AD)::x_ad_ REAL(DBL_AD)::xp_ad_ END TYPE DUAL_NUM Change to “xp_ad_(n)” with n as # of DVs for sensitivities wrt to multiple DVs Overload functions/operations needed in the analysis codes to this new data type: relational operators, arithmetic operators/functions Implementation Using F90/95 (cont.) Define EXP exp( u, u ) exp u, u exp u INTERFACE EXP MODULE PROCEDURE EXP_D END INTERFACE ELEMENTAL FUNCTION EXP_D(u) RESULT(res) TYPE (DUAL_NUM), INTENT(IN)::u REAL(DBL_AD)::tmp TYPE (DUAL_NUM)::res tmp=EXP(u%x_ad_) res%x_ad_ = tmp res%xp_ad_ =u%xp_ad_* tmp END FUNCTION EXP_D How to Use DNAD To AD a Fortran code use DNAD 1. Replace all the definitions of real numbers with dual numbers REAL(8) :: x TYPE(DUAL_NUM) :: x REAL(8), PARAMETER:: ONE=1.0D0 TYPE(DUAL_NUM),PARAMETER::ONE=DUAL_NUM(1.0D0,0.D0) 2. Insert “Use DNAD” right after Module/Function/Subroutine/ Program statements. 3. Change IO commands correspondingly if the code does not use free formatting read and write (can be automated by written some general-purpose utility subroutines) 4. Recompile the source along with DNAD.o 5. The whole process can be automated, and even manually it only takes just a few minutes for most real analysis codes, although step 3 is code dependent How to Use DNAD (cont.) To use the sensitivity capability Insert 0 after all real inputs not affected by the design variable Insert 1 after the real input if it directly represents the design variable Insert the corresponding sensitivities calculated by other codes if the real inputs are affected indirectly by design variable, such as the sensitivity of nodal coordinates due to change of geometry The sensitivities are reported in the outputs as the number following the function value Designers only need to manipulate inputs/outputs of the code DNAD Example PROGRAM CircleArea REAL(8),PARAMETER:: PI=3.141592653589793D0 REAL(8):: radius, area Input: 5 READ(*,*) radius AREA=78.5398163397448 Area=PI*radius**2 WRITE(*,*) "AREA=", Area END PROGRAM CircleArea PROGRAM CircleArea USE DNAD TYPE (DUAL_NUM),PARAMETER:: PI=DUAL_NUM(3.141592653589793D0,0.D0) TYPE (DUAL_NUM):: radius,area READ(*,*) radius Input: 5,1 Area=PI*radius**2 AREA=78.5398163397448, 31.4159265358979 WRITE(*,*) "AREA=",Area END PROGRAM CircleArea Example (VABS-AD) • VABS: 10,000+ lines of F 90/95 codes • An isotropic rectangular section meshed with two 8-noded quads E 2.6 GPa, 0.3, b 0.1m, h 0.2 m 8 4 10 Element 2 EA Ebh 5200 104 N h EI 22 Ebh / 12 17.333 10 N.m 3 4 EI 33 Ehb / 12 4.333 10 N.m 3 x3 9 4 7 x2 13 2 11 6 2 Element 1 GJ 0.229Gb3h 4.58 104 N.m 2 5 12 Value ( 10 4 ) Sens. (E) 5 ( 10 ) Sens. (b) 6 ( 10 ) EA 5200 2000 520.0 Sens. (h) ( 105 ) 2600 EI22 17.333 6.667 1.733 26.00 EI33 4.333 1.667 1.300 2.167 GJ GJ (exact) 4.833 4.58 1.859 1.762 1.247 1.374 3.433 2.29 b 1 3 2 Loss of accuracy due to coarse mesh remains the same, can be verified by sensitivity wrt E which is equal to GJ/E and GJ is linear of E Example (VABS-AD) Changes to the inputs • Sensitivity wrt E: • Sensitivity wrt h: Change 0.26E+10 .300000000E+00 To 0.26E+10 1. .300000000E+00 0. • Sensitivity wrt b: 1 -0.0500000 -0.5 -0.1000000 0.0 2 0.0500000 0.5 -0.1000000 0.0 3 0.0000000 0.0 -0.1000000 0.0 4 0.0500000 0.5 0.1000000 0.0 5 0.0500000 0.5 -0.0500000 0.0 6 0.0500000 0.5 0.0000000 0.0 7 0.0500000 0.5 0.0500000 0.0 8 -0.0500000 -0.5 0.1000000 0.0 9 0.0000000 0.0 0.1000000 0.0 10 -0.0500000 -0.5 0.0500000 0.0 11 -0.0500000 -0.5 0.0000000 0.0 12 -0.0500000 -0.5 -0.0500000 0.0 13 0.0000000 0.0 0.0000000 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 -0.0500000 0.0 -0.1000000 -0.5 0.0500000 0.0 -0.1000000 -0.5 0.0000000 0.0 -0.1000000 -0.5 0.0500000 0.0 0.1000000 0.5 0.0500000 0.0 -0.0500000 -0.25 0.0500000 0.0 0.0000000 0.0 0.0500000 0.0 0.0500000 0.25 -0.0500000 0.0 0.1000000 0.5 0.0000000 0.0 0.1000000 0.5 -0.0500000 0.0 0.0500000 0.25 -0.0500000 0.0 0.0000000 0. -0.0500000 0.0 -0.0500000 -0.25 0.0000000 0.0 0.0000000 0.0 For more complex geometry and mesh, such inputs should be prepared by a mesh generator: so-called geometry sensitivity. Note # of design variables is not the same as # of seeds for inputs Example (GEBT-AD) GEBT: • 5000 lines of Fortran 90/95 codes F3 a3 • 20,000 lines of Fortran 77 codes a1 • Includes BLAS, MA28 sparse linear solver, LAPACK, ARPACK sparse eigensolver 1 2 • Sensitivity with respect to L Responses and sensitivities by different analyses max U 3 max L 1m F3 5 105 N Value Sensitivity Value Sensitivity Exact (linear) 0.9615 2.8846 -1.4423 -2.8846 U 3max F3 L3 3EI 22 GEBT (linear) 0.9615 2.8845 -1.4423 -2.8846 max F3 L2 2 EI 22 GEBT (nonlinear) 0.5989 1.1157 -1.0540 -1.2665 1.3619 -1.5995 -3.8806 GEBT (follower) 0.7114 Analytic Method If we know the equations, there are even more efficient methods A linear system K ( x)q ( x) F ( x) Unknowns: q; design parameter x; objective function: (q, x) Derivative of the objective dq Direct method q dx x K ( x)q( x) K ( x)q( x) F ( x) K ( x)q( x) F ( x) K ( x)q( x) 1 q( x) K [ F ( x) K ( x)q( x)] q x q x Adjoint method 1 K K q q [ F ( x) K ( x)q( x)] x Recommendation Neither equations no source codes are available: finite difference (FD) method, step-size dilemma AD: source codes available Computational/algorithmic/automatic differentiation (AD) (apply chain rule to each operation in the program flow) Forward (direct): SRT: ADIFOR, OpenAD, TAF, TAPENADE; OO: AUTO_DERIV, HSL_AD02, ADF,SCOOT Reverse (adjoint): very difficult for a general-purpose implementation complex-step: better than FD, less accurate/efficient than AD Analytic methods: equations are known Continuous sensitivity: differentiate then solve Discrete sensitivity: approximate then differentiate Direct differentiation (forward) or adjoint (reverse) formulation Source codes can be exploited if the algorithms are also known Recommendation (cont.) Forward (direct) vs reverse (adjoint) Forward mode is in principle more efficient if the number of objectives and constraints is larger than number of design variables (geometric sensitivities, many stress constraints) Forward mode is easier and more straightforward for implementation, easier to exploit sparsity and etc. AD vs analytic methods AD: very little effort to differentiate a code (conditional compilation); can be done by analysts using AD tools developed by professional differentiators; not efficient Analytic methods: efficient; needs to know the equations; exploiting of existing codes is possible but need to know the algorithms, more changes to the original codes Recommendation (cont.) Continuous vs discrete sensitivity Continuous method obtains approximate solution for exact derivatives, while discrete method obtains exact derivatives of approximate solution Continuous sensitivity is more accurate/efficient, particularly for problems with changing domain (topology/shape design) Continuous sensitivity requires deep understanding of the problem (GDEs & BCs), significant effects are needed to derive sensitivity equations and BCs. Discrete sensitivity only requires nominal understanding of the equations and algorithms Continuous sensitivity usually requires more changes to existing codes, while discrete sensitivity needs less changes and most of the change can be done automatically Continuous sensitivity is the same as the discrete sensitivity if the same discretization, numerical integration, and linear design velocity fields are used for both methods Sensitivity Analysis of MDO Multiple analysis codes: preprocessors (CAD/mesh generator), aerodynamic codes, structure codes, performance analysis codes Different people involved: Analysts: developers of analysis codes Differentiators: sensitivity enablers of analysis codes, Designers: end users of multiple analysis codes for MDO They could be all different, and collaboration may not be practical (e.g. sensitivity analysis of NASTRAN) Possible two-way communications between analysis codes (e.g. iterative process); only sensitivities of the converged state are needed, a linear problem could be solved directly for one code or iteratively for multiple coupled codes Sensitivity Analysis of MDO (cont.) Suggestions 1. If source codes are not accessible, use finite difference 2. If source codes are available, but we don’t know much about the equations/algorithms (NASTRAN, CAD), use AD; If possible, iterative nonlinear solver should be avoided for efficiency 3. If we have some knowledge of the equations and algorithms, use discrete analytical method 4. If we have a deep knowledge of the equations/algorithms, use continuous analytical method 5. If # of objectives/constraints is larger than # of design variables, use forward mode, otherwise use reverse (adjoint) mode 6. Source codes differentiated by AD can be used as excellent tools to verify analytical sensitivity methods 7. The designer should only deal with inputs and outputs of the code and not have to access to the source and recompile the codes: similar to finite difference but with the capability to handle multiple design variables 8. If possible, collaborations should be facilitated between designers, differentiators, and analysts