automatic differentiation of assembler code

1 downloads 0 Views 403KB Size Report
development of a large number of tools for automatic differentiation of numerical ... The forward mode of automatic differentiation [Wengert 1964] provides the.
IADIS International Conference Applied Computing 2007

AUTOMATIC DIFFERENTIATION OF ASSEMBLER CODE D. Gendler and U. Naumann RWTH Aachen University Department of Computer Science Seffenter Weg 23, D-52056 Aachen,Germany

B. Christianson University of Hertfordshire Department of Computer Science College Lane, Hatfield, AL10 9AB, UK

ABSTRACT We present adac – the first working prototype of a tangent-linear assembler code generator. This work complements the development of a large number of tools for automatic differentiation of numerical codes written in various high-level programming languages. From a user’s perspective the novel aspect lies in the independence of adac of the programming languages used to implement the numerical model as long as the original program can be transformed into assembler by a suitable compiler front-end. Our current focus is hereby on GCC. Links to a number of working examples are given. KEYWORDS Automatic Differentiation, Assembler Code

1. INTRODUCTION The augmentation of numerical programs with statements that compute directional derivatives has been fairly well understood for some time. The forward mode of automatic differentiation [Wengert 1964] provides the necessary mathematical basis for this semantical source transformation. Compilers have been developed that perform this transformation for programs written in various programming languages including C, FORTRAN, and Matlab. See www.autodiff.org for further information. A large number of successful applications of these tools to problems in science and engineering have been reported on in the proceedings of the four international conferences on automatic differentiation held in 1991 [Corliss & Griewank 1991], 1996 [Berz, Bischof, Corliss & Griewank 1996], 2000 [Corliss, Faure, Griewank, Hascoët & Naumann 2002], and 2004 [Bücker, Corliss, Hovland, Naumann & Norris 2005], respectively. The development of a source transformation tool for automatic differentiation (AD) includes the implementation of a full compiler front-end for the targeted programming language. Although the theory behind compiler front-ends has been well understood for many years their actual implementation is still a major software development effort. AD tool developers often simplify this task by assuming that the semantical correctness of the input programs has already been verified by one out of many well supported compilers. Thus, the code base can be kept smaller by ignoring issues in semantic analysis and the corresponding error handling. An alternative approach to source transformation AD is the extension of existing compilers with AD capabilities. This route has been taken by a collaborative project between us and British colleagues at the University of Hertfordshire, Hatfield and at The Numerical Algorithms Group Ltd., Oxford. The advantage of this approach is that a commercially supported industrial-strength compiler is taken as the staring point for AD development. Differentiation becomes an intrinsic operation of the compiler. The AD developers need

431

ISBN: 978-972-8924-30-0 © 2007 IADIS

not take care of the robustness of the front-end. Thus they can focus on their actual field of interest – the development of new AD algorithms. Refer to [Naumann & Riehme 2005b, Naumann & Riehme 2005a] for further information on existing research prototypes of the differentiation-enabled NAGWare FORTRAN compiler. A major disadvantage of all established AD tools is their focus on a single programming language. The algorithmic cores of these tools are similar. Differences are represented by the static data-flow analyses offered [Hascoët, Naumann & Pascual 2005, Hascoët & Araya-Polo 2005] as well as local preaccumulation techniques [Naumann & Riehme 2005b] and other algorithmic refinements. Would it not be nice to share the algorithmic kernels among AD tools that target different programming languages? A large amount of duplication in development work could be avoided. Moreover, the rather small community could bundle their efforts in order to arrive at better algorithms resulting in more efficient and robust derivative codes faster. The OpenAD project (http://www-unix.mcs.anl.gov/~utke/OpenAD/) aims to provide a platform for the development of AD algorithms that is language-independent and can be coupled with front-ends and unparsers for various programming languages. First progress has been made in the development of FORTRAN and C tools that share the same AD algorithms. The FORTRAN tool has only recently been applied successfully to a computationally very complex configuration of the MIT general circulation model (http://mitgcm.org). A corresponding manuscript is under review. A different approach to language-independent AD tool development is taken in this paper. Instead of defining a new abstract intermediate representation to represent numerical simulation programs written in various programming languages (see OpenAD’s intermediate representation called XAIF under http://www-unix.mcs.anl.gov/xaif/) we use an existing format. Most compilers generate assembly code at some point during the compilation process. Assembly code is generated for programs written in various highlevel programming languages. The dependence of the assembly code on the processor model is not really a problem. As it is merely used as an intermediate representation we intent to generate the transformed code in a high-level programming language such as C and use a native compiler to build executable code for the target architecture. The generation of high-level derivative code from a given assembler program is the subject of ongoing research. In this paper we describe a first prototype that generates derivative code in assembler. We use Intel’s 80386/80387 processor model. The assembly code is generated by gcc or g77 from the GNU Compiler Collection (http://gcc.gnu.org). As AD of assembler code is original work that (to the best of our knowledge) nobody has looked at so far we first had to understand the relevant implications. Producing an 80386/80387-to-80386/80387 assembler AD tool helped us evaluate the suitability of assembly code as a (high-level) programming-language-independent intermediate representation for AD. So far the conclusion is the following: While assembly code can be differentiated in principle the loss of high-level structure may result in decreased efficiency of the derivative code. Optimization of the derivative code is the

Figure 1. Approaches to AD by source transformation

subject of future work. We believe that pursuing this route further is of interest. The independence of the high-level programming language weighs heavier than (as far as we can tell today) small losses in efficiency.

432

IADIS International Conference Applied Computing 2007

We are convinced that the proposed approach represents a reasonable trade-off between flexibility and robustness on one side and efficiency of the transformed code on the other. The different approaches to AD by source transformation are summarized in Figure 1. Starting with a numerical program written in some high-level programming language (P(H)) we can use either a differentiation-enabled compiler (AD Compiler) or one of the established language-specific AD tools (AD Software). The approach taken by the prototype tool that is the subject of this paper (adac stands for Automatic Differentiation of Assembler Code) is enclosed by the dashed line. More specifically the first version of adac takes the upper route by generating derivative code in assembler (P’(A)) for a given assembler program (P(A)). The route via P’(C) where the tangent-linear code is generated in the original high-level language offers plenty of interesting problems to be considered in the future. One major advantage of this approach would be the ability to use optimizing compilers on the automatically generated derivative code. The structure of this paper is as follows: In Section 2.1 we review the fundamentals of automatic differentiation. We focus on the forward mode. A description of the adac compiler without much technical details is given in Section 2.2. Conclusions are drawn in Section 3.

2. AUTOMATIC DIFFERENTIATION 2.1 Forward Mode in Automatic Differentiation This paper presents a method for generating tangent-linear versions of numerical simulation programs written in assembly code (a subset of Intel’s 80386/80387) instruction set. The programs are assumed to implement vector functions n

m

F:R →R ,

y = F (x) ,

x = ( x k ) k =1,...,n ,

y = ( yl ) l =1,...,m ,

where R denotes the real numbers. Tangent-linear programs F& = F& ( x, x& ) compute directional derivatives y& , that is, products of the Jacobian matrix l =1,... m

F′ = ( f ′ l, k

l =1,...m ) k =1,...n

⎛ dy ⎞ m×n ≡⎜ l ⎟ ∈R dx ⎝ k ⎠ k =1,...n

n with a direction x& in the input space R . Formally,

y& = F& ( x, x& ) ≡ F ′ ⋅ x&

Tangent-linear codes play an important role, for example, in matrix-free Newton-type methods for the solution of systems of non-linear equations F(x) = 0, F : R → R . Given a good start estimate x 0 , the system can be solved by the classical Newton method with quadratic convergence as follows: n

i

i −1

δx = −( F ′( x ))

n

i ⋅ F (x )

i +1 i i x = x + δx for increasing integer values i. At each step the algorithm requires the Jacobian F ′ of F at the current

estimate x i . Finite difference quotients can be used to approximate the entries of the Jacobian at the cost of n+1 and 2n function evaluations when using forward (or backward) and centered differences, respectively.

433

ISBN: 978-972-8924-30-0 © 2007 IADIS

However, it is well-known that step size control is a problem [Heath 1998]. To avoid these problems, the n tangent-linear program can be run with x& ranging over the Cartesian basis vectors in R to obtain F ′ at roughly the same cost as that of (centered) finite differences but with machine accuracy. The Newton step can be obtained as the solution of the linear system i i i F ′( x ) ⋅ δx = − F ( x ) at each Newton iteration i. Direct methods may be prohibitive due to the potentially large size of the i Jacobian F ( x ) . Iterative methods are likely to be more suitable. Krylov methods such as GMRES, involve the computation of the product of the Jacobian with a vector. The accumulation of the Jacobian can be avoided by using a tangent-linear code for this product.

2.2 The ADAC Compiler For a given assembler routine that implements a vector function F : R (y a y p )T = F (x a , x a )

na + n p

→R

where x a ∈ R a ( x p ∈ R p ) are the active (passive) inputs and y a ∈ R n

n

ma

ma + m p

as

(y p ∈ R p ) are the active m

(passive) outputs, the adac compiler generates a new assembler routine that implements the tangent-linear 2⋅n + n 2⋅m + m function F& : R a p → R a p as (y a y& a y p ) T = F& (x a , x& a , x a ) .

For a given direction x& a the tangent-linear routine computes the function values (y a , y p ) and the m ×n directional derivative y& a = F ′ ⋅ x& a at the current point ( x a , x p ) . The matrix F ′ = F ′(x a , x p ) ∈ R a a

denotes the Jacobian that contains all partial derivatives of the active outputs with respect to the active inputs. adac is open-source software. It is available for downloading from our Internet site http://www.stce.rwth-aachen.de/ADAC/. We provide test sets for the following five problems from the MINPACK-2 test problem collection [Averik, Carter & Moré 1991]: Chebyshev quadrature, coating thickness standardization, enzyme reaction, flow in a driven cavity, solid fuel ignition. Two scripts are provided to build adac and to run a test with the corresponding input problem. Refer to the website for more specific instructions. The current version of adac has been created and tested successfully under Linux in the following configuration: Fedora Core release 2 (Tettnang) gcc (GCC) 3.3.3 20040412 (Red Hat Linux 3.3.3-7). With adac currently being an academic proof-of-concept prototype we have not put any emphasis on cross-platform compatibility yet. A considerable amount of further work will be necessary to achieve the level of robustness that is desirable for AD software tools. As an example for the usage of adac we consider its application to the Chebyshev quadrature problem. A configuration file needs to be provided by the user to specify the names of the input and output files, the number of subroutine arguments, the inputs and outputs of the subroutine to be differentiated, and the active and passive program variables. fileNameSource = dchqfj_f fileNameDestin = dchqfj_f varNumber = 4 varInp varInd varOut varDep

434

= = = =

1 2 3 3 4 4

IADIS International Conference Applied Computing 2007

All references to dummy arguments are made by position in the argument list. If the file dchqfj_f.f contains a subroutine with the following interface subroutine dchqfj_f(m, n, x, fvec) integer m, n real x(n), fvec(m) end subroutine

then the GNU Fortran compiler is used with the “compile-only” option -S to obtain the corresponding assembler file dchqfj_f.s that consequently serves as input to adac. The configuration indicates that we are interested in the Jacobian of the only output (fvec in the FORTRAN source) with respect to the last of the seven inputs (x in the FORTRAN source). All test problems in the MINPACK-2 collection contain hand-written code for the evaluation of the first derivatives of the standardized output vector fvec with respect to the standardized input vector x. While we apply adac only to the function evaluation (isolated in the file dchqfj_f.s) we use the provided Jacobian code to verify our numerical results. Successful tests have been performed for a number of examples including the previously listed MINPACK-2 problems. The user needs to provide a driver that calls the tangent-linear routine gen_schqfj_f after initializing the directional derivatives dx(j) of all inputs according to their needs. In the following example driver we use the original MINPACK routine schqfj in ’XS’ mode to compute start values for the input vector x followed by the computation of the Jacobian jac of fvec with respect to x (’FJ’ mode). These values are for comparison with the first derivatives computed by the adac-generated tangent-linear assembler code. The corresponding routine gen_schqfj_f is called n times with dx(j) ranging over the Cartesian basis vectors n

in R . Finally, the results are compared. program schqfj_drv_f integer i, j, m, n parameter (m = 20, n = 19) real x(n), dx(n), fvec(m), dfvec(n), jac(m, n) call schqfj(m, n, x, fvec, jac, m, ’XS’) call schqfj(m, n, x, fvec, jac, m, ’FJ’) do i = 1, n do j = 1, n if(j == i) then dx(j) = 1 else dx(j) = 0 end if end do call gen_schqfj_f(m, n, x, dx, fvec, dfvec) do j =1, m print*, jac j, i) print*, dfvec(j) end do end do end program

You may wish to verify the equality of the results by downloading the sources from our website followed by building and running the corresponding executable. For this purpose we provide two makefiles for building adac (makefile_adac) and for executing the following pipeline (makefile) 1. Compilation and assembly of schqfj.f (• schqfj.o) 2. Compilation and assembly of schqfj_drv_f.f (• schqfj_drv_f.o) 3. Compilation of schqfj_f.f (• schqfj_f.s) 4. Generation of tangent-linear code for schqfj_f.s (• gen_schqfj_f.s) 5. Assembly of gen_schqfj_f.s (• gen_schqfj_f.o) 6. Linkage of schqfj.o, schqfj_drv_f.o, and gen_schqfj_f.o (• schqfj_drv_f) Two versions of the solid fuel ignition problem from the MINPACK-2 collection in FORTRAN and C++ are considered to support the independence of adac of the high-level programming language used to

435

ISBN: 978-972-8924-30-0 © 2007 IADIS

implement the input problem. The corresponding front-ends from the GCC (g77 and g++) are used to transform the source code into assembler. You are welcome to try your own examples. However with adac being under development we cannot promise unlimited robustness (yet).

3. CONCLUSION AND OUTLOOK The dependence on high-level programming languages is a major drawback of existing tools for automatic differentiation. A first functional proof-of-concept prototype of a tangent-linear assembler code generator has been presented. It has been shown to work with numerical simulation codes that have originally been implemented in either FORTRAN or C++. Ongoing work focuses both on robustness of the transformation and on efficiency of the generated code. We expect to make substantial progress during the following two years of the current funding period.

REFERENCES Averik, B. , Carter, R. & Moré, J., 1991, The MINPACK-2 test problem collection (preliminary version). Technical Memorandum ANL/MCS-TM-150, Mathematics and Computer Science Division. Argonne National Laboratory. Berz, M., Bischof, C., Corliss, G. & Griewank, A., eds., 1996. Computational Differentiation: Techniques, Applications, and Tools, Proceedings Series. SIAM. Bücker, M., Corliss, G., Hovland, P., Naumann, U. & Norris, B., eds., 2005. Automatic Differentiation: Applications, Theory, and Tools, Vol. 50 of Lecture Notes in Computational Science and Engineering, Springer. Corliss, G., Faure, C., Griewank, A., Hascoët, L. & Naumann, U., eds., 2002. Automatic Differentiation of Algorithms – From Simulation to Optimization. Springer. Corliss, G. & Griewank, A., eds. 1991. Automatic Differentiation: Theory, Implementation, and Application. Proceedings Series, SIAM. Hascoët, L. & Araya-Polo, M., 2005, The adjoint data-flow analyses: Formalization, properties, and applications. In ‘[Bücker et al. 2005]’. Hascoët, Naumann, U. & Pascual, V., 2005. “To be recorded” analysis in reverse-mode automatic differentiation. Future Generation Computer Systems, 21(8), 1401–1417. Heath, M., 1998. Scientific Computing. An Introductory Survey. McGraw-Hill. Naumann, U. & Riehme, J., 2005a, Computing adjoints with the NAGWare FORTRAN 95 compiler. In ‘[Bücker et al. 2005]’. Naumann, U. & Riehme, J., 2005b. A differentiation-enabled FORTRAN 95 compiler. ACM Transactions on Mathematical Software, 31(4), 458–474. Wengert, R., 1964. A simple automatic derivative evaluation program. Communications of the ACM, 7, 463–464.

436