Higher Derivatives in Matlab Using MAD - Autodiff.org

4 downloads 1224 Views 488KB Size Report
email: [email protected] www.amorg.co.uk/AD/staff_SAF.html. 5th European .... FUNCTION: FMAD. % SYNOPSIS: Class constructor for forward Matlab AD objects ..... MAD can differentiate object-oriented code - including itself!
Higher Derivatives in Matlab Using MAD Shaun Forth Engineering Systems Department Cranfield University (DCMT Shrivenham) Shrivenham, Swindon SN6 8LA, U.K. email: [email protected] www.amorg.co.uk/AD/staff_SAF.html

5th European Workshop on Automatic Differentiation University of Hertfordshire, UK, May 21-22 2007

1/ 29

Higher Derivatives in Matlab Using MAD

Plan 1

Introduction

2

MAD’s Forward Mode fmad class derivvec class

3

Higher Derivatives Second Derivatives How The Heck Does This Work? Required Changes to MAD

4

Reverse Mode AD in Matlab

5

Results Minpack Elastic-Plastic Torsion (EPT) Problem Object-Oriented Test Case

6

Conclusions & Future Work

2/ 29

Higher Derivatives in Matlab Using MAD

Introduction

MAD’s overloaded forward mode is efficient for first derivatives [For06]. Requests for higher derivatives of order 2 to 4 for uncertainty analysis and robust optimisation. Could implement a Taylor series class `a la ADOL-C [GJU96] manpower expensive. OR use MAD to recursively differentiate itself (c.f. EuroAD2 - Barak Perlmutter and Jeffrey Siskind, [Gri00, p109]).

3/ 29

Higher Derivatives in Matlab Using MAD

MAD’s Forward Mode

Based on two classes: fmad class - forward mode AD by operator overloading derivvec class - for storage and combination of multiple directional derivatives.

4/ 29

Higher Derivatives in Matlab Using MAD

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]);

5/ 29

Higher Derivatives in Matlab Using MAD

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]); Defines fmad object with I I

5/ 29

value component - row vector [1.1 2 3] deriv component - single directional derivative [4 5 6]

Higher Derivatives in Matlab Using MAD

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]); Defines fmad object with I I

value component - row vector [1.1 2 3] deriv component - single directional derivative [4 5 6]

Perform overloaded operations, e.g., element-wise multiplication via times z=x.*x value = 1.2100 4.0000 9.0000 derivatives = 8.8000 20.0000 36.0000 AD enabled by the times.m function of the fmad class.

5/ 29

Higher Derivatives in Matlab Using MAD

fmad times function for z=x.*y function z=times(x,y) if isa(x,’fmad’) z=x; % deep copy to avoid constructor, no refs. to x if isa(y,’fmad’) z.deriv=y.value.*z.deriv+z.value.*y.deriv; z.value=z.value.*y.value; else z.value=z.value.*y; z.deriv=y.*z.deriv; end else z=y; % deep copy to avoid constructor, no refs. to y z.value=x.*z.value; z.deriv=x.*z.deriv; end

6/ 29

Higher Derivatives in Matlab Using MAD

Working with Multiple Directional Derivatives What if we want the Jacobian?

7/ 29

Higher Derivatives in Matlab Using MAD

Working with Multiple Directional Derivatives What if we want the Jacobian? Seed derivatives with identity I3 x=fmad([1.1 2 3],eye(3));

7/ 29

Higher Derivatives in Matlab Using MAD

Working with Multiple Directional Derivatives What if we want the Jacobian? Seed derivatives with identity I3 x=fmad([1.1 2 3],eye(3)); Overloaded operation with same times function gives value = 1.2100 4.0000 Derivatives Size = 1 3 No. of derivs = 3 derivs(:,:,1) = 2.2000 0 derivs(:,:,2) = 0 4 derivs(:,:,3) = 0 0

7/ 29

9.0000

0 0 6

Higher Derivatives in Matlab Using MAD

The fmad and derivvec classes The fmad constructor function function xad=fmad(x,dx) % FUNCTION: FMAD % SYNOPSIS: Class constructor for forward Matlab AD objects xad.value=x; sx=size(xad.value); sd=size(dx); if prod(sx)==prod(sd) % #values=#derivs => single directional derivative xad.deriv=reshape(dx,sx); else % #values~=#derivs => multiple directional derivatives % Pass derivatives to derivvec constructor xad.deriv=derivvec(dx,size(xad.value)); end 8/ 29

Higher Derivatives in Matlab Using MAD

The fmad and derivvec classes The fmad constructor function function xad=fmad(x,dx) % FUNCTION: FMAD % SYNOPSIS: Class constructor for forward Matlab AD objects xad.value=x; sx=size(xad.value); sd=size(dx); if prod(sx)==prod(sd) % #values=#derivs => single directional derivative xad.deriv=reshape(dx,sx); else % #values~=#derivs => multiple directional derivatives % Pass derivatives to derivvec constructor xad.deriv=derivvec(dx,size(xad.value)); end 8/ 29

Higher Derivatives in Matlab Using MAD

The derivvec class Store derivatives as a matrix with each directional derivative unrolled into a column.

9/ 29

Higher Derivatives in Matlab Using MAD

The derivvec class Store derivatives as a matrix with each directional derivative unrolled into a column. e.g. derivvec(eye(3),[1 3]) derivatives conceptualised as as,   direc 1 direc 2 direc 3 [1, 0, 0] [0, 1, 0] [0, 0, 1]

9/ 29

Higher Derivatives in Matlab Using MAD

The derivvec class Store derivatives as a matrix with each directional derivative unrolled into a column. e.g. derivvec(eye(3),[1 3]) derivatives conceptualised as as,   direc 1 direc 2 direc 3 [1, 0, 0] [0, 1, 0] [0, 0, 1] But stored as, 

   direc  1 direc  2 direc  3 1 0 0   1 0 0       0   1   0  = 0 1 0 0 0 1 0 0 1

9/ 29

Higher Derivatives in Matlab Using MAD

The times operation of the derivvec class e.g. Need to calculate,  direc 1 direc 2 direc 3           1 0 0  1.1 2 3 . ∗    0   1   0   0 0 1 

with multiplication of each of the 3 directional derivatives. Convert value to column matrix and replicate 3 times       1.1 1.1 1.1 1 0 0 1.1 0 0  2 2 2 . ∗  0 1 0  =  0 2 0  3 3 3 0 0 1 0 0 3 columns give required directional derivatives.

10/ 29

Higher Derivatives in Matlab Using MAD

Accessor Functions Getting the value getvalue(z) ans = 1.2100

4.0000

9.0000

Getting external representation of derivatives getderivs(z) ans(:,:,1) = 2.2000 ans(:,:,2) = 0 ans(:,:,3) = 0

0 4 0

0 0 6

Getting unrolled internal representation getinternalderivs(z) ans = 2.2000 0 0 4 0 0

11/ 29

0 0 6

Higher Derivatives in Matlab Using MAD

Higher Derivatives Basic idea is to be able to use MAD recursively. For first derivatives: x=fmad([1.1 2 3],eye(3)); z=x.*x zvalue=getvalue(z) zvalue = 1.2100 4.0000 9.0000 zderivs=getinternalderivs(z) zderivs = 2.2000 0 0 0 4.0000 0 0 0 6.0000 Can we just differentiate the above process a second time with MAD?

12/ 29

Higher Derivatives in Matlab Using MAD

Second Derivatives xx=fmad([1.1 2 3],eye(3)); % xx’s value and derivs. D1=I_3 x=fmad(xx,eye(3)); % x’s value =xx, x’s derivs. D2=I_3 z=x.*x; zvalue=getvalue(getvalue(z)) % z’s value’s value zvalue = 1.2100 4.0000 9.0000 zderivs=getinternalderivs(getvalue(z)) % z’s value’s derivs % in D1 direcs. zderivs = 2.2000 0 0 0 4.0000 0 0 0 6.0000 zderivs=getvalue(getinternalderivs(z)) % value of z’s derivs % in D2 direc. zderivs = 2.2000 0 0 0 4.0000 0 0 0 6.0000 13/ 29

Higher Derivatives in Matlab Using MAD

Second Derivatives (ctd)

z2ndderivs=reshape(... getinternalderivs(getinternalderivs(z)),[3 3 3]) % derivs. in D1 direc. of z’s derivs. in D2 direc. z2ndderivs(:,:,1) = 2 0 0 0 0 0 0 0 0 z2ndderivs(:,:,2) = 0 0 0 0 2 0 0 0 0 z2ndderivs(:,:,3) = 0 0 0 0 0 0 0 0 2

14/ 29

Higher Derivatives in Matlab Using MAD

Why The Heck Does This Work? xx xx=fmad([1.1 2 3],eye(3)); creates object, value

deriv

[1.1 2 3] eye(3) x value

deriv

xx

eye(3) deriv

x=fmad(xx,eye(3)); creates object, value

[1.1 2 3] eye(3)

15/ 29

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value

deriv

x.value.*x.value 2.*x.value.*x.deriv

16/ 29

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value

deriv 2.*x.value.*x.deriv

xx value

xx deriv

[1.1 2 3] eye(3)

16/ 29

.*

value

deriv

[1.1 2 3] eye(3)

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6 value

16/ 29

deriv 2.*x.value.*x.deriv

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6

deriv

value

16/ 29

xx value

deriv .* 2 eye(3)

[1.1 2 3] eye(3)

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6

deriv

value

16/ 29

xx value deriv deriv value  .* 2 eye(3) .* 2 I3 1.1 1.1 1.1  22 3]2 eye(3) 2  [I3 , I3 , I3 ] [1.1 3 3 3

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6

deriv

value

16/ 29

xx deriv value value deriv    .* 2 eye(3) 2.2 0 0 e1  02 3]  [I3 , I3 , I3 ] .* 2  e2  4 0eye(3) [1.1 0 0 6 e3

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6

deriv

value

16/ 29

value xx deriv value deriv     .* 2  eye(3)  2.2 0 0 I3 e 1 e 1 e1  02 3]   I3  .* 2e2 e2 e2  4 0eye(3) [1.1 0 0 6 I3 e 3 e 3 e3

Higher Derivatives in Matlab Using MAD

How The Heck (ctd.) z=x.*x; forms object, z value deriv deriv   2.2 0 0 [1.21, 4.0, 9.0]  0 4 0 0 0 6 value

16/ 29

value xx deriv value deriv   .* 2 eye(3)  2.2 0 0 e1 0 0  02 3]   0 e2 0  4 0eye(3) [1.1 0 0 6 0 0 e3

Higher Derivatives in Matlab Using MAD

Required Changes to MAD

Few changes needed Made derivvec class superiorto(’fmad’). Subscript Referencing - make via direct calls to fmad’s subsref function. For efficiency - eliminate unnecessary temporary fmad objects created.

17/ 29

Higher Derivatives in Matlab Using MAD

Made derivvec class superiorto(fmad) derivvec Constructor Function function dv=derivvec(x,shape,nd) . . otherwise error(’derivvec must have 1 or 2 inputs’) end superiorto(’fmad’) Required for binary operations between fmad and derivvec objects. e.g. xfmad.*yderivvec Ensures object precedence is such that derivvec operation called and not the fmad operation. Enables differentiation through the derivvec class.

18/ 29

Higher Derivatives in Matlab Using MAD

Subscript Referencing Cannot make subscript references such as x(1:5,2) within fmad class if x is fmad class. Have to replace overloaded subscripting with direct calls to fmad’s subsref function.

In fmad’s prod.m Product Function localderivs=pval(ones(s(1),1),:)./Aval; replaced with S.type=’()’; S.subs=ones(s(1),1),’:’; localderivs=subsref(pval,S)./Aval;

19/ 29

Higher Derivatives in Matlab Using MAD

Eliminate Unnecessary Temporary fmad Objects A.value of fmad class:

Original Code in prod.m (fmad objects highlighted) Aval=A.value; s=size(Aval); S.subs={ones(s(1),1),’:’}; localderivs=subsref(pval,S)./Aval;

Modified Code in prod.m (fmad objects highlighted) Aval=A.value; s=size(deactivate(Aval)); S.subs={ones(s(1),1),’:’}; localderivs=subsref(pval,S)./Aval; Reduces number of fmad operations performed.

20/ 29

Higher Derivatives in Matlab Using MAD

Reverse Mode AD in Matlab

Uses a madtape class to build up 2 tapes: 1 2

operations tape of all arithmetic operations, including subscripting. value tape of sizes of values involved in calculation, values of those involved nonlinearly.

Functions to set adjoints and propagate back through the computation. Ensure madtape class superior to fmad to enable forward-over-reverse.

21/ 29

Higher Derivatives in Matlab Using MAD

Use of Reverse Mode - Minpack EPT Gradient x=madtape(x0); % initialise madtape object f=MinpackEPT_F(x,Prob); % Evaluate the function % and create tapes fmadtape=getvalue(f) % Extract the objective function value setadjoint(f,1); % initialise adjoint of dependent var. MADCalcAdjoints % propagate adjoints back % through the operations tape gmadtape=getadjoint(x); % extract adjoint of independent

22/ 29

Higher Derivatives in Matlab Using MAD

Results

Very preliminary Minpack EPT case. Objected oriented application

23/ 29

Higher Derivatives in Matlab Using MAD

Minpack EPT Problem Objective function with sparse Hessian

24/ 29

Higher Derivatives in Matlab Using MAD

Minpack EPT Problem Ratio of Hessian time to function time. n = nx ∗ nx 4 16 64 256 hand-coded 12. 13. 12. 14. fmad(sparse) on fmad(sparse) 662. 664. 1154. 18553. fmad(sparse) on reverse 1160. 1147. 1235. 2897. Modify Source Code hand-coded 13. 14. 14. 15. fmad(sparse) on fmad(sparse) 569. 559. 757. 4778. fmad(sparse) on reverse 1114. 1094. 1194. 2727. n2 16 256 4096 65536 All times averaged over a minimum of 5s.

25/ 29

Higher Derivatives in Matlab Using MAD

Source Code Modification [f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29

Higher Derivatives in Matlab Using MAD

Source Code Modification [f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29

Higher Derivatives in Matlab Using MAD

Source Code Modification [f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29

Higher Derivatives in Matlab Using MAD

Source Code Modification [f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/deactivate(hy); % deactivate hy => cheaper division

26/ 29

Higher Derivatives in Matlab Using MAD

Object-Oriented Test Case Preliminary Results Test case with 4 classes from aerospace sector n = 21, m = 101. max. 11 nonzeros/row of J. Require derivatives for uncertainty propagation. Users require derivatives to order 3 - results here to order 2 - worked first time! technique FD fmad fmad on fmad

cpu(JF)/cpu(F) 22.0 1.09 -

cpu(∇2 F)/cpu(F) 486.7 2.41

Very little computation performed - nearly all time creating objects and some trivial computation - highlights benefits of OO!! OO destroys performance of FD.

27/ 29

Higher Derivatives in Matlab Using MAD

Jacobian Sparsity

28/ 29

Higher Derivatives in Matlab Using MAD

Conclusions & Future Work

MAD can differentiate object-oriented code - including itself! Further performance optimisation of fmad class needed: I I

Inactive fmad object is needed. Store size of fmad object directly - don’t take size of it’s value.

Sparsity detection needed for first and higher derivatives - some proof of concept done using sparse logical matrices - summer 2007. madtape class needs improvements: I

I

29/ 29

Generate tape once and do multiple adjoints for same inputs x - need to extend to different x with same control flow. Performance analysis and optimisation.

Higher Derivatives in Matlab Using MAD

References

Shaun A. Forth. An efficient overloaded implementation of forward mode automatic differentiation in MATLAB. ACM Trans. Math. Softw., 32(2):195–222, June 2006. Andreas Griewank, David Juedes, and Jean Utke. Algorithm 755: ADOL-C: A package for the automatic differentiation of algorithms written in C/C++. ACM Transactions on Mathematical Software, 22(2):131–167, 1996. Andreas Griewank. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Number 19 in Frontiers in Applied Mathematics. SIAM, Philadelphia, PA, 2000.

30/ 29

Higher Derivatives in Matlab Using MAD