Serial and Parallel Krylov Methods for Implicit Finite ... - CiteSeerX

Serial and Parallel Krylov Methods for Implicit Finite Difference Schemes Arising in Multivariate Option Pricing Manfred Gilli, Evis K¨ellezi and Giorgio Pauletto1

First draft: June 1999 This draft: March 2001

Comments are welcome 1 M. Gilli: Department of Econometrics, 40 Bd du Pont d’Arve, University of Geneva, 1211 Geneva 4, Switzerland, email [email protected]. E. K¨ ellezi: Department of Econometrics and FAME, University of Geneva, 40 Bd du Pont d’Arve, 1211 Geneva 4, Switzerland, email [email protected]. G. Pauletto: Department of Econometrics, 40 Bd du Pont d’Arve, University of Geneva, 1211 Geneva 4, Switzerland, email [email protected]. We would like to thank Nick Webber for comments on an earlier version of this paper. Financial support from the Swiss National Science Foundation (project 12–5248.97) is gratefully acknowledged.

Abstract This paper investigates computational and implementation issues for the valuation of options on three underlying assets, focusing on the use of the finite difference methods. We demonstrate that implicit methods, which have good convergence and stability properties, can now be implemented efficiently due to the recent development of techniques that allow the efficient solution of large and sparse linear systems. In the trivariate option valuation problem, we use nonstationary iterative methods (also called Krylov methods) for the solution of the large and sparse linear systems arising while using implicit methods. Krylov methods are investigated both in serial and in parallel implementations. Computational results show that the parallel implementation is particularly efficient if a fine grid space is needed.

JEL codes: C63, C88, G13. Keywords: Multivariate option pricing, finite difference methods, Krylov methods, parallel Krylov methods.

Executive Summary In recent years the demand for numerical computations in financial applications has greatly increased. Several fields of application have benefited from the combination of efficient algorithms and fast computers. In particular, the valuation of derivative securities has been pushed forward by the use of intensive computational procedures. This paper focuses on computational and implementation issues of finite difference methods for the valuation of multivariate contigent claims. Examples of problems resulting in multivariate partial differential equations in finance include the pricing of foreign currency debts, compensation plans, risk-sharing contracts, multifactor interest rate models to mention a few. It is generally accepted that the dimensionality of the problem is a nontrivial issue. Up to a dimension of three, methods like the finite differences or the finite elements can still be used. With a greater number of state variables, Monte Carlo is thought to be the only way out. For bivariate problems, finite difference methods, both explicit and implicit, have been successfuly implemented. In the trivariate case the dimensionality of the problem increases and it is generally accepted that implicit methods are greatly desirable as much smaller grid sizes need to be used in order to obtain acceptable precision in reasonable computation times. Based on the computational results performed in two different computing environments, we conclude that implicit finite difference methods can be efficiently used for the valuation of options on three underlying assets, allowing to take advantage of their good stability and convergence features. The availability of efficient methods for the solution of large and sparse linear systems, namely nonstationary iterative methods, makes the use of implicit finite difference methods possible. In our experiments, the size of the system that we solve in parallel is approximatively three million. However, we think that, with faster processors and more memory, we can go even further, allowing for a finer space grid (of the order of 200 in each of the three directions). In the serial case, the maximum grid size that we can solve in a standard PC in Matlab environment is of about 70 in each direction.

1

Introduction

In recent years the demand for numerical computations in financial applications has greatly increased. Several fields of application have benefited from the combination of efficient algorithms and fast computers. In particular, the valuation of derivative securities has been pushed forward by the use of intensive computational procedures. This paper focuses on computational and implementation issues of finite difference methods for the valuation of multivariate contigent claims. Examples of problems resulting in multivariate partial differential equations in finance include the pricing of foreign currency debts, compensation plans, risk-sharing contracts, multifactor interest rate models to mention a few. It is generally accepted that the dimensionality of the problem is a nontrivial issue. Up to a dimension of three, methods like the finite differences or the finite elements can still be used. With a greater number of state variables, Monte Carlo is thought to be the only way out. For bivariate problems, finite difference methods, both explicit and implicit, have been successfuly implemented. Financial applications are proposed by Dempster and Hutton (1997) and Izvorski (1998). Other methods like the finite elements (Zvan et al., 1998) and the Fourier grid method (Engelmann and Schwendner, 1998) have also been used. In the trivariate case the dimensionality of the problem increases and it is generally accepted that implicit methods are greatly desirable as much smaller grid sizes need to be used in order to obtain acceptable precision in reasonable computation times. The point we want to make in this paper is that, for trivariate problems, implicit finite difference methods can be implemented, allowing to take advantage of their good stability and convergence features. The availability of efficient methods for the solution of large and sparse linear systems, namely nonstationary iterative methods, makes the use of implicit finite difference methods possible. Previous work has focused on the explicit and the alternating direction implicit (ADI) methods as the implicit methods were considered as unfeasible. An example is given by Ekvall (1994), who implemented the explicit and the ADI methods on a massively parallel computer. The paper is organized as follows: Section 2 introduces the valuation problem of an european call on three underlying assets which leads to a partial differential equation with three state variables. In order to be able to benchmark our results, we consider the case of an option on the maximum of three assets, for which a closed form pricing formula exists. In Section 3 we describe in some detail the discretisation of the partial differential equation, deriving the different solution methods and giving their time and space complexity. The nonstationary iterative methods and in particular the biconjugate gradient stabilized algorithm (BiCGSTAB) is presented in Section 4. Computational results are given in Section 5; Section 6 concludes.

4

2

Fundamental PDE for european style options on three assets

Consider a risk free bond B evolving according to the differential equation dB(t) = B(t) r dt, with r the risk free instantaneous interest rate, and three risky assets with values S1 , S2 and S3 satisfying the stochastic differential equation dS(t) = diag(S(t)) [ µdt + ωdW (t) ].

(1)

£ ¤′ W is a 3-dimensional standard Brownian motion, whereas S = S1 S2 S3 is £ ¤′ the vector of prices, µ = µ1 µ2 µ3 the vector of drifts and ω is a three by three matrix such that ωω ′ dt represents the covariance matrix of the vector of the instantaneous return on assets, with generic element ρij σi σj dt. The volatility coefficient of the process for Si is denoted by σi and ρij is the correlation dSj i coefficient between dS Si and Sj . In the absence of arbitrage opportunities, it can be shown that all Europeanstyle derivatives on the risky assets must verify the following partial differential equation 3 3 X ∂C ∂C 1 X ∂2C = − rSi − ρij σi σj + rC . ∂t ∂Si 2 i,j=1 ∂Si ∂Sj i=1 The terminal condition is specific for each type of payoff. For example, in the case of an European call on the maximum of three assets with strike price E, the terminal condition is the option payoff at maturity T : C(S1 , S2 , S3 , T ) = (max{S1 (T ), S2 (T ), S3 (T )} − E)+ .

(2)

Introducing the changes of variables x = log(S1 /E), y = log(S2 /E), z = log(S3 /E), τ = T − t and C(S1 , S2 , S3 , t) = Eu(x, y, z, τ ), leads to the following forward parabolic equation ∂u ∂τ

=

σy2 ∂u σx2 ∂u σ 2 ∂u ) + (r − ) + (r − z ) 2 ∂x 2 ∂y 2 ∂z 2 2 2 1∂ u 2 1∂ u 2 1∂ u 2 σ + σ + σ + 2 ∂x2 x 2 ∂y 2 y 2 ∂z 2 z ∂2u ∂2u ∂2u σx σy ρxy + σx σz ρxz + σy σz ρxz − ru + ∂x∂y ∂x∂z ∂y∂z (r −

with initial condition u(x, y, z, 0) = (max{ex − 1, ey − 1, ez − 1})+ . For notational convenience, we indexed volatility coefficients by x, y and z. 5

(3)

Equation (3) can also be written in a more compact way in the following general notation ∂u = V · ∇u + (D∇) · ∇u − ru (4) ∂τ where V and D     1 2 r − 12 σx2 ρxy σx σy ρxz σx σz σx 2     1 2 1 2  D= V = ρyz σy σz   ρyx σy σx   r − 2 σy  2 σy 1 2 ρzx σz σx ρzy σz σy σ r − 21 σz2 2 z are respectively the velocity tensor and the diffusion tensor.

3

Finite difference methods

In this section we discuss the solution of the partial differential equation (PDE) given in (3) using finite difference methods. We describe in detail how the discretization of the PDE leads to a large block tridiagonal linear system that can be efficiently solved using nonstationary iterative methods. We favor the completeness of the presentation in order to guide the reader in the implementation of finite difference methods.

3.1

Discretisation of the PDE

The function u(x, y, z, τ ) will now be represented by its values at the discrete set of points xi = Lx + i △x i = 0, 1, . . . , Nx yj = Ly + j △y

j = 0, 1, . . . , Ny

zk = Lz + k △z

k = 0, 1, . . . , Nz

τm = (m − 1) △τ

m = 1, . . . , N

where Lx , Ly and Lz are the lower bounds of the grid, △x , △y and △z define the grid spacing, △τ the time step and Nx , Ny , Nz and N the number of space respectively time steps. For a given time τm , the space grid points (xi , yj , zk , τm ) are contained in a rectangular parallelepiped with edges parallel to the axes x, y and z. A particular grid point is then unambiguously defined by the triplet (i, j, k). Therefore the value of our function u(x, y, z, τ ) at grid point (xi , yj , zk , τm ) can be denoted by um ijk . The set of all (Nx + 1)(Ny + 1)(Nz + 1) grid points, i.e. all triplets (i, j, k), is denoted Ω and we consider the partition Ω = Ω ∪ ∂Ω and

Ω ∩ ∂Ω = ∅

where Ω is the set of (Nx − 1)(Ny − 1)(Nz − 1) interior points and ∂Ω the set of grid points on the boundary. 6

Replacing all partial derivatives in equation (3) by finite difference approximations we can write the following approximation to equation (3) which holds for all grid points um ijk ∈ Ω: δ τm

m

m

m

m

m

m

= vx δ x + vy δ y + vz δ z + δ xx dxx + δ yy dyy + δ zz dzz m

m

m

+δ xy dxy + δ xz dxz + δ yz dyz − rum ijk .

(5)

m

The 9 approximations δ q , q ∈ Q = {x, y, z, xx, yy, zz, xy, xz, yz} for the partial derivatives with respect to the space variables x, y and z are defined as convex linear combinations of central finite difference approximations at time points m + 1 and m m δ q = θq δqm+1 + (1 − θq )δqm . (6) The temporal weighting factors θq , q ∈ Q take their values in [0, 1] and determine the time at which the partial derivatives with respect to the space variables are evaluated. The central finite difference approximations δqm , q ∈ Q are: δxm

=

δym

=

δzm

=

m δxy

=

m δxz

=

m δyz

=

m m m um um i+1,j,k − ui−1,j,k i+1,j,k − 2uijk + ui−1,j,k m δxx = 2 △x △2x m m m m ui,j+1,k − 2um ui,j+1,k − ui,j−1,k ijk + ui,j−1,k m δyy = 2 △y △2y m m m um um i,j,k+1 − ui,j,k−1 i,j,k+1 − 2uijk + ui,j,k−1 m δzz = 2 △z △2z m m m m ui+1,j+1,k − ui−1,j+1,k − ui+1,j−1,k + ui−1,j−1,k 4 △x △y m m m ui+1,j,k+1 − ui−1,j,k+1 − um i+1,j,k−1 + ui−1,j,k−1 4 △x △z m m m um − u i,j+1,k+1 i,j−1,k+1 − ui,j+1,k−1 + ui,j−1,k−1 4 △y △z

The approximation for the partial derivative with respect to time δ τm is defined as a linear combination between a backward approximation at time m + 1 and a forward approximation at time m δ τm

= θτm =

m um+1 ijk − uijk

△τ m um+1 − u ijk ijk

+ (1 − θτm )

m um+1 ijk − uijk

△τ (7)

△τ

and is independent from θτm . Assigning the same value θ to all θq , q ∈ Q and substituting the definitions (6) and (7) for the approximations for the space respectively time derivatives into equation (5) and collecting terms, we obtain m+1 m m um+1 ijk − uijk = θ c uΩijk + (1 − θ) c uΩijk

7

um ijk ∈ Ω

(8)

where um Ωijk denotes a one-dimensional array. The set of indices Ωijk corresponds to the points in the grid which are neighbours of um ijk and which occur in the finite difference approximations for the partial derivatives. The product c um Ωijk , where c is a row vector, can be developed as c um Ωijk

m m m = c1 um i,j−1,k−1 + c2 ui−1,j,k−1 + d6 ui,j,k−1 − c2 ui+1,j,k−1 m m m −c1 um i,j+1,k−1 + c3 ui−1,j−1,k + d4 ui,j−1,k − c3 ui+1,j−1,k m m +d2 um (9) i−1,j,k + d1 ui,j,k + d3 ui+1,j,k m m m −c3 um i−1,j+1,k + d5 ui,j+1,k + c3 ui+1,j+1,k − c1 ui,j−1,k+1 m m m −c2 um i−1,j,k+1 + d7 ui,j,k+1 + c2 ui+1,j,k+1 + c1 ui,j+1,k+1

where the coefficients are c1 = c4 = c7 =

△τ Dxy 4△x △y △τ Vx 2△x △τ Dxx △2x

c2 = c5 = c8 =

△τ Dxz 4△x △z △τ Vy 2△y △τ Dyy △2y

c3 = c6 = c9 =

△τ Dyz 4△y △z △τ Vz 2△z △τ Dzz △2z

d2 = c9 − c6

d4 = c8 − c5

d6 = c7 − c4

d3 = c9 + c6

d5 = c8 + c5

d7 = c7 + c4

d1 = −r △τ −2c7 − 2c8 − 2c9 and where Vi are the elements of the velocity tensor and Dij those of the diffusion tensor in (4). Figure 1 reproduces the fragment of the grid containing point uijk (bullet) and the set of neighbours Ωijk (circles). (i,j+1,k+1) . .................................... .................................... ... ...... .. ..... .. ..... . ....... ...... ... ...... ... ...... .... . ........ . . . . ..... ..... ...... ...... ...... .... .... .... ..... ..... ..... ..... . . . . . . . . . . . .. . . . ... ..................................... ..................................... . ....... ....... .... ... ... .. . . . . . . . . . . .. . . . . . . . ... ..... ... ..... ... ... ..... ... ... . . . . . . ... . . . ... . . . . ... . ... ... . ... . . . . . . .... ... ........... . . . . . . . ..... .. ........ ..... . ........................................ . .................................... .. (i+1,j+1,k) . .. . . ..................................... ........................................ . . . . . . . . . . . .. .... ... .... ......... .... .... .... .... .... ......... .... .... ...... ... ......... .. .. ....... .. . ..... ... ... ................................. . .................................... . ..... ... ... . . . . . . . . . . . . . . . . . ... . . . ... . ... ... ... ..... ... ... ..... ... . . . . . . . . . . . . ... . .... . .. ... ... ... . . . .... .... ......... ... .... .... . ... ......... . ............ ............................. . . . . . . . . . . . . . . . . . . . . . ... .. ....... . .. ....... . . . ... ..... ........................................... .... ....... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (i−1,j−1,k) ..... ... .. ........ ... .. ........ .. .... ...... ... .......... ... .......... ... ... .......... . ..... .. ........ .. ........ ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . .. . .. .. (i+1,j,k−1) ..... ..... ... .. .. ...... ..... ..... ...... ... ... ... ..... ..... ..... ... ........ ... ........ ... .......... . .... . .... . .... ................................... ................................... ......................................................... ... ............... (i,j−1,k−1) ......... ...... . . . . ...

·

k

◦

·

◦

◦

◦

◦

◦

◦

◦

◦

·

◦

◦

◦

◦

·

◦

·

·

·

◦

•

◦

◦

◦

·

i

j

Figure 1: Grid fragment with node uijk (bullet) and neighbours uΩijk (circles). In order to write the system of finite difference equations in matrix form, we number the three dimensions of the space grid points in a single one-dimensional 8

sequence by defining a new index s = i + j(Nx + 1) + k(Nx + 1)(Ny + 1)

(10)

i=0,1,...,Nx j=0,1,...,Ny k=0,1,...,Nz .

We now need to define the matrix M which corresponds to array c in equation (8). The position of the elements of a generic row of matrix M is given in Figure 2. The position of the elements is symmetric with respect to coefficient

k−1 j−1

k

j

j+1

i−1 i i+1 i−1 i i+1 i−1 c1

c2 d6 −c2

j−1

j

k+1 j+1

j−1

i

i+1 i−1 i i+1 i−1 i i+1 i−1 i i+1 i−1

i

−c1

c3 d4 −c3 d2 d1 d3 −c3 d5 c3

−c1

j

j+1

i+1 i−1 i i+1 i−1 i i+1 −c2 d7 c2

c1

Figure 2: Elements of vector c and corresponding grid points. d1 of element um ijk which according to definition (10) is at column s. The offset of the other elements with respect to s is then defined by the array h i o = 1 Nx Nx +1 Nx +2 (Nx +1)Ny (Nx +1)(Ny +1)−1 (Nx +1)(Ny +1) (Nx +1)(Ny +1)+1 (Nx +1)(Ny +2) . The row indices of matrix M corresponding to the points uijk ∈ Ω are defined by r = i + (j − 1)(Nx − 1) + (k − 1)(Nx − 1)(Ny − 1)

i=1,...,Nx −1

(11)

j=1,...,Ny −1 k=1,...,Nz −1

and a generic row of matrix M has the form h i M (r, s − o(9 : −1 : 1) s s + o ) = c

(12)

and we can now write equation (8) in matrix form m+1 m um+1 − um + (1 − θ)M uΩ . Ω = θM uΩ Ω

(13)

Figure 3 illustrates the structure of matrix M corresponding to a grid with five points in each direction. The columns of matrix M which correspond to grid points uijk ∈ ∂Ω are indicated with a circle and those which correspond to grid point uijk ∈ Ω are indicated with a bullet.

9

... ... ... ... ... ... k=1 k=2 k=3 k=4 . . . . . ........................................k=0 ..................................................................................................................................................................................................................................................................................................................................................................................................................................... ... ... j=0 j=1 j=2 j=3 j=4 .... j=0 j=1 j=2 j=3 j=4 .... j=0 j=1 j=2 j=3 j=4 .... j=0 j=1 j=2 j=3 j=4 .... j=0 j=1 j=2 j=3 j=4 .... ... ... ... ... ... ... . . . . . ...◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦ .................................................................................................................••• ..................••• ..................••• .......................................................••• ..................••• ..................••• ........................................................••• ..................••• .................••• ............................................................................................................................ ... .. .. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .. ...............................................................................................................................................................................................................................................................................................................................................................................................................................................................

·· ··· · ··· ··· ··· ·· ··· · ··· · ··· ··· ·· ··· ··· ··· ··· ··· · ··· ··· ·· ·· ··· · ··· ··· ··· · ··· · ··· ·· ··· · ··· ··· ·· ··· ··· ··· ··· ··· ··· ·· ·· ··· ·· ··· ··· ··· ·· ··· · ··· ··· ··· ··· ··· · ··· · ··· ··· ··· · ··· ·· ·· ··· ·· ··· ··· ··· ·· ··· · ··· ··· ··· ··· · ··· · ··· ··· ··· · ··· ··· ·· ·· ··· · ··· ··· ··· · ··· · ··· ·· ··· · ··· ··· ·· ··· ··· ··· ··· ··· ··· ·· ·· ··· · ··· ··· ··· · ··· · ··· ·· ··· · ··· ··· ·· ··· ··· ··· ··· ··· ··· ·· ·· ··· · ··· ··· ··· ·· ··· · ··· · ··· ··· ·· ··· ··· ··· ··· ··· · ··· ··· ·· ·· ··· · ··· ··· ··· · ··· · ··· ·· ··· · ··· ··· ·· ··· ··· ··· ··· ··· ··· ·· ·· ··· ·· ··· ··· ··· ·· ··· · ··· ··· ··· ··· ··· · ··· · ··· ··· ··· · ··· ··

Figure 3: Pattern of matrix M for Nx = Ny = Nz = 4. Columns marked with bullets form set Ω and columns marked with circles form set ∂Ω. We then partition the columns of matrix M into two matrices A and B, such that the columns of A correspond to the indices of set Ω and the columns of B to the indices of set ∂Ω, i.e. m m M uΩ = Aum Ω + Bu∂Ω .

Finally we can rewrite the set of equations (8) as m+1 m (I − θA)um+1 = (I + (1 − θ)A)um Ω + θBu∂Ω + (1 − θ)Bu∂Ω . Ω

(14)

The matrix on the left in Figure 4 corresponds to matrix A extracted from matrix M given in Figure 3. The matrix on the right in the same figure shows the pattern for a larger matrix A with Nx = Ny = Nz = 6. ... ... ... ... k=1 k=2 k=3 .. .. .. ... ............................................................................................................................................................................... ... j=1 j=2 j=3 ... j=1 j=2 j=3 ... j=1 j=2 j=3 ... ............................................................................................................................................................................ .. ... .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .. ... ... ... ............................................................................................................................................................................. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .. ... ... ... ............................................................................................................................................................................. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .. ... ... ... .........................................................................................................................................................................

·· ·· · ·· ·· · ·· ·· · · · · · · · ·· ·· · ·· ·· · ·· ·· · · ·· · ·· ·· ·· · · · · · ·· · · ·· · · · · · · · · ······ · ··· ·· ·· · · ·· · · · ·· ·· · ·· ·· ·· · · ·· ·· · · · · ·· · ·· ·· ·· · · ·· ··· ·· ·· ··· ·· ·· ·· · · ·· · ·· ·· ·· · · · · · · · · · · · · ·· · · ·· · · · · · · · · · · · ·· ·· ······ · ··· ·· ·· · · ·· · · · ·· ·· · ·· ·· ·· · · · ·· · ·· ·· ·· · · ·· ··· ·· ·· ··· ·· ·· ·· · · · · · · · · · · · ·· · · ·· · · · · ·· ·· · ·· ·· · ·· ··

················ ··········· ··· ··· ····················· ····································· ············· ····················· ··············· ························ ············· ································· ··········· ··· ················ ··· ········ ················ ·· ······· · ··············· ······················ ······ ·· ················· ··························· ····················· · ············· ····················· ············· ··············· ··························· ··············· ············· ····················· ············· ··········· ··· ························ ··········· ··· ··· ········ ··············· ··· ········ · ·· ······· · ····· ······· ·· ·· ······· · ············ ·················· ······ ·· · · · · · · · ··· ··········· ····················· ································· ············· ············· ····················· ··············· ··············· · · · · · · · · · ····················· ············· ············· ··························· ··········· ··· ··········· ··· · · · · · · ··· ········ ··· ········ ······················ ·· ······· · ·· ······· · ·· ······· ····· ············ ······ ·· · · · · · · ················· ································· ····················· · ············· ····················· ············· ··············· ··························· ··············· ············· ····················· ············· ··········· ··· ························ ··········· ··· ··· ········ ··············· ··· ········ · ·· ······· · ····· ······· ·· ·· ····· ············ ·················· · · · · · · · ··· ··········· ································· ············· ····················· ··············· · · · · · · · · · ····················· ············· ··························· ··········· ··· ················ ··· ········ ················ ·· ·····

Figure 4: Pattern of matrix A for Nx = Ny = Nz = 4 and Nx = Ny = Nz = 6.

10

3.2

Solution methods for the discretized problem

Different settings of the parameters θq introduced in (6) and (7) will transform the set of equations (14). The solution for these different situations will be discussed in this section. Denoting N1 , N2 , . . . , Nd the number of steps in each direction of a d dimensional problem, the number of grid point |Ω| and the number of interior points |Ω| is d d Y Y (Ni − 1) (Ni + 1) , |Ω| = |Ω| = i=1

i=1

and the number of neighbours to a given grid point s is |Ωs | = 2d2 . 3.2.1

Explicit method (θ = 0)

If all parameters θq are set to zero, equation (14) becomes m um+1 = (I + A)um Ω + Bu∂Ω Ω

where all elements on the right hand side are known at time step τm . This equation can be solved explicitly for each time step τm . Its numerical stability and convergence depend on the spectral radius ρ of matrix I + A which has to be inferior to unity. When solving the system it is however more efficient to consider the form m um+1 = um Ω + M uΩ Ω

where we do not need to form matrix M . The indices and elements of the nonzero columns corresponding to a given row have been defined in (10) and (12). The explicit method for an arbitrary number d of space variables is summarized in Algorithm 1. Ωs is the set of all neighbours of grid point s in the d dimensional parallelepiped and which occur in the finite difference approximations for the partial derivatives. Algorithm 1 Pseudo-code for the explicit method. 1: Initialize u0 , c 2: for m = 0 : N − 1 do 3: for s ∈ Ω do 4: Compute Ωs m 5: um+1 = um s s + c u Ωs 6: end for 7: end for

11

Complexity Statement 4 needs 2(d − 1) operations to compute the index s and 2d2 operations to compute the set of indices Ωs . The addition and the scalar product in Statement 5 need 1 + 2(1 + 2d2 ) operations. We then establish that the number of elementary operations for Algorithm 1 is given by N (6d2 + 2d + 1) |Ω|

(15)

and as we need two arrays um+1 and um the space complexity of the algorithm is 2 |Ω|. There is a possibility of reducing the operation count given in (15) by computing first all sets Ωs and to store them in an array of dimension (1 + 2d2 ) |Ω|. This reduces the factor in expression (15) from 6d2 + 2d + 1 to 4d2 + 3. 3.2.2

Alternating Directions Implicit (ADI) method

The alternating directions implicit (ADI) method, introduced by Peaceman and Rachford (1955), is a way of reducing multidimensional problems to a sequence of one dimensional problems by solving alternatively for a different single space variable. To describe the method we consider the space variable x and the two finite m m difference approximations δ x and δ xx defined in (6). We then set θx = θxx = 1 and θq = 0, q 6= x, xx and q ∈ Q. Thus these finite difference approximations can now be written m

δx =

m+1 um+1 i+1,j,k − ui−1,j,k

2 △x

m

δ xx =

m+1 m+1 um+1 i+1,j,k − 2uijk + ui−1,j,k

△2x

m

all other approximations δ q , q 6= x, xx and q ∈ Q remain unchanged. As a consequence equation (8) becomes m m m m m um+1 ijk − uijk = c1 ui,j−1,k−1 + c2 ui−1,j,k−1 + d6 ui,j,k−1 − c2 ui+1,j,k−1 m m m −c1 ui,j+1,k−1 + c3 ui−1,j−1,k + d4 ui,j−1,k − c3 um i+1,j−1,k m+1 m+1 m +d2 um+1 i−1,j,k + dx uijk + dx uijk + d3 ui+1,j,k m m m −c3 um i−1,j+1,k + d5 ui,j+1,k + c3 ui+1,j+1,k − c1 ui,j−1,k+1 m m m −c2 um i−1,j,k+1 + d7 ui,j,k+1 + c2 ui+1,j,k+1 + c1 ui,j+1,k+1

where dx = −r △τ −2c7 . Compared to (9) only the third row in the right-hand side of the expression above has changed. For a given j and k we then have the following system of equations m+1 m m um+1 .jk − u.jk = Ax u.jk + Bx uΩ

where Ax = tridiag(d2 , dx , d3 ) is a tridiagonal matrix and Bx is the matrix which contains the coefficients corresponding to the um ijk on the right-hand side. 12

We then proceed in a similar way for the other space variables and form the two tridiagonal matrices Ay = tridiag(d4 , dy , d5 ) and Az = tridiag(d6 , dz , d7 ) where dy = −r △τ −2c8 and dz = −r △τ −2c9 as well as the corresponding marices By and Bz . The ADI method is then summarized in Algorithm 2. Algorithm 2 Pseudo-code for the ADI method. 1: Initialize u0 2: for m = 0 : N − 1 do 3: for j = 1, . . . , Ny − 1 and k = 1, . . . , Nz − 1 do m+1/3

m+1/3

Solve u.jk − um + Bx um .jk = Ax u.jk Ω end for for i = 1, . . . , Nx − 1 and k = 1, . . . , Nz − 1 do m+2/3 m+1/3 m+2/3 m+1/3 Solve ui.k − ui.k = Ay ui.k + By uΩ end for for i = 1, . . . , Nx − 1 and j = 1, . . . , Ny − 1 do m+2/3 m+2/3 + Bz uΩ − uij. = Az um+1 10: Solve um+1 ij. ij. 11: end for 12: end for 4: 5: 6: 7: 8: 9:

Complexity In statements 4, 7 and 10 of Algorithm 2 we have to solve an interdependent tridiagonal system of equations. The matrices I − Ax , I − Ay and I − Az are constant and therefore the factorisation has to be computed only once. The back- and forward substitution in the tridiagonal system needs 7(Ni − 1) operations. A row of matrix Bi contains 2d2 − 2 elements which gives 4(d2 − 1)(Ni − 1) operations to compute the product Bi um . Considering the Ni − 1 Ω 2 operations for the addition we get a total of (4d + 7)Ni + 7(Ni − 1) operations for statements 4, 7 and 10. The total operation count is then N (4d2 + 14) |Ω| .

(16)

In order to be tridiagonal the index set Ω of the grid points has to be computed by varying the indices i, j and k in the order j, i, k for Ay and in the order k, i, j for Az .1 3.2.3

Implicit method (θ = 1)

If all parameters θq are set to one, equation (14) becomes m+1 (I − A)um+1 = um Ω + Bu∂Ω Ω

and um+1 is the solution of a linear system. Algorithm 3 implements the implicit Ω method. 1 This can lead to a cache problem as the same array must be accessed in a non contiguous way.

13

Algorithm 3 Pseudo-code for the implicit method. 1: 2: 3: 4: 5: 6:

Initialize u0 Compute A and B for m = 0 : N − 1 do Compute boundary values um+1 ∂Ω m+1 Solve (I − A)um+1 = um Ω + Bu∂Ω Ω end for

The implicit method is first order accurate and is in this respect not superior compared to the explicit method. Its advantage is however that it is stable and convergent for all choices of space and time steps. 3.2.4

θ-method (0 < θ < 1)

If all the parameters θq are set to some θ, 0 < θ < 1 we obtain a θ-weighted average of the explict and the implicit method. In the particular case where all θq are set to 21 we have the Crank-Nicolson method. Equation (14) becomes in this case m+1 m 1 = (I + 12 A)um (I − 12 A)um+1 Ω + 2 B(u∂Ω + u∂Ω ). Ω The Crank-Nicolson method is second order accurate and its complexity is almost identical to the complexity of the implicit method. There are a few more operations involved for the evaluation of the right-hand side vector of the linear system.

4

Nonstationary iterative methods for the solution of the linear system

The implicit method requires the solution of a linear system of size |Ω| which can become relatively large. The structure of matrix I − A of our linear system is shown in Figure 4. This matrix can either be considered as a banded matrix with bandwidth (Nx − 1)Ny + 1 or as a block tridiagonal matrix with blocksize (Nx − 1) × (Nx − 1). For the solution of the linear system, one could then apply a LU decomposition which exploits this particular structure of the matrix. It appears however that direct methods are unfeasible for systems of the size considered in our application.2 Iterative methods offer a possibility of overcoming the difficulties of direct methods. We experimented nonstationary iterative methods also called Krylov subspace methods.3 2 Gilli and Pauletto (1997) discuss this problem in the case of the solution of large economic models. 3 A presentation of these techniques can be found in Barrett et al. (1994), Freund et al. (1992), Axelsson (1994), Kelley (1995) and Saad (1996). For applications in economics see (Gilli and Pauletto, 1998), (Pauletto and Gilli, 2000) and (Mrkaic, 2001).

14

Contrary to stationary methods, such as Jacobi or Gauss-Seidel, these techniques use information that changes from iteration to iteration. For a linear system Ax = b, Krylov methods generally compute the ith iterate as x(i) = x(i−1) + d(i)

i = 1, 2, . . .

.

The attractive characteristic of these methods is that the ith update d(i) can be easily performed on sparse matrices and requires little storage. The convergence speed is generally also better than for stationary iterative methods. For the application presented in the paper, we used the biconjugate gradient stabilized method (BiCGSTAB) introduced by van der Vorst (1992). Algorithm 4 represents the pseudo-code for solving a linear system Ax = b with this method. Algorithm 4 Pseudo-code for BiCGSTAB. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:

Initialize x(0) Compute r(0) = b − Ax(0) , ρ0 = 1, ρ1 = r(0)′ r(0) , α = 1, ω = 1, p = 0, v = 0 for k = 1, 2, . . . until convergence do β = (ρk /ρk−1 )(α/ω) p = r + β(p − ωv) v = Ap α = ρk /(r(0)′ v) s = r − αv t = As ω = (t′ s)/(t′ t) x(k) = x(k−1) + αp + ωs r = s − ωt ρk+1 = −ωr(0)′ t end for

Complexity If BiCGSTAB is used with no preconditioning, which is the case for our application, the algorithm computes in statement 2 a matrix-vector product and a norm, and for each iteration 4 inner products, 6 vector updates (also called saxpy operations) and 2 matrix-vector products. The number of elementary operations corresponding to these computations is 2 (2d2 + 1) |Ω| for the matrixvector product, 2 |Ω| for one saxpy as well as for one inner product and 4 |Ω| for the computation of the norm. The overall complexity of the implicit method for d = 3 and N time steps is then4 N |Ω| (42 + 96k) where k is the number of iterations for BiCGSTAB to converge. 4 This

operation count neglects the computation of the right-hand side in statement 5 of Algorithm 3.

15

Algorithm 4 needs memory to store5 matrix A which has nnz = (2d2 + 1) |Ω| nonzero entries and 7 vectors of size corresponding to the order of matrix A. Considering a storage scheme for the sparse matrix A that needs 12 nnz + 4 |Ω| bytes we need for d = 3 a total of 12 · 38 |Ω| + 4 |Ω| + 7 · 8 |Ω| = 516 |Ω| bytes of memory where integers occupy 4 bytes and reals 8 bytes.

5

Computational results

In order to be able to benchmark our results, we price an instrument for which an analytical solution exists. We report the values of a call on the maximum of three assets S1 , S2 and S3 with strike price E = 10, maturity T = 1, risk free rate of return r = 0.1, volatility and correlation coefficients among assets σ1 = σ2 = σ3 = 0.2, ρ12 = ρ23 = ρ13 = 0.5 and initial values S1 (0) = S2 (0) = S3 (0) = 10. The analytical solution of the call corresponding to this set of parameters is 2.267.6 In the following we present computational results obtained both with a serial and a parallel implementation. Serial implementation We implemented the explicit method described in Algorithm 1, the implicit method described in Algorithm 3 and the BiCGSTAB method as described in Algorithm 4 in a MATLAB 5.x programming environment.7 Computations have been executed on a 500 MHz Pentium III PC with 512 Mbytes of memory. In order to illustrate the advantage of the implicit method over the explicit one, we report the relative error of the computed call price with respect to the analytical solution for different time and space grids. Figure 5 clearly illustrates that the implicit method achieves a higher precision. The figure does not supply information on the number of elementary operations needed to compute the different solutions. The best result for the explicit method needs about 8 Gflops while the best result for the implicit method has been obtained with 3 Gflops. 5 It

is possible to implement a matrix free version of the algorithm. analytical formula for a call on the maximum of three assets can be found in (Johnson,

6 The

1987). 7 The MATLAB files containing the code can be downloaded from the URL http://www. unige.ch/ses/metri/pauletto/fdm.html.

16

0.01

0.01

0.005

0.005

0 100 200 300 400 500 Time steps 600 70

0 10 15 30 40

30

20

40 25

50 60

Time steps

Space grid

50 30 70

60 Space grid

Figure 5: Relative error for the explicit method (left panel) and the implicit method (right panel) as a function of space grid points and time steps.

With 71 space grid points in each of the three directions the implicit method needs 516 × (71 − 2)3 ≈ 170 Mbytes of memory for data storage. This is the largest problem instance we can solve with MATLAB on our machine with 512 Mbytes of total memory. Parallel implementation As commented in the previous paragraph, the space complexity of the implicit method limits the size of the problem that can be solved in a serial environment. This is a situation where Krylov methods become more attractive as they are also well suited for parallel implementation allowing the tackeling of larger grid sizes. For our application we employed the software PETSc 2.0 (Portable and Extensible Toolkit for Scientific Computing) developed and maintained at the Mathematics and Computer Science Division of the Argonne National Laboratory, see Balay et al. (1998). PETSc is a set of routines designed to solve large-scale computational problems in a parallel environment. The software is written in C and can be linked to C, C++ or Fortran programs. We used a computing platform constituted by a 32 PC LINUX cluster. The individual machines are Pentium II that run at 220 MHz and are equipped with 128 Mbytes of memory each. On this cluster we were able to solve with the implicit method a problem with 141 grid points in each of the three directions corresponding to a linear system with 2 685 619 equations. With 60 time steps the relative error of the computed price is 0.00075. The implicit method using BiCGSTAB needs approximatively 1.4 Gbytes of memory for data storage and uses approximatively 145 Gflops. Using 29 processors the solution for this problem was computed in 300 seconds which corresponds to a computing speed of about 0.5 Gflop/s. In order to illustrate the fact that sparse direct methods do not constitute an alternative to the nonstationary iterative methods, we compare in Figure 6 the space and time complexity of a sparse LU with BiCGSTAB as a function of grid points in each of the three directions. 17

Gbytes

Gflops / time step

3

◦ LU

◦.

. • BiCGSTAB . . . . . ◦.. . . ◦.. . . • . . . . . • . . . .• ◦ . • . . . . . •. . • ◦.◦. •. . .•. . •

25 20 15 10 5 0 0

50

100

150

•.

◦ LU • BiCGSTAB

.

2

.

1

.

. ◦ . . .• . .• ◦.◦ ◦.◦.◦◦ •. . .•.

0

200

0

50

.

.

. .•

.

.

.

.

.

.

.

.

•. .

. .•

. .•

100

150

200

Figure 6: Comparison of computational and space complexity of a sparse LU against BiCGSTAB as a function of grid point in each of the three directions.

It clearly appears that the sparse direct method reaches its limits for a number of grid points below 30, whereas for BiCGSTAB we can easily imagine grid sizes beyond 200 with a more performant cluster of PCs.

6

Conclusion and further research

Based on the computational results performed in two different computing environments, we conclude that implicit finite difference methods can be efficiently used for the valuation of options on three underlying assets. The availability of nonstationary iterative methods and the fact that they are well suited for parallelization makes the solution of very large linear systems possible. In our experiments, the size of the system that we solve in parallel is approximatively three million. However, we think that, with faster processors and more memory, we can go even further, allowing for a finer space grid (of the order of 200 in each of the three directions). In the serial case, the maximum grid size that we can solve in a standard PC in Matlab environment is of about 70 in each direction. We have used the Krylov methods without preconditioning. It would be worthwhile studying whether preconditioning would improve the performance of the algorithm. The influence of the diffusion and velocity tensors in the convergence speed of the Krylov methods can also be the object of further research.

References Axelsson, O. (1994). Iterative Solution Methods. Oxford University Press. Oxford, UK.

18

Balay, S., W. D. Gropp, L. Curfman McInnes and B. F. Smith (1998). PETSc 2.0 Users Manual. Technical Report ANL 95/11. Argonne National Laboratory. http://www.mcs.anl.gov/petsc/petsc.html. Barrett, R., M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine and H. van der Vorst (1994). Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM. Philadelphia, PA. Dempster, M. and J. Hutton (1997). Numerical valuation of cross-currency swaps and swaptions. In: Mathematics of Derivative Securities (M. Dempster and S. R. Pliska, Ed.). Cambridge University Press. Cambridge. pp. 473–503. Ekvall, N. (1994). Experiences in the Pricing of Trivariate Contingent Claims with Finite Difference Methods on a Massively Parallel Computer. Computational Economics 7(2), 63–72. Engelmann, B. and P. Schwendner (1998). The pricing of multi-asset options using a Fourier grid method. Journal of Computational Finance 1(4), 63– 72. Freund, R. W., G. H. Golub and N. M. Nachtigal (1992). Iterative solution of linear systems. Acta Numerica pp. 57–100. Gilli, M. and G. Pauletto (1997). Sparse direct methods for model simulation. Journal of Economic Dynamics and Control 21(6), 1093–1111. Gilli, M. and G. Pauletto (1998). Krylov methods for solving models with forward-looking variables. Journal of Economic Dynamics and Control 22(6), 1275–1289. Izvorski, I. (1998). A nonuniform grid method for solving PDE’s. Journal of Economic Dynamics and Control 22(6), 1445–1452. Johnson, H. (1987). Options on the maximum or the minimum of several assets. Journal of Financial and Quantitative Analysis 22, 277–283. Kelley, C. T. (1995). Iterative Methods for Linear and Nonlinear Equations. SIAM. Philadelphia, PA. Mrkaic, M. (2001). Policy iteration accelerated with Krylov methods. Journal of Economic Dynamics and Control. forthcoming. Pauletto, G. and M. Gilli (2000). Parallel Krylov Methods for Econometric Model Simulation. Computational Economics 16(1–2), 173–186. Peaceman, D. and H. Rachford (1955). The numerical solution of elliptic and parabolic differential equations. Journal of SIAM 3, 28–41.

19

Saad, Y. (1996). Iterative Methods for Sparse Linear Systems. PWS Publishing Company. MA. van der Vorst, H. A. (1992). Bi-CGSTAB: A Fast and Smoothly Converging Variant of BI-CG for the Solution of Nonsymmetric Linear Systems. SIAM Journal for Scientific and Statistical Computing 13(2), 631–644. Zvan, R., P.A. Forsyth and K.R. Vetzal (1998). A general finite element approach for PDE option pricing models. Mimeo.

20

Serial and Parallel Krylov Methods for Implicit Finite ... - CiteSeerX

Serial and Parallel Krylov Methods for Implicit Finite ... - CiteSeerX

Suggest Documents

Parallel Krylov Methods for Econometric Model Simulation - CiteSeerX

Projected Krylov Methods - CiteSeerX

Preconditioning Newton- Krylov Methods for Variably ... - CiteSeerX

Implicit Meshing for Finite Element Methods using Levelsets - CiteSeerX

High-order Krylov-Newton and fast Krylov-Secant Methods for

A Parallel Krylov-Type Method for Nonsymmetric Linear ... - CiteSeerX

Self-Scheduling Parallel Methods for Multiple Serial Codes with ...

Self-Scheduling Parallel Methods for Multiple Serial Codes with ...

implicit methods for timed circuit synthesis - CiteSeerX

Preconditioning Newton- Krylov Methods for ... - Semantic Scholar

Parallel and Serial Methods of Calculating Thermal Insulation in ...

Preconditioned Krylov Subspace Methods for Eigenvalue Problems

Low-Rank Tensor Krylov Subspace Methods for ... - CiteSeerX

Krylov Subspace Methods for Tensor Computations

Bit-Serial Parallel Processing Systems - CiteSeerX

Generalized Alternating-Direction Implicit Finite-Difference ... - CiteSeerX

A Parallel Node-based Solution Scheme for Implicit Finite ...www.researchgate.net › publication › fulltext › A-Parallel

Combining Krylov subspace methods and identification-based ...

Serial & Parallel Communication

Parallel Adaptive Finite Element Methods for ... - Google Groups

nested krylov methods and preserving the

Finite Element Methods for Radiosity - CiteSeerX

Generalized Sweep Methods for Parallel Computakional ... - CiteSeerX

Implicit and Iterative Methods for the Boltzmann