Explicit Density Approximations for Local Volatility Models Using Heat ...

3 downloads 1099 Views 606KB Size Report
Methodology and Computing in Applied Probability ... compute all the heat kernel coefficients for the CEV and quadratic local volatility models; in the later case, ...
Methodol Comput Appl Probab DOI 10.1007/s11009-015-9463-6

Explicit Density Approximations for Local Volatility Models Using Heat Kernel Expansions Stephen Taylor1 · Scott Glasgow2 · James Taylor2 · Jan Vecer3,4

Received: 27 February 2014 / Revised: 27 August 2015 / Accepted: 1 September 2015 © Springer Science+Business Media New York 2015

Abstract Heat kernel perturbation theory is a tool for constructing explicit approximation formulas for the solutions of linear parabolic equations. We review the crux of this perturbative formalism and then apply it to differential equations which govern the transition densities of several local volatility processes. In particular, we compute all the heat kernel coefficients for the CEV and quadratic local volatility models; in the later case, we are able to use these to construct an exact explicit formula for the processes’ transition density. We then derive low order approximation formulas for the cubic local volatility model, an affine-affine short rate model, and a generalized mean reverting CEV model. We finally demonstrate that the approximation formulas are accurate in certain model parameter regimes via comparison to Monte Carlo simulations. Keywords Heat kernel expansion · Local volatility · CEV model · Short rate models

 Stephen Taylor

[email protected] Scott Glasgow [email protected] James Taylor [email protected] Jan Vecer [email protected] 1

Hutchin Hill Capital, New York, NY, USA

2

BYU Department of Mathematics, Provo, UT 84602, USA

3

Vysoka skola aplikovanecho prava, Chomutovicka 1443, 14900 Prague 4, Czech Republic

4

Faculty of Mathematics and Physics, Charles University, Sokolovska 83, 186 75, Prague 8, Czech Republic

Methodol Comput Appl Probab

Mathematics Subject Classification (2010) 35K08 · 91G20

1 Introduction A significant portion of the mathematical finance literature addresses the construction of exact pricing formulas for a variety of contingent claims whose underlying assets are assumed to evolve according to a specified stochastic process. In particular, the Black and Scholes (1973) and Merton (1973) pricing formula for a European call option on an asset that undergoes lognormal dynamics has served as a cornerstone for much of the development of modern pricing theory. Although the lognormal model has long been a guide for gaining insight into how derivatives’ prices depend on market parameters, it is often too simplistic for practical use and has many drawbacks. Specifically, asset prices generally do not follow lognormal dynamics, and one is typically unable too calibrate this model in a manner that is consistent with market option prices. These issues, amongst others, led (Dupire 1994) to consider the more general class of local volatility models. A local volatility model allows an asset process to evolve by a one-dimensional It¯o process, where the diffusion coefficient function is chosen so that the model’s European call prices agree with current market quotes. Unfortunately, one cannot explicitly determine the transition density of a general It¯o process (which is equivalent to constructing an explicit formula (up to quadrature) for the price of a call option). There are only a handful of local volatility models where it is possible to derive an expression for the transition density given by a composition of elementary and special functions. Constructing such a solution is equivalent to solving a single linear parabolic equation of one spatial and one temporal variable. Generally, as the functional form of the diffusion coefficient becomes more complicated, the prospect of finding an exact solution to the associated transition density equation diminishes. In lieu of exact density formulas, there are several widely used approximate pricing techniques. For example, tree, Monte Carlo, and finite difference methods are currently the most popular approximation methods used in practice. Alternatively, one can attempt to construct explicit approximation formulas for the transition density of a local volatility process that generically will only be valid in certain model parameter regimes. The advantages of such formulas are that they are considerably more computationally efficient, relative to their previously mentioned counterparts and, at times, one can determine qualitative model information from their functional forms. The main drawbacks of these approximations are that they can be inaccurate for certain choices of model parameters, and a thorough error analysis against an established approximation method is usually required prior to practical use. The literature regarding explicit approximation formulas for stochastic processes, which are constructed using different types of perturbation theory, has been growing over the past several years. In particular, heat kernel perturbation theory has been particularly effective in creating some of the most accurate approximation formulas to date. Roughly speaking, heat kernel perturbation theory utilizes a Taylor series expansion ansatz in time together with a geometrically motivated prefactor chosen to simplify subsequent calculations. The heat kernel ansatz is most naturally stated using the language of differential geometry. Hagan et al. (2005) and Lesniewski (2002) were the first to introduce differential geometric methods in finance. Labord`ere (2008, 2005) used heat kernel methods to construct implied volatility formulas for local volatility models, the SABR model, and the SABR Libor Market Model. More recently, Gatheral et al. (2012) have also considered heat kernel expansions in the context of local volatility models. Extending the work of Henry-Labord`ere, Paulot (2007)

Methodol Comput Appl Probab

constructed (to our knowledge) the most accurate explicit implied volatility formula for the SABR model. Forde (2011, 2013) and Pagliarani and Pascucci (2012) have rigorized and extended this work to a larger class of stochastic volatility models. We refer the reader to Medvedev (2004) for a list of several references to the finance perturbation theory literature. There are two main advantages of heat kernel methods over other forms of perturbation theory. First, the only error in a formula constructed solely using a heat kernel ansatz is due to a Taylor series assumption in time. In particular, an expression for a transition density constructed using heat kernel perturbation theory will increase in accuracy as time decreases. Consequently, implied volatility smile approximation formulas constructed from a heat kernel expansion tend to be more accurate at out of the money strikes than others constructed with alternative formalisms that are perturbative in both time and strike. Secondly, the heat kernel ansatz solves its associated parabolic equation exactly to zeroth order in time. Thus in some sense, one does not carry along zeroth order complexities when trying to establish a first-order correction for the transition density of a process. Heat kernel perturbation theory is not without its limitations. In the case of two spatial dimensional models, the heat kernel ansatz can not be expressed explicitly. This is due to the fact that one can not derive an explicit distance function associated with a generic two dimensional geometry. However, one does not have this issue in one dimension and is able to make progress towards constructing formulas which approximate the transition density to varying degrees of accuracy. Our first aim is to review the formulation of the heat kernel ansatz, construct its explicit one dimensional form, and then use this ansatz to find approximations for the transition density of local volatility processes. These may be converted into implied volatility approximations using say a saddle point approximation method. The main contributions of this article include a thorough derivation of the heat kernel perturbation theory formalism required to construct transition density approximations for a generic system of stochastic processes as well as a restriction of these methods to a one-dimensional setting. Secondly, we provide example derivations and benchmark these methods in the case of one-dimensional stochastic processes with known transition densities including the CEV and quadratic local volatility models. We then consider two examples where they can be applied to stochastic processes that do not have explicit densities and finally consider Monte Carlo simulations to gauge the accuracy of these methods. The main aim of this article is to provide researchers and practitioners with a new set of analytic tools to derive closed form approximations for transition densities of stochastic processes that can be used in a variety of financial mathematics applications as well as a numerical framework for testing the accuracy of such formulas. In particular, one may use a transition density approximation in order to price derivatives whose underlying assets following the associated SDE by integrating the approximate density against the derivative’s payoff function. In addition, all calculations have been implemented in Mathematica, and the authors are willing to share this software on request. In Section 2, we provide basic conventions as well as review background and review the construction of the heat kernel ansatz. We then state the explicit form of this ansatz in one dimension and illustrate the relative robustness of the method in this setting. In Section 3, we apply this method to several examples, including the CEV model, the quadratic local volatility model, the cubic local volatility model, an affine-affine short rate model, and a generalized CEV model. In the case of the quadratic volatility model, we show how heat kernel methods can sometimes be used to construct exact solutions of transition density equations. In Section 4, we test the approximation formulas against Monte Carlo Simulation to gauge their accuracy, and finally, in Section 5 we summarize the results.

Methodol Comput Appl Probab

2 Background and the Heat Kernel Ansatz We first review the construction of the pricing equation for a general path independent derivative whose payoff depends on n generic time homogeneous It¯o processes. We then review how this equation can be related to a geometric heat equation, which will motivate the heat kernel expansion ansatz; alternative expositions of these topics can be found in Labord`ere (2008) and Paulot (2007). Finally, we write the ansantz out explicitly in the one dimensional setting. For i = 1, . . . , n, let xti ≡ x i (t) be n It¯o processes which evolve according to a system of coupled stochastic differential equations (SDEs) dxti = μi (xt )dt +

n 

σji (xt )dW j ,

x i (0) = x0i ,

(1)

j =1

where j = 1, . . . , n, xt = {xt1 , . . . , xtn } denotes a function’s dependence on potentially all the xti , W i are n Brownian motions with covariance matrix ρ ij , i.e. E[dW i dW j ] = ρ ij dt, and μi , σji are suitably regular functions. We assume these dynamics are risk-neutral and that E is the expectation operator with respect to the risk neutral measure. We will also sometimes write x i = xti for short when no contextual conflicts are present. The coefficient functions of this SDE system do not have explicit time dependence; we will always assume that this holds. Let F (xt , t) be a pricing function for a contingent claim with a path independent payoff function F (xT , T ) for some fixed T > t > 0. Then It¯o’s lemma requires that the martingale discounted price process erτ F , where τ = T − t, evolves according to Ft − rF +

n  i=1

μi

n ∂F 1  ij ∂ 2 F +  = 0. ∂x i 2 ∂x i ∂x j

(2)

i,j =1

j

where here we let  ij ≡ σki σl ρ kl be a positive definite volatility matrix. We will only consider the class of path independent payoff functions, given the values of the underlying processes xt at time t and a filtration Ft , we can represent F according to    rτ rτ F (xt ) = E e F (xT )|Ft = e F (xT )φ(T , xT |t, xt )dxT , (3) R

where here φ(T , xT |t, xt ) is the joint transition density of the underlying processes x i , i.e. it represents the probability that x will evolve to x = xT at time T given that the x i had initial values xt at time t. Substituting this form for F (xt ) into Eq. 2, we find φ must satisfy n n  ∂φ ∂φ 1  ij ∂ 2 φ μi i +  , = j ∂τ 2 ∂xt ∂xti ∂xt i=1 i,j =1

φ(0, x, xt ) = δ(x − xt ).

(4)

We seek to approximate solutions to equations of this form using heat kernel perturbation theory. In particular, we make an ansatz for φ which solves this equation exactly to zeroth order in the hope of allowing us to simplify subsequent perturbative computations for higher order correction terms. There are many references for this construction (see e.g. Avramidi (2007), Labord`ere (2008), and Paulot (2007)). In order to motivate the heat kernel ansatz, we first review a correspondence between elliptic operators on Rn and connections on line bundles over a Riemannian manifold (M, g); here M is a C ∞ manifold (which can roughly be interpreted as a smooth subset of a Euclidean space) and g is a smooth set of symmetric positive definite matrices, indexed

Methodol Comput Appl Probab

by points in M, which contains all necessary information related to computing distances on M. Consider an elliptic operator L=

n n  1  ij  ∂i ∂j + μi ∂i , 2 i,j =1

(5)

i=1

on Rn . We can represent L by an equivalent operator of the form L = A − Q =

n 

g ij ∇iA ∇jA − Q,

(6)

i,j =1

on a line bundle L over M where here ∇iA is a connection which can be decomposed as ∇iA = ∇i + Ai where ∇i is the Levi-Civita connection associated to g and Ai are the components of a real-valued section of the cotangent bundle of M, i.e. A ∈ (T ∗ M), and Q is a section of EndL ≈ L ⊗ L∗ , (cf. Avramidi (2007)). All of our analysis will be local, and we will only require the fact that ∇iA acts on a function p : M → R according to ∇iA p = (∇i + Ai )p = ∇i p + Ai p,

(7)

and on the components vj of a covector v ∈ T ∗ M like   n n n   1  km A k k ∂i gkj + ∂j gik −

ij vk , ij = g ∂k gij , ∇i vj = ∂i vj − 2 k=1

m=1

(8)

k=1

are the Christoffel symbols associated with g. In particular, in local coordinates, we can compute Lp =

n 

g ij (∇i + Ai )(∇j + Aj )p − Qp =

i,j =1

=

n 

g

ij

n 

g ij (∇i + Ai )(∂j p + Aj p) − Qp (9)

i,j =1

 ∂i ∂j p + 2Aj pi −

i,j =1

n 



ijk pk

+ ∂i Aj −

k=1

n 



ijk Ak

+ Ai Aj p − Qp.

k=1

(10) We now can identify the diffusion and advection terms of this expression with those of Eq. 5 to find that n  1 ij g j k ji k , (11)  = g ij , μi = 2g ij Aj − 2 j,k=1   n n   ij k 0= ∂i A j − (12) g

ij Ak + Ai Aj − Q. i,j =1

k=1

Through these equations, we can express an elliptic operator either by a choice of ( ij , μi ), or equivalently, by specifying a triple (g ij , Ai , Q). Specifically, they can be inverted in order to write geometric quantities in terms of the financial ones, ⎤ ⎡ n n   1 1 g ij =  ij , Ak = ⎣ gik μi + gik g j m ji m ⎦ , (13) 2 2 Q=

n  i,j =1

 g

ij

i=1

∂i A j −

i,j,m=1

n  k=1

ijk Ak



+ A i Aj .

(14)

Methodol Comput Appl Probab

We now give two operator identities that will prove to be useful later. The first concerns the A-Laplacian A defined by, A p ≡

n 

g ij (∇i + Ai )(∇j + Aj )p

(15)

i,j =1 n 

=

g ij (∇i ∇j p + ∇i (Aj p) + Ai ∇j p + Ai Aj p)

i,j =1

= g p +

n 



g

ij



Aj ∂i p + p ∂i Aj −

i,j =1

n 

(16)



ijk Ak

 + Ai ∂j p + Ai Aj p

(17)

k=1

= g p + 2

n 

 g Ai ∂j p + g ij

ij

∂i A j −

i,j =1

n 



ijk Ak

+ A i Aj p

(18)

k=1 n 

= g p + 2

g ij Ai ∂j p + Qp,

(19)

i,j =1

which can be expressed more concisely as ( A − Q)p = g p + 2

n 

g ij Ai ∂j p.

(20)

i,j =1

The second identity involves the Levi-Civita Laplacian (Laplace-Beltrami operator) g and is expressed in local coordinates by   n n n    1  √ ij ij k ∂i ∂j p − ∂i gg ∂j p = g

ik ∂j p , (21) g p = √ g i,j =1

i,j =1

k=1

√ where here g is the determinant of the metric. With the above tools at hand, we now can construct the heat kernel ansatz. First, consider a general second-order elliptic differential operator on Rn expressed in terms of ∇iA given by Lφ =

n  i,j =1

g ij ∇iA ∇jA φ − Qφ =

n √   1 √ (∂i + Ai ) gg ij (∂j + Aj )φ − Qφ. g

(22)

i,j =1

The heat equation ∂τ φ = Lφ can be solved exactly to zeroth order in τ . We assume that φ is given by this zeroth other solution multiplied by an arbitrary function , where we must have (0, x, x ) = 1 for consistency. The resulting expression for φ is called the heat kernel expansion and for x, x ∈ Rn , is given by    g(x ) d 2 (x, x ) 1/2 (τ, x, x ), P (x , x) (x, x ) exp − (23) φ(τ, x, x ) = 4τ (4π τ )n/2 where here d(x, x ) is the distance function from x to x associated with the metric g, and       n x i (24) P (x , x) = exp − A = exp − Ai dx . C

x i=1

Methodol Comput Appl Probab

Here C is a minimizing oriented geodesic from x = x(0) to x = x(t) parametrized by arclength. In the one dimensional setting, such a geodesic always exists, although this issue is more subtle in higher dimensions cf. Forde (2011). Finally,   ∂ 2 d 2 (x, x ) 1 det − (x , x) =  , (25) ∂xi ∂xj g(x)g(x ) is known as the van-Vleck-Morette determinant. If we substitute this ansatz into the previous equation, then after simplification, we find that must satisfy   ∂ 1 (26) + (∇ i σ )∇i − P −1 −1/2 L 1/2 P (τ, x, x ) = 0, ∂τ τ with initial condition (0, x, x ) = 1. Now assume that is given by a formal power series in τ , ∞  (τ, x, x ) = ak (x , x)τ k . (27) k=0

Next, let

σ (x, x )

0=

∞ 

=

d(x, x )2 /2,

kak τ k−1 +

k=0

=

∞ 

kak τ k−1 +

k=1

=

n 

∞ 

i=1 k=0

k=0

[(∇ i σ )∇i ak ]τ k−1 −

n  ∞ 

[(∇ i σ )∇i ak ]τ k−1 −

i=1 k=0 i

(∇ σ )(∇i a0 )τ

−1

i=1

and substitute into Eq. 26 to find

n  ∞ 

+

∞ 

P −1 −1/2 L 1/2 P ak−1 τ k−1

k=1



k=1

∞ 

P −1 −1/2 L 1/2 P ak τ k

kak +

n 

(∇ σ )∇i ak − P i

−1



−1/2

L

1/2

P ak−1 τ k−1 .

i=1

(28) Now the coefficients of the different powers of the τ i must vanish identically. The initial condition (0, x, x ) = 1 together with ∇i a0 = 0 require that a0 = 1. We find that the rest of the ak are given by a recursive hierarchy of differential equations kak + d∂d ak − P −1 −1/2 L 1/2 P ak−1 = 0, ∇i σ

(29)

where here we use = and define a directional derivative by  n i σ )∇ = d∇ (∇ ≡ d∂ . One can integrate this system to find an iterative formula i ∇d d i=1 for the ak ,  1 d k−1 P −1 (x , x) −1/2 L 1/2 P (x , x)ak−1 , k ≥ 1. (30) ak (x , x) = k d C d∇ i d

The goal of heat kernel perturbation theory is to attempt to evaluate or approximate (usually by the tractable diagonal coefficients ak (x, x)) the ak integrals in a manner such that the resulting explicit form for φ approximates the true solution of Eq. 5 to a high degree of accuracy for a desired domain of model parameters. Computing ak exactly is generally only possible in the simplest geometries (M, g) for dimensions n ≥ 2. However, when n = 1, these reduce to integrals over R and are calculable in a wide variety of models. We now restrict our attention to the case of one dimensional models. Specifically, we consider a driftless local volatility model of the form dSt = C(St )dWt ,

S(0) = S0 ,

(31)

Methodol Comput Appl Probab

where we take C : R+ → R to be at least a C 2 function. The transition density equation for this model is given by 1 φτ = C(α)2 φαα . (32) 2 Applying the PDE/geometry correspondence, note that the single component of the inverse 2 metric is given by g αα = C(α) √ /2. Thus√the metric and the square root of its determinant are just gαα = 2/C(α)2 and g(α) = 2/C(α). Using this, we can compute the single Christoffel symbol 1 1 ∂C(α) α

αα = g αα ∂α gαα = − , (33) 2 C(α) ∂α which is just the component of the one form −d ln C(α). Next, we find Aα = 1 1 α αα α 2 gαα g αα = 2 αα and   S   S   P (α, S) = exp − Aα dα = exp d ln C(α) . (34) α

Evaluating √ the integral, we find that P (α, S) = P (S, α) = C(α)/C(S). We can further compute

P −1 g P + 2P −1 g αα Aα ∂α P =



α

C(S)/C(α), which in turn implies

 1 2CC − (C )2 . 8

(35)

√  S du which parameterizes geodesics on (R, g) by Next, define a coordinate s = 2 α C(u) arclength. Changing coordinates allows us to see that  2 ∂s 2 gαα = gss = gss , (36) ∂α C(α)2 from which we note that the line element is given by ds 2 = gαα dα 2 ,

(37)

i.e. in the s coordinate g is just the standard Euclidean metric on R. In particular, this implies that (S, α) = 1 which considerably simplifies the computation of the ak . We now summarize the heat kernel ansatz in one dimension in a form that is expressed solely in terms of operators and functions on R, namely,   ∞  C(α) d 2 (S, α)  φ(τ, α, S) = exp − ak (S, α)τ k , (38) 4τ 2π τ C(S)3 k=1

where the distance function d is d(S, α) =

√  2 α

S

du . C(u)

(39)

Actually, the true distance function associated to g is given by taking the absolute value of the above; however, we omit this since C(u) will be a positive function because is corresponds to the volatility of the stochastic process in Eq. 31. The ak are given by the integrals   α   1 d k−1 P (S, α)−1 ∂α g(α)g αα ∂α (P (S, α)ak−1 (S, α)) ak (S, α) = − k d S   (40) + 2P (S, α)−1 g(α)g αα Aα ∂α (P (S, α)a2 (S, α)) dα.

Methodol Comput Appl Probab

We can represent the first heat kernel coefficient in the following convenient way, √  α 2 (C (u))2 du. C (u) − a1 (S, α) = − 4d S 2C(u)

(41)

Assuming one can evaluate this integral (which can be done for a wide range of functions C), then one can insert the result into the ak formula and attempt to compute a2 . Although one can typically compute the a1 integral exactly, computing the higher ak is a potentially difficult task. If we are able to compute ak for a given local volatility model, we will say that we have constructed a k-th order approximation formula. We now turn to several examples starting with the CEV model.

3 Examples 3.1 The CEV Model We first consider the Constant Elasticity of Variance (CEV) model. Here, the asset dynamics are given by β dSt = σ St dWt , S(0) = S0 , (42) where we fix β ∈ (0, 1) and σ > 0; here we restrict to the case where β < 1 to ensure that the CEV process is martingale. This model can be thought of as the natural interpolation between the normal Bachelier (β → 0) and lognormal Black-Scholes-Merton (β → 1) models. We note that when β = 1, one can compute the ak exactly and in fact is able to invert the heat kernel expansion series and recover the exact well known transition density for a lognormal process. The solutions of the CEV SDE fall into two classes depending on whether β ∈ (0, 1/2) or β ∈ [1/2, 1). If the former case, it was shown by Feller in (1951) that the level S = 0 is attainable and one needs to specify whether this boundary is absorbing (meaning that the process remains trivial after hitting the zero level) or reflecting. If β ∈ [1/2, 1), then the boundary is always absorbing. Also, if β ≥ 1, the zero level is not attainable so there is no need to consider spatial boundary conditions (see Brecher and Lindsay (2010) for a survey of the CEV process). We now compute the geometric quantities that are relevant to this model. First, we note the volatility function is given by  αα = σ 2 α 2β where α = S0 and μα = 0; where the transition density PDE takes the form 1 φτ = σ 2 α 2β φαα , φ(0, α, S) = δ(S − α). (43) 2 This PDE has an exact solution given by  2(1−β) 2(1−β)    1 −2β √ (Sα)1−β S2 +α φ(τ, S, α) = (1−β)σ , (44) α exp − S 2(1−β) I 1 2τ 2σ 2τ 2 2 (1−β) σ τ 2(1−β)

where here Iν (x) is the modified Bessel function of the first kind which is defined by ∞  x ν  (x 2 /4)k Iν (x) = , (45) 2 k! (ν + k + 1) k=0

where here (x) is the standard Gamma function  ux−1 e−u du.

(x) = R+

(46)

Methodol Comput Appl Probab

One can verify that φ is the true transition density of a CEV process directly by substituting φ into Eq. 43 and using the modified Bessel function identity 1 ∂Iν (x) = (Iν−1 (x) + Iν+1 (x)) , (47) ∂x 2 which can be iterated to compute the required second derivative formula.  Note that, depending on β, we may have φdS < 1 due to the potential absorbing boundary condition, i.e. probability mass is lost. In such a case, the real transition probability density can be constructed by adding an appropriately normalized delta function centered about the origin to φ. However, when the volatility σ is small and the initial asset price α  > 0, the probability that any price path hits the zero level is also small and as a result, i.e. 10, σ = 0.3, and β = 0.6, we can numerically R+ φ ≈ 1. For instance when α = 1, t =  when we keep all the same integrate the exact density function to find R+ φ ≈ 0.950384;  parameters but lower the volatility to σ = 0.2, we find R+ φ ≈ 0.999232. √ Now the inverse metric is given by g αα = σ 2 α 2β /2; thus gαα = 2/σ 2 α 2β and g = √ α = −β/α, with associated connection A = 2/σ α β . The lone Christoffel symbol is αα α −β/(2α). Next, we note that   σ2 α Q = g αα ∂α Aα − αα Aα + A2α = − β(β − 2)α 2β−2 . (48) 8 Using this, we find that   S    S   β 1 β S 2 P (α, S) = exp − Aα dα = exp , (49) dα = 2 α α α α whose reciprocal is P (S, α) = (α/S)β/2 . From this, we can calculate

P −1 (S, α) g P (S, α) =

σ 2 β(3β − 2) 2(β−1) α 8

2P −1 g αα Aα ∂α P = −

σ 2 β 2 2(β−1) α , 4

(50) (51)

so that the sum is given by σ 2 β(β − 2) 2(β−1) α . 8 In order to use this to construct a1 (S, α), we need the arclength coordinate √  S √   2 du 2 s(α) = = S 1−β − α 1−β . β σ α u σ (1 − β) √ Note that ∂s/∂α = − 2/σ α β , so that gss = (∂α/∂s)2 gαα = 1. Also note that √ ∂s 2 ds = dα = − β dα. ∂α σα With these formulas at hand, we can calculate  β(β − 2)σ 2 d 2(β−1) α ds a1 (S, α) = 8d 0 √  2βσ (β − 2) α β−2 β(β − 2)σ 2 (Sα)β−1 . =− α dα = 8d 8 S

P −1 g P + 2P −1 g αα Aα ∂α P =

(52)

(53)

(54)

(55)

(56)

Methodol Comput Appl Probab

Next, using Eq. 40 we can compute (β − 2)(3β − 4)(3β − 2) (Sα)2β−2 . 128 Moreover, we can keep calculating to find that the ak are determined recursively by a2 =

ak+1 σ2 [(2k + 1)β − (2k + 2)][(2k + 1)β − 2k](Sα)β−1 . = ak 8(k + 1) Thus the resulting CEV transition density is given by   √ g(S) d(α, S)2  φ(τ, α, S) = √ P (S, α) exp − ak τ k 4τ 4π τ k   ∞   (S 1−β − α 1−β )2  α β/2 1 exp − ak τ k . = √ β 2σ 2 (1 − β)2 τ σ S 2π τ S

(57)

(58)

(59)

(60)

k=0

We now comment on the non-convergence of this series. Note that if we apply the ratio series convergence test, we find    ak+1    = ∞, lim (61) k→∞  a  k

except in the BSM case where β = 1. Hence the series will always diverge for β = 1. One might expect this fact is grounds for rejecting these approximation formulas. However, for β ≈ 1, they approximate the true density quite well when the series is truncated after the first few ak , as we will demonstrate in a few examples. However, if one takes k large, this approximation becomes increasingly inaccurate. We now let φn denote the n-th partial sum of Eq. 60 and will look at two examples which compare approximations of the transition density against the exact solution. We will also compare the approximations to one given in Labord`ere (2005) in the CEV case which approximates a1 and a2 by their diagonal elements; we denote this second-order approximation by φH L . First, we set the model parameters to α = 1, T = 10, σ = 0.3, and β = 0.6. In Fig. 1, we plot the exact transition density in red, the approximation φH L in orange, our zeroth order correction in green, and first-order correction in light blue. We note that all the graphs are virtually indistinguishable for S > 0.5. For small values, e.g. S ∈ [0.1, 0.5], the first and second-order approximations are closer to the exact transition density function than φH L . Now for very low values of S, all the approximations break down. We demonstrate the degeneration of φk more clearly in our second example in Fig. 2. Here we plot, in decreasing order, the exact solution φ, alongside our approximations φ1 , φ2 , φ3 , φ5 , φ10 , φ20 , φ30 , and φ50 in red, orange, green, teal, blue, purple, light purple, red, and orange. Thus as n grows large, the associated φn approximations degenerate. Also note that φn+1 < φn pointwise, which can straightforwardly be deduced from the form of Eq. 60. In addition, whenever one finds a value where the transition density is accurately approximated by some φn , say at S = S0 , then the associated approximating function also gives an accurate approximation of the exact transition density for all S > S0 . One could exploit this fact in practice by using a Monte Carlo method to check the accuracy of the approximation at some S value, and if it is established that the approximation is indeed good, it can then be safely used for larger S values. We finally comment on the potential implications that these results may have for the SABR model. In Paulot (2007), Paulot has computed an explicit expression for a0 and

Methodol Comput Appl Probab 1.0

0.9

0.8

0.7

0.6

0.5

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 1 Here we plot the exact CEV transition density φ from Eq. 44 in red together with φH L in orange alongside the approximations φ0 and φ1 from Eq. 60 in green and light blue, respectively, for model parameter values α = 1, T = 10, σ = 0.3, and β = 0.6. The domain is the final value of the asset process, given an initial value of α, and the range is the probability density that the process evolves to this value

an analogous expression for a1 which involves a numerical integration (although with a substantial computational effort, one may be able to construct an explicit formula for a1 ). He then constructs three approximations for the implied volatility smile; to our knowledge, these are currently the best explicit approximation formulas for the implied volatilities of highly out of the money options in the SABR model. In an example, Paulot demonstrates that his second-order formula degenerates for options with very low strike to a much greater degree than his first-order approximation. Since the SABR model reduces to the CEV model as the volvol constant ν → 0, we suspect that this degeneration is an extension of the previously mentioned effect; namely, higher order correction terms of the CEV transition density cause the approximation to increasingly degenerate for low strikes. Thus it would not seem pertinent to compute a third order heat kernel correction for SABR (although such a computation probably is not realistic due to its complexity).

3.2 Quadratic Local Volatility Model We now turn to the quadratic local volatility model. Here we assume that the local volatility function is quadratic in the asset’s price process. In particular, we assume that the dynamics are given by dSt = (a + bSt + cSt2 )dW,

S(0) = S0 ,

(62)

where here a, b, and c, are taken to be constants. The properties of this SDE have been studied in Andersen (2011) and Z¨uhlsdorff (2002); the author also gives explicit formulas for

Methodol Comput Appl Probab

0.8

0.6

0.4

0.2

1

2

3

4

5

0.2

0.4

Fig. 2 Here we plot in decreasing order the exact transition density φ, and the approximations φ1 , φ2 , φ3 , φ5 , φ10 , φ20 , and φ50 from Eq. 60 in red, orange, green, teal, blue, purple, light purple, and orange, for the parameter values α = 1, T = 10, σ = 0.3, and β = 0.6. The domain is the final value of the asset process, given an initial value of α, and the range is the probability density that the process evolves to this value

European call prices when the underlying evolves with quadratic dynamics. The transition density PDE for this model is given by 2 1 ∂τ φ = (63) a + bα + cα 2 ∂α2 φ, φ(0, α, S) = δ(S − α). 2 It is possible to construct an exact solution of this equation using the heat kernel expansion method. To demonstrate this, we first note that the volatility function is given by 2 2    = a + bα + cα 2 . The single component of the metric is gαα = 2/ a + bα + cα 2 , and the Christoffel symbol is a rational function b + 2cα . a + bα + cα 2 Since the drift is trivial,the connection is just  α /2 and Q = b2 − 4ac /8. Aα = αα Using these expressions, we can calculate   a + bS + cS 2 a + bα + cα 2 P (α, S) = , P (S, α) = . a + bα + cα 2 a + bS + cS 2 α

αα =−

(64)

(65)

The distance function is given by      √ 2 2 arctan √b+2cS 2 − arctan √b+2cα 2 4ac−b 4ac−b d(α, S) = , (66) √ 4ac − b2 √ where the discriminant 4ac − b2 = 0. When the discriminant vanishes, one needs to take the limiting case of the above expression. Even though the distance function takes

Methodol Comput Appl Probab

a somewhat complicated form, a1 and a2 can be computed exactly and can be expressed simply as  2 1 1  2 a1 (S, α) = − b2 − 4ac , a2 (S, α) = (67) b − 4ac , 8 128 or more generally, k (−1)k  2 (68) b − 4ac . ak = 3k k!2 Now, we can recognize the sum in the heat kernel expansion as an exponential function and invert the heat kernel coefficient series to find that  k   ∞ ∞   1 ac b2 ac b2 k k τ . (69) ak τ = τ = exp − − k! 2 8 2 8 k=0

k=0

Therefore the exact transition density is given by   √ g(S) d(α, S)2  φ(τ, α, S) = √ P (S, α) exp − ak τ k 4τ 4π τ k      ac b2 a + bα + cα 2 d 2 (α, S) τ , + − exp − = 4τ 2 8 2π τ (a + bS + cS 2 )3

(70)

(71)

which one can readily check solves Eq. 63. Moreover, we note that this solution satisfies the boundary values of Eq. 63. As τ → 0, then φ → 0 if α = S since the exponential √ dominates the square root in the limit. If α = S, then d(α, S) = 0 and therefore φ ≈ 1/ τ as τ → 0, so φ → ∞, which when formalized shows that φ is a delta function initially. The above is an example of the heat kernel method at its best. The heat kernel ansatz turned out to be a very good choice in the sense that we were able to determine a significant portion of the content of the functional form of the transition density from the zeroth order prefactor. As a result, the ak took particularly simple forms, and the perturbative formalism produced an exact factor.

3.3 Cubic Local Volatility Model We now consider a local volatility model whose instantaneous volatility function is given by a cubic polynomial. We consider this model for two purposes. First, we wish to illustrate how the computation of the ak becomes significantly more difficult in this polynomial model. Second, this model has applicability in computing a third order correction of a general local volatility model that depends on a small parameter, ν. Specifically, in certain cases, it could be useful to expand an instantaneous volatility function as a third-order Taylor series in ν and study the associated stochastic process. The asset dynamics for this model are given by dSt = (St − a)(St − b)(St − c)dWt ,

S(0) = α.

(72)

The metric is gαα = (α − a)(α − b)(α − c)/2, and the corresponding Christoffel symbol is   1 1 1 1 α . (73)

αα = + + 2 a−α b−α c−α From this, we can deduce the form of P ,   (a − S)(b − S)(c − S) 1/2 P (α, S) = . (a − α)(b − α)(c − α)

(74)

Methodol Comput Appl Probab

The distance function is given by

d(S, α) =

√ 2 (a−b)(a−c)(b−c)

 ln

S−a α−a

b−c 

S−b α−b

c−a 

S−c α−c

a−b  .

(75)

The a1 integral can be computed exactly, but its expression is quite long and we do not write it here. Instead, we will compare the first-order approximation formula which involves a1 with Monte Carlo results in the next section.

3.4 Affine-Affine Short Rate Model We now turn our attention to developing a first order approximation formula for the transition density of a model we refer to as the affine-affine short rate model. Let rt represent a short rate whose evolution is governed by an SDE of the form drt = (a + brt )dt + (c + drt )dWt ,

r(0) = r0 ,

(76)

where here a, b, c, and d are constants. We note that we could eliminate one of a or c by shifting r by an appropriate constant; however, we keep the above form for the sake of symmetry. In the case that d = 0, this reduces to the Vasicek model, (cf. Andersen and Piterbarg (2010) and Brigo and Mercurio (2006) for a discussion of short rate models). Although we will refer to this as a short rate model in light of this reduction, it can also be viewed as an asset model. The drift is μα = a + bα, and the instantaneous volatility function is σαα = (c + dα). Hence  αα = (σαα )2 = (c+dα)2 , and thus the metric is given by g = 2 −1 = 2/(c+dα)2 , α = 1 g αα ∂ g whose Christoffel symbol is αα α αα = −d/(c+dα). We next need to determine 2 the connection Aα as well as Q. These are given by Aα =

2a − cd + (2b − d 2 )α , 2(c + dα)2

4a 2 + c2 (4b + d 2 ) + 2cd 3 α + (d 2 − 2b)α 2 − 8a(d(c + dα) − bα) . 8(c + dα)2 Using this we find that Q(α) =

      c+dr  12 − db2 (bc−ad)(r−α) P (α, r) = exp − αr Ar dr = c+dα exp d(c+dr)(c+dα) .

(77)

(78)

(79)

Again we can compute (80) 2P −1 (r, α) g P (r, α) + 2P −1 g αα Aα ∂α P = −Q. √ Now, from the metric, we see that ds = − 2/(c + αd)dα. We finally need to compute the distance function for this metric in order to compute the ai . To do this, we define an arc-length coordinate √   √  r dα 2 c + rd , (81) s(α) = 2 = ln d c + αd α c + αd Again, in this coordinate the metric is just given by gss = (∂r/∂s)2 grr = 1. Thus our distance function is √   c + rd 2 . (82) ln d(α, r) = d c + αd

Methodol Comput Appl Probab

Using this, we can compute a somewhat involved explicit formula for a1 given by     1 2d(bc − ad)(α − r) b(2c2 + 4d 2 rα + 3cd(r + α) + d a(2c + d(r + α)) A    c + dr , (83) − 4d(c + dr)(c + dα)) + (d 2 − 2b)2 (c + dr)2 (c + dα)2 ln c + dα

a1 (r, α) = −

 c + dr . (84) c + dα We are able to compute an analytic expression for a2 as well, but it is quite long and we refrain from writing it for this reason. We will use the second-order formula in the subsequent section. 

where

A = 8d 2 (c + dr)2 (c + dα)2 ln

3.5 A Generalized CEV Model We now consider another short rate model which can roughly be characterized as a generalized CEV model with mean reversion. Specifically, the process reverts to the mean at a rate proportional to rta . The dynamics for this model are given by   β (85) drt = k θ − rta dt + σ rt dWt , r0 = α. Just as in the CEV case, the volatility function, metric, and Christoffel symbol for this α = −β/α, respectively. The model are given by  αα = σ 2 α 2β , gαα = 2/(σ 2 α 2β ), and αα connection takes the form   1 2k (θ − α a ) β , (86) − Aα = 2 α σ 2 α 2β and Q=



1 8σ 2 α 2(1+β)

 4k 2 α 2 (α a − θ )2 − σ 4 α 4β (β − 2)β − 4kσ 2 α 1+2β (aα a + 2β(θ − α a ) . (87)

Next, we compute       α β/2 k r 1+a−2β − α 1+a−2β k r 1−2β − α 1−2β P (r, α) = exp − + . r σ 2 (1 + a − 2β) σ 2 (1 − 2β)

(88)

The distance function for this metric is the same as in the CEV case. Thus we can use this to compute a1 (r, α) =

 4 σ β(β − 2)(rα β − r β α) (β − 1)(rα)β 2 β β rα(β − 1) 8σ (r α − rα )  a β  4kσ 2 a(r α − r β (α a − 2θ) − 2α β θ) + 2β(r β (α a − θ) − r a α β + α β θ)) (β − a)(rα)β  1+2a−3β  r − α 1+2a−3β 2θ(r 1+a−3β − α 1+a−3β ) θ 2 (r 1−3β − α 1−3β ) − + . (89) +4k 2 1 + 2a − 3β 1 + a − 3β 1 − 3β

+

We are unable to compute a2 exactly. We note that there are several values of the model parameters where the formula for a1 is not defined; one should take necessary limits of the above expression, if possible, to construct corresponding formulas.

Methodol Comput Appl Probab

In Fig. 3, we plot several first-order approximations for the transition density where we let a be in the set {−0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0}, and fix the other model parameters to be α = 1, θ = 1.1, k = 0.5, T = 1, σ = 0.3, and β = 0.6. The maximum of each graph increases with a. When a = −0.5, we note that the graph hits the zero level near r = 0.35. In fact, the plot is negative for all prior r values, hence this particular approximation is not useful. This behavior is generic for all negative a.

4 Numerical Tests We now consider approximation formulas for the cubic, affine-affine, and generalized CEV models and compare them to Monte Carlo results to demonstrate their accuracy for specific sets of model parameters. Our aim here is only to give a single example for each model in order to demonstrate that the formulas work quite well for certain parameter regimes. All heat kernel approximation formulas lose accuracy for large maturity values, and thus our approximation formulas will degenerate as τ becomes large. A more thorough error analysis is required to establish the limitations of these formulas. Our numerical simulation will be the same in each of the three cases. In particular, fix a set of model parameters and then discretize the local volatility SDE using the Euler scheme. We next fix an initial asset price α and a time step dt = 0.01, along with other model parameters. We then, evolve α to a future time T using the stochastic difference equation and then store the result. We perform 105 runs in each case and plot the results in a blue

2.5

2.0

1.5

1.0

0.5

0.0

0.5

1.0

1.5

2.0

Fig. 3 Plots for r ∈ [0, 2] of cross sections of the first-order density function of the affine-affine model for a = {−0.5, 0.0, 0.5, 1, 1.5, 2.0, 2, 5, 3.0} where we take α = 1, θ = 1, k = 0.5, T = 1, σ = 0.3, and β = 0.6

Methodol Comput Appl Probab

4000

3000

2000

1000

0.95

1.00

1.05

1.10

Fig. 4 This is histogram for the transition density of the cubic local volatility model with parameters given by τ = 1, a = 0.5, b = 0.75, c = 0.8, σ = 0.1, and α = 1. We plot the first-order density approximation in red

histogram. We then plot a suitably normalized version of the model’s approximation formula on top of the histogram in red to compare how well it approximates the Monte Carlo transition density. In Fig. 4, we consider the cubic local volatility model with parameters τ = 1, a = 0.5, b = 0.75, c = 0.8, σ = 0.1, and initial asset value α = 1.

3000

2500

2000

1500

1000

500

1

0

1

2

3

Fig. 5 This is a histogram for the transition density of the Affine-Affine short rate model with parameters α = 1, a = 0.5, b = −0.1, d = −0.2, and τ = 1. In red we plot our second-order approximation formula

Methodol Comput Appl Probab 3500

3000

2500

2000

1500

1000

500

0.0

0.5

1.0

1.5

2.0

Fig. 6 This is a histogram for the simulated transition density of the generalized CEV model with parameters α = 1, θ = 1.1, k = 0.5, a = 0.5, τ = 1, σ = 0.3, and β = 0.6. In red, we plot our first-order approximation formula

Note that the density is slightly skewed to the left of the initial asset price. This is due to the fact that the cubic local volatility model has a tendency to pull the initial asset price towards the greatest root c = 0.8. We next consider the affine-affine model in Fig. 5. We fix model parameters a = 0.5, b = −0.1, d = −0.2, τ = 1, with an initial asset price of α = 1. Here the drift term pushes the asset’s distribution function to the right of its original value with a fair amount of dispersion. In order to demonstrate a second order application of the perturbation theory methods that we consider, we plot our second-order density approximation in red and note it agrees well with the numerical result. We finally turn to the generalized CEV model in Fig. 6. Here, we fix model parameters θ = 1.1, k = 0.5, a = 0.5, τ = 1, σ = 0.3, β = 0.6, and initial asset price α = 1. Here the distribution remains centered around the initial asset value, and is slightly skewed to the right.

5 Conclusion We have reviewed the construction of the heat kernel ansatz for the fundamental solution of a general parabolic PDE which represents the joint transition density of n coupled It¯o processes. We then restricted this ansatz to the one dimensional local volatility model setting and expressed it explicitly in terms of the instantaneous volatility function and model parameters. Next, this ansatz was used to construct all the heat kernel coefficients for the CEV and quadratic volatility models. In the former case, we found that higher order heat kernel coefficient corrections to the transition density caused the associated sequence of approximation

Methodol Comput Appl Probab

formulas to degenerate. We were also able to construct the exact transition density of the quadratic volatility model. We then turned to three new local volatility models, namely, the cubic, affine-affine, and generalized CEV models. We demonstrated how the form and computation of low order heat kernel coefficients is quite complicated in these models. We then constructed first and second-order approximation formulas and compared them against Monte Carlo results to demonstrate the accuracy of the new density approximation functions. These formulas seem to be quite robust across the model parameter space for maturity times near one year. Finally, we note that one can use these approximation formulas to price derivatives whose underlying assets follow such a process. Specifically, if one calibrates a stochastic process to market data, then one can first approximate the transition density of this process. Then using this density, one can price a derivative whose underlying evolves by this process by computing the integral of the payoff function of the derivative multiplied by the approximate transition density of the process. It would be interesting to examine how error propagates through different stages of this construction as well as how the error influences the final price of the derivative. Acknowledgments The first author would like to thank Martin Forde, Fabio Mercurio, and Louis Paulot, for conversations that enhanced this work as well as Bloomberg L.P. and the Frankfurt School of Finance and Management for creating an environment conducive to research. The work of Jan Vecer was supported in part by grant GACR 13-34480S. All authors would like to thank the referees for their constructive comments related to improving the content and exposition of this article.

References Andersen LBG, Piterbarg VV (2010) Interest rate modeling volumes 1,2, and 3. Atlantic Financial Press Andersen LBG (2011) Option pricing with quadratic volatility: a revisit. Finance Stoch 15:191–219. Available at SSRN: http://ssrn.com/abstract=1118399 Avramidi IG (2007) Analytic and geometric methods for heat kernel applications in finance. Available at http://infohost.nmt.edu/iavramid/notes/hkt/hktutorial13.pdf Brecher DR, Lindsay AE (2010) Results on the CEV Process, Past and Present. Working Paper. Available at http://ssrn.com/abstract=1567864 Brigo D, Mercurio F (2006) Interest Rate Models? Theory and Practice: With Smile, Inflation and Credit, 2nd. Springer Finance Black F, Scholes M (1973) The pricing of options and corporate liabilities. J Political Econ 81(3):637–654 Dupire B (1994) Pricing with a smile. Risk 7(1):18–20 Feller W (1951) Two singular diffusion problems. Ann Math, Second Ser 54(1):173–182 Forde M (2011) Exact pricing and large-time asymptotics for the modified sabr model and the brownian exponential functional. Int J Theor Appl Finan 14:559 Forde M (2013) The large maturity smile for the SABR and CEV-Heston models. Int J Theor Appl Finan 16(8) Gatheral J, Hsu EP, Laurence PM, Ouyang C, Wang TH (2012) Asymptotics of implied volatility in local volatility models. Math Finance 22(4):591–620 Hagan P, Lesniewski A, Woodward D (2005) Probability Distribution in the SABR Model of Stochastic Volatility. Working Paper available at http://lesniewski.us/papers/working/ProbDistrForSABR.pdf Labord`ere PH (2005) A General Asymptotic Implied Volatility for Stochastic Volatility Models. Working Paper available at ArXiv:cond-mat/0504317 Labord`ere PH (2008) Analysis, Geometry, and Modeling in Finance: Advanced Methods in Option Pricing. Chapman and Hall/CRC Financial Mathematics Series Lesniewski A (2002) WKB Method for Swaption Smile http://lesniewski.us/papers/presentations/ Courant020702.pdf Medvedev AN (2004) Asymptotic Methods for Computing Implied Volatilities Under Stochastic Volatility. Working Paper available at http://ssrn.com/abstract=667281

Methodol Comput Appl Probab Merton R (1973) Theory of rational option pricing. Bell J Econ Manag Sci (The RAND Corporation) 4(1):141–183 Pagliarani S, Pascucci A (2012) Analytical approximation of the transition density in a local volatility model cent. Eur J Math 10(1):250–270 Paulot L (2007) Asymptotic Implied Volatility at the Second Order with Application to the SABR Model. Working paper available at http://ssrn.com/abstract=1413649 Z¨uhlsdorff C (2002) Extended Libor Market Models with Affine and Quadratic Volatility. Bonn Econ Discussion Papers