WHY ARE HIGH-DIMENSIONAL FINANCE ... - Semantic Scholar

WHY ARE HIGH-DIMENSIONAL FINANCE PROBLEMS OFTEN OF LOW EFFECTIVE DIMENSION? Xiaoqun Wang1,2 and Ian H. Sloan2 1

Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China 2

School of Mathematics, University of New South Wales, Sydney 2052, Australia

ABSTRACT Many problems in mathematical finance can be formulated as highdimensional integrals, where the large number of dimensions arises from small time steps in time discretization and/or a large number of state variables. Quasi-Monte Carlo (QMC) methods have been successfully used for approximating such integrals. To understand this success, this paper focuses on investigating the special features of some typical high-dimensional finance problems, namely option pricing and bond valuation. We provide new insight into the connection between the effective dimension and the efficiency of QMC, and present methods to analyze the dimension structure of a function. We confirm the observation of Caflisch, Morokoff and Owen that functions from finance are often of low effective dimension, in the sense that they can be well approximated by their low-order ANOVA (analysis of variance) terms, usually just the order-1 and order-2 terms. We explore why the effective dimension is small for many integrals from finance. By deriving explicit forms of the ANOVA terms in simple cases, we find that the importance of each dimension is naturally weighted, by certain hidden weights. These weights characterize the relative importance of different variables or groups of variables, and limit the importance of the higher-order ANOVA terms. We study the variance ratios captured by low-order ANOVA terms and their asymptotic properties as the dimension tends to infinity, and show that with the increase of dimension the lower-order terms continue to play a significant role and the higher-order terms tend to be negligible. This provides some insight into high-dimensional problems from finance and explains why QMC algorithms are efficient for problems of this kind. Key Words: quasi-Monte Carlo, effective dimension, option pricing, bond valuation. 2000 Mathematics Subject Classification: 65C05, 65D30, 91B28. Email addresses: [email protected], [email protected].

1

1

Introduction

Many problems in mathematical finance can be formulated as high-dimensional integrals, or after appropriate transformations as integrals over the d-dimensional unit cube, Id (f ) =

Z [0,1]d

f (x)dx.

For large d it is well known that classical methods are not feasible, because of the curse of dimension, but it is by now also well known that Monte Carlo (MC) methods (see Boyle et al. 1997) and quasi-Monte Carlo (QMC) methods (see Paskov and Traub 1995) can provide powerful tools for approximating such integrals. The MC estimate in its simplest form is n 1X Qn,d (f ) = f (xi ), n i=1

xi ∈ [0, 1]d ,

where the points x1 , . . . , xn are independent and identically distributed (i.i.d.) random samples from the uniform distribution on [0, 1]d . The QMC methods take the same form, but now with x1 , . . . , xn chosen deterministically, so as to yield better uniformity than the random samples. The apparent success in the 1990s of QMC methods for some high-dimensional problems in finance (Paskov and Traub 1995) was a cause for surprise, given that the error bound given by the Koksma-Hlawka inequality (Niederreiter 1992) is of order O(n−1 (log n)d ) as n increases, which for large d increases with increasing n unto we reach astronomically large values of n. Some other research also showed the high efficiency of QMC methods for highdimensional integrals in finance (see Acworth et al. 1997, Joy et al. 1996, Ninomiya and Tezuka 1996). Sparse grids may also have good performance for finance applications (see Gerstner and Griebel 2003). It is by now well accepted, following the work of Caflisch et al. (1997), Imai and Tan (2004), Owen (2003), Paskov (1997), Wang and Fang (2003), that the qualitative explanation lies in the fact that the effective dimension is small, even though the nominal dimension is very large. (Two notions of effective dimension, namely the truncation and superposition dimensions, are defined precisely in Section 2.) The main purpose of this paper is to explore why the effective dimension is small for many integrals from finance. We do this by studying some simple model problems in finance, and exploring what happens (as far as possible by analytic means) as the nominal dimension d approaches ∞. There are two main sources of the high dimensionality of d: one is time discretization (for example, a path-dependent option pricing model can be thought of as arising from discretization of an underlying continuous process), the other is the existence of a large number of different assets or securities. In the case of time discretization, the number of time steps (and hence the dimension) increases as the time step shortens, but the problem as it relates to a single time step becomes in a certain sense simpler. For a simplified option pricing situation (geometric Asian call option with zero strike price) we show that the superposition dimension actually has a limit as d → ∞ (see Theorem 5.1). In this situation the high value of d becomes in a sense irrelevant. For non-zero but small strike prices the situation is little changed. For multiasset options the situation can be even more favorable, with the superposition dimension 2

approaching 1 as d → ∞ in the case of zero strike price. It is natural to ask whether the property of low superposition dimension is common for finance problems. We thus study one more problem, namely the bond valuation (see Theorem 5.2). Since the main focus of this paper is directed towards the possible special properties of finance problems and the reasons for such properties, we restrict to simple models where closed solutions may also exist. Explaining the surprisingly good performance of QMC methods for some high-dimensional integrals is a challenging problem. A possible theoretical explanation is given in Sloan and Wo´zniakowski (1998) by introducing weighted function classes, where the variables are given weights depending on their importance. A number of works have been devoted to studying the (strong) tractability of multivariate integration, as well as constructing algorithms that achieve the corresponding (strong) tractability error bounds (see Hickernell and Wang 2002, Wang 2003, and Sloan et al. 2003). This paper is organized as follows. After introducing the ANOVA decomposition and effective dimensions in Section 2, we discuss the relationship of effective dimension to the integration error, and perform a numerical experiment that illustrates the relationship. In Section 3 we discuss methods to analyze the effective dimensions and variance ratios. In Section 4, we consider the pricing of path-dependent and multi-asset options, and show numerically that these nominally high-dimensional problems are often low-dimensional, in the sense that the functions are dominated by their lower-order ANOVA terms (usually just the order-1 and order-2 terms). In Section 5 we come to the main results of the paper, where we investigate why high-dimensional option pricing and bond valuation problems are often of low superposition dimension: we do this by identifying the inherent weights in the problems that control the relative importance of different (groups of) variables and by studying the variance ratios and their asymptotic properties (as d → ∞). These weights often become smaller as the nominal dimension d increases, offsetting the increased inherent difficulty caused by the high dimensionality. The problems of non-smoothness involved in the integrands are also discussed. Concluding remarks are presented in the last section.

2 2.1

The effective dimensions of functions The definitions of effective dimensions and variance ratios

Let A = Ad = {1, 2, . . . , d}. For any subset u ⊆ A, let |u| denote its cardinality and let A − u denote its complementary set in A. Consider a square-integrable function f defined on [0, 1]d . We say that the expansion f (x) =

X

fu (x)

(1)

u⊆A

is an ANOVA decomposition of f if for each fu with ∅ = 6 u ⊆ A the following is satisfied: 1

Z 0

fu (x)dxj = 0 if j ∈ u.

It is easy to show that f∅ = Id (f ) and that fu (x) can be determined recursively by fu (x) =

Z [0,1]d−|u|

f (x)dxA−u −

X v⊂u

3

fv (x) for ∅ = 6 u ⊆ A,

(2)

where the last sum is over strict subsets v of u. Each term fu expresses the cooperative contribution of a distinct group Rof variables to f . It follows from (2) that the ANOVA decomposition (1) is orthogonal: [0,1]d fu (x)fv (x)dx = 0 whenever u 6= v. Therefore, σ 2 (f ) =

X

σu2 (f ),

∅6=u⊆A

where σ 2 (f ) = Id (f 2 ) − [Id (f )]2 is the variance of f and σu2 (f ) = [0,1]d [fu (x)]2 dx is the variance of fu (for |u| > 0). The order of the ANOVA term fu is |u|, the cardinality of u. Let u be a non-empty subset of A. The variance corresponding to u and all its subsets is defined as X Tu (f ) := σv2 (f ). (3) R

∅6=v⊆u

The function f has effective dimension dt in the truncation sense (or truncation dimension or ‘TD’ in short) if T{1,...,dt } (f ) ≥ p σ 2 (f ); it has effective dimension ds in the superposition sense (or superposition dimension or ‘SD’) if X σu2 (f ) ≥ p σ 2 (f ), 0 0. 5

(5)

The parameter a measures how large the QMC error for h is in comparison to the MC error for h. In general, a is small. In our experience, for common low discrepancy point sets and for n in the thousands, it is common that a < 1/10 or even a < 1/100. Now consider the second term in the right hand side of (5). Since 2 2 IE[|Id (f − h) − QMC n,d (f − h)| ] = σ (f − h)/n,

where QMC n,d denotes a quadrature with random nodes x1 , . . . , xn , Chebyshev’s inequality yields n √ o λ (x1 , . . . , xn ) : |Id (f − h) − QMC ≥ 1 − b−2 . n,d (f − h)| ≤ b σ(f − h)/ n Here λ is the Lebesgue measure and b is an arbitrary positive number. For instance, with b = 5, the probability of the quadratures achieving √ |Id (f − h) − QMC n,d (f − h)| ≤ 5 σ(f − h)/ n is at least 0.96. Thus the error√of an arbitrary realization of random quadratures would not be much worse than σ(f − h)/ n. Experience suggests that a QMC rule rarely gives a much worse result than MC. Suppose that for some b > 0, √ |Id (f − h) − Qn,d (f − h)| ≤ b σ(f − h)/ n. ¿¿From the analysis above, we have q √ √ √ √ |Id (f ) − Qn,d (f )| ≤ aσ(h)/ n + bσ(f − h)/ n = a ρ + b 1 − ρ σ(f )/ n.

√ As a benchmark, we recall that the root mean square of the MC error is σ(f )/ n. Thus the relative QMC error with respect to MC error depends on the following: • How well does QMC work for h or f − h with respect to MC? This is characterized by a or b, respectively. • How well can f be approximated in L2 -norm by h? (or how small is the variance of f − h with respect to that of f ?) This is measured by ρ. To get a feeling of the QMC error with comparing to MC error, consider two cases: √ 1 • A normal case: a = 100 , b = 1, ρ = 0.99, then |Id (f ) − Qn,d (f )| ≤ 0.110 σ(f )/ n. • An optimistic case: a =

1 ,b 1000

√ = 1, ρ = 0.999, then |Id (f ) − Qn,d (f )| ≤ 0.003σ(f )/ n.

Effective dimensions can be viewed as measures of how well f can be approximated by a P sum of lower-dimensional functions. For example, let h(x) := |u|≤ds fu (x), where ds is the SD, then by definition ||f − h||22 ≤ (1 − p)σ 2 (f ) = 0.01σ 2 (f ). If ds is small (say, ds ≤ 2) then even if the TD dt is large we have good reason to expect superiority of QMC over MC. 6

It is useful to perform an experiment to see how the QMC error depends on effective dimensions. Consider the functions f (x) =

d Y

1 + aτ j (xj − 1/2) ,

(6)

j=1

where a and τ are parameters, which control the effective dimension (a mainly controls the SD, while τ mainly controls the TD). Obviously, Id (f ) = 1 for all a, τ and d. The effective dimensions can be easily determined by the method in Wang and Fang (2003). Table 2.1 presents the effective dimensions and the root mean square errors √ (RMSE) of MC and QMC in dimension d = 50. In MC, the RMSE is computed as σ(f )/ n. In QMC, the sequence of Sobol (1967) is used and the RMSE is computed based on 30 random shifts. The ratios of RMSE of QMC with respect to that of MC are also given. The dependence of QMC error on the effective dimensions is clear from the comparisons. In summary, the superiority of QMC is observed for two classes of functions: • The class of functions with small TD (see the columns with τ = 0.1 and τ = 0.5 in Table 2.1). The efficiency of QMC for this class of functions is especially high. • The class of functions with small SD (especially if the order-1 part plays the major role). In this case the TD can be large (see the row with a = 0.1 in Table 2.1). Note that when τ = 1, the function f depends equally on the variables, but different orders of ANOVA terms may have quite different contribution to f . We do not observe the superiority of QMC if both the TD and SD are large (say, dt > 30 and ds > 5). In fact, in such cases (see the case of a = 10 and τ = 1), both MC and QMC may give unreasonable results (this can be explained by the large variance: σ 2 (f ) ≈ 3.18 × 1048 for a = 10, τ = 1 and d = 50). There is no sense to compare their RMSE in this case. Note that it is shown in Owen (2002) that low superposition dimension is necessary for QMC to be much better than MC with practical sample size n.

a = 0.1

(dt , ds ) MC QMC

a=1

(dt , ds ) MC QMC

a = 10

(dt , ds ) MC QMC

τ = 0.1 (2, 1) 2.27e-5 1.82e-7 (0.008) (2, 1) 2.27e-4 1.83e-6 (0.008) (2, 1) 2.27e-3 1.98e-5 (0.009)

τ = 0.5 (4, 1) 1.30e-4 1.38e-6 (0.011) (4, 1) 1.31e-3 1.64e-5 (0.013) (5, 3) 1.66e-2 1.21e-3 (0.073)

τ = 0.8 (11, 1) 3.01e-4 8.47e-6 (0.028) (11, 2) 3.10e-3 6.75e-4 (0.218) (17, 8) 3.12e-1 2.78e-1 (0.891)

τ = 0.9 (22, 1) 4.66e-4 4.29e-5 (0.092) (23, 3) 5.05e-3 4.24e-3 (0.840) (39, 15) 3.51e+1 2.69e+1 (0.769)

τ =1 (50, 2) 1.61e-3 6.71e-4 (0.416) (50, 9) 5.73e-2 8.00e-2 (1.400) (50, 49) 1.39e+22 4.27e+20 −−−

Table 2.1. The TD and SD (dt , ds ), the RMSE for function (6) in dimension d = 50, n = 214 . The ratios of RMSE of QMC with respect to the RMSE of MC are in the parentheses. 7

3 3.1

Computing the effective dimensions Estimating the effective dimensions and variance ratios

In some specific cases, simple formulas are available for computing the effective dimensions and variance ratios, but in the general case numerical algorithms have to be used. A numerical algorithm to determine the TD of an arbitrary function is given in Wang and Fang (2003). For any fixed set u ⊆ A, write x = (xu , xA−u ) and y = (yu , yA−u ). The variance corresponding to u and all its subsets defined in (3) can be expressed as (see Sobol 2001) Tu (f ) =

Z [0,1]2d−|u|

f (x)f (xu , yA−u )dxdyA−u − [Id (f )]2 .

(7)

The TD can be determined by computing Tu (f ) with u = {1, . . . , `} for ` = 1, 2, . . .. Determining the SD of an arbitrary function is much more difficult, since we have to compute all the variances σu2 (f ) for subsets u up to some order. In principle, this can be done as follows. For |u| = 1, σu2 (f ) = Tu (f ), so it can be computed directly by (7). For 2 |u| = 2 and say u = {i, j}, σ{i,j} (f ) can be estimated from 2 2 2 T{i,j} (f ) = σ{i} (f ) + σ{j} (f ) + σ{i,j} (f ),

where T{i,j} (f ) can be computed by (7). Continuing this process, all the variances σu2 (f ) for subsets u up to some order can be computed. For large d, such an approach is computationally expensive. Moreover, for large |u| (say, |u| ≥ 3) there is often a loss of accuracy. If f is a symmetric function, the computation can be simplified, since now σu2 (f ) depends on u only through |u|. For large d, one strategy to verify whether a function has low SD is to compute the variance ratios of low-order ANOVA terms, such as the order-1 and order-2 terms. An alternative is to compute the mean dimension in superposition sense (see Owen 2003, Liu and Owen 2003).

3.2

A special class of functions

Computing Tu (f ) is an important step in determining both the TD and SD. The dimension of the resulting integrals for Tu (f ) can be very large. We show that such integrals can be reduced to 3-dimensional integrals in some cases. In fact, some finance problems are related to the computations of integrals of the form (see next section for examples): J :=

Z IR

max(0, eF (z) − K)pd (z)dz d

with

F (z) =

d X

aj zj ,

(8)

j=1

where aj , K are constants and 

pd (z) = (2π)−d/2 exp −

8

d 1X

2 j=1



zj2 

(9)

is the density of the standard normal distribution. The integral (8) can be written as a d-dimensional integral over the unit cube [0, 1]d : J=

Z

f (x)dx with f (x) = max(0, exp(F (Φ−1 (x1 ), . . . , Φ−1 (xd ))) − K),

[0,1]d

2

x where Φ(x) := √12π −∞ e−t /2 dt is the normal probability integral and Φ−1 is its inverse. The integral (8) can be reduced to a one-dimensional integral, thereby allowing a useful test of the efficiency of d-dimensional QMC algorithms for such integrals. For the vector P z = (z1 , . . . , zd )T with the density function pd (z), the variable dj=1 aj zj is a normal random P P variable with mean zero and variance b2 := dj=1 a2j . Put w := b−1 dj=1 aj zj , so that w is a standard normal random variable. Then we can write

R

J=

Z

∞

max(0, e

−∞

bw

− K)p1 (w)dw =

Z

1

max(0, ebΦ

−1 (x)

− K)dx.

0

A similar method can be used to transform the integral involved in Tu (f ) (see (7)). P Indeed, for any given nonempty set u ⊆ A, put b2u := j∈u a2j . According to Sobol’s formula (7), we have 2

Tu (f ) + [Id (f )] =

Z

1

Z

0

0

1

Z

1

F u (x, y)F u (x, z)dxdydz.

0

where F u (x, y) = max(0, exp[bu Φ−1 (x) + bA−u Φ−1 (y)] − K). Thus the integrals involved in Tu (f ) can be reduced to 3-dimensional integrals. This reduces the computational cost significantly. This method will be used in next section.

4

High-dimensional finance problems are often of low effective dimension

In this section we investigate empirically the nature of option pricing problems by estimating the SD (superposition dimension). The TD (truncation dimension) for these problems has been computed in Wang and Fang (2003). It turned out that the TD is almost as large as the nominal dimension d (if the Brownian motion is generated by the standard method), and thus the success of QMC in this case cannot be explained by a small TD.

4.1

Path-dependent options

Consider the pricing of a path-dependent option with payoff g(St1 , . . . , Std ), where St1 , . . . , Std are the prices of the underlying asset at times t1 , . . . , td . Suppose the prices are sampled at equally spaced times t0 = 0, tj = tj−1 + ∆t, j = 1, . . . , d, ∆t = T /d, where T is the expiration date. For simplicity of presentation, assume that under risk-neutral measure (i.e., equivalent martingale measure) the underlying asset follows geometric Brownian motion: dSt = rSt dt + σSt dBt , 9

(10)

where r is the risk-free interest rate, σ is the volatility and Bt is the standard Brownian motion. The analytical solution to (10) is St = S0 exp((r − σ 2 /2)t + σBt ).

(11)

Based on the risk-neutral valuation (see Hull 2001), the value of the option at t = 0 is IE[e−rT g(St1 , . . . , Std )], where IE[·] is the expectation under the risk-neutral measure. Let (y1 , . . . , yd )T := (Bt1 , . . . , Btd )T . It is normally distributed with mean zero and covariance matrix V = (min(ti , tj ))di,j=1 . Defining µj = log S0 + (r − σ 2 /2)tj , from (11) we have Stj = exp(µj + σyj ). The payoff can be written as g(St1 , . . . , Std ) = g(eµ1 +σy1 , . . . , eµd +σyd ) =: H(y). The value of the option at time t = 0 can then be written as IE[e−rT g(St1 , . . . , Std )] =

Z Z e−rT − 12 yT V −1 y −rT √ dy = e H(Az)pd (z)dz, H(y)e IRd (2π)d/2 det V IRd

where a linear transformation (y1 , . . . , yd )T = A(z1 , . . . , zd )T is introduced using an arbitrary d × d matrix A satisfying AAT = V . This change of variables can be interpreted as a covariance matrix decomposition V = AAT or as a generation of the Brownian motion: (Bt1 , . . . , Btd )T = A(z1 , . . . , zd )T ,

(12)

where zj ∼ N (0, 1), j = 1, . . . , d, are i.i.d. standard normal variables. The standard construction (or sequential sampling) generates the Brownian motion sequentially in time: √ B0 = 0, Btj = Btj−1 + ∆t zj , zj ∼ N (0, 1), j = 1, . . . , d. (13) Consider Asian call options based on the geometric or arithmetic average of the underQ P 1/d lying asset. Their terminal payoffs are max(0, dj=1 Stj − K) or max(0, d1 dj=1 Stj − K), respectively, where K is the strike price at T . For the geometric average, we have d Y j=1

where Ak =

Pd

j=1



1/d Stj



d d σX σX = exp m + yj  = exp m + Ak zk , d j=1 d k=1

!

ajk , with ajk the elements of A, and d 1X 1 σ2 m := md = µj = log S0 + (r − )(T + ∆t). d j=1 2 2

(14)

Thus for the geometric average case the resulting integral IE[e−rT g(St1 , . . . , Std )] has the special form (8), excluding a constant factor. The computation of the variance ratios can be significantly simplified as shown in Section 3.2. 10

We will also consider other path-dependent options: look-back and barrier options. For a European look-back call option, the payoff is: g(St1 , . . . , Std ) = ST − min{St1 , . . . , Std }. For a European down-and-out call option, the payoff is (with barrier Ba ): (

g(St1 , . . . , Std ) =

max (0, ST − K) , if for all j = 1, . . . , d, Stj > Ba , 0, if for some j = 1, . . . , d, Stj ≤ Ba .

In our investigations the Brownian motion is generated by the standard construction (13). All integrals involved in the computations of the variance ratios are approximated by QMC based on Sobol points with n = 218 . In Tables 4.1 and 4.2 we present the computational results for the variance ratios captured by order-1 and order-2 ANOVA terms. (Due to integration and rounding errors, in some cases 100R(1) (f )+100R(2) (f ) > 100.) For geometric Asian options (Table 4.1), we observe that for K = 90 and K = 100 the order-1 and order-2 terms capture more than 99% of the total variance; for K = 110, the percentage is more than 94%. The strike price K has some effects on the variance ratios: when K increases, R(1) (f ) decreases, while R(2) (f ) increases; however, their sum is always close to 1. The computations of the variance ratios for the other options are more difficult, since the method in Section 3.2 can not be used. We have to use the method in Section 3.1 and must restrict the value of d. We observe a similar phenomenon for these options (Table 4.2). Dimension

K = 90 order-1 order-2 95.58 4.00 95.38 4.40 95.26 4.78 95.12 5.05

8 16 32 64

K = 100 order-1 order-2 85.00 13.74 83.20 15.60 82.22 17.81 81.63 19.40

K = 110 order-1 order-2 65.79 29.93 60.76 33.99 57.99 37.71 56.46 40.38

Table 4.1. The variance ratios (in percentage) captured by order-1 and order-2 ANOVA terms for geometric Asian options: S0 = 100, σ = 0.2, r = 0.1, T = 1 year. Options

Dimension

Arithmetic Asian option Look-back option

4 8 16 4 8 16 4 8 16

Barrier option

K = 90 order-1 order-2 96.16 3.56 95.12 4.99 94.71 5.10 92.68 6.12 89.49 6.59 88.56 7.21 92.70 6.17 91.90 4.87 90.75 4.04

K = 100 order-1 order-2 88.52 10.67 84.53 16.12 82.22 17.30 92.68 6.12 89.49 6.59 88.56 7.21 81.09 16.49 76.61 19.38 74.31 21.56

K = 110 order-1 order-2 74.89 22.90 65.02 35.17 60.10 38.42 92.68 6.12 89.49 6.59 88.56 7.21 60.94 32.97 47.94 41.49 46.37 43.72

Table 4.2. The same as Table 4.1, but for arithmetic Asian options, look-back options and barrier options (Ba = K − 10). Look-back options do not depend on the strike price.

11

4.2

Multi-asset options

Consider a European multi-asset derivative with terminal payoff φ(ST1 , . . . , STd ), where ST1 , . . . , STd are the prices of d risky assets at time T . Assume that in risk-neutral measure the prices (St1 , . . . , Std ) of the assets satisfy dStj = rStj dt + σj Stj dBtj , j = 1, . . . , d, where σj are volatilities and Bt1 , . . . , Btd are correlated Brownian motions with correlations ρij . The solutions to the stochastic differential equations are given by Stj = S0j exp((r − σj2 /2)t + σj Btj ), j = 1, . . . , d.

(15)

Note that the random vector (BT1 , . . . , BTd )T =: (y1 , . . . , yd )T is normally distributed with mean zero and covariance matrix (16) Σ = (ρij T )di,j=1 . Put νj := log S0j +(r− 12 σj2 )T . Since STj = eνj +σj yj , the payoff can be written as φ(ST1 , . . . , STd ) = φ(eν1 +σ1 y1 , . . . , eνd +σd yd ) =: G(y). Therefore, the current price of the derivative security can be expressed as Z −rT 1 d −rT IE[e φ(ST , . . . , ST )] = e G(Az)pd (z)dz, d IR

T

where A is any real d × d matrix satisfying AA = Σ. This matrix A corresponds to a method of generating the correlated Brownian motions: (BT1 , . . . , BTd )T = A (z1 , . . . , zd )T . The standard method corresponds to the Cholesky decomposition of the matrix Σ = AAT . This is the choice we use for our analysis and numerical computations. For a European call option on the geometric average over the d assets, we have 







d 1X φ(ST1 , . . . , STd ) = max 0, exp  (νj + σj yj ) − K  . d j=1

(17)

Just the same as for the geometric Asian option, the corresponding integral also has the special form (8), allowing a faster computation of the variance ratios. For a call options on the P arithmetic average over the d assets, the payoff is φ(ST1 , . . . , STd ) = max(0, d1 dj=1 STj − K). In Tables 4.3 and 4.4 we present the computational results. The conclusion is similar to the cases of path-dependent options. For K = 90 and K = 100, the order-1 and order-2 terms capture over 99% of the total variance; For K = 110, the percentage is larger than 97%. A new feature is that as the nominal dimension increases, the sum of the variance ratios of the order-1 and order-2 may increase (for example, for the cases of K = 90 and 100). We also observe that regarding the variance ratios, there is not much difference between the geometric and arithmetic Asian options (though their analytical tractability is different).

12

Dimension 8 16 32 64

K = 90 order-1 order-2 97.34 2.45 97.63 2.51 97.74 2.80 97.72 2.22

K = 100 order-1 order-2 90.03 9.11 90.13 9.23 90.22 9.43 90.22 9.71

K = 110 order-1 order-2 74.38 23.07 72.84 24.68 72.04 26.82 71.56 28.60

Table 4.3. The variance ratios (in percentage) captured by the order-1 and order-2 ANOVA terms for geometric multi-asset options: S0j = 100, σj = 0.2, ρij = 0.3(i 6= j), r = 0.1, T = 1. Dimension 4 8 16

K = 90 order-1 order-2 98.19 1.36 97.52 3.36 97.23 3.64

K = 100 order-1 order-2 92.96 5.95 91.91 7.99 91.12 8.63

K = 110 order-1 order-2 81.94 16.11 77.62 22.19 75.57 23.38

Table 4.4. The same as Table 4.3, but for arithmetic multi-asset options.

5

Why are high-dimensional finance problems often of low effective dimension?

Very high-dimensional problems often occur in finance because of small time increments and/or a large number of assets or other state variables. In the previous section we found numerically that these nominally high-dimensional problems are often of low SD (even if dimension reduction techniques are not used). Here we try to investigate theoretically the reason for this phenomenon, i.e., why the effective dimension is small. It is important to know what controls the complexity of the problems, and how the quantities relating to dimension (e.g. variance ratios and effective dimensions) change as the number of time steps and/or the number of risky assets increases (or even tends to infinity). We also explore how the model parameters affect the dimension structure of the integrands. We consider not only option pricing problems but also bond valuation problems. Note that our purpose is not to solve the pricing or valuation problems themselves under complicated models. We are trying to understand the reason for the special property (e.g. low SD) under simplified models or situations where closed solutions may also exist.

5.1

Option pricing

Consider first the case of a geometric Asian option. In the standard construction of Brownian motion (13), the generating matrix A in (12) is just the Cholesky decomposition of the covariance matrix V = (min(ti , tj ))di,j=1 : 

√ A=

  ∆t   

0 ··· 1 ··· .. . . . . 1 1 ··· 1 1 .. .

13

0 0 .. . 1

   .  

Let Aj be the sum of the j-th column of A. From the discussion in Section 4.1 the price of the geometric Asian call option can be written as 

e−rT

Z



d X

max 0, exp m +

[0,1]d





aj Φ−1 (xj ) − K  dx,

(18)

j=1

where

σ d−j+1 √ d−j+1 √ √ Aj = σ ∆t = σ T , j = 1, . . . , d. (19) d d d d For a given T , the nominal dimension d is inversely proportional to the time step ∆t (since d = T /∆t): the smaller the time step ∆t, the larger the dimension d. The integrand in (18) depends on the parameters aj (the “weights”), which in turn depend on the time step ∆t: as ∆t becomes smaller, the weights aj decrease (the weights also depends on the model parameters σ and T ). We shall show that these weights control the relative importance of different (groups of) variables and play an essential role in characterizing the complexity of the problem. These weights are not introduced artificially, but are determined by the nature of the problem and by the generation of the Brownian motion. The weights are small when the step length ∆t is small, i.e., when the dimension d is large. To see the potential influence of the parameters aj , it is convenient to consider first the extreme case K = 0. The general case K 6= 0 will be studied in the next subsection. For K = 0, the integrand in (18) can be written as (excluding a constant factor) aj =

f (x) :=

d Y

gj (xj ) with gj (xj ) = exp aj Φ−1 (xj ) .

(20)

j=1

(An arbitrary construction of Brownian motion (12) leads to the same form of function, but with different parameters aj .) By direct calculation we then have mj :=

Z

1

gj (x)dx = e

0

It follows that Id (f ) = 2

Qd

j=1

σ (f ) =

1 2 a 2 j

λ2j

and

:=

Z 0

1

2

2

(gj (x) − mj )2 dx = eaj (eaj − 1).

mj and

d Y

m2j

λ2j

+

−

j=1

d Y

m2j

2

= [Id (f )]

Pd

e

j=1

a2j

−1 .

(21)

j=1

The ANOVA terms of f and the corresponding variances (for u 6= ∅) are fu =

Y

(gj (xj ) − mj ) ·

Y

mj ,

j6∈u

j∈u

and σu2 (f ) =

Y j∈u

λ2j

Y

m2j = [Id (f )]2

j6∈u

Y

2

eaj − 1 .

j∈u

The relative importance of the term fu to f can be measured by the sensitivity index:

2

aj σu2 (f ) j∈u e − 1 Su (f ) = 2 = Pd 2 , aj σ (f ) j=1 e −1

Q

14

(22)

indicating how the parameters aj control the relative importance of different variables. The relative importance of all order-` ANOVA terms can be measured by the variance ratio (see Section 2.1): 2 P Q P aj 2 e − 1 σ (f ) |u|=` j∈u |u|=` u R(`) (f ) = = . (23) Pd 2 σ 2 (f ) e j=1 aj − 1 Before studying the relative size of R(`) (f ) and their asymptotic properties as d → ∞, we mention several important features of the parameters aj given in (19). First, from (19) we have, for fixed T , √ aj = O( ∆t) = O(d−1/2 ) as ∆t → 0, j = 1, 2, . . . . (24) It follows that

2

eaj − 1 = a2j + a4j /2 + · · · = a2j + O(d−2 ).

(25)

Second, according to (19), d X

a2j =

j=1

d σ 2 ∆t X σ2T 2 j = + O(∆t). d2 j=1 3

(26)

2

Therefore, Γ := limd→∞ dj=1 a2j = σ 3T . Based on these properties, we see from (21) that the variance of f is stable with respect to d: limd→∞ σ 2 (f ) = eΓ (eΓ − 1), implying that the nominal dimension becomes in a sense irrelevant at least for MC. Moreover, from (25) we 2 have eaj − 1 = O(∆t). Therefore, from (22) we have, for ` = 1, 2, . . ., P

σu2 (f ) = O((∆t)` ) = O(d−` ) for |u| = `. This means that a small time step ∆t leads to effects that are increasingly dominated by the lower order terms. We are particularly interested in the variance ratios captured by low-order ANOVA terms and their asymptotic properties. The variance ratio captured by order-1 terms is Pd

R(1) (f ) =

2 j=1 σ{j} (f ) σ 2 (f )

a2j j=1 (e

Pd

=

Pd

e

a2 j=1 j

− 1)

Pd

=

a2j + O(∆t)

j=1

Pd

−1

a2 j=1 j

e

=

−1

eΓ

Γ + O(∆t), −1

(27)

where we have used the relations (25) and (26). The leading term is close to 1 for usual values of the volatility σ and T (see Table 5.1). Similarly, the variance ratio captured by order-2 ANOVA terms is P

R(2) (f ) =

2 σ{i,j} (f ) = 2 σ (f )

i