Waterloo, Ontario, CANADA. September 6, 2005. Abstract: In recent years, constructions based on Brownian bridge [11], prin- cipal component analysis[1], and ...
To appear in Monte Carlo and Quasi-Monte Carlo Methods 2002, H. Niederreiter editor, Springer-Verlag, 2004
Minimizing Effective Dimension using Linear Transformation Junichi Imai Iwate Prefectural University, Faculty of Policy Studies, 152-52, Takizawa-aza-sugo, Takizawa, Iwate, JAPAN.
Ken Seng Tan University of Waterloo, Department of Statistics and Actuarial Science, University Avenue West, Waterloo, Ontario, CANADA. September 6, 2005
Abstract: In recent years, constructions based on Brownian bridge [11], principal component analysis[1], and linear transformation [7] have been proposed in the context of derivative pricing to further enhance QMC through dimension reduction. Motivated by [16, 18] and the ANOVA decomposition, this paper (i) formally justifies the dimension minimizing algorithm of Tan and Imai [7], and (ii) proposes a new formulation of linear transformation which explicitly reduces the effective dimension (in the truncation sense) of a function. Another new application of LT method to an interest rate model is considered. We establish the situation for which linear transformation method outperforms PCA. This method is not only effective on dimension reduction, it is also robust and can easily be extended to general diffusion processes.
1
1
Introduction and Motivation
In recent years, quasi-Monte Carlo (QMC) methods have been gaining popularity in the area of computational finance. This is attributed to their overwhelming success when applied to pricing complex derivative securities, even in cases of very high dimension. These results have presented something of a puzzle in the fields of computational finance and numerical analysis due to the apparent conflicting conclusion. QMC attains a convergence rate of O(N −1 logs N ) in dimension s, which is better than Monte Carlo (MC) when N grows exponentially with dimension s. Hence the theoretical higher asymptotic convergence rate of QMC is not achievable for practical applications, particularly for large s. This is supported by the empirical evidence in non-finance applications; for example, [2] concludes that QMC offers no practical advantage over MC even for problems with dimensions as low as 12. Nevertheless when the same method is used to price derivative securities, a much better empirical rate of convergence relative to MC is observed, even for dimensions that are of several hundreds! Several explanations have been offered to reconcile the conflicting results. Based on the notion of tractability and strong tractability, Sloan and Wo´zniakowski [14, 15] showed that there exists a QMC algorithm for which the curse of dimensionality is not present in some weighted function classes. Papageorgiou [13] and Owen [12] demonstrated the superiority of QMC in some isotropic integrals. Moskowitz and Caflisch [11] and Caflisch, Morokoff and Owen [3] used the concept of effective dimension to argue that the efficiency of QMC ties to the effective dimension and finance related problems are typically of low effective dimension. More recently, several techniques including those based on the Brownian bridge (Moskowitz and Caflisch [11]), the principal component analysis (Acworth, Broadie and Glasserman [1]), and the linear transformation (Imai and Tan [7]) have been proposed to further enhance QMC through dimension reduction. Motivated by Sobol’ [16] and Wang and Fang [18], this paper formally justifies the linear transformation approach introduced in [7]. Based on the ANOVA decomposition, this paper also proposes another formulation of the linear transformation. A new application of this method on interest rate model is considered. The paper is organized as follows. Section 2 reviews the concept of effective dimension and explores its connection with the QMC integration errors. Section 3 discusses the dimension reduction techniques associated with derivative pricing. Section 4 describes the new formulation of the linear transformation and its efficiency relative to the existing methods is assessed numerically. Section 5 considers a new application of the linear transformation on pricing interest rate dependent securities. Section 6 concludes the paper.
2
2
Effective Dimension
This section describes the ANOVA (“analysis of variance”) approach of decomposing an integrand f (x), x ∈ [0, 1)s , into a sum of simpler functions as well as its connection with effective dimensions. For detailed description on ANOVA decomposition for functions, see [5]. We begin by introducing some notation. The set A = {1, 2, . . . , s} denotes the coordinates axes of [0, 1)s . Then for any subset u ⊆ A, we define |u| as its cardinality and A − u as its complementary set. Also, let [0, 1)u denote the |u|-dimensional unit cube with coordinates in u and xu be the |u|-dimensional coordinate projection of x with component in u. Under the mild conditions that f (x) is a square integrable function, the ANOVA decomposition expresses the integrand f (x) as a sum of 2s additive functions as follows: X f (x) = fu (x), (1) u⊆{1,2, ...,s}
where the function fu , which depends only on the components of x in the set u, is defined recursively by Z X fu (x) = f (x)dxA−u − fv (x) (2) [0,1)A−u
v u
R with the usual convention that f∅ (x) = [0,1)s f (x)dx = I(f ). The ANOVA deR composition is orthogonal in that fu (x)fv (x)dx = 0, for u 6= v. Let σ 2 (f ) and σu2 (f ) denote the variance ofR f and ¡ fu , respectively. Formally, Rthese two quantities are defined as σ 2 (f ) = [0,1)s f (x) − I)2 dx, and σu2 (f ) = [0,1)u [fu (x)]2 dx, for |u| > 0, respectively. Furthermore, σ∅2 = 0. Based on the ANOVA decomP position (1), an alternate way of computing σ 2 (f ) is via σ 2 (f ) = |u|>0 σu2 (f ). Associated with σu2 (f ), we define Du as the total variance corresponding to the subset u; i.e., X σv2 (f ). (3) Du = v⊆u
The above quantity can be computed explicitly (see [16]) as Z Du = f (x)f (xu , y A−u )dxdy A−u − [I(f )]2 ,
(4)
[0,1)2s−|u|
where x = (xu , xA−u ) and y = (y u , y A−u ). In the context of QMC, it is important to distinguish between the notion of nominal and effective dimensions of a function. When a function f (x) depends on s variables, it is typically said to have a nominal dimension s whereas its effective dimension can be quite small. Motivated by the ANOVA decompositions, [3] introduced two definitions of effective dimension: The effective dimension of f , in the superposition sense, P (or simply the superposition dimension) is the smallest integer dS such that |u|≤dS σu2 (f ) ≥ pσ 2 (f ). The effective dimension
3
of f , in the truncation sense, (or simply the truncation dimension) is the smallest integer dT that satisfies D{1,2, ...,dT } (f ) ≥ pσ 2 (f ). The critical level p can be arbitrarily but is usually close to one. Essentially, the truncation dimension indicates the number of important variables which predominantly captures the given function f . The superposition dimension, on the other hand, measures to what extent the low-order ANOVA terms dominate the function. Low effective dimension occurs naturally in finance problems. For example, mortgage-backed securities depend on the contingent cash flows each month over the next 30 years, leading to a 360 nominal dimensional problem. Its truncation dimension, however, is much smaller due to two factors: (i) the time value of money; a dollar in 30 years is worth a lot less than the same dollar in 1 year; (ii) empirical evidence indicates that majority of the cash flows occur in the initial few years. Thus the cash flows in the first few years are most important for pricing mortgage-backed securities, suggesting a relatively small effective dimension. The quadrature error in a QMC rule involving point set P = {xi }N i=1 , xi ∈ [0, 1)s is intricately dependent on the effective dimension. This follows from the following error bound: ¯ ¯ N ¯1 X ¯ X ¯ ¯ f (xi ) − I ¯ ≤ DN,u (Pu )||fu || (5) ¯ ¯N ¯ i=1 |u|>0
where Pu is the projection of the point set P on [0, 1)|u| , DN,u (Pu ) is the discrepancy corresponding to Pu of N points, and ||fu || is the variation of fu . See [6] for various suitable choices of discrepancy and variation. The bound in (5) explicitly associates the QMC error with the uniformity of all the projections Pu as well as all the low-dimensional structures fu . The significance of this error bound is that QMC relies on specially constructed sequences known as low discrepancy sequences. These sequences are deterministic and are designed to have greater uniformity than random sequences. However, for finite number of points, such “greater” uniformity is not preserved for all dimensions and for all projections. It is well known that (see [10]) as dimension increases, the uniformity of low discrepancy sequences decreases. Nevertheless, as argued in [18] that QMC can still be more effective than MC, particularly on problems with low truncation dimension. This can be justified by decomposing the bound (5) as ¯ ¯ N ¯1 X ¯ ¯ ¯ f (xi ) − I ¯ ¯ ¯N ¯ i=1
≤
X u⊆{1, ...,dT }
X
DN,u (Pu )||fu || +
DN,u (Pu )||fu ||,
(6)
u∩(A−{1, ...,dT })6=∅
assuming the truncation dimension of f is dT . Note the role of dT in the above representation. When dT is small, the discrepancies of all the low-dimensional 4
projections of low discrepancy point sets Pu are much smaller relative to those of the random point sets. This implies that the first summation in (6) is much smaller for QMC than for MC. As we further increase the dimension, the uniformity of the low discrepancy point sets deteriorates, which implies that for higher values of ||u||, DN,u (Pu ) of QMC can be larger than MC . Yet the second summation in (6) can be insignificant since the quantities ||fu || are often small. The overall effect is that for QMC, both terms on the right-hand side of (6) can be small, justifying that QMC can be more effective than MC when f has low truncation dimension.
3
Dimension Reduction Techniques
In this section, we consider the simulation approach for pricing path-dependent options on a portfolio of assets. Using the Black-Scholes framework, we assume that the risky assets follow a multivariate geometric Brownian motion and that their dynamics under the risk-neutral world (i.e., Q measure) are given by the following stochastic differential equations: dSi (t) = rSi (t)dt + σi Si (t)dWi (t), i = 1, 2, . . . , m,
(7)
where Si (t) denotes the i-th asset price at time t, r is the risk-free interest rate, σi is the volatility for i-th asset, and {Wi (t), t ≥ 0} is a standard Brownian motion corresponds to the price process of asset i. Furthermore, assets i and k are instantaneously correlated such that cov(dWi , dWk ) = ρik dt. By partitioning time T into n equal fixed time intervals of length ∆t (i.e. ∆t = T /n and tj = j∆t) and by defining Σ as an m × m covariance matrix with (Σ)ik = ρik σi σk ∆t, i, k = 1, . . . , m, we generate an mn × mn matrix Σmn from Σ according to Σ Σ ··· Σ Σ 2Σ · · · 2Σ Σmn = . .. .. . .. .. . . . Σ
2Σ · · ·
nΣ
Then the solution to (7) is given by 2
Si (tj ) = Si (0)e(r−σi /2)tj +Zi (tj ) ,
(8)
where (Z1 (t1 ), Z1 (t2 ), . . . , Zm (tn )) ∼ N (0, Σmn ). Let φ(S1 (t1 ), . . . , Sm (tn )) represent the payoff at maturity T of an option which depends on the realized correlated asset prices at time tj , j = 1, . . . , n. It follows from the fundamental risk-neutral valuation principle that the time-0 value of the option can be expressed as an expectation as (see [4]) EQ [e−rT φ(S1 (t1 ), . . . , Sm (tn ))],
5
where EQ [·] represents an expectation under the risk-neutral measure Q. For example for the discretely sampled arithmetic average basket (or Asian basket) call option, we have m X n X φ(S1 (t1 ), . . . , Sm (tn )) = max wij Si (tj ) − K, 0 (9) i=1 j=1
Pm Pn where i=1 j=1 wij = 1, {t1 , t2 , . . . , tn = T } are the n discretized time points at which the asset prices are sampled, and K is a fixed strike price. Hence the time-0 value of an Asian basket option, c0 , becomes c0 erT Z =
( ) m X n mn X X max exp µ ˜ij + C˜(j−1)m+i,k Φ−1 (zk ) − K, 0 dz
[0,1)mn
i=1 j=1
(10)
k=1
³ ´ σ2 where µ ˜ij = log(wij Si (0))+ r − 2i tj , Φ(·) is the cumulative standard normal ˜ which satisfies distribution, C˜ij is the (i, j) entry of the decomposed matrix C ˜C ˜ 0 = Σmn . C
(11)
We now make the following remarks: • The payoff structure (9) is quite general and encompasses many types of exotic options. When m = 1 and n > 1, the option reduces to an Asian option which depends only on one underlying asset. In this case, the covariance matrix simplifies to Σmn = {min(tj , tk )σ 2 }nj,k=1 . With n = 1 and m > 1, the resulting option is known as the basket option and depends on the terminal prices of a portfolio of assets. • The nominal dimension associated with pricing multi-factor path-dependent options can be very large. For instance, an Asian option on a basket of 10 assets with daily observations over 1 year period gives rise to 2500 nominal dimensions (assuming 250 trading days per year). ˜ as long as it sat• Integral (10) yields the same solution for any matrix C isfies (11). This is a consequence of a property that a normal distribution is completely characterized by its mean and variance. This has important implications on both MC and QMC. While the nominal dimension of the function is not affected by the choice of the covariance matrix decomposition, the effective dimension is not invariant to the decomposed matrix. In ˜ can lead to a substantial reduction fact a judicious choice of the matrix C in the effective dimension of the integrand in (10). This in turn potentially translates into a significant increased in efficiency of QMC over MC, as argued in (6). 6
Three approaches have been proposed attempting to enhance QMC through dimension reduction. These methods are known as the Brownian bridge construction [11], the principal component analysis (PCA) construction [1], and the linear transformation (LT) [7]. Here we consider only the latter two methods since the Brownian bridge construction has only been proposed in the context of one-dimensional Brownian motion. Also, [1] indicates that this technique is less efficient relative to that based on PCA. We now describe various constructions in terms of the decomposed matrix ˜ First, the conventional way of simulating the correlated Brownian motions C. ˜ ≡ C Ch that is based is equivalent to resorting to a lower triangular matrix C on Cholesky decomposition of Σmn . We refer this method as the standard or Cholesky construction. For PCA, the corresponding decomposed matrix is obtained as: ³p ´ p ˜ ≡ C PCA = C λ1 v 1 , . . . , λmn v mn where λ1 ≥ · · · ≥ λmn are the eigenvalues of Σmn and v k is the unit-length eigenvector corresponds to the eigenvalue λk . More recently, [7] propose another method of reducing the effective dimension by explicitly exploiting the flexibility of choosing the decomposed matrix ˜ By considering the following class of transformation C. ˜ ≡ C LT = C Ch A, C
(12)
where A is an orthogonal matrix; i.e., AA0 = I for identity matrix I, A is optimally obtained so as the effective dimension of the problem of interest is minimized. This method is referred to as the linear transformation or simply the LT construction. Note that C LT (C LT )0 = Σmn . The LT-based method yields the following properties: (i) Under a particular formulation of the optimization algorithm for obtaining A, LT construction is equivalent to the PCA-based approach. (ii) Under LT construction, the truncation dimension of the linear combination of normal random variables is minimized. See [7] for further discussion on property (i). Here we examine property (ii) in greater details. Let f (z) be a linear combination of the normal random variables: s X f (z) = wi zi (13) i=1
where z ∼ Ns (µ, Σ) and constants wi , i = 1, . . . , s. If C denotes the decomposed matrix of Σ and use the notation that C ·j ∈