Simplified vine copula models: Approximations based on the ... - arXiv

Simplified vine copula models: arXiv:1510.06971v1 [stat.ME] 23 Oct 2015

Approximations based on the simplifying assumption FABIAN SPANHEL1,∗ and MALTE S. KURZ1 1

Department of Statistics, Ludwig-Maximilians-Universit¨ at M¨ unchen, Akademiestr. 1, 80799 Munich, Germany.

E-mail: [email protected]; [email protected] In the last decade, simplified vine copula models (SVCMs), or pair-copula constructions, have become an important tool in high-dimensional dependence modeling. So far, specification and estimation of SVCMs has been conducted under the simplifying assumption, i.e., all bivariate conditional copulas of the data generating vine are assumed to be bivariate unconditional copulas. For the first time, we consider the case when the simplifying assumption does not hold and an SVCM acts as an approximation of a multivariate copula. Several results concerning optimal simplified vine copula approximations and their properties are established. Under regularity conditions, step-by-step estimators of pair-copula constructions converge to the partial vine copula approximation (PVCA) if the simplifying assumption does not hold. The PVCA can be regarded as a generalization of the partial correlation matrix where partial correlations are replaced by j-th order partial copulas. We show that the PVCA does not minimize the KL divergence from the true copula and that the best approximation satisfying the simplifying assumption is given by a vine pseudo-copula. Moreover, we demonstrate how spurious conditional (in)dependencies may arise in SVCMs. Keywords: Vine copula, Pair-copula construction, Simplifying assumption, Conditional copula, Approximation.

1. Introduction Copulas constitute an important tool to model dependence [1, 2, 3]. While it is easy to construct bivariate copulas, the construction of flexible high-dimensional copulas is a sophisticated problem. The introduction of simplified vine copulas (Joe [4]), or pair-copula constructions (Aas et al. [5]), has been an enormous advance for high-dimensional dependence modeling. Simplified vine copulas are hierarchical structures, constructed upon a sequence of bivariate unconditional copulas, which capture the conditional dependence between pairs of random variables if the data generating process satisfies the simplifying assumption. In this case, all conditional copulas of the data generating vine collapse to unconditional copulas and the true copula can be represented in terms of a simplified vine copula. Vine copula methodology and application have been extensively developed under the simplifying assumption [6, 7, 8, 9, 10], with studies showing the superiority of simplified vine copula models over elliptical copulas and nested Archimedean copulas (Aas and Berg [11], Fischer et al. [12]). ∗ Corresponding

author.

1

2

F. Spanhel and M. S. Kurz

Although some copulas can be expressed as a simplified vine copula, this is not true in general. Hobæk Haff et al. [13] point out that the simplifying assumption is in general not valid and provide examples of multivariate distributions which do not satisfy the simplifying assumption. Stöber et al. [14] show that the Clayton copula is the only Archimedean copula for which the simplifying assumption holds, while the Student-t copula is the only simplified vine copula arising from a scale mixture of normal distributions. In fact, it is very unlikely that the unknown data generating process satisfies the simplifying assumption in a strict mathematical sense. Nevertheless, to the best of our knowledge, only Hobæk Haff et al. [13] and St¨ ober et al. [14] briefly investigate the approximation of a three-dimensional copula by a simplified vine copula. Several important questions regarding simplified vine copula approximations (SVCAs) remain open up to this point: • What is the best SVCA for a given vine copula? That is, what unconditional copulas should be chosen to approximate the conditional copulas? Is it optimal to specify the true bivariate copulas in the first tree as it has been suggested by Hobæk Haff et al. [13] and Stöber et al. [14]? • What is the probability limit of simplified vine copula estimators if the simplifying assumption does not hold for the data generating process? • What are the properties of optimal SVCAs? For instance, can inference under the simplifying assumption be misleading and create the impression that two random variables are conditionally (in)dependent when actually they are not? In this study, we address these questions and establish several results concerning optimal SVCAs. We show under which conditions stepwise parametric estimators of pair-copula constructions converge to the PVCA. The PVCA is an SVCM where to any edge a j-th order partial copula is assigned. We illustrate that the PVCA can be regarded as a generalization of the partial correlation matrix and investigate several of its properties. In particular, we prove that the PVCA does not minimize the KL divergence from the true copula. This result is rather surprising, because it implies that it is not optimal to specify the true copulas in the first tree of an SVCA. That is, in order to improve the approximation of conditional copulas in higher trees, one should adequately specify “wrong” bivariate unconditional copulas in the first tree. Consequently, the approximation can be improved if one assigns pseudo-copulas to the edges of higher trees. Moreover, we show that spurious (in)dependencies can be induced in higher trees of SVCAs with the effect that a pair of random variables may be considered as conditionally (in)dependent although this is not the case. The rest of this paper is organized as follows. (Simplified) vine copulas, SVCAs, and partial copulas, are discussed in Section 2. Optimal SVCAs of three-dimensional copulas, in particular the PVCA, are investigated in Section 3. In Section 4, we discuss the role of j-th order partial copulas for higher-dimensional SVCAs and define the PVCA in the higher-dimensional case. We also reveal the consequences of using unconditional copulas as building blocks when the simplifying assumption is violated and investigate the concept of j-th order partial independence. Section 5 contains some concluding remarks. The following notation and assumptions are used throughout the paper. We write X1:K := (X1 , . . . , XK ),

Simplified vine copula models

3

Table 1. Notation for simplified D-vine copula approximations. U1:K has standard uniform margins and K ≥ 3, j = 1, K. Notation FU1:K , F1:K , or C1:K

Explanation cdf and copula of U1:K

CK

space of K-dimensional copulas with positive density

CK

space of K-dimensional simplified D-vine copulas with positive density

Uj|2:K−1

Fj|2:K−1 (Uj |U2:K−1 ) = FUj |U2:K−1 (Uj |U2:K−1 ), conditional probability integral transform (CPIT) of Uj wrt U2:K−1

C1K|2:K−1

bivariate conditional copula of FU1 ,UK |U2:K−1 , i.e., C1K|2:K−1 = FU1|2:K−1 ,UK|2:K−1 |U2:K−1

C1K; 2:K−1

arbitrary bivariate (unconditional) copula that is used to model FU1|2:K−1 ,UK|2:K−1 |U2:K−1

P C1K; 2:K−1

P partial copula of FU1 ,UK |U2:K−1 , i.e., C1K; 2:K−1 = FU1|2:K−1 ,UK|2:K−1

PVC C1K; 2:K−1

partial copula of order K − 2, a particular bivariate copula which approximates FU1|2:K−1 ,UK|2:K−1 |U2:K−1

PVC Uj|2:K−1

PVC (Uj |U2:K−1 ), (K − 3)-th order partial probability integral transform (PPIT) of Uj Fj|2:K−1 wrt U2:K−1

PVC C1:K

Partial vine copula (PVC) of the copula C1:K , if K = 3, then P cPVC 1:3 (u1:3 ) = c12 (u1 , u2 )c23 (u2 , u3 )c13; 2 (F1|2 (u1 |u2 ), F3|2 (u3 |u2 ))

IK

IK := {(i, j) : j = 1, . . . , K − 1, i = 1, . . . , K − j}, a short notation for the index set of a D-vine copula density

Sij

Sij := i + 1 : i + j − 1, a short notation for the conditioning set of an edge in a D-vine

so that FX1:K (x1:K ) := P(∀i = 1, . . . , K : Xi ≤ xi ), and dx1:K := dx1 . . . dxK to denote the variables of R integration in fX1:K (x1:K )dx1:K . C ⊥ refers to the independence copula. X ⊥ Y means that X and Y are stochastically independent. For 1 ≤ k ≤ K, the partial derivative of g wrt the k-th argument is denoted by ∂k g(x1:K ). We write 11{A} = 1 if A is true, and 11{A} = 0 otherwise. For simplicity, we assume that all random variables are real-valued and continuous. In the following, let K ≥ 3, if not otherwise specified, and CK be the space of absolutely continuous K-dimensional copulas with positive density (a.s.). We set IK := {(i, j) : j = 1, . . . , K − 1, i = 1, . . . , K − j} and Sij := i + 1 : i + j − 1. We focus on D-vine copulas, but all results carry over to regular vine copulas (Bedford and Cooke [15], Kurowicka and Joe [16]). An overview of the used notation can be found in Table 1.

2. Simplified vine copula models and partial copulas In this section, we discuss (simplified) vine copulas, the simplifying assumption, and simplified vine copula approximations. Thereafter, we introduce the bivariate partial copula which can be considered as a generalization of the partial correlation coefficient and as an approximation of a bivariate conditional copula.

4


Definition 2.1 (Simplified D-vine copula or pair-copula construction – Joe [4], Aas et al. [5]) For all (i, j) ∈ IK , let C˜i,i+j; Sij ∈ C2 with density c˜i,i+j; Sij . If j = 1, we set C˜i,i+j; Sij = C˜i,i+1 . Define F˜i|Sij (ui |uSij ) = ∂2 C˜i,i+j−1; Si,j−1 F˜i|Si,j−1 (ui |uSi,j−1 ), F˜i+j−1|Si,j−1 (ui+j−1 |uSi,j−1 ) , F˜i+j|S (ui+j |uS ) = ∂1 C˜i+1,i+j; S F˜i+1|S (ui+1 |uS ), F˜i+j|S (ui+j |uS ) . ij

ij

i+1,j−1

i+1,j−1

i+1,j−1

i+1,j−1

i+1,j−1

Then c˜1:K (u1:K ) =

Y

c˜i,i+j; Sij F˜i|Sij (ui |uSij ), F˜i+j|Sij (ui+j |uSij )

(i,j)∈IK

is the density of a K-dimensional simplified D-vine copula C˜1:K . We denote the space of K-dimensional simplified D-vine copulas by CK . From a graph-theoretic point of view, simplified (regular) vine copulas can be considered as an ordered sequence of trees, where j refers to the number of the tree and a bivariate unconditional copula C˜i,i+j; Sij is assigned to each of the K − j edges of tree j (Bedford and Cooke [15]). The bivariate unconditional copulas are also called pair-copulas, so that the resulting model is often termed a pair-copula construction (PCC). By means of simplified vine copula models one can construct a wide variety of flexible multivariate copulas because each of the K(K − 1)/2 bivariate unconditional copulas C˜i,i+j; Sij can be chosen arbitrarily and the resulting model is always a valid K-dimensional copula. Moreover, a pair-copula construction does not suffer from the curse of dimensions because it is build upon a sequence of bivariate unconditional copulas which renders it very attractive for high-dimensional applications. Obviously, not every multivariate copula can be represented by a simplified vine copula model. However, every copula can be represented by the following (non-simplified) D-vine copula. Definition 2.2 (D-vine copula – Kurowicka and Cooke [17]) Let U1:K be a random vector with cdf F1:K = C1:K ∈ CK . Ci,i+j|Sij denotes the conditional copula of Fi,i+j|Sij (Definition 2.6), and we set Ci,i+j|Sij = Ci,i+1 for j = 1. The density of a D-vine copula decomposes the copula density of U1:K into K(K − 1)/2 bivariate conditional copula densities ci,i+j|Sij according to the following factorization: Y c1:K (u1:K ) = ci,i+j|Sij Fi|Sij (ui |uSij ), Fi+j|Sij (ui+j |uSij )|uSij . (i,j)∈IK

Contrary to a simplified D-vine copula in Definition 2.1, a bivariate conditional copula Ci,i+j|Sij , which is in general a function of j + 1 variables, is assigned to each edge of a D-vine copula in Definition 2.2. In applications, the simplifying assumption is typically imposed, i.e., it is assumed that all bivariate conditional copulas of the data generating vine copula degenerate to bivariate unconditional copulas. Definition 2.3 (The simplifying assumption – Hobæk Haff et al. [13]) The D-vine copula in Definition 2.2 satisfies the simplifying assumption if ci,i+j|Sij (·, ·|uSij ) does not depend on uSij for all (i, j) ∈ IK .


5

If the data generating copula satisfies the simplifying assumption, it can be represented by a simplified vine copula, resulting in fast and simple statistical inference. Several methods for the consistent specification and estimation of pair-copula constructions have been developed under this assumption (Hobæk Haff [18], Dißmann et al. [6]). However, in view of Definition 2.2 and Definition 2.1 it is evident that it is extremely unlikely that the data generating vine copula strictly satisfies the simplifying assumption in practical applications. Therefore, we consider and investigate in this study simplified vine copula models as approximations of multivariate copulas. Definition 2.4 (Simplified vine copula approximation (SVCA)) If a simplified vine copula model C˜1:K ∈ CK is used to approximate a copula C1:K ∈ CK \CK , we call C˜1:K a simplified vine copula approximation. Several questions arise if the data generating process does not satisfy the simplifying assumption and a simplified D-vine copula model (Definition 2.1) is used to approximate a general D-vine copula (Definition 2.2). First of all, what bivariate unconditional copulas C˜i,i+j; Sij should be chosen in Definition 2.1 to model the bivariate conditional copulas Ci,i+j|Sij in Definition 2.2 so that the best approximation wrt a certain criterion is obtained? What simplified vine copula model do established stepwise procedures (asymptotically) specify and estimate if the simplifying assumption does not hold for the data generating vine copula? What are the properties of an optimal approximation? Before we investigate these questions in detail, we focus in the remainder of this section on conditional copulas and their approximation by means of the partial copula. Definition 2.5 (Conditional probability integral transform (CPIT)) Let K ≥ 1, Y be a scalar random variable and Z be a K-dimensional vector so that (Y, Z) has a positive density. We call FY |Z (Y |Z) the conditional probability integral transform of Y wrt Z. Proposition 2.1 (Eliminating dependence with CPITs) Under the assumptions in Definition 2.5 it holds that FY |Z (Y |Z) ∼ U(0, 1) and FY |Z (Y |Z) ⊥ Z. Rb Proof. This follows since P(FY |Z (Y |Z) ≤ a, Z ≤ b) = −∞ P(Y ≤ FY−1 |Z (a|t)|Z = t)dFZ (t) = aFZ (b), for all a ∈ [0, 1], b ∈ R ∪ {−∞, ∞}. Thus, applying the random transformation FY |Z (·|Z) to Y removes possible dependencies between Y and Z. This interpretation of the CPIT is crucial for understanding the conditional and partial copula which are related to the (conditional) joint distribution of CPITs. The conditional copula has been introduced by Patton [19] and we restate its definition here.1 Definition 2.6 (Bivariate conditional copula – Patton [19]) Let (Y1 , Y2 ) be a bivariate random vector and Z be a K-dimensional random vector, with K ≥ 1, so that (Y1 , Y2 , Z) has a positive density. Define the CPITs U1 (Z) = FY1 |Z (Y1 |Z) and U2 (Z) = FY2 |Z (Y2 |Z). The 1 We use Patton’s notation C Y1 ,Y2 |Z for the conditional copula which is also used in the vine copula literature [5, 16, 20]. Some researchers [8, 14, 21] denote a(n) (un)conditional copula that is assigned to an edge of a vine by CY1 ,Y2 ; Z . However, we use the semicolon to distinguish an unconditional copula from a conditional copula.

6


(a.s.) unique conditional copula CY1 ,Y2 |Z of the conditional distribution FY1 ,Y2 |Z is defined by CY1 ,Y2 |Z (a, b|z) := P(U1 (Z) ≤ a, U2 (Z) ≤ b|Z = z) = FY1 ,Y2 |Z (FY−1 (a|z), FY−1 (b|z)|z). 1 |Z 2 |Z Equivalently, we have that FY1 ,Y2 |Z (y1 , y2 |z) = CY1 ,Y2 |Z (FY1 |Z (y1 |z), FY2 |Z (y2 |z)|z), so that the effect of a change in z on the conditional distribution FY1 ,Y2 |Z (y1 , y2 |z) can be separated into two effects. First, the values of the CPITs, (FY1 |Z (y1 |z), FY2 |Z (y2 |z)), at which the conditional copula is evaluated, may change. Second, the functional form of the conditional copula CY1 ,Y2 |Z (·, ·|z) may vary. In comparison to the conditional copula, which is the conditional distribution of two CPITs, the partial copula is the unconditional distribution and copula of two CPITs. Definition 2.7 (Bivariate partial copula) Let (Y1 , Y2 , Z) be given as in Definition 2.6. The partial copula CYP1 ,Y2 ; Z of the distribution FY1 ,Y2 |Z is defined by CYP1 ,Y2 ; Z (a, b) := P(U1 (Z) ≤ a, U2 (Z) ≤ b). Since U1 (Z) ⊥ Z and U2 (Z) ⊥ Z, the partial copula represents the distribution of random variables which are individually independent of the conditioning vector Z. This is similar to the partial correlation coefficient, which is the correlation of two random variables from which the linear influence of the conditioning vector has been removed.2 The partial copula can be interpreted as the expected conditional copula, Z CYP1 ,Y2 ; Z (a, b) = CY1 ,Y2 |Z (a, b|z)dFZ (z), RK

and be considered as an approximation of the conditional copula. Indeed, it is easy to show that the partial copula CYP1 ,Y2 ; Z minimizes the KL divergence from the conditional copula CY1 ,Y2 |Z in the space of absolutely continuous bivariate distribution functions. Non-parametric estimation of the partial copula is investigated in Gijbels et al. [23].

3. Simplified vine copula approximations in three dimensions In this and the next section, we provide a theoretical investigation of optimal SVCAs and their properties. For ease of exposition, we first consider the approximation of three-dimensional copulas. In this case, the space of simplified vine copulas is given by Z u2 n o C3 := C1:3 ∈ C3 : C1:3 (u1:3 ) = C13; 2 ∂2 C12 (u1 , z), ∂1 C23 (z, u3 ) dz, ∀u1:3 ∈ [0, 1]3 , 0 2 Independently of us, Bergsma [22] also names this copula the partial copula, which he applies to test for conditional independence.


7

where (C12 , C23 , C13; 2 ) ∈ C2 × C2 × C2 := C23 . Since any C1:3 ∈ C3 is uniquely determined by the two copulas (C12 , C23 ) in the first tree and the copula C13; 2 in the second tree, we identify an element of C3 by the triple (C12 , C23 , C13; 2 ). The specification and estimation of SVCMs is commonly based on procedures that asymptotically minimize the Kullback-Leibler (KL) divergence in a stepwise fashion. For instance, if a parametric vine copula model is used, the step-by-step ML estimator (Hobæk Haff [24, 18]), where one estimates tree after tree and sequentially minimizes the estimated KL divergence conditional on the estimates from the previous trees, is often employed in order to select and estimate the parametric pair-copula families of the vine. But also the non-parametric methods of Kauermann and Schellhase [9] and Nagler and Czado [25] proceed in a stepwise manner and asymptotically minimize the KL divergence of each pair-copula separately under appropriate conditions. Therefore, we first investigate the SVCA that is based on a stepwise minimization of the KL divergence. Let C1:3 ∈ C3 and C˜1:3 ∈ C3 . The KL divergence of C˜1:3 from the true copula C1:3 is given by c1:3 (U1:3 ) ˜ ˜ ˜ ˜ , DKL (C1:3 ||C1:3 ) = DKL (C1:3 ||C12 , C32 , C13; 2 ) = E log c˜1:3 (U1:3 ) where the expectation is taken wrt the true distribution C1:3 . The KL divergence can be decomposed into (1) (2) the KL divergence related to the first tree DKL and to the second tree DKL as follows " # ∂ C (U ), ∂ C (U )|U c c (U )c (U ) 2 12 1:2 1 23 2:3 2 13|2 12 1:2 23 2:3 + E log DKL (C1:3 ||C˜12 , C˜23 , C˜13; 2 ) = E log c˜12 (U1:2 )˜ c23 (U2:3 ) c˜13; 2 ∂2 C˜12 (U1:2 ), ∂1 C˜23 (U2:3 ) (1)

(2)

=: DKL (C˜12 , C˜23 ) + DKL (C˜13; 2 (C˜12 , C˜23 )).

(3.1)

Note that the KL divergence related to the second tree depends on the specified copulas (C˜12 , C˜23 ) in the first tree because they determine at which values C˜13; 2 is evaluated. Theorem 3.1 (Tree-by-tree KL divergence minimization) Let C1:3 ∈ C3 with margins C12 , C23 . Then, (1)

arg min DKL (C˜12 , C˜23 ) = (C12 , C23 ),

(3.2)

˜12 ,C ˜23 )∈C 2 (C 2 (2)

P arg min DKL (C˜13; 2 (C12 , C23 )) = C13; 2,

(3.3)

˜13; 2 ∈C2 C

and P arg min DKL (C1:3 ||C12 , C23 , C˜13; 2 ) = C13; 2.

(3.4)

˜13; 2 ∈C2 C

Proof. The proof is given in Appendix A.1.

Equation (3.2) states that the KL divergence related to the first tree is minimized if the true copulas are specified. It is then optimal to specify the partial copula in the second tree to minimize the KL divergence

8


related to the second tree (3.3) and the overall KL divergence (3.4). Theorem 3.1 also remains true if we replace C2 by the space of absolutely continuous bivariate cdfs. Equation (3.4) generalizes the result of St¨ ober et al. [14], who note in their appendix that if C1:3 is a FGM copula and the copulas in the first tree are correctly specified, then the KL divergence from the true distribution has an extremum at P C˜13; 2 = C ⊥ = C13; 2 . If C13|2 belongs to a parametric family of bivariate copulas whose parameter depends P on u2 , then C13; 2 is in general not a member of this copula family with a constant parameter. E.g., consider Q Q C13|2 (a, b|u2 ) = ab 1 + 3g(u2 ) i=a,b (1 − i) + 5g(u2 )2 i=a,b (1 − i)(1 − 2i) with g(u2 ) = 0.5 − u2 . Together with Theorem 3.1 it follows that the proposed SVCAs of Hobæk Haff et al. [13] and Stöber et al. [14] can be improved if the partial copula is chosen in the second tree, and not a copula of the same parametric family as the conditional copula but with a constant dependence parameter such that the KL divergence is minimized. Definition 3.1 (Partial vine copula (approximation) (PVC(A)) in three dimensions) PVC P We call C1:3 , the copula that is associated with (C12 , C23 , C13; 2 ), the partial vine copula (PVC) of C1:3 ∈ C3 . Its density is given by P cPVC 1:3 (u1:3 ) = c12 (u1 , u2 )c23 (u2 , u3 )c13; 2 F1|2 (u1 |u2 ), F3|2 (u3 |u2 ) PVC C1:3 is the partial vine copula approximation (PVCA) of C1:3 if C1:3 ∈ C3 \C3 .

If one consistently estimates the bivariate unconditional copulas in the first tree, one observes (asymptotically) data from the partial copula in the second tree. Under appropriate conditions, a pair-copula estimator in the second tree is then a consistent estimator of the partial copula. Thus, under appropriate conditions, the PVCA is the probability limit of a stepwise simplified vine copula estimator (see Proposition 4.2 for the stepwise ML estimator). In light of the importance of the PVCA for applied work, we discuss some properties of the PVCA in the following. In the first tree of the PVCA two bivariate margins (C12 , C23 ) PVC of the true copula C1:3 are specified. The third bivariate margin C13 of the PVCA is determined by the P specified bivariate margins (C12 , C23 ) in the first tree and the partial copula C13; 2 in the second tree. The next proposition states some properties about this implicitly modeled margin of the PVCA. Proposition 3.1 (Properties of the implicitly modeled margin of the PVCA) Let C1:3 ∈ C3 \C3 , and τD and ρD denote Kendall’s tau and Spearman’s rho of the copula D ∈ C2 . Define R1 P PVC C13 (u1 , u3 ) = 0 C13; 2 (F1|2 (u1 |t), F3|2 (u3 |t))dt. PVC PVC 6= ρC PVC 6= τC (i) In general it holds that C13 6= C13 , ρC13 , and τC13 . 13 13 PVC (ii) A sufficient condition for C13 = C13 is that C12 = C23 = C ⊥ . P PVC Proof. Proposition 3.1 (i) follows from Example 3.1. Proposition 3.1 (ii) holds because C13; 2 = C13 = C13 if U1 ⊥ U2 and U2 ⊥ U3 .

The next example demonstrates that the implicitly modeled bivariate margin of the three-dimensional PVCA is in general not equal to the corresponding margin of the true copula.


9

Example 3.1 Let C F GM2 (θ) denote the bivariate FGM copula with parameter |θ| ≤ 1, i.e., C F GM2 (u1 , u2 ; θ) = u1 u2 [1 + θ(1 − u1 )(1 − u2 )]. and C A (γ) denote the following asymmetric version of the FGM copula ([1], Example 3.16) with parameter |γ| ≤ 1, C A (u1 , u2 ; γ) = u1 u2 [1 + γu1 (1 − u1 )(1 − u2 )].

(3.5)

Assume that C12 = C A (γ), C23 = C ⊥ , C13|2 (·, ·|u2 ) = C F GM2 (·, ·; 1 − 2u2 ) for all u2 . Elementary computations show that C13 (u1 , u3 ) = u1 u3 [γ(u1 − 3u21 + 2u31 )(1 − u3 ) + 3]/3, PVC which is a copula with quartic sections in u1 and square sections in u3 if γ 6= 0. The implied margin of C1:3 is Z 1 Z 1 PVC P C13 (u1 , u3 ) = C13; 2 (F1|2 (u1 |u2 ), u3 )dF2 (u2 ) = u3 F1|2 (u1 |u2 )dF2 (u2 ) = u1 u3 .

0

0

PVC = τC PVC = 0. Moreover, ρC13 = −γ/1080, τC13 = −γ/135, but ρC13 13

(1) In order to derive the PVCA, the KL divergence of the first tree DKL (C˜12 , C˜23 ) is minimized over the copulas in the first tree, but the effect of (C˜12 , C˜23 ) on the KL divergence related to the second tree (2) DKL (C˜13; 2 (C˜12 , C˜23 )) is not taken into account. Therefore, we now analyze whether the PVCA also globally (1) minimizes the KL divergence. Note that specifying (C˜12 , C˜23 ) 6= (C12 , C23 ) increases DKL (C˜12 , C˜23 ) in any case. Thus, without any further investigation, it is absolutely indeterminate whether the definite increase in (1) (2) DKL (C˜12 , C˜23 ) can be overcompensated by a possible decrease in DKL (C˜13; 2 (C˜12 , C˜23 )) if another approximation is chosen. The next theorem shows that the PVCA is in general not the global minimizer of the KL divergence.

Theorem 3.2 (Global and tree-by-tree KL minimization if C1:3 ∈ C3 or C1:3 ∈ C3 \C3 ) Let C12 , C23 be the margins of C1:3 . If C1:3 ∈ C3 , i.e., the simplifying assumption holds for C1:3 , then arg min

P DKL (C1:3 ||C˜12 , C˜23 , C˜13; 2 ) = (C12 , C23 , C13; 2 ).

(3.6)

˜12 ,C ˜23 ,C ˜13; 2 )∈C 3 (C 2 P If the simplifying assumption does not hold for C1:3 , then (C12 , C23 , C13; 2 ) might not be a global minimum. That is, ∃C1:3 ∈ C3 \C3 such that

arg min

P DKL (C1:3 ||C˜12 , C˜23 , C˜13; 2 ) 6= (C12 , C23 , C13; 2 ),

(3.7)

DKL (C1:3 ||C˜12 , C˜23 , C˜13; 2 ) 6= (C12 , C23 , D13; 2 ).

(3.8)

˜12 ,C ˜23 ,C ˜13; 2 )∈C 3 (C 2

from which it follows that ∀D13; 2 ∈ C2 arg min ˜12 ,C ˜23 ,C ˜13; 2 )∈C 3 (C 2

10


P Proof. Equation (3.6) is obvious, since C1:3 = (C12 , C23 , C13; 2 ) is the data generating process. Example 3.2 ˜ in conjunction with Lemma 3.1 and C12 being the asymmetric FGM copula stated in (3.5) shows that P DKL (C1:3 ||C12 , C23 , C13|2 ) may not be a local minimum and proves (3.7). Equation (3.8) immediately follows from the equations (3.7) and (3.4).

Theorem 3.2 states that, if the simplifying assumption does not hold, the KL divergence may not be minimized by choosing the true copulas in the first tree and the partial copula in the second tree (see (3.7)). It follows that, if the objective is the minimization of the KL divergence, it may not be optimal to specify the true copulas in the first tree, no matter what bivariate copula is specified in the second tree (see (3.8)). This can be explained by the fact that, if the simplifying assumption does not hold, then the approximation error of the implicitly modeled bivariate margin C13 is not minimized (see Example 3.1). Although a departure from the true copulas (C12 , C23 ) increases the KL divergence related to the first tree, it can decrease the KL divergence of the implicitly modeled margin C˜13 from C13 . As a result, the increase (1) (2) in DKL can be overcompensated by a larger decrease in DKL , so that the KL divergence can be decreased. The next example can be used to demonstrate that the PVCA is in general not a local minimizer of the KL divergence and proves (3.7) of Theorem 3.2. Example 3.2 Let g : [0, 1] → [−1, 1] be a measurable function. Consider the data generating process Z u2 C1:3 (u1:3 ) = C F GM2 u1 , u3 ; g(z) dz, 0

i.e., the two unconditional bivariate margins (C12 , C23 ) are independence copulas and the conditional copula is a FGM copula with varying parameter g(u2 ). The partial copula is also a FGM copula given by Z 1 P P P P C13; 2 (u1 , u3 ; θ13; 2 ) = u1 u3 [1 + θ13; 2 (1 − u1 )(1 − u3 )], θ13; 2 := g(u2 )du2 . 0

We set C˜23 = C23 , C˜13; 2 = C13; 2 , and specify a parametric copula C˜12 (θ12 ), θ12 ∈ Θ12 ⊂ R, with conditional cdf F˜1|2 (u1 |u2 ; θ12 ) and such that C˜12 (0) corresponds to the independence copula. Thus, P P (C˜12 (0), C23 , C13; ˜12 (u1 , u2 ; θ12 ) and ∂θ12 c˜12 (u1 , u2 ; θ12 ) are both 2 ) = (C12 , C23 , C13; 2 ). We also assume that c 2 continuous on (u1 , u2 , θ12 ) ∈ (0, 1) × Θ12 . P

P We now derive necessary and sufficient conditions such that DKL (C1:3 ||C˜12 (θ12 ), C23 , C13; 2 ) attains an extremum at θ12 = 0.

Lemma 3.1 (Extremum for the KL divergence in Example 3.2) Let C1:3 be given as in Example 3.2. For u1 ∈ (0, 1), we define Z 1 h(u1 ; g) := ∂θ12 F˜1|2 (u1 |u2 ; θ12 ) g(u2 )du2 , θ12 =0

0

P K(u1 ; θ13; 2 ) :=

1 h(u1 ; g)

Z 0

1

Z 0

1

P ∂θ12 log cP13; 2 F˜1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2

θ12 =0

c1:3 (u1:3 )du2 du3 .


11

P P P ˜ Then, ∀u1 ∈ (0, 0.5) : K(0.5 + u1 ; θ13; 2 ) > 0 ⇔ θ13; 2 > 0, and DKL (C1:3 ||C12 (θ12 ), C23 , C13; 2 ) has an extremum at θ12 = 0 if and only if Z 0.5 P P = K(0.5 + u1 ; θ13; ∂θ12 DKL (C1:3 ||C˜12 (θ12 ), C23 , C13; ) 2 2 )[h(0.5 + u1 ; g) − h(0.5 − u1 ; g)]du1 = 0.

θ12 =0

0

(3.9) Proof. See Appendix A.2.

It depends on the data generating process whether the condition in Lemma 3.1 is satisfied and P P DKL (C1:3 ||C˜12 (0), C23 , C13; 2 ) is an extremum or not as we illustrate in the following. If θ13; 2 = 0, then P K(u1 ; θ13; 2 ) = 0 for all u1 ∈ (0, 1), or if g does not depend on u2 , then h(u1 ; g) = 0 for all u1 ∈ (0, 1). Thus, the integrand in (3.9) is zero and we have an extremum if one of these conditions is true. Assuming P ˜ θ13; 2 6= 0 and that g depends on u2 , we see from (3.9) that g and C12 determine whether we have an extremum at θ12 = 0. Depending on the copula family that is chosen for C˜12 , it may be possible that the P ˜ copula family alone determines whether DKL (C1:3 ||C˜12 (0), C23 , C13; 2 ) is an extremum. For instance, if C12 is a FGM copula we obtain Z 1 h(u1 ; g) = u1 (1 − u1 ) (1 − 2u2 )g(u2 )du2 0

so that h(0.5 + u1 ; g) = h(0.5 − u1 ; g),

∀u1 ∈ (0, 0.5).

This symmetry of h across 0.5 implies that (3.9) is satisfied for all functions g. If we do not impose any constraints on the bivariate copulas in the first tree of the SVCA, then P DKL (C1:3 ||C˜12 (0), C23 , C13; 2 ) may not even be a local minimizer of the KL divergence. For instance, if ˜ C12 is the asymmetric FGM copula given in (3.5), we find that h(u1 ; g) = u21 (1 − u1 )

Z

1

(1 − 2u2 )g(u2 )du2 . 0

R1 If Λ := 0 (1 − 2u2 )g(u2 )du2 6= 0, e.g., g is a non-negative function which is increasing, say g(u2 ) = u2 , then, depending on the sign of Λ, either h(0.5 + u1 ; g) > h(0.5 − u1 ; g),

∀u1 ∈ (0, 0.5),

h(0.5 + u1 ; g) < h(0.5 − u1 ; g),

∀u1 ∈ (0, 0.5),

or

P so that the integrand in (3.9) is either strictly positive or negative and thus DKL (C1:3 ||C12 , C23 , C13; 2 ) can P ˜ not be an extremum. Since θ12 ∈ [−1, 1], it follows that DKL (C1:3 ||C12 (0), C23 , C13; 2 ) is not a local minimum. As a result, we can, relating to the PVCA, further decrease the KL divergence from the true copula if we

12


adequately specify “wrong” copulas in the first tree and choose the partial copula in the second tree of the SVCA. Theorem 3.2 does not imply that the PVCA never minimizes the KL divergence from the true copula. In order to show that the PVCA in Example 3.2 is not an extremum, we require that the partial copula is not P ⊥ P the independence copula. More generally, if C13; 2 = C , then DKL (C1:3 ||C12 , C23 , C13; 2 ) is an extremum, which directly follows from equation (3.2) since arg min ˜12 ,C ˜23 )∈C 2 (C 2

DKL (C1:3 ||C˜12 , C˜23 , C ⊥ ) =

(1)

arg min DKL (C˜12 , C˜23 ). ˜12 ,C ˜23 )∈C 2 (C 2

It is an open problem whether and when the PVCA can be the global minimizer of the KL divergence. Unfortunately, the SVCA that globally minimizes the KL divergence is not tractable. However, if the SVCA that minimizes the KL divergence does not specify the true copulas in the first tree, then the random variables in the second tree are not CPITs. Thus, it is not guaranteed that these random variables are uniformly distributed and it may not be optimal to model the joint distribution in the second tree with a copula. We could further decrease the KL divergence by choosing a pseudo-copula (Fermanian and Wegkamp [26]) as edge in the second tree. It can be easily shown that the resulting best approximation is then a pseudo-copula. Consequently, the best approximation satisfying the simplifying assumption is in general not an SVCM but a simplified vine pseudo-copula if one considers the space of regular vines where each edge corresponds to a bivariate cdf.

4. Simplified vine copula approximations in higher dimensions We now investigate the approximation of a K-dimensional copula by an SVCA. It turns out that, if the KL divergence should be minimized in a stepwise fashion, partial copulas are only optimal in the second tree of the SVCA. Therefore, j-th order partial copulas are introduced in order to generalize the PVCA to the K-dimensional case. We illustrate that a j-th order partial copula is another generalization of the partial correlation coefficient that can be used to obtain new measures of conditional dependence. While all properties of the three-dimensional PVCA also apply to a higher-dimensional PVCA, there are also some issues which only arise in higher dimensions. For instance, we find that spurious conditional (in)dependencies in higher trees may be induced if one approximates conditional copulas with unconditional copulas. As explained in Section 2, the partial copula is the unconditional distribution of two particular CPITs. In practice, data from these CPITs is not observable but we can obtain pseudo-observations by estimating conditional cdfs. However, if the simplifying assumption does not hold, we can not consistently estimate every conditional cdf with more than one conditioning variable using an SVCM. Consequently, from the third tree on, we may not get pseudo-observations from the CPITs and may not obtain data for the corresponding partial copulas.3 Instead, if partial copulas are specified in the second tree, we obtain pseudo-observations 3

If the simplifying assumption holds, then F1|2:4 (u1 |u2:4 ) = FU1|2:3 |U4|2:3 (F1|23 (u1 |u2:3 )|F4|23 (u4 |u2:3 )). However, if


13

from the second-order partial copulas in the third tree. This observation gives rise to the general definition of the partial vine copula (PVC). Definition 4.1 (PVC(A) and j-th order partial copulas) Assume K ≥ 3 and let C1:K ∈ CK be the copula of U1:K . Consider the corresponding D-vine copula stated PVC in Definition 2.2. In the first tree, we set for i = 1, . . . , K − 1 : Ci,i+1 = Ci,i+1 , while in the second tree, we PVC P PVC denote for i = 1, . . . , K − 2, k = i, i + 2 : Ci,i+2; i+1 = Ci,i+2; i+1 and Uk|i+1 = Uk|i+1 = Fk|i+1 (Uk |Ui+1 ). In the remaining trees j = 3, . . . , K − 1, for i = 1, . . . , K − j, we define PVC PVC PVC PVC PVC Ui|S := Fi|S (Ui |USij ) := ∂2 Ci,i+j−1; Si,j−1 (Ui|Si,j−1 , Ui+j−1|Si,j−1 ) ij ij PVC PVC PVC PVC PVC Ui+j|S := Fi+j|S (Ui+j |USij ) := ∂1 Ci+1,i+j; Si+1,j−1 (Ui+1|Si+1,j−1 , Ui+j|Si+1,j−1 ) ij ij

and PVC PVC PVC Ci,i+j; Sij (a, b) := P(Ui|Sij ≤ a, Ui+j|Sij ≤ b). PVC We call C1:K the partial vine copula (PVC) of C1:K . Its density is given by Y PVC PVC cPVC cPVC 1:K (u1:K ) := i,i+j; Sij Fi|Sij (ui |uSij ), Fi+j|Sij (ui+j |uSij ) .

(i,j)∈IK PVC C1:K is the partial vine copula approximation (PVCA) of C1:K if C1:K ∈ CK \CK . For k = i, i + j, we call PVC PVC Uk|S the (j − 2)-th order partial probability integral transform (PPIT) of Uk wrt USij and Ci,i+j; Sij the ij PVC (j − 1)-th order partial copula of Fi,i+j|Sij that is induced by C1:K .

Proposition 4.1 Under the assumptions of Definition 4.1 it holds that, for all (i, j) ∈ IK , PVC Ui|S = FU PVC ij

i|Si,j−1

PVC Ui+j|S = FU PVC ij

PVC |Ui+j−1|S

i+j|Si+1,j−1

i,j−1

PVC |Ui+1|S

PVC PVC (Ui|S |Ui+j−1|S ), i,j−1 i,j−1

i+1,j−1

PVC PVC (Ui+j|S |Ui+1|S ), i+1,j−1 i+1,j−1

PVC PVC PVC PVC PVC i.e., Ui|S is the CPIT of Ui|S wrt Ui+j−1|S and Ui+j|S is the CPIT of Ui+j|S wrt ij i,j−1 i,j−1 ij i+1,j−1 PVC Ui+1|Si+1,j−1 . Thus, PPITs are uniformly distributed and j-th order partial copulas are indeed copulas.

Proof. See Appendix A.3.

Note that the first-order partial copula coincides with the partial copula of a conditional distribution PVC PVC PVC with one conditioning variable. Since Ui|S is the CPIT of Ui|S wrt Ui+j−1|S , it is independent ij i,j−1 i,j−1 PVC PVC of Ui+j−1|Si,j−1 . However, in general it is not true that Ui|Sij ⊥ USij . Consequently, if the order is larger we use the simplifying assumption as a device to tackle the curse of dimensions, then our model of F1|2:4 is not given by FU1|2:3 |U4|2:3 because the CPITs U1|2:3 and U4|2:3 can not be obtained from the approximation. Instead these CPITs ˜1|2:3 = ∂2 C ˜13; 2 (F˜1|2 (U1 |U2 ), F˜3|2 (U3 |U2 )) and U ˜4|2:3 = ∂1 C ˜24; 3 (F˜2|3 (U2 |U3 ), F˜4|3 (U4 |U3 )), where are approximated using U 5 ˜ ˜ ˜ ˜ ˜ (F12 , F23 , F34 , C13; 2 , C24; 3 ) ∈ C . 2

14


than one, a j-th order partial copula describes the distribution of a pair of uniformly distributed random variables which are in general neither jointly nor individually independent of the conditioning variables of PVC the corresponding conditional copula. Obviously, if the simplifying assumption holds, then Ci,i+j; Sij (·, ·) = Ci,i+j|Sij (·, ·|·) = Ci,i+j|Sij (·, ·), i.e., the (j − 1)-th order partial copula equals the conditional copula which does not depend on the conditioning vector. PVC −1 PVC Let k = i, i + j, and GPVC (tk |tSij ) denote the inverse of Fk|S (·|tSij ) wrt the first k|Sij (tk |tSij ) = (Fk|Sij ) ij argument. A (j − 1)-th order partial copula can be computed using PVC PVC PVC PVC PVC Ci,i+j; Sij (a, b) = P(Ui|Sij ≤ a, Ui+j|Sij ≤ b) = E P(Ui|Sij ≤ a, Ui+j|Sij ≤ b|USij ) Z PVC = Ci,i+j|Sij Fi|Sij GPVC i|Sij (a|tSij ) tSij , Fi+j|Sij Gi+j|Sij (b|tSij ) tSij tSij dFSij (tSij ).

(4.1)

[0,1]j−1

PVC If j ≥ 3, Ci,i+j; Sij depends on Fi|Sij , Fi+j|Sij , Ci,i+j|Sij , and FSij , i.e., it depends on Ci:i+j . Moreover, PVC PVC Ci,i+j; Sij also depends on GPVC i|Sij and Gi+j|Sij , which are determined by the regular vine structure. Thus, the corresponding PVCs of different regular vines may be different. In particular, if the simplifying assumption does not hold, pair-copulas of different PVCs which refer to the same conditional distribution may not be identical. This is different from a partial copula (Definition 2.7) which depends only on Ci,i+j|Sij and is not related to a regular vine. The following theorem generalizes Theorem 3.1 and shows that, if one sequentially minimizes the KL divergence related to each tree, then the optimal SVCM is the PVC.

Theorem 4.1 (Tree-by-tree KL divergence minimization using the PVC) Let K ≥ 3 and C1:K ∈ CK be the data generating copula and j = 1, . . . , K − 1. j PVC PVC PVC We define TjPVC := (Ci,i+j; collects all copulas of the PVC up to Sij )i=1,...,K−j , so that T1:j := ×k=1 Tk and including the j-th tree. Moreover, let n o Tj := (C˜i,i+j; Sij )i=1,...,K−j : C˜i,i+j; Sij ∈ C2 for 1 ≤ i ≤ K − j , so that T1:j = ×jk=1 Tk represents all possible SVCMs up to and including the j-th tree. Let Tj ∈ Tj , T1:j−1 ∈ T1:j−1 . The KL divergence of the SVCM associated with T1:K−1 is given by DKL (C1:K ||T1:K−1 ) = PK−1 (j) j=1 DKL (Tj (T1:j−1 )), where (1) DKL (T1 (T1:0 )))

:=

K−1 X i=1

ci,i+1 (Ui , Ui+1 ) E log , c˜i,i+1 (Ui , Ui+1 )

denotes the KL divergence related to the first tree, and for the remaining trees j = 2, . . . , K − 1, the related KL divergence is " # K−j X ci,i+j|Sij (Fi|Sij (Ui |USij ), Fi+j|Sij (Ui+j |USij )|USij ) (j) DKL (Tj (T1:j−1 )) := E log , c˜i,i+j; Sij (F˜i|Sij (Ui |USij ), F˜i+j|Sij (Ui+j |USij )) i=1


15

with F˜i|Sij (Ui |USij ) := ∂2 C˜i,i+j−1; Si,j−1 (F˜i|Si,j−1 (Ui |USi,j−1 ), F˜i+j−1|Si,j−1 (Ui+j−1 |USi,j−1 )), F˜i+j|S (Ui+j |US ) := ∂1 C˜i+1,i+j; S (F˜i+1|S (Ui+1 |US ), F˜i+j|S (Ui+j |US ij

ij

i+1,j−1

i+1,j−1

i+1,j−1

i+1,j−1

i+1,j−1

)).

It holds that ∀j = 1, . . . , K − 1 :

(j)

PVC )) = TjPVC . arg min DKL (Tj (T1:j−1

Tj ∈Tj


According to Theorem 4.1, if the true copulas are specified in the first tree, one should choose the firstorder partial copulas in the second tree, the second-order partial copulas in the third tree etc. to minimize the KL divergence tree-by-tree. Theorem 4.1 also remains true if we replace C2 in the definition of Tj by the space of absolutely continuous bivariate cdfs. The PVCA ensures that random variables in higher trees are uniformly distributed since the resulting random variables in higher trees are higher-order PPITs. If one uses a different approximation, such as the one used by Hobæk Haff et al. [13] and Stöber et al. [14], then the random variables in higher trees are not necessarily uniformly distributed and pseudo-copulas can be used to further minimize the KL divergence. From Theorem 3.2 it follows that the PVCA is in general not the global minimizer of the KL divergence, implying that one should adequately specify “wrong” copulas in the first tree. As a result, the random variables in higher trees are not necessarily uniformly distributed and the approximation can be improved if one assigns pseudo-copulas to these edges. Thus, in general, the best approximation satisfying the simplifying assumption is given by a simplified vine pseudo-copula. However, neither this approximation nor the best SVCA is analytically tractable. Under the assumption that the data is generated by a simplified vine copula, stepwise parametric ([5, 18]) and non-parametric estimators ([9, 27, 25]) have been developed to estimate the data generating vine copula. Regularity conditions that guarantee that the stepwise ML estimator is consistent for the PVC if the PVC is the data generating copula have been established by Hobæk Haff [18]. The next corollary demonstrates that the same conditions can be used to show that the stepwise ML estimator is consistent for the PVC if the data generating copula exhibits this PVC. Proposition 4.2 (Convergence to the PVC) Let N denote the sample size and θˆN be the stepwise ML estimator defined in Aas et al. [5] or Hobæk Haff ˜ : θ˜ ∈ Θ} ⊂ CK satisfies the [18]. Assume that the family of parametric simplified vine copulas {C˜1:K (θ) ˜ regularity conditions in Theorem 1 of Hobæk Haff [18] and that C1:K (θ), with θ ∈ Θ, is the PVC of the data generating copula C1:K ∈ CK . It holds that θˆN = θ + op (1) for N → ∞. Proof. See Appendix A.5.

Proposition 4.2 essentially holds because the regularity conditions for the stepwise ML estimator to be consistent for the PVC apply to the PVC but not to the data generating copula. That is, it does not matter

16


whether the data generating copula satisfies the simplifying assumption or not. A look at the construction of the non-parametric estimators given in [9, 27, 25] reveals that these estimators also aim at estimating the PVC. Considering Proposition 4.2 it seems reasonable to conjecture that regularity conditions for these estimators to be consistent for the PVC also do not depend on whether the PVC is the data generating copula or the data generating copula has this PVC. However, to the best of our knowledge, conditions for the consistency of these stepwise non-parametric estimators are a topic of current research ([27, 25]). In the following, we illustrate that the PVC is a generalization of the partial correlation matrix and investigate the dependence structure of the PVCA. The partial copula can be interpreted as a generalization of the partial correlation coefficient and the same is true for the j-th order partial copula, although the relation is different. To illustrate this relation for the second-order partial copula of FY1 ,Y4 |Y2:3 , assume that all univariate margins have zero mean and unit Yi −ρij Yj variance. Define, for i = 1, . . . , 4, j = 2, 3, Ei|j = √ , where ρij = Corr[Yi , Yj ], and let ρij;K denote the 2 1−ρij

partial correlation between Yi and Yj conditional on YK with K ⊂ {1, 2, 3, 4}\{i, j}. Moreover, define E1|23 = E1|2 −ρ13;2 E3|2 E −ρ24;3 E2|3 √ 2 √ 2 and E4|23 = 4|3 . Then, ρ14;23 = Corr E1|23 , E4|23 is the correlation of (E1|2 , E4|3 ) after 1−ρ13;2

1−ρ24;3

E1|2 has been corrected for the linear influence of E3|2 , and E4|3 has been corrected for the linear influence of E2|3 . If we do not only decorrelate the involved random variables but render them independent and transform the marginal distributions into uniform distributions, then Ei|j becomes Fi|j (Yi |Yj ), for i = 1, . . . , 4, j = 2, 3, PVC PVC . The distribution E1|23 becomes FU1|2 |U3|2 (U1|2 , U3|2 ) = U1|23 , and E4|23 becomes FU4|3 |U2|3 (U4|3 , U2|3 ) = U4|23 PVC PVC of (U1|23 , U4|23 ) is the second-order partial copula. In other words, the (j −1)-order partial copula of Fi,i+j|Sij is the copula of Yi and Yi+j after each random variable has been adjusted for the influence of the intermediate PVC PVC vector YSij that is captured by the (j − 2)-th order partial distribution functions Fi|S and Fi+j|S . In this ij ij way, a j-th order partial copula may be a helpful tool for analyzing bivariate conditional dependencies. Definition 4.2 (j-th order partial independence) PVC ⊥ Let X1:K be a K-dimensional random vector and denote its copula by C1:K ∈ CK . If Ci,i+j; we call Sij = C PVC Xi and Xi+j (j-th order) partially independent given XSij (wrt Ci,i+j; Sij ). Partial independence and conditional independence are equivalent if the simplifying assumption holds. It is easy to show that if all partial copulas are Gaussian copulas then Xi and Xi+j are partially independent given XSij if and only if ρi,i+j; Sij = 0, where ρi,i+j; Sij is the partial correlation of (Zi , Zi+j ) given ZSij , Zk = Φ−1 FXk (Xk ) , k = i, . . . , i + j, and Φ−1 is the inverse of the standard normal cdf. The next example illustrates j-th order PPITs and j-th order partial copulas. Moreover, the example allows the theoretical investigation of the relation between partial independence and conditional independence and elucidates possible consequences that occur when a PVCA is used to model a non-simplified vine copula.

Simplified vine copula models 1

2

17

3

4

5

C⊥

C⊥

C⊥

C⊥

12

23

34

45

1

2

24|3 C⊥

14|23

C⊥

23 C⊥

13|2

35|4

45 C⊥

24|3

35|4

F GM C14; 23 2

F GM C25; 34 2

14|23

25|34

25|34

5 C⊥

34

C⊥

C⊥

4

C⊥

12

F GM2 F GM2 F GM2 (g) C (g) C (g) C 24|3 35|4 13|2

13|2

3

C⊥

PVC C15; 2:4

C⊥

(a) Copulas in Example 4.1.

(b) PVCA of Example 4.1.

Figure 1: The true copulas of the non-simplified D-vine copula given in Example 4.1 and its PVCA.

Example 4.1 (Spurious (in)dependencies in partial vine copula approximations) Let C1:5 have the following D-vine copula representation of the non-simplified form Ci,i+j|i+1:i+j−1 = C ⊥ , Ci,i+2|i+1 (ui , ui+2 |ui+1 ) = C

j = 1, 3, 4, i = 1, . . . , 5 − j,

F GM2

(ui , ui+2 ; g(ui+1 )),

i = 1, 2, 3,

(4.2) (4.3)

where g : [0, 1] → [−1, 1] is a non-constant measurable function such that ∀u ∈ [0.5, 1] : g(0.5 + u) = −g(0.5 − u).

(4.4)

All conditional copulas of the vine copula in Example 4.1 correspond to the independence copula except R for the second tree. Also note that [0,1] g(u)du = 0. Setting j = 1 in (4.2), we find that ∀i = 1, 2, 3, ∀k = i, i + 2 : Uk|i+1 = Fk|i+1 (Uk |Ui+1 ) = Uk .

(4.5)

If g(u) = 1 − 2u, then (4.3) is equivalent to (Ui|i+1 , Ui+2|i+1 , Ui+1 ) ∼ C F GM3 (1) for all i = 1, 2, 3, so that Q3 Q3 in this case, (Ui , Ui+1 , Ui+2 ) ∼ C F GM3 (1), where C F GM3 (u1:3 ; θ) = i=1 ui + θ i=1 ui (1 − ui ), |θ| ≤ 1, is the three-dimensional FGM copula. The left panel of Figure 1 illustrates the D-vine copula of the data generating process. We now investigate the PVCA of C1:5 which is illustrated in the right panel of Figure 1. Lemma 4.1 (The PVCA of Example 4.1) Let C1:5 be defined as in Example 4.1. Then PVC ⊥ Ci,i+2; i+1 = C ,

∀i = 1, 2, 3,

PVC F GM2 Ci,i+3; (θ), i+1:i+2 = Ci,i+3 = C

θ=4

R [0,1]

2 ug(u)du > 0, ∀i = 1, 2,

18


and, in general, PVC ⊥ C15; 2:4 6= C .

Proof. For i = 1, 2, 3, the copula in the second tree of the PVC is given by Z PVC Ci,i+2; i+1 (a, b) = P(Ui|i+1 ≤ a, Ui+2|i+1 ≤ b) = Ci,i+2|i+1 (a, b|ui+1 )dui+1 [0,1] Z (4.4) (4.3) = ab 1 + (1 − a)(1 − b) g(ui+1 )dui+1 = ab,

(4.6)

[0,1]

which is the independence copula. For i = 1, 2, k = i, i + 3, the true CPIT of Uk wrt Ui+1:i+2 is a function of Ui+1:i+2 because Ui|i+1:i+2 = Ui [1 + g(Ui+1 )(1 − Ui )(1 − 2Ui+2 )],

(4.7)

Ui+3|i+1:i+2 = Ui+3 [1 + g(Ui+2 )(1 − Ui+3 )(1 − 2Ui+1 )].

(4.8)

However, for i = 1, 2, k = i, i + 3, the PPIT of Uk wrt Ui+1:i+2 is not a function of Ui+1:i+2 because PVC PVC Ui|i+1:i+2 = Fi|i+1:i+2 (Ui |Ui+1:i+2 ) = FUi|i+1 |Ui+2|i+1 (Ui|i+1 |Ui+2|i+1 ) (4.6)

(4.5)

PVC = ∂2 Ci,i+2; i+1 (Ui|i+1 , Ui+2|i+1 ) = Ui|i+1 = Ui ,

(4.9)

and, by symmetry, PVC Ui+3|i+1:i+2 = Ui+3 .

(4.10)

For i = 1, 2, the joint distribution of these first-order PPITs is a copula in the third tree of the PVC which is given by (4.9),(4.10)

PVC PVC PVC Ci,i+3; = P(Ui ≤ a, Ui+3 ≤ b) = Ci,i+3 (a, b) (4.11) i+1:i+2 (a, b) = P(Ui|i+1:i+2 ≤ a, Ui+3|i+1:i+2 ≤ b) Z (4.2) = Fi|i+1:i+2 (a|ui+1:i+2 )Fi+3|i+1:i+2 (b|ui+1:i+2 )dui+1:i+2 [0,1]2 Z Z (4.7),(4.8) = ab[1 + (1 − a)(1 − b) g(ui+1 )(1 − 2ui+1 )dui+1 g(ui+2 )(1 − 2ui+2 )dui+2 ]

[0,1]

= ab[1 + θ(1 − a)(1 − b)] = C

F GM2

[0,1]

(θ),

where θ := 4( [0,1] ug(u)du)2 > 0, by the properties of g. Thus, a copula in the third tree of the PVC is a bivariate FGM copula whereas the true conditional copula is the independence copula. The CPITs of U1 or U5 wrt U2:4 are given by R

(4.2)

U1|2:4 = F1|2:4 (U1 |U2:4 ) = ∂2 C14|2:3 (U1|2:3 , U4|2:3 |U2:3 ) = U1|2:3 (4.7)

= U1 [1 + g(U2 )(1 − U1 )(1 − 2U3 )],

(4.12)

U5|2:4 = U5 [1 + g(U4 )(1 − U5 )(1 − 2U3 )],

(4.13)


19

whereas the corresponding second-order PPITs are given by PVC PVC PVC PVC U1|2:4 = F1|2:4 (U1 |U2:4 ) = FU PVC |U PVC (U1|2:3 |U4|2:3 ) 1|2:3

= U1|4 = ∂2 C14 (U1 , U4 )

4|2:3

(4.11)

=

(4.9),(4.10)

=

FU1 |U4 (U1 |U4 )

U1 [1 + θ(1 − U1 )(1 − 2U4 )],

PVC

U5|2:4 = U5 [1 + θ(1 − U5 )(1 − 2U2 )].

(4.14) (4.15)

The computation of the copula in the fourth tree of the PVC is given in Appendix A.6. For g(u) = 1 − 2u, we explicitly show that this copula is not the independence copula. In Example 4.1 U1|2:4 is a function of U2:3 , i.e. the true conditional distribution function F1|2:4 depends on u2 and u3 . In contrast, the resulting model for F1|2:4 which is implied by the PVCA depends only on u4 . That is, the approximation of the conditional distribution function F1|2:4 depends on the conditioning variable which actually has no effect. Example 4.1 demonstrates that j-th order partial copulas may not be independence copulas, although the corresponding conditional copulas are independence copulas. In particular, under the data generating process the edges of the third tree are independence copulas. Neglecting the conditional copulas in the second tree and approximating them with partial copulas induces spurious dependencies in the third tree to which bivariate FGM copulas are the best approximation wrt the KL divergence related to this tree. The introduced spurious dependence also carries over to the fourth tree where we have (conditional) independence in fact. Note that the expected log-likelihoods in the third and fourth tree are larger if higher-order partial copulas are used instead of the true conditional copulas. Thus, the spurious dependence in the third and fourth tree decreases the KL divergence from C1:5 and therefore acts as a countermeasure for the spurious (conditional) independence in the second tree. The following theorem generalizes the insights from Example 4.1 and states that there is in general no relation between conditional independence and higher-order partial independence. Theorem 4.2 (Conditional independence and j-th order partial independence) Let K ≥ 3, (i, j) ∈ IK , and C1:K ∈ CK \CK . It holds that PVC ⊥ Ci,i+2|i+1 = C ⊥ ⇒ Ci,i+2; i+1 = C , PVC ⊥ ∀j ≥ 3 : Ci,i+j|Sij = C ⊥ 6⇒ Ci,i+j; Sij = C ,

and PVC ⊥ ∀j ≥ 2 : Ci,i+j|Sij = C ⊥ 6⇐ Ci,i+j; Sij = C .


To reduce the computational complexity, Brechmann et al. [28] introduce the truncation of regular vines, where all copulas after a specific tree are set to the independence copula if this is suggested by the data. If the simplifying assumption does not hold, it may happen that the data in higher trees appears to be generated by a dependent copula even if there is no dependence under the data generating process. Therefore, it may

20


be possible to truncate a vine copula earlier if conditional copulas are modeled. Bauer and Czado [29] base conditional independence tests on j-th order partial copulas to learn the structure of a Bayesian network. The violation of the simplifying assumption might lead to a suboptimal choice of the network structure if partial (in)dependence is mistaken for conditional (in)dependence.

5. Conclusion The specification and estimation of SVCMs has been based on the assumption that the data generating process satisfies the simplifying assumption. In this study, we considered the case when the simplifying assumption does not hold for the data generating process and an SVCM acts as an approximation of a multivariate copula. We provided a thorough analysis of SVCAs and established several results concerning optimal SVCAs. Under appropriate regularity conditions, the commonly used stepwise ML estimator for paircopula constructions is a consistent estimator of the PVC(A). In light of the importance of the PVCA for applied work, we analyzed several properties of the PVCA. We illustrated that the PVCA is a generalization of the partial correlation matrix, where partial correlations are generalized by j-th order partial copulas, and analyzed the relation between conditional independence and partial independence. We also established that the PVCA does in general not minimize the KL divergence from the true copula. Thus, the approximation which sequentially gives the best approximation for each tree is not the best global approximation. One intuitive explanation for this result is the fact that the PVCA favors some bivariate dependencies of the data generating copula, because K − 1 of the K(K − 1)/2 bivariate margins are correctly specified while the other bivariate margins may not match the margins of the true copula. The approximation error can be more evenly distributed over all variables if the copulas in the first tree of the SVCA are adequately misspecified, resulting in an overall decrease in KL divergence. As a result, the best approximation of a multivariate copula that is based on the simplifying assumption is given by a simplified vine pseudo-copula that assigns pseudo-copulas to the edges of higher trees. The analysis in this paper focused on the relative optimality of SVCAs. We did not investigate the goodness of SVCAs in absolute terms, i.e., under what conditions does an SVCA provide an adequate approximation? Initial research on this question has been conducted by Hobæk Haff et al. [13] and St¨ ober et al. [14]. A more comprehensive study that utilizes the established results in this work is left for future research.

Acknowledgements We would like to thank Harry Joe and Claudia Czado for comments which helped to improve this paper. We also would like to thank Roger Cooke and Irène Gijbels for interesting discussions on the simplifying assumption.


21

References [1] R. B. Nelsen, An introduction to copulas, Springer series in statistics, Springer, New York, 2006. [2] H. Joe, Multivariate models and dependence concepts, Chapman & Hall, London, 1997. [3] A. J. McNeil, R. Frey, P. Embrechts, Quantitative risk management, Princeton series in finance, Princeton Univ. Press, Princeton NJ, 2005. [4] H. Joe, Families of m-Variate Distributions with Given Margins and m(m-1)/2 Bivariate Dependence Parameters, Lecture Notes-Monograph Series 28 (1996) 120–141. [5] K. Aas, C. Czado, A. Frigessi, H. Bakken, Pair-Copula Constructions of Multiple Dependence, Insurance: Mathematics and Economics 44 (2009) 182–198. [6] J. Dißmann, E. C. Brechmann, C. Czado, D. Kurowicka, Selecting and estimating regular vine copulae and application to financial returns, Computational Statistics & Data Analysis 59 (2013) 52–69. [7] O. Grothe, S. Nicklas, Vine constructions of Lévy copulas, Journal of Multivariate Analysis 119 (2013) 1–15. [8] H. Joe, H. Li, A. K. Nikoloulopoulos, Tail dependence functions and vine copulas, Journal of Multivariate Analysis 101 (2010) 252–270. [9] G. Kauermann, C. Schellhase, Flexible pair-copula estimation in D-vines using bivariate penalized splines, Statistics and Computing (2013) 1–20. [10] A. K. Nikoloulopoulos, H. Joe, H. Li, Vine copulas with asymmetric tail dependence and applications to financial return data, Computational Statistics & Data Analysis 56 (2012) 3659–3673. [11] K. Aas, D. Berg, Models for construction of multivariate dependence – a comparison study, The European Journal of Finance 15 (2009) 639–659. [12] M. Fischer, C. K¨ ock, S. Schl¨ uter, F. Weigert, An empirical analysis of multivariate copula models, Quantitative Finance 9 (2009) 839–854. [13] I. Hobæk Haff, K. Aas, A. Frigessi, On the simplified pair-copula construction – Simply useful or too simplistic?, Journal of Multivariate Analysis 101 (2010) 1296–1310. [14] J. St¨ ober, H. Joe, C. Czado, Simplified pair copula constructions—Limitations and extensions, Journal of Multivariate Analysis 119 (2013) 101–118. [15] T. Bedford, R. M. Cooke, Vines: A New Graphical Model for Dependent Random Variables, The Annals of Statistics 30 (2002) 1031–1068. [16] D. Kurowicka, H. Joe (Eds.), Dependence modeling, World Scientific, Singapore, 2011. [17] D. Kurowicka, R. Cooke, Uncertainty analysis with high dimensional dependence modelling, Wiley, Chichester, 2006. [18] I. Hobæk Haff, Parameter estimation for pair-copula constructions, Bernoulli 19 (2013) 462–491. [19] A. J. Patton, Modelling Asymmetric Exchange Rate Dependence, International Economic Review 47 (2006) 527–556. [20] E. F. Acar, C. Genest, J. Neˇslehov´ a, Beyond simplified pair-copula constructions, Journal of Multi-

22

[21] [22] [23] [24] [25] [26] [27] [28] [29] [30]

F. Spanhel and M. S. Kurz variate Analysis 110 (2012) 74–90. P. Krupskii, H. Joe, Factor copula models for multivariate data, Journal of Multivariate Analysis 120 (2013) 85–101. W. Bergsma, Testing conditional independence for continuous random variables, 2004. URL: http: //eprints.pascal-network.org/archive/00000824/. I. Gijbels, M. Omelka, N. Veraverbeke, Estimation of a Copula when a Covariate Affects only Marginal Distributions, Scandinavian Journal of Statistics (2015) (forthcoming). I. Hobæk Haff, Comparison of estimators for pair-copula constructions, Journal of Multivariate Analysis 110 (2012) 91–105. T. Nagler, C. Czado, Evading the curse of dimensionality in multivariate kernel density estimation with simplified vines, 2015. URL: http://arxiv.org/pdf/1503.03305v2.pdf. J.-D. Fermanian, M. H. Wegkamp, Time-dependent copulas, Journal of Multivariate Analysis 110 (2012) 19–29. I. Hobæk Haff, J. Segers, Nonparametric estimation of pair-copula constructions with the empirical pair-copula, Computational Statistics & Data Analysis 84 (2015) 1–13. E. C. Brechmann, C. Czado, K. Aas, Truncated regular vines in high dimensions with application to financial data, Canadian Journal of Statistics 40 (2012) 68–85. A. Bauer, C. Czado, Pair-copula Bayesian networks, 2012. URL: http://arxiv.org/pdf/1211.5620. B. Remillard, Statistical methods for financial engineering, CRC Press, Boca Raton, Fl., 2013.

Appendix A.1. Proof of Theorem 3.1. (1) Equation (3.2) follows directly from the definition of DKL (C˜12 , C˜23 ) and the fact that the true data generating process minimizes the KL divergence (Gibbs’ inequality). Now let C˜12 = C12 , C˜23 = C23 . The KL divergence related to the second tree is then given by

(2) DKL (C˜13; 2 (C12 , C23 )) = E log c13|2 ∂2 C12 (U1:2 ), ∂1 C23 (U2:3 )|U2 − E[log c˜13; 2 ∂2 C12 (U1:2 ), ∂1 C23 (U2:3 ) ] =: E log c13|2 ∂2 C12 (U1:2 ), ∂1 C23 (U2:3 )|U2 − H (2) (C˜13; 2 (C12 , C23 )),


23

where H (2) is the negative cross entropy. The KL divergence related to the second tree is minimized if the corresponding negative cross entropy is maximized. We obtain that Z (2) ˜ H (C13; 2 (C12 , C23 )) = log c˜13; 2 (F1|2 (u1 |u2 ), F3|2 (u3 |u2 )) c13|2 (F1|2 (u1 |u2 ), F3|2 (u3 |u2 )|u2 ) [0,1]3

× c12 (u1 , u2 )c23 (u2 , u3 )du1:3 Z −1 −1 u1 =F1|2 (u1|2 |u2 ),u3 =F3|2 (u3|2 |u2 ) = log c˜13; 2 (u1|2 , u3|2 ) c13|2 (u1|2 , u3|2 |u2 )du1|2 du3|2 du2 [0,1]3 Z P = log c˜13; 2 (u1|2 , u3|2 ) c13; 2 (u1|2 , u3|2 )du1|2 du3|2 [0,1]2

which is maximized for c˜13; 2 = cP13; 2 (Gibbs’ inequality) and proves equation (3.3). Equation (3.4) follows from equation (3.1) and (3.3).

A.2. Proof of Lemma 3.1. The KL divergence attains an extremum if and only if the negative cross entropy attains an extremum. The negative cross entropy is given by P H1:3 (C1:3 ||C˜1:3 (θ12 )) = E[log c˜1:3 (U1:3 ; θ12 , θ13; 2 )] P P = E[log[˜ c12 (U1:2 ; θ12 )c13; 2 F˜1|2 (U1 |U2 ; θ12 ), U3 ; θ13; 2 ]]

P = E[log c˜12 (U1:2 ; θ12 )] + E[log cP13; 2 F˜1|2 (U1 |U2 ; θ12 ), U3 ; θ13; 2 ]. If the negative cross entropy attains an extremum then the derivative of E[log c˜1:3 (U1:3 ; θ12 )] wrt θ12 is zero. Since c˜12 (u1 , u2 ; θ12 ) and ∂θ12 c˜12 (u1 , u2 ; θ12 ) are both continuous on (u1 , u2 , θ12 ) ∈ (0, 1)2 ×Θ12 , we can apply Leibniz’s rule for differentiation under the integral sign to conclude that ∂θ12 E[log c˜12 (U1:2 ; θ12 )] θ =0 = 12 E[∂θ12 log c˜12 (U1:2 ; θ12 ) θ =0 ] = 0 because C˜12 (0) is the true copula of U1:2 . Thus, the derivative evaluated 12 at θ12 = 0 becomes P ∂θ12 E[log c˜1:3 (U1:3 ; θ12 , θ13; )] = ∂ E[log c ˜ (U ; θ )] θ 12 1:2 12 2 12 θ12 =0 θ12 =0 P P ˜ + ∂θ12 E[log c13; 2 F1|2 (U1 |U2 ; θ12 ), U3 ; θ13; 2 ] θ12 =0 P P ˜ = ∂θ12 E[log c13; 2 F1|2 (U1 |U2 ; θ12 ), U3 ; θ13; 2 ] θ12 =0 Z P P = ∂θ12 log c13; 2 F˜1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2 c1:3 (u1:3 )du1:3 θ12 =0

[0,1]3

∂1 c13; 2 F˜1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2 ˜1|2 (u1 |u2 ; θ12 ) ∂ F c1:3 (u1:3 )du1:3 , = θ 12 P P ˜ θ12 =0 [0,1]3 c13; 2 F1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2 Z

P

P

24


P where ∂1 cP13; 2 (u, v; θ13; 2 ) is the partial derivative wrt u and we have used Leibniz’s integral rule to perform the differentiation under the integral sign for the second last equality which is valid since the integrand and its partial derivative wrt θ12 are both continuous in u1:3 and θ12 on (0, 1)3 × (−1, 1). To compute the integral we observe that P P ∂1 cP13; 2 (u, v; θ13; 2 ) = −2θ13; 2 (1 − 2v), P P ∂1 cP13; 2 F˜1|2 (u1 |u2 ; 0), u3 ; θ13; 2 = −2θ13; 2 (1 − 2u3 ), P P −2θ13; ∂1 cP13; 2 F˜1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2 2 (1 − 2u3 ) P =: m(u1 , u3 ; θ13; = 2 ). P P P 1 + θ13; 2 (1 − 2u1 )(1 − 2u3 ) c13; 2 F˜1|2 (u1 |u2 ; θ12 ), u3 ; θ13; 2 θ12 =0 P Note that m(u1 , u3 ; θ13; 2 ) does not depend on u2 . Moreover, with c1:3 (u1:3 ) = 1 + g(u2 )(1 − 2u1 )(1 − 2u3 ), Z 1 Z 1 ∂θ12 F˜1|2 (u1 |u2 ; θ12 ) c1:3 (u1:3 )du2 = ∂θ12 F˜1|2 (u1 |u2 ; θ12 ) du2

θ12 =0

0

θ12 =0

0

Z + (1 − 2u1 )(1 − 2u3 ) 0

1

∂θ12 F˜1|2 (u1 |u2 ; θ12 )

θ12 =0

g(u2 )du2

= (1 − 2u1 )(1 − 2u3 )h(u1 ; g), where the second equality follows because Z Z 1 ∂θ12 F˜1|2 (u1 |u2 ; θ12 )du2 = ∂θ12 0

1

F˜1|2 (u1 |u2 ; θ12 )du2 = ∂θ12 u1 = 0.

0

Thus, integrating out u2 , we obtain ∂θ12 E[log c˜1:3 (U1:3 ; θ12 )]

Z =

θ12 =0

[0,1]2

Z = [0,1]2

P m(u1 , u3 ; θ13; 2 )(1 − 2u1 )(1 − 2u3 )h(u1 ; g)du1 du3

P f (u1 , u3 ; θ13; 2 )h(u1 ; g)du1 du3 ,

(A.1)

P P where f (u1 , u3 ; θ13; 2 ) := m(u1 , u3 ; θ13; 2 )(1 − 2u1 )(1 − 2u3 ). We note that ∀u1 ∈ (0, 0.5), u3 ∈ (0, 1): P P f (0.5 + u1 , u3 ; θ13; 2 ) > 0 ⇔ θ13; 2 > 0, P P f (0.5 − u1 , u3 ; θ13; 2 ) = −f (0.5 + u1 , 1 − u3 ; θ13; 2 ).

So, if u1 ∈ (0, 0.5) then Z 1 Z P f (0.5 − u1 , u3 ; θ13; 2 )du3 = 0

1 P

Z

f (0.5 − u1 , 1 − u3 ; θ13; 2 )du3 = −

0 P

Thus, if we define K(u1 ; θ13; 2 ) :=

R1 0

1 P f (0.5 + u1 , u3 ; θ13; 2 )du3 .

0

f (u1 , u3 ; θ13; 2 )du3 we have that ∀u1 ∈ (0, 0.5): P

P P K(0.5 + u1 ; θ13; 2 ) > 0 ⇔ θ13; 2 > 0, P P K(0.5 − u1 ; θ13; 2 ) = −K(0.5 + u1 ; θ13; 2 ).

(A.2)


25

Plugging this into our integral (A.1) yields Z 1Z 1 P P f (u1 , u3 ; θ13; = ∂θ12 E[log c˜1:3 (U1:3 ; θ12 , θ13; )] 2 )h(u1 ; g)du1 du3 2 θ12 =0 0 0 Z 1 Z 0.5 P f (0.5 − u1 , u3 ; θ13; )du h(0.5 − u1 ; g) = 3 du1 2 0

0 0.5

Z +

Z h(0.5 + u1 ; g)

P f (0.5 + u1 , u3 ; θ13; )du 3 du1 2

0

0 0.5

Z

1

P h(0.5 − u1 ; g)K(0.5 − u1 ; θ13; 2 )du1

= 0

0.5

Z

P h(0.5 + u1 ; g)K(0.5 + u1 ; θ13; 2 )du1

+ 0 (A.2)

Z

=

0.5 P K(0.5 + u1 ; θ13; 2 )[h(0.5 + u1 ; g) − h(0.5 − u1 ; g)]du1 .

0 P P Note that if θ13; 2 = 0, then K(u1 ; θ13; 2 ) = 0 for all u1 ∈ (0, 1), or if g does not depend on u2 , then h(u1 ; g) = 0 for all u1 ∈ (0, 1), so in both cases the integrand is zero and we have an extremum.

A.3. Proof of Proposition 4.1. Let j = 3, . . . , K − 1, i = 1, . . . , K − j. For j = 3, we have PVC PVC Ui|i+1:i+2 = ∂2 Ci,i+2; i+1 Fi|i+1 (Ui |Ui+1 ), Fi+2|i+1 (Ui+2 |Ui+1 ) = FUi|i+1 |Ui+2|i+1 (Ui|i+1 |Ui+2|i+1 ), PVC where the first equality follows from Definition 4.1 and the second equality follows because Ci,i+2; i+1 = P P Ci,i+2; i+1 is a copula and (Ui|i+1 , Ui+2|i+1 ) ∼ Ci,i+2; i+1 . Using similar arguments, we obtain PVC Ui+3|i+1:i+2 = FUi+3|i+2 |Ui+1|i+2 (Ui+3|i+2 |Ui+1|i+2 ). PVC PVC Thus, for k = i, i + 3, Uk|i+1:i+2 is a CPIT and by Proposition 2.1 we conclude that Uk|i+1:i+2 ∼ U (0, 1), so PVC that Ci,i+3; i+1:i+2 is indeed a copula. PVC PVC Now let j = 4, . . . , K − 1 and assume that Ci,i+j−1; Si,j−1 and Ci+1,i+j; Si+1,j−1 are copulas. Because PVC PVC PVC PVC Ci,i+j−1; Si,j−1 is a copula and (Ui|Si,j−1 , Ui+j−1|Si,j−1 ) ∼ Ci,i+j−1; Si,j−1 , we obtain that PVC PVC PVC PVC Ui|S = ∂2 Ci,i+j−1; Si,j−1 (Ui|Si,j−1 , Ui+j−1|Si,j−1 ) ij

= FU PVC

i|Si,j−1

PVC |Ui+j−1|S

i,j−1

PVC PVC (Ui|S |Ui+j−1|S ). i,j−1 i,j−1

Using similar arguments, it follows that PVC Ui+j|S = FU PVC ij

i+j|Si+1,j−1

PVC |Ui+1|S

i+1,j−1

PVC PVC (Ui+j|S |Ui+1|S ). i+1,j−1 i+1,j−1

26


PVC PVC PVC This shows that, for k = i, i+j, Uk|S is a CPIT and so it is uniformly distributed. Because (Ui|S , Ui+j|S )∼ ij ij ij PVC PVC Ci,i+j; Sij , we conclude that Ci,i+j; Sij is indeed a copula.

A.4. Proof of Theorem 4.1. (1)

It is obvious that DKL (T1 (T1:0 )) is minimized if T1 contains the true copulas of the first tree, i.e., T1 = T1PVC . (j) Now let j ≥ 2. The KL divergence related to tree j, DKL (Tj (T1:j−1 )), is minimized when the negative cross entropy related to tree j is maximized. The negative cross entropy related to tree j is given by H (j) (Tj (T1:j−1 )) :=

K−j X

h i E log c˜i,i+j; Sij (F˜i|Sij (Ui |USij ), F˜i+j|Sij (Ui+j |USij ))

i=1

=:

K−j X

(j)

Hi (˜ ci,i+j; Sij , F˜i|Sij , F˜i+j|Sij ).

i=1 (j) Obviously, to maximize H (j) (Tj (T1:j−1 )) wrt Tj we can maximize each Hi (˜ ci,i+j; Sij , F˜i|Sij , F˜i+j|Sij ) individually for all i = 1, . . . , K − j. If j = 2, then (j)

(2)

PVC PVC Hi (˜ ci,i+j; Sij , Fi|S , Fi+j|S ) = Hi (˜ ci,i+2; i+1 , ∂2 Ci,i+1 , ∂1 Ci+1,i+2 ), ij ij

which is maximized if c˜i,i+2; i+1 = cPi,i+2; i+1 according to Theorem 3.1. Thus, if j = 2, then (j)

PVC arg min DKL (Tj (T1:j−1 )) = TjPVC .

(A.3)

Tj ∈Tj

To show that (A.3) holds for j ≥ 3 we use induction. Assume that (j)

PVC arg min DKL (Tj (T1:j−1 )) = TjPVC .

Tj ∈Tj

holds for 2 ≤ j ≤ K − 2. To minimize the KL divergence related to tree j + 1 =: n wrt Tn , conditional on PVC T1:n−1 = T1:n−1 , we have to maximize the negative cross entropy which is maximized if (n)

PVC PVC Hi (˜ ci,i+n; Si,n , Fi|S , Fi+n|S ) i,n i,n h i PVC PVC = E log c˜i,i+n; Si,n Fi|S (U |U ), F (U |U ) i S i+n S i,n i,n i+n|S i,n i,n


27

PVC is maximized for all i = 1, . . . , K − n. Using the substitution ui = (Fi|S )−1 (ti |uSi,n ) = Gi|Si,n (ti |uSi,n ) and i,n −1 PVC ui+n = (Fi+n|Si,n ) (ti+n |uSi,n ) = Gi+n|Si,n (ti+n |uSi,n ), we obtain Z (n) Hi (˜ ci,i+n; Si,n , F˜i|Si,n , F˜i+n|Si,n ) = log c˜i,i+n; Si,n (ti , ti+n ) [0,1]n+1 × ci,i+n|Si,n Fi|Si,n Gi|Si,n (ti |uSi,n ) uSi,n , Fi|Si,n Gi+n|Si,n (ti+n |uSi,n ) uSi,n uSi,n Q k=i,i+n fk|Si,n Gk|Si,n (tk |uSi,n ) uSi,n cSi,n (uSi,n )duSi,n dti dti+n ×Q PVC k=i,i+n fk|Si,n Gk|Si,n (tk |uSi,n ) uSi,n Z = log c˜i,i+n; Si,n (ti , ti+n ) [0,1]2 Z × ci,i+n|Si,n Fi|Si,n Gi|Si,n (ti |uSi,n ) uSi,n , Fi|Si,n Gi+n|Si,n (ti+n |uSi,n ) uSi,n uSi,n [0,1]n−1 Q k=i,i+n fk|Si,n Gk|Si,n (tk |uSi,n ) uSi,n cSi,n (uSi,n )duSi,n dti dti+n ×Q PVC k=i,i+n fk|Si,n Gk|Si,n (tk |uSi,n ) uSi,n Z = log c˜i,i+n; Si,n (ti , ti+n )cPVC i,i+n; Si,n (ti , ti+n )dti dti+n ,

[0,1]2

PVC which is maximized for c˜i,i+n; Si,n = cPVC i,i+n; Si,n = ci,i+(j+1); Si,j+1 by Gibbs’ inequality.

A.5. Proof of Proposition 4.2 ˜ : θ˜ ∈ Θ}, with C1:6 (θ) ˜ = Without loss of generality let C1:4 ∈ C4 be the data generating process.4 Let {C1:6 (θ) C1 (θ˜1 ), C2 (θ˜2 ), C3 (θ˜3 ), C4 (θ˜4 ), C5 (θ˜5 ), C6 (θ˜6 ) = C˜12 (θ˜1 ), C˜23 (θ˜2 ), C˜34 (θ˜3 ), C˜13; 2 (θ˜4 ), C˜24; 3 (θ˜5 ), C˜14; 23 (θ˜6 ) . By assumption, there exists a θ ∈ Θ so that (Ui , Ui+1 ) ∼ Ci (θi ), (Ui|i+1 , Ui+2|i+1 ) ∼ Cj (θj ),

i = 1, 2, 3, j = 4, 5, i = j − 3,

PVC PVC (U1|2:3 , U4|2:3 ) ∼ C6 (θ6 ). PVC PVC PVC PVC PVC PVC That is, C1:6 (θ) = (C12 , C23 , C34 , C13; 2 , C24; 3 , C14; 23 ). Moreover, without loss of generality, assume that each pair-copula of the PVC has one scalar parameter and that none of the copulas share the same n N parameter.5 Let N ∈ N and (U1:4 )n=1 be N independent copies of U1:4 . For each n = 1, . . . , N , define n n n n n Ui|i+1 (θ˜i ), Ui+2|i+1 (θ˜i+1 ) = ∂2 Ci (Uin |Ui+1 ; θ˜i ), ∂1 Ci+1 (Ui+2 |Ui+1 ; θ˜i+1 ) , i = 1, 2, n ˜ PVC n n PVC n ˜ (θ˜2:3 , θ˜5 ) = F1|2:3 (U1n |U2:3 ; θ1:2 , θ˜4 ), F4|2:3 (U4n |U2:3 ; θ2:3 , θ˜5 ) U1|2:3 (θ˜1:2 , θ˜4 ), U4|2:3 n ˜ n ˜ n ˜ n ˜ = ∂2 C4 (U1|2 (θ1 ), U3|2 (θ2 ); θ˜4 ), ∂1 C5 (U2|3 (θ2 ), U4|3 (θ3 ); θ˜5 ) .

4 If

C1:4 is not the data generating process we have to estimate marginal distributions to obtain pseudo-data from C1:4 . However, this does not affect the following results. 5 If the pair-copulas share parameters, Condition 1 in Hobæk Haff [18] applies with dC PVC 1:d replaced by dC1:d .

28

F. Spanhel and M. S. Kurz Consider the following estimating function

N N N N N ˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜ ΨN (θ˜1:6 ) = ΨN 1 (U1:4 ; θ1 ), Ψ2 (U1:4 ; θ2 ), Ψ3 (U1:4 ; θ3 ), Ψ4 (U1:4 , θ1:2 , θ4 ), Ψ5 (U1:4 ; θ2:3 , θ5 ), Ψ6 (U1:4 ; θ1:6 ) , where −1 ˜ ΨN i (U1:4 ; θi ) = N

N X

n ∂θ˜i log ci (Uin , Ui+1 ; θ˜i ),

i = 1, 2, 3,

n=1

˜ ˜ ΨN i (U1:4 , θj:j+1 , θj+3 )

=N

−1

N X

n n (θ˜j ), Uj+2|j+1 (θ˜j+1 ); θ˜j+3 ), ∂θ˜i log ci (Uj|j+1

i = 4, 5, j = i − 3,

n=1 −1 ˜ ΨN 6 (U1:4 , θ1:6 )) = N

N X

n n ∂θ˜6 log c6 (U1|2:3 (θ˜1:2 , θ˜4 ), U4|2:3 (θ˜2:3 , θ˜5 ); θ˜6 ),

n=1

so that a solution θˆ1:6 of the estimating equation ΨN (θ˜1:6 ) = 0 is a Z-estimator for θ1:6 . This Z-estimator is the stepwise ML estimator defined in Aas et al. [5] or Hobæk Haff [18]. √ The regularity conditions for asymptotic normality (and thus consistency) of N (θˆ1:6 − θ1:6 ) (Theorem 1 in Hobæk Haff [18]) are conditions that are imposed on the components of the estimating function ΨN , which are functions of the pair-copulas (C1 , . . . , C6 ) . That is, the regularity conditions are placed on the pair-copulas of the PVC but not on C1:4 . Theorem 4.1 and the regularity conditions imply that θ1:6 is a zero of the mapping θ˜1:6 → m(θ˜1:6 ) =

0 n n E[∂θ˜1 log c1 (U1 , U2 ; θ˜1 )], . . . , E[∂θ˜6 log c6 (U1|2:3 (θ˜1:2 , θ˜4 ), U4|2:3 (θ˜2:3 , θ˜5 ); θ˜6 )] ,

irrespective of whether the simplifying assumption holds for C1:4 . The conditions stated in Theorem 1 of Hobæk Haff [18] which imply that θˆ1:6 then weakly converges to θ1:6 are also only related to the PVC. Thus, the stepwise ML estimator defined in Aas et al. [5] or Hobæk Haff [18] is consistent for θ1:6 if C1:4 is the data generating copula and its PVC is given by C˜1:4 (θ). It also follows that the asymptotic distribution of √ N (θˆ1:6 − θ1:6 ) does not depend on C1:4 if the PVC is fixed. For K > 4 the same arguments apply.

A.6. Proof of Lemma 4.1. For the copula in the fourth tree of the PVC it holds (4.14),(4.15)

−1 −1 PVC PVC PVC = P(U1|4 ≤ a, U5|2 ≤ b) = P(U1 ≤ F1|4 (a|U4 ), U5 ≤ F5|2 (b|U2 )) C15; 2:4 (a, b) = P(U1|2:4 ≤ a, U5|2:4 ≤ b) Z −1 −1 = F15|2:4 (F1|4 (a|u4 ), F5|2 (b|u2 )|u2:4 )c2:4 (u2:4 )du2:4 [0,1]3 Z −1 −1 = C15|2:4 (F1|2:4 (F1|4 (a|u4 )|u2:4 ), F5|2:4 (F5|2 (b|u2 )|u2:4 )|u2:4 )c2:4 (u2:4 )du2:4 [0,1]3

Simplified vine copula models Z (4.2) −1 −1 = F1|2:4 (F1|4 (a|u4 )|u2:4 )F5|2:4 (F5|2 (b|u2 )|u2:4 )c2:4 (u2:4 )du2:4 3 [0,1] Z (4.12),(4.13) −1 −1 = F1|2:3 (F1|4 (a|u4 )|u2:3 )F5|3:4 (F5|2 (b|u2 )|u3:4 )c2:4 (u2:4 )du2:4 3 [0,1] Z (4.7),(4.8) −1 −1 = F1|4 (a|u4 )[1 + g(u2 )(1 − F1|4 (a|u4 ))(1 − 2u3 )]

29

[0,1]3

−1 −1 × F5|2 (b|u2 )[1 + g(u4 )(1 − F5|2 (b|u2 ))(1 − 2u3 )] × [1 + g(u3 )(1 − 2u2 )(1 − 2u4 )]du2:4 " Z Z −1 −1 −1 −1 = F1|4 (a|u4 )F5|2 (b|u2 ) 1 + (1 − 2u3 )2 du3 (1 − F1|4 (a|u4 ))(1 − F5|2 (b|u2 ))g(u4 )g(u2 ) [0,1]2

[0,1]

Z + [0,1]

−1 (1 − 2u3 )g(u3 )du3 (1 − F5|2 (b|u2 ))(1 − 2u2 )(1 − 2u4 )g(u4 )

#

Z (1 − 2u3 )g(u3 )du3 (1 −

+ [0,1]

−1 F1|4 (a|u4 ))(1

− 2u4 )(1 − 2u2 )g(u2 ) du2 du4 ,

R R R (4.4) (4.4) where we used that [0,1] (1 − 2u3 )du3 = 0, [0,1] g(u3 )du3 = 0 and [0,1] (1 − 2u3 )2 g(u3 )du3 = 0. By setting R γ := −2 [0,1] ug(u)du we can write the copula function as PVC

Z

C15; 2:4 (a, b) =

[0,1]2

−1 −1 F1|4 (a|u4 )F5|2 (b|u2 )

1 −1 −1 (a|u4 ))(1 − F5|2 (b|u2 ))g(u4 )g(u2 ) 1 + (1 − F1|4 3

−1 + γ(1 − F5|2 (b|u2 ))(1 − 2u2 )(1 − 2u4 )g(u4 )

i −1 + γ(1 − F1|4 (a|u4 ))(1 − 2u4 )(1 − 2u2 )g(u2 ) du2 du4 Z 1 Z 1 −1 −1 = F1|4 (a|u4 )du4 F5|2 (b|u2 )du2 0

0

1

Z

(1 −

+γ 0

(1 −

+γ 0

1 3

−

−1 (F5|2 (b|u2 ))2 )du2

Z 0

−1 2u4 )(F1|4 (a|u4 )

−

−1 (F1|4 (a|u4 ))2 )du4

1 −1 −1 g(u4 )(F1|4 (a|u4 ) − (F1|4 (a|u4 ))2 )du4

1

Z

1

Z

+

−1 2u2 )(F5|2 (b|u2 )

(1 − 0

Z

Z 0

1

(1 − 0

−1 2u4 )g(u4 )F1|4 (a|u4 )du4

−1 2u2 )g(u2 )F5|2 (b|u2 )du2

1 −1 −1 g(u2 )(F5|2 (b|u2 ) − (F5|2 (b|u2 ))2 )du2

If (U, V ) ∼ C F GM2 (θ), the quantile function is given by (cf. Remillard [30]) p 1 + h(v) − (1 + h(v))2 − 4h(v)u −1 FU |V (u|v) = , 2h(v) with h(v) := θ(1 − 2v), which implies ∂ −1 1 F (u|v) = p =: G(u, v), ∂u U |V (1 + h(v))2 − 4h(v)u

(A.4)

30


and ∂ 1 (F −1 (u|v))2 = [(1 + h(v))G(u, v) − 1] . ∂u U |V h(v)

(A.5)

For the density of the copula in the fourth tree of the PVC it follows ∂2 C PVC (a, b) ∂a∂b 15; 2:4 Z Z 1 (A.4),(A.5) G(a, u4 )du4 =

cPVC 15; 2:4 (a, b) =

G(b, u2 )du2

0

0

1 + γ

1 γ

+

1

Z

1

1−

1

Z

(1 − 2u4 )g(u4 )G(a, u4 )du4

G(b, u2 )du2 0

0

Z 1−

1

Z

1

(1 − 2u2 )g(u2 )G(b, u2 )du2

G(a, u4 )du4 0

0

Z 1 g(u4 ) g(u2 ) [1 − G(a, u4 )] du4 [1 − G(b, u2 )] du2 0 h(u4 ) 0 h(u2 ) Z 1 1 1 1 1− log(σ(b)) (1 − 2u4 )g(u4 )G(a, u4 )du4 = 2 log(σ(a)) log(σ(b)) + 4θ γ 2θ 0 Z 1 1 1 + 1− log(σ(a)) (1 − 2u2 )g(u2 )G(b, u2 )du2 γ 2θ 0 Z 1 Z 1 1 g(u4 ) g(u2 ) [1 − G(a, u4 )] du4 [1 − G(b, u2 )] du2 , + 3 0 h(u4 ) 0 h(u2 ) +

1 3

Z

1

where σ(i) =

! p (1 + θ)2 − 4θi + 1 − 2i + θ p (1 − θ)2 + 4θi + 1 − 2i − θ

for i ∈ {a, b}.

If we set g(u) := 1 − 2u, then θ = 1/9 and γ = 1/3, and we get Y 81 Y 81 cPVC (a, b) = log(s(i)) + 27 1 − log(s(i)) 15; 2:4 4 4 i=a,b i=a,b " # 2187 X 28 p 9 1 26 p 2 + 1 − log(s(i)) (6j − 6j + 1) log(s(j)) + (6j − ) 25 − 9j − (6j − ) 16 + 9j 2 2 9 9 9 (i,j)∈Ia,b

where √ 25 − 9i + 5 − 9i s(i) = √ 16 + 9i + 4 − 9i

for i ∈ {a, b} and Ia,b := {(a, b), (b, a)}.

PVC Evaluating the density shows that C15; 2:4 is not the independence copula.


31

A.7. Proof of Theorem 4.2 Q3 Q3 Let C F GM3 (u1:3 ; θ) = i=1 ui + θ i=1 ui (1 − ui ), |θ| ≤ 1, be the three-dimensional FGM copula, K ≥ 4, PVC and (i, j) ∈ IK . It is obvious that Ci,i+2|i+1 = C⊥ ⇒ Ci,i+2; i+1 = C⊥ is true. Let J ∈ {2, . . . , K − 2} be fixed. Assume that C1:K has the following D-vine copula representation of the non-simplified form C1,1+J|2:J = ∂3 C F GM3 (u1 , u1+J , u2 ; 1) C2,2+J|3:J+1 = ∂3 C F GM3 (u2 , u2+J , u1+J ; 1) and Ci,i+j|Si,j = C ⊥ for all other (i, j) ∈ IK . Using the same arguments as in the proof of Lemma 4.1 we obtain PVC ⊥ Ci,i+J; i+1:J−1 = C ,

i = 1, 2,

PVC F GM2 C1,2+J; (1/9). 2:J+1 = C PVC ⊥ This proves that Ci,i+2|i+1 = C ⊥ ⇐ Ci,i+2; is not true in general and that, for j ≥ 3, neither the i+1 = C ⊥ PVC ⊥ ⊥ PVC is true statement Ci,i+j|Sij = C ⇒ Ci,i+j; Sij = C nor the statement Ci,i+j|Sij = C ⊥ ⇐ Ci,i+j; Sij = C in general.

Simplified vine copula models: Approximations based on the ... - arXiv

Simplified vine copula models: Approximations based on the ... - arXiv

Suggest Documents

A D-vine copula based model for repeated measurements ... - arXiv

Vine copula based inference of multivariate event time data - arXiv

A goodness-of-fit test for regular vine copula models

Regime switching vine copula models for global equity and volatility ...

A Vine-copula Based Adaptive MCMC Sampler for ... - Project Euclid

On martingale approximations - arXiv

Bernstein Approximations to the Copula Function and Portfolio

Copula Gaussian graphical models and their application to ... - arXiv

Stress Testing German Industry Sectors: Results from a Vine Copula ...

A vine and gluing copula model for permeability stochastic simulation

optimality in copula models - Core

A vine-copula conditional value-at-risk approach to ...

On Cavity Approximations for Graphical Models

On the Correlation Structure of Gaussian Copula Models for ...

Worst Case Copula-CVaR Performance based on

A Copula Based Bayesian Approach for Paid-Incurred Claims Models ...

Efficient estimation of copula-based semiparametric Markov ... - arXiv

CopulaDTA: An R Package for Copula Based Bivariate Beta ... - arXiv

Validating Simplified Processor Models in Architectural Studies - arXiv

A Family of Simplified Geometric Distortion Models for Camera ... - arXiv

Investigation of simplified thermal expansion models for ... - arXiv

MOUSSERENDE VINE HVIDE VINE ROSÈ VINE

Copula Gaussian Graphical Models - University of Washington

Multivariate Skew Distributions Based on the GT-Copula*