Predictive Quantile Regressions under Persistence and Conditional ...

19 downloads 0 Views 586KB Size Report
Moreover, the conditional heteroskedasticity introduces rather complicated .... Assumption 2.1 The specification of predictor xt−1 follows the autoregressive form.
Predictive Quantile Regressions under Persistence and Conditional Heteroskedasticity Rui Fan∗

Ji Hyung Lee



July, 2017

Abstract This paper provides an improved inference for predictive quantile regressions with persistent predictors and conditionally heteroskedastic errors. The confidence intervals based on conventional quantile regression techniques are not valid when predictors are highly persistent. Moreover, the conditional heteroskedasticity introduces rather complicated nuisance parameters in the limit theory, whose estimation errors can be another source of distortion. We propose a size-corrected bootstrap inference thereby avoiding the nuisance parameter estimation. The bootstrap consistency is shown even with the nonstationary predictors and conditionally heteroskedastic innovations. Monte Carlo simulation confirms the significantly better test size performances of the new methods. The empirical exercises on stock return quantile predictability are revisited. Keywords: α-mixing process, Conditional heteroskedasticity, Moving block bootstrap, Predictive regression, Quantile regression. JEL classification: C22



Department of Economics, University of Illinois, 1407 W. Gregory Dr., 214 David Kinley Hall, Urbana, IL 61801. Email: [email protected] † Department of Economics, University of Illinois, 1407 W. Gregory Dr., 214 David Kinley Hall, Urbana, IL 61801. Email: [email protected]

1

1

Introduction

Predictive regression models have been popular in empirical economics. A common example is to infer the predictive relation between financial returns and economic state variables. Nonstationary predictors in predictive regressions typically lead to a spurious inference. The conventional t-test based approaches are not able to correct this type of nonstationary distortion. Extensive studies have been devoted to correct the inflated test size, see Campbell and Yogo (2006), Kostakis et al. (2014), Phillips and Lee (2013) and Choi et al. (2016) among many others. Quantile regression (QR henceforth; Koenker and Bassett (1978)) is an appealing technique in light of the issues predictive regressions face. We can potentially detect greater predictability at conditional quantiles of financial returns than at the mean. The standard QR techniques are, however, subject to the same issue of size distortion when predictors are highly persistent. In addition to the predictor persistence, another important issue in predictive regression models is the conditional heteroskedasticity (CHE). We commonly observe the time varying volatility in financial return data so it is important to allow a reasonable form of CHE in the predictive regression errors. The CHE effect, however, has received much less attention in either predictive regression or QR models. There have been a few studies on QR with persistent regressors. Xiao (2009) developed a novel cointegration framework in nonstationary QR models. In predictive QR models, Maynard et al. (2011) and Lee (2016) considered Bonferroni-based and IVX-based methods to correct the nonstationary distortion, respectively. The last paper adopts the IVX-filtration idea (Magdalinos and Phillips, 2009) into QR framework and proposed the so-called IVX-QR methods; see Section 2 below for a quick review. In the nonstationary QR literature, the regression innovation has been typically assumed to be conditionally homoskedastic1 . The validity of standard error estimation in QR inference crucially depends on the assumed regression innovation structure, and the availability of observations near the quantile of interest (Koenker (2005), Ch.3). Under the homoskedastic error assumption, the nuisance parameters in the nonstationary QR limit theory have simpler forms, so we may employ a density estimation using the QR residuals. Oftentimes we may even cancel out the nuisance parameters using some proper normalization. The density estimation is however difficult when observations are sparse in the neighborhood of the quantile of interest. Moreover, when the homoskedastic error assumption does not hold, the corresponding standard error estimation is no longer valid, either by estimation or cancellation. This paper provides a valid and easy-to-use inference procedure in predictive QR framework with CHE innovations. We use the moving block bootstrap (MBB; K¨ unsch (1989), Liu and Singh (1992)) to resample the pair of the dependent variable and the IVX-filtered predictors. The reduced predictor persistence by IVX filtering enables us to correct the nonstationary distortion in QR. The pairwise MBB then circumvents the issue of nuisance parameter estimation, while preserving the 1

Lee (2016, Remark 2.1) addresses this issue by including a proxy of error variance as a QR predictor; see Section 1.3.3 of the online supplement (Lee, 2014).

2

validity via its inherent robustness to heteroskedasticity. The asymptotic validity of this new QR inference is established in the presence of persistent predictor and CHE errors. This MBB IVX-QR method therefore provides a convenient size-corrected inference. Our Monte Carlo study shows that when predictor variables follow various forms of nonstationary processes, the MBB IVX-QR approach not only maintains the good properties of IVX-QR, but also further reduces the inferential errors by avoiding variance estimation, especially at tail quantiles. The results also show that the MBB IVX-QR method provides better finite sample size control under conditional heteroskedasticity. The empirical implication of this improvement is that we can provide a more conservative test for financial return quantile predictability. This is confirmed through the empirical exercises. The paper is organized as follows. Section 2 introduces the model and the econometric issues associated with the QR models with persistent predictors. The section also provides a brief introduction of Lee (2016)’s IVX-QR method. Section 3 studies QR and IVX-QR limit theory under CHE errors, motivating the bootstrap-based inference. Section 4 presents the theory of MBB IVXQR approach. Section 5 provides a Monte Carlo study to illustrate the improved finite-sample performance of the new methods. Section 6 reinvestigates the empirical exercises on stock return quantile predictability, and Section 7 concludes. All the technical details are relegated to the Appendix. Notation

We use standard notation. =⇒ and →p represent convergence in distribution and in

probability, respectively. All limit theory assumes n → ∞ so we sometimes omit this condition. ∼ signifies “being distributed as” either exactly or asymptotically depending on the contexts. O (1) and o(1) (Op (1) and op (1)) are (stochastically) asymptotically bounded or negligible quantities.

2

Model and Review of Existing Results

We introduce the model and review the econometrics issue and limit theory in nonstationary predictive QR models. Consider the linear predictive QR model: 0 yt = β0τ + β1τ xt−1 + utτ = βτ0 Xt−1 + utτ ,

(1)

where the conditional moment restriction Pr (yt < βτ0 Xt−1 |Ft−1 ) = Pr (utτ < 0|Ft−1 ) = τ defines 0 )0 of (K + 1) × 1 vector. Motivated by the stylized fact of CHE the QR parameters βτ = (β0τ , β1τ

of financial asset returns (yt ), we impose CHE to the mean regression innovation ut = yt − βµ0 Xt−1  0 0 . Then the (equals to yt under the null hypothesis of no predictability) where βµ = β0µ , β1µ relation utτ = ut + (βµ − βτ )0 Xt−1 clearly indicates the QR innovation utτ is also subject to the CHE effects. Without loss of generality, we assume βµ = 0 (no predictability at the mean)2 and focus on the significance test H0 : β1τ = 0 for a given τ ∈ (0, 1). These tests can detect some 2

When βµ 6= 0, simply redefining bτ = βµ − βτ justifies all the following theory with minor modifications.

3

quantile specific predictors for financial returns so they are of primary importance in practice. The implied relation between mean and quantile regression innovations is now 0 utτ = ut − β0τ − β1τ xt−1 .

(2)

Assumption 2.1 The specification of predictor xt−1 follows the autoregressive form xt = Rn xt−1 + uxt , C Rn = IK + α , for some α > 0, n

(3)

where n is the sample size and C = diag (c1 , c2 , ..., cK ). The predictor innovation uxt satisfies the conditions in Assumption 3.1 below. The pair of (α, C) represents persistence in the multiple predictors of unknown degree, and xt can belong to any of the following persistence categories3 : I(0) stationary: α = 0 and |1 + ci | < 1, ∀i, MI mildly integrated: α ∈ (0, 1) and ci ∈ (−∞, 0), ∀i, I(1) local to unity and unit root: α = 1 and ci ∈ (−∞, ∞), ∀i.

2.1

Review: nonstationary QR distortion

The ordinary QR estimators of parameters are: βˆτ = argminβ

n X

ρτ (yt − β 0 Xt−1 ),

(4)

t=1

where ρτ (u) = u(τ − 1(u < 0)), τ ∈ (0, 1) is the conventional QR loss function. As shown in Lee (2016), persistence in xt−1 can lead to a nonstandard distortion in t-ratio for testing β1τ = 0 depending on nonlinear dependence between ut and uxt , that is:

tβˆ1τ

h i1/2  2 ˆ 1 − λ (τ ) Z + λ (τ ) ηLU R (c) β1τ − β1τ | {z } = ∼ | , {z } s.e(βˆ1τ ) standard inference nonstandard distortion

(5)

where Z and ηLU R (c) stand for a standard normal distribution and a local to unit root t-statistic, respectively, and λ(τ ) = −cor(1(utτ < 0)uxt ) 6= cor(utτ , uxt ). As the analytical expression (5) shows, the nonstandard distortion becomes greater with (i) smaller |c| and (ii) larger |λ (τ )|. Condition (i) is well known from the mean predictive regression literature where the distortion from the highly left-skewed feature of ηLU R (c) with small |c| has been studied. Condition (ii) is a special feature of QR with persistent regressors. 3

Lee (2016) also considers the mildly explosive case, but bootstrapping mildly explosive data is technically demanding (if possible) is beyond the scope of this paper.

4

ˆ (τ ) is large in its absolute value (plug-in estimation is suggested in Lee; Therefore, when λ 2014) and xt is highly persistent (smaller c; suspected from unit root-like behavior), the tβˆ1τ can be large even though the true value β1τ is zero, resulting a spurious QR result. Including a spurious predictor can be detrimental in empirical analysis and forecasting.

2.2

Review: IVX filtration and existing limit theory

The IVX method suggested by Magdalinos and Phillips (2009) filters xt to generate z˜t with MI persistence - intermediate between first differencing and the use of levels data. In particular, we choose F = Rnz as follows: Cz , nδ where δ ∈ (0, 1) , Cz = cz IK , cz < 0 and z˜0 = 0. z˜t = Rnz z˜t−1 + 4xt , Rnz = IK +

(6)

The parameters δ ∈ (0, 1) and cz < 0 are specified by the researcher, and some practical suggestions to use in predictive QR is given in Section 4.1 of Lee (2016). We define the normalized random variables z˜t−1 and xt−1 by † † := kn−1/2 xt−1 , := kn−1/2 z˜t−1 and Xt−1 Zt−1

where

( kn =

IK α∧δ n IK

for I(0), for MI and I(1).

(7)

(8)

Following Lee (2016), we also unify the different asymptotic moment matrices for the MI and I(1) cases : ( Vcxz :=

Vxx

R∞

erCz Ωxx erCz dr, when δ ∈ (0, α ∧ 1) , R0 ∞ rC = 0 e Ωxx erC dr, when α ∈ (0, δ) .

x = Vzz

(9)

and

where Ωxx

  R −1 c c0 if α = 1,   −Cz Ωxx + dJx Jx , −1 Ψcxz := (10) −Cz {Ωxx + CVxx } , if α ∈ (δ, 1) ,   Vcxz = Vxx if α ∈ (0, δ) . Rr = E [uxt u0xt ] and Jxc (r) = 0 e(r−s)C dBx (s) is Ornstein-Uhlenbeck (OU) process with

Bx (·) being Brownian Motion BM (Ωxx ). In order to test H0 : β1τ = 0, Lee (2016) proposed to use QR with z˜t−1 from (6) 4 : IV XQR βˆ1τ = arg min β

n X

 ρτ ytτ − β10 z˜t−1 .

(11)

t=1

IV XQR IV XQR We use γˆ1τ of Theorem 3.2 in Lee (2016) but still write it as βˆ1τ in this paper for expositional convenience. 4

5

We have the following asymptotics of null test statistics under homoskedastic mds regression errors. Theorem 2.1 (Theorem 3.2 of Lee (2016))Under H0 : β1τ = 0, 

IV XQR (nkn )1/2 βˆ1τ − β1τ



    N 0, τ (1−τ )2 Ω−1 for I(0),  fuτ (0) xx  =⇒ . τ (1−τ ) −1  N 0, for MI and I(1), 2 Vcxz f (0) uτ

Remark 2.1 The homoskedasticity of regression errors allows the convenient form of variancecovariance matrix in the limit with a separated sparsity fuτ (0), which can be either cancelled by a proper normalization or can be estimated. The stylized facts of financial time series, however, strongly suggests CHE innovations. As we see below, asymptotic theory under heteroskedasticity is quite different, which calls for a new method of statistical inference. The IVX methods with CHE are studied in the mean regression framework (Phillips and Lee, 2016; Kostakis et al., 2014) but the nonstationary QR with CHE has not been addressed in the literature.

3

Limit Theory under Conditional Heteroskedasticity

We study the QR and IVX-QR limit theory with persistent predictors and CHE innovations in this Section. Under CHE errors, the limit theory becomes more involved with complicated forms of nuisance parameters, introducing another potential source of distortion in statistical inference.

3.1

Heuristics

To see the source of the distortion quickly, let’s assume a univariate I(1) predictor xt = (1 + c/n) xt−1 + uxt , with uxt ∼ mds (0, Σxx ). Also assume ut = σt εt where εt ’s are iid mean-zero random sequences, as given in Assumption 3.1 below. Under H0 : β1τ = 0, from (2), we have utτ = ut − β0τ . From (4) with the standard quadratic approximation argument (Pollard, 1991; Knight 1998), n



QR βˆ1τ

− β1τ





Pn ψτ (utτ )xt−1 , Pnt=1 2 t=1 futτ ,t−1 (0)xt−1

1 n 1 n2

where ψτ (utτ ) = (τ − 1(ut < β0τ )), and futτ ,t−1 (0) =

1 σt fε

 Fε−1 (τ ) . Note that ξnt =

ψτ (utτ ) √ n

is still

mds with the quadratic variation of τ (1 − τ ) in spite of CHE utτ . The standard invariance principle and the convergence results to the stochastic integrals deliver the limit theory of the numerator, R R1 P n−1 nt=1 ψτ (utτ )xt−1 =⇒ J¯xc dBψτ , where J¯xc = Jxc (r) − 0 Jxc (r)dr is the demeaned OU process, and Bψτ ≡ BM (0, τ (1 − τ )). The denominator, however, shows somewhat different asymptotic P behavior under a reasonable assumption on σt (Assumption 3.1), n−2 nt=1 futτ ,t−1 (0)x2t−1 =⇒

6

 h i R c 2 J¯x dr, leading to fε Fε−1 (τ ) E σ1t 

QR n βˆ1τ − β1τ





R

1  fε Fε−1 (τ ) E

h iR 1 σt

J¯xc dBψτ 2 . J¯xc dr

Following Lee (2016), use the decomposition dBψτ = dBψτ . x + Σψτ x Σ−1 xx dBx to have n



QR βˆ1τ

− β1τ





"R

1  h i fε Fε−1 (τ ) E σ1t

#   R ¯c Jx dBx J¯xc dBψτ . x Σψτ x 2 + 2 . R R Σxx J¯xc J¯xc

In addition to the nonstationary distortion arising from

R

J¯xc

negligible, we also have an additional nuisance parameter fε

2

−1 R

J¯xc dBx when Σψτ x is not  h i Fε−1 (τ ) E σ1t which is harder to dr

estimate than in the case with conditional homoskedastic errors. Estimation of this nuisance parameter is not easy unless we have both (i) a simple form of DGP of σt and (ii) plenty of observations around the quantile of interest. As we see below, the complication of nuisance parameters in the limit theory becomes even worse for MI-I(0) cases, clearly motivating the usage of a time series bootstrap procedure combined with the IVX filtration.

3.2

Limit Theory

To lay out the nonstationary QR limit theory under CHE, let us impose a reasonable weak dependence and moment conditions for the error processes and their conditional variances. Following standard notation, let F[−∞,m] = σ(..., xm−1 , xm ) and F[m+j,∞] = σ(xm+j , xm+j+1 , ...), j > 0, denote for the σ-algebras generated by {xt }. The mixing coefficients α(j) are defined as: α(j) = sup

|P (A)P (B) − P (AB)|.

sup

m A∈F[−∞,m] ,B∈F[m+j,∞]

Recall when α(j) → 0 as j → ∞, we denote {xt }t=1,...,n as an α-mixing process.  0 2 = E(u2 |F 2 2 2 Let σt2 = E(u2t |Ft−1 ), σxt xt t−1 ), ut = (ut , uxt ) and σt = σt , σxt . Assumption 3.1

1. σt2 ∈ Ft−1 and E(ut |Ft−1 ) = 0. In particular, ut = σt εt where εt ’s are iid

mean-zero random sequences with CDF Fε , pdf fε . 2. The processes {ut , σt } are α-mixing sequences of size cients α(j) =

O(j λ )

for λ
2, i.e., the mixing coeffi-

< 0.

3. E|uxt |2r+γ < C1 for all t = 1, ..., n and some small γ > 0. 2 |r < ∞ for all t = 1, ..., n. 4. E|σxt

Remark 3.1 As in Carrasco and Chen (2002) or Lindner (2009), various GARCH and Stochastic Volatility models have β-mixing (hence α-mixing) properties with exponential decay rates. Indeed, 7

α-mixing processes are bigger classes than just CHE processes, and thus are general enough to include many practical nonlinear time series innovations. To accommodate CHE in the mean predictive regressions, Phillips and Lee (2016) allows a rather different nonlinear processes based on Wu (2005) and Kostakis et al. (2014) use of vec-GARCH(p,q) models. The mixing condition imposes a reasonable weak dependence structure for the conditional volatility processes σt but allows their DGP to remain unspecified. Theorem 3.1 (QR limit theory under CHE) Define the normalizing matrices   



nIK+1 for I(0), √ 1+α Dn = diag( n, n 2 ) for MI,  √  diag( n, nIK ) for I(1). Under Assumptions 2.1 and 3.1,

  Dn βˆτQR − βτ =⇒

                  

where Ωβτ = E

 N 0,

σt

2

fε (Fε−1 (τ ))

Ωβτ " 1

for I(0), #!

0 τ (1−τ ) h io2 for MI, 0, n −1 fε (Fε−1 (τ ))E σ1 0 V xx # " #−1 " R tc 0 (1) 1 J (r) B ψ 1 h i R c τ R c R c x c 0 for I(1), fε (Fε−1 (τ ))E σ1 J (r)dB J (r) J (r)J (r) ψτ x x x x t

0 t−1 Xt−1

hX



τ (1−τ )

N

i−1

i−1 0   h Xt−1 Xt−1 R∞ 0 E Xt−1 Xt−1 E and Vxx = 0 erC Ωxx erC dr. σt

Remark 3.2 The signal strength of the CHE effect from σt and that of I(0) regressor xt are comparable, so the nuisance parameter Ωβτ shows some interaction between them. As the regressor persistence increases to MI-I(1), h i the signal strength of σt is dominated by the regressor persistence so the average CHE effect E σ1t is separated out in the limit. In all cases, the direct estimation of the standard error is not feasible unless we assume the exact (and convenient) form of CHE (see, e.g., Koenker and Zhao (1996) and Xiao and Koenker (2009) for ARCH/GARCH specifications). Even with a convenient CHE specification, the tail inference may not be reliable due to the difficulty of density estimation. All these aspects, again, motivate the bootstrap inference in the next Section. Remark 3.3 From the above argument we confirm that the CHE effect can potentially lead to a severe size distortion, in addition to the nonstationary h i QR distortion. Roughly speaking, when there is a substantial amount of heteroskedasticity, E

1 σt

will shrink thereby inflating the dispersion of

the asymptotic distribution in Theorem (3.1). To remove the nonstationary distortion, we can employ IVX-QR from (11):

8

Theorem 3.2 (IVX-QR limit theory under CHE) Under H0 : β1τ = 0,

(nkn )

1/2

where Ωβτ = E



IV XQR βˆ1τ

0 t−1 Xt−1

hX

− β1τ

i−1

σt

( Vcxz :=



E



=⇒

    

N

    N

0, n

Vxx



τ (1−τ )

2 Ωβτ fε (Fε−1 (τ ))

for I(0), .

!

0 Xt−1 Xt−1

x = Vzz

 0,



E

τ (1−τ ) fε (Fε−1 (τ ))E 0 t−1 Xt−1

hX

σt

h

i−1

1 σt

−1 io2 Vcxz

for MI and I(1),

and

R∞

erCz Ωxx erCz dr, when δ ∈ (0, α ∧ 1) , R0 ∞ rC . = 0 e Ωxx erC dr, when α ∈ (0, δ) .

Remark 3.4 IVX-filtration is used to achieve asymptotic normality for all I(0)-I(1) parameter spaces. The asymptotic standard error of this IVX-QR estimator under CHE innovation is still hard to estimate. Using the estimated fˆu (0) under iid limit theory severely distorts the inference5 , τ

see Table 1 below. Therefore, we study a moving block bootstrap (MBB) based approach.

4

Bootstrap-based Inference: MBB IVX-QR

As we explained above, estimating the nuisance parameters in Theorem (3.2) directly is not feasible under practical scenarios. We study the moving block bootstrap (MBB) inference6 in this Section, and prove the validity of the proposed test. Let b be an integer block length and let B(t) = (wt , wt+1 ,...,wt+b−1 ) denote a data block with starting point t ∈ {1, ..., n − b + 1}, where wt = (yt , z˜t0 )0 and z˜t denotes the IVX-filtered regressors from (6). The total number of possible blocks and the number of blocks in one bootstrapped sample are denoted by q and m, respectively. The letters n and ` indicate the sample size and the bootstrapped sample size, respectively. Therefore, n = q +b−1 and ` = mb. The MBB procedure is sampling m number of blocks randomly with replacement from {B(t) : t = 1, ..., n − b + 1} yielding MBB sample. This MBB sample w1∗ , ..., w`∗ is defined as (B(I1 ), ..., B(Im )) where Ii0 s are iid discrete uniform variables on {1, ..., n − b + 1}. Let P ∗ , E ∗ and V ar∗ denote probability, expectation and variance of the bootstrap distribution conditional on the original sample. 5 Under the iid error assumption, Koenker and Bassett (1982) estimate the sparsity parameters using the histospline methods of Boneva, Kendall, ad Stefanov (1971). Welsh (1987) develops a kernel approach estimating the same quantity as a weighted average of Siddiqui estimates (Siddiqui (1960)). In a more general setting of non-iid errors, Hendricks and Koenker (1992) suggest estimating the asymptotic covariance matrix by an extension of the sparsity estimation methods in iid setting. Powell (1991) proposes an estimation method following the idea of kernel density estimation. Although these methods help estimating standard errors for statistical inference, the performance of produced statistics are limited by the precision of estimated nuisance parameter and the imposed error structure. 6 The moving block bootstrap (MBB) are independently formulated by K¨ unsch (1989) and Liu and Singh (1992). Existing literature provides several variants such as the non-overlapping block bootstrap (NBB) by Carlstein (1986), the circular block bootstrap (CBB) and the stationary bootstrap (SB) of Politis and Romano (1992, 1994). We mainly focus on the overlapping block bootstrap method, which we continue to denote as MBB.

9

Based on MBB sample (w1∗ , ..., w`∗ ), we estimate ∗ βˆτIV XQR = arg min

` X

β

 ∗ ρτ yt∗ − β 0 z˜t−1 ,

(12)

t=1

which we denote as MBB IVX-QR estimator. We prove the first-order asymptotic validity of the MBB IVX-QR under the following rate conditions: Assumption 4.1 (a) ` = O(n) for n → ∞ and n = O(`) for ` → ∞, (b) kn = o(n), and (c) b = O (kn ) and b → ∞. Remark 4.1 The rate condition in (a) is standard in MBB literature. In (b), kn = o(n) is valid by IVX construction (8). (c) b = O (kn ) is a technical condition to derive Lemma A.4 below which is essential to prove Theorem 4.2. This last condition is a sufficient condition, used to bound the various cross-products of higher moments of IVX variables (see Proof of Lemma A.4). Intuitively, the degree of dependence in MBB sample (block length: b) needs to be balanced with the persistence of IVX-filtered data (kn ). −1/2

† := kn Recall that Zt−1

−1/2

† := kn z˜t−1 and Xt−1

xt−1 . Define m†t =

z√ ˜t−1 ψ (u ) kn τ tτ

† ψτ (utτ ), = Zt−1

and m∗† t is similarly defined using MBB sample. We have the following result on the sample mean of m†t and m∗† t . Theorem 4.1 (First-order asymptotic validity of the bootstrap score functions). Under Assumptions 2.1, 3.1 and 4.1, √   √ † † † − m ¯ ) ≤ x − P sup P ∗ `(m ¯ ∗† n( m ¯ − Em ) ≤ x →p 0, n n t `

as n → ∞,

(13)

x∈R

where m ¯ †n =

1 n

† t=1 mt ,

Pn

m ¯ ∗† ` =

1 `

∗† t=1 mt .

P`

Remark 4.2 To the best of our knowledge, there exist no bootstrap consistency results for this type of nonlinear statistics of mildly integrated processes. Kim and Nordman (2011) studied the mean of long memory processes, but the relation between mild integration and long memory property has not yet been clearly understood. See Lee (2014, Section 1.7) for a related discussion. We now state the main theorem of this paper, enabling the MBB percentile methods as discussed in, for example, by Efron and Tibshirani (1993, Ch 13). IV XQR∗ IV XQR Theorem 4.2 (MBB IVX-QR consistency) Under H0 : β1τ = 0, (`kn )1/2 (βˆ1τ − βˆ1τ ) IV XQR ∗ 1/2 ˆ under P and (nkn ) (β − β1τ ) under P have the same limit distribution Hβ , where 1τ

Hβτ ≡

τ

    

 N 0,

    N

0, n



τ (1−τ ) 2

fε (Fε−1 (τ ))

Ωβτ

for I(0), (14)

! τ (1−τ ) fε (Fε−1 (τ ))E

h

1 σt

10

−1 io2 Vcxz

for MI and I(1),

whose nuisance parameters are defined in Theorem 3.2. Remark 4.3 One common intuition in the bootstrap literature is that if we have asymptotic normality of the original test statistics, the bootstrap consistency will be achieved since the (blockwise) iid resampling easily attains CLT. The reduced persistence of IVX-filtered regressors (˜ zt ) is therefore essential to achieve MBB IVX-QR validity. Without this filtration MBB will not be valid, which is illustrated in the Simulation Section below.

5

Monte Carlo Simulation

This section compares the finite sample performances of (a) conventional t-tests (using estimated density), (b) MBB (without IVX correction), (c) IVX-QR of Lee (2016) and (d) MBB IVX-QR. For convenience, our setup and notation follow Lee (2016). The samples are generated from a predictive QR model: yt = β0τ + β1τ xt−1 + utτ ,

(15)

xt = µx + rn xt−1 + uxt ,

(16)

where β0τ , β1τ , µx , and rn are scalars. Let β0τ = Fu−1 (τ ), β1τ = 0, and µx = 0. Since parameter t rn = 1 + c/n controls the degree of persistence in predictor xt , we allow the value of c ∈ C to vary from −70 to 0, and C=(0, −2, −5, −7, −70), where c = 0 indicates the exact unit-root process. The larger the value of |c| is, the less persistent the variable xt is. (τ ): We consider two scenarios for the regression errors, utτ ≡ ut − Fu−1 t 1. ut ∼ iid Gaussian: 

ut uxt

0

  ∼ iid Fu 02×1 Σ2×2 ,

where Fu is a bivariate normal distribution. The correlation matrix Σ is standardized as Σ=

1 φ

! φ = −0.95.

,

φ 1

2. ut ∼ ARCH(1):  0 ut = σ  t t

for t = 0 for t ≥ 1,

where σt2 = γ0 + γ1 u2t−1 . Let γ0 = 1 and γ1 = 0.9. The errors t and uxt are jointly generated by t uxt

! ∼ iid Fu

0 0

! ,

1

φ0

φ0

1

!! ,

φ0 = −0.9.

To apply the IVX-QR technique, we obtain instruments from (6): z˜t = Rnz z˜t−1 + ∆xt ,

Rnz =

IK + n−δ Cz , where δ ∈ (0, 1), Cz = cz IK , cz < 0 and z˜0 = 0. We choose values of (Cz , δ) by the 11

practical rule in section 4.1 of Lee (2016). The essential idea is to choose c(δ, n) = n1−δ cz such that the nonstandard local unit root t-statistics ηLU R (c(δ, n)) (Phillips, 1987) behaves like Z ∼ N (0, 1) for a given n. Inference on βˆ1τ can therefore be based on critical values from the standard normal distribution. We test H0 : β1τ = 0 versus the local alternatives H1 : β1τ = b/n. Each test is investigated at eleven quantiles from τ = 0.05 to 0.95 with a nominal size of 5%. In each experiment, 1000 samples are randomly generated for n = 200 and 700. The inferential methods could be classified into two categories: (1) Tests without IVX correction; this includes t test and percentile method using MBB technique (Efron and Tibshirani, 1993; Koenker, 2005). These types of tests are commonly used in the literature of QR. If xt is not persistent, the regular QR asymptotics works for t test. Also, the percentile bootstrap method is often employed when the covariance matrix cannot be computed easily. For the block length of MBB, we adopt the form of constant×n1/4 suggested in Hall, Horowitz and Jing (1995) and Lahiri (1999, 2005)7 . (2) Tests with IVX correction; these types of tests includes IVX-QR and MBB IVX-QR tests. Both of these tests aim to correct the size distortion caused by persistence in xt and nonlinear dependence of utτ and uxt . We expect MBB IVX-QR can have better size control than IVX-QR, especially when (i) we are interested in inference at tails, and (ii) the covariance matrix is hard to compute, for example, with conditional heteroskedastic errors. The MBB IVX-QR test here is conducted as follows: Procedure of MBB IVX-QR 1. Given data (yt , xt ), for t = 1, ..., n, we choose values of (Cz , δ) by the practical rule of Lee (2016). Then construct instruments z˜t following (6). 2. Set block length b = dn1/4 e, 8 where dae denotes the least integer that is greater or equal to a. Let m = dn/be. Randomly sample m data blocks from (B(1), ..., B(q)), where B(t) = (wt , wt+1 , ..., wt+b−1 ) with w(t) = (yt , z˜t ), q = n − b + 1. Denote the sampled data blocks by (B ∗ (1), ..., B ∗ (m)) and obtain a resampled data (w1∗ , ..., w`∗ ), ` = mb and wt∗ = (yt∗ , z˜t∗ ). 3. Estimate βˆτ∗ = arg min β

` X

∗ ρτ (yt∗ − β 0 z˜t−1 ).

(17)

t=1

∗(1) ∗(N ) 4. Repeat step 2 and 3 for N times and obtain (βˆτ , ..., βˆτ ). 7

Hall et al. (1995) studied optimal block size of bootstrap for strictly stationary dependent data, suggesting n1/3 , n , and n1/5 under the different contexts. Lahiri (1999, 2005) suggests a nonparametric plug-in method. There is no known result for the block length choice for bootstrapping mildly integrated processes to the best of our knowledge. Thus we follow the standard suggestion. 8 Under Assumption 4.1, a rate condition b = O(kn ) is required, where kn := nα∧δ IK for MI and I(1) processes. In our simulated sample and also in empirical Section, we observe δ to be around 0.45, so the choice b = dn1/4 e does not conflict the rate condition. 1/4

12

ˆ −1 (α/2), G ˆ −1 ((1 − α)/2)], where G ˆ is the empirical 5. The (1 − α)100% percentile interval is [G ∗ CDF of βˆ from 4. τ

5.1

Results of Simulation Study 1 - IID Errors

We first examine the size performance under iid settings. The test methods are t test, percentile method using MBB technique, IVX-QR method and MBB IVX-QR method . Each test is conducted at various quantiles with varying degrees of persistent predictors. In particular, we compare the finite sample size of the MBB IVX-QR method to that of the IVX-QR method at upper and lower quantiles. Although both of these two tests are designed to have robust size performance over quantiles, our results suggest that MBB IVX-QR significantly outperforms IVX-QR at tails in terms of size control. Figure 1-2 show the results. We summarize the results as follows. First, among these four methods, t test has the largest size distortion regardless of I(1) (c = 0), MI (c = −2, −5, −7) or I(0) (c = −70) predictor. This confirms the theory in Lee (2016) that asymptotic distribution of the standard QR t-statistics is nonstandard. It suffers from distortion depending on the persistence of predictor and nonlinear dependence. In our simulation for both n = 200 and 700, we observe that as |c| approaches 0, the distortion of t test grows reaching its highest level around 20%. Second, our results illustrate that IVX-QR technique can help to correct the nonstandard distortion under I(1), MI and I(0) settings. Both IVX-QR and MBB IVX-QR methods have their sizes around the nominal 5% level across 0.2 to 0.8 quantiles. However, the size of IVX-QR is mildly inflated when moving towards tails (τ = 0.05, 0.1, 0.2, 0.8, 0.9, 0.95). This tail performance has also been observed in Lee (2016). We suspect that under iid settings this tail performance is mainly due to inaccurate estimation of sparsity function. This issue often happens at lower and upper quantiles when the sample size is small (Koenker (2005)). The results of the t test also exhibit the same pattern of over-rejections because it involves sparsity function estimation. However, these over-rejections of t test and IVX-QR at tails are dampened when the sample size increases from 200 to 700. Third, in comparison to t test and IVX-QR, the tests using MBB can be conservative at tails. Based on our results, both MBB and MBB IVX-QR improve over their corresponding non-bootstrap counterparts by reducing more than 30% of size distortion at tails. Lastly, the MBB IVX-QR approach performs well across all quantiles. This confirms that MBB IVX-QR improves over IVX-QR at tail quantiles, even under an iid-error scenario. Simulation results of Figure 1-2 can also be found in Table 1-2.

5.2

Results of Simulation Study 2 - CHE Errors

We now consider the predictive QR scenario with ARCH(1) errors. This scenario is closely related to many empirical practices where CHE asset returns are often observed. According to previous asymptotic theory under CHE, the covariance matrix cannot be easily estimated. Therefore, in this simulation study we illustrate that test results can be adversely affected when the estimation of 13

the covariance matrix is involved. Moreover, we show MBB IVX-QR is robust across all quantiles, substantially improving inference accuracy over all of the other three methods. The results are shown in Figure 3-4, and are summarized here. First, we observe that t tests exhibit a U-shaped empirical rejection frequencies and the worst size distortion among all tests. Although these size patterns under CHE settings are similar to the iid settings, they are different in terms of scale. In Figure 3, the size of t tests reaches 0.3 in most of the cases. Second, different from iid settings, the tail performances of t test and IVX-QR cannot be improved by increasing the sample size. This may imply that the effect of sparsity function estimation errors at tails is worsened by the effect of CHE errors. Thus, increasing sample size enlarges the inaccurate estimation issue. Third, due to the different asymptotic distribution under CHE, IVX-QR does not perform well even at the median. In the figures, for n = 200 and 700, IVX-QR has its best size at the median, around 0.1. Its distortion can increase to its highest 0.215 at τ = 0.95. Finally, MBB IVX-QR has the best size control among all tests. The performance of this test is stable across quantiles for all I(1), MI and I(0) scenarios, confirming the robust property of MBB IVX-QR to CHE effects. The results in Figure 3-4 can also be found in Table 3-4.

6

Empirical Illustration: Stock Return Quantile Predictability

We illustrate an improved predictive QR inference for stock market index returns (SP 500) using the MBB IVX-QR approach. The univariate predictive mean regression model Stock Return(t) = α + β × Economic V ariable(t − 1) + (t),

(18)

has been considerably discussed in the economics and finance literature, Goyal and Welch (2003), Campbell and Yogo (2006) and Welch and Goyal (2008), to name a few. However, there is no consensus on which economic variables have significant predictive ability for the mean of stock returns. To provide some different insight, Lee (2016) and Maynard et al. (2011) employed a QR-version of this predictive model. In this Section, we discuss the same predictive QR framework and compare the test results using IVX-QR and MBB IVX-QR. The same data as in Lee (2016) is employed, ranging from 1927 to 20059 . Excess stock returns (called Stock Return(t)) are calculated as:  Stock Return(t) = log

P (t) + D(t) P (t − 1)

 − log (Rf ree(t) + 1) ,

where P (t) and D(t) indicate the S&P 500 index and dividends at time t, respectively. Rf ree(t) is the 1-month treasury bill rate at t. The eight persistent predictors are dividend price ratio (dp), dividend payout ratio (de), earnings price ratio (ep), book to market ratio (bm), net equity expansion (ntis), treasury bills (tbl), term spread (tms), and default yield spread (df y). The 9

The data is available from: https://sites.google.com/site/jihyung412/research.

14

definition of these variables follows Welch and Goyal (2008). Figure 5 plots these variables from January 1927 to December 2005, signifying highly persistent patterns over time. As shown in Table 5, inference results of predictors using IVX-QR and MBB IVX-QR are different for some variables at certain quantiles. The results shown with ∗ imply the rejection of the null hypothesis of no predictability at 5% level. We observe that at several lower and upper tail quantiles, IVX-QR tends to reject the null hypothesis more often than MBB IVX-QR. For example, when dividend payout ratio (de) is the predictor, p-value of IVX-QR is smaller than 0.05 when τ = 0.05, 0.1, 0.2, 0.3, 0.4 and τ = 0.8, 0.9, 0.95, while 95% confidence intervals of MBB IVXQR reject none of the null hypothesis βτ = 0 at all quantiles. The similar conflicting results can also be found, for example, at dp when τ = 0.05, 0.2, 0.7, 0.95, bm when τ = 0.05, 0.1, 0.2, 0.3, 0.9, 0.95. The contradictory results between IVX-QR and MBB IVX-QR at lower and upper tail quantiles may indicate some level of over-rejections using IVX-QR, which was already conjectured from the simulation results in Lee (2016) and Section 5 above. Under both iid and CHE errors, MBB IVXQR can have better size controls especially at lower and upper tail quantiles. In finding significant economic and financial variables to predict excess stock return quantiles, MBB IVX-QR can be a safer inferential tool for applied researchers since it provides a more conservative test.

7

Summary and Concluding Remarks

Stock return predictability has been an important topic in economics and finance but the empirical conclusion is still controversial. In the meanwhile, valid econometric inferential methods have been carefully developed in the predictive mean and quantile regression literature. In this paper, we study the predictive quantile regression models with a particular emphasis on two important stylized facts, the predictor persistence and the conditional heteroskedasticity of stock return data. A valid and easy-to-use inference procedure is proposed and labelled as MBB IVX-QR. As its name indicates, the main development is to combine two techniques, (i) IVX filtering and (ii) the moving block bootstrap. In essence, IVX-filtering removes the nonstationary distortion, while the block bootstrap accommodates the conditional heteroskedasticity. Simulation and empirical results confirm the benefit of the new methods, guarding against the type-I errors arising from persistence and conditional heteroskedasticity, two universal stylized facts in financial data. Therefore MBB IVX-QR is well-suited to empirical exercises in the predictive regression literature.

15

Figure 1: Size performance: N = 200 and I.I.D. errors

c=0

c=−2

0.20

0.15

test

test

size

0.15

t test

size

t test MBB

MBB

0.10

IVX−QR

0.10

IVX−QR

MBB IVX−QR

MBB IVX−QR 0.05

0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−5

c=−7

0.16 0.16

test

test

0.12

0.12

MBB 0.08

t test

size

size

t test

IVX−QR

MBB IVX−QR

0.08

MBB IVX−QR

MBB IVX−QR 0.04

0.04 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−70

test t test

size

0.10

MBB IVX−QR MBB IVX−QR 0.05

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

16

Figure 2: Size performance: N = 700 and I.I.D. errors

c=0

c=−2 0.15

0.20

test 0.15

test

0.12

t test

size

size

t test MBB

MBB 0.09

IVX−QR

0.10

IVX−QR

MBB IVX−QR

MBB IVX−QR 0.06

0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−5

c=−7 0.11

0.100

test

test

0.09

t test

size

size

t test MBB

0.075

MBB

0.07

IVX−QR

IVX−QR

MBB IVX−QR

0.050

0.025

0.03 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−70 0.10

test

0.08

t test

size

MBB IVX−QR

0.05

MBB 0.06

IVX−QR MBB IVX−QR

0.04

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

17

Figure 3: Size performance: N = 200 and ARCH(1) errors

c=0

c=−2 0.3

0.3

test

test

0.2

t test

size

size

t test MBB

0.2

MBB

IVX−QR

IVX−QR

MBB IVX−QR

MBB IVX−QR

0.1

0.1

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−5

c=−7 0.3

0.3

test

test

0.2

t test

size

size

t test MBB

0.2 MBB

IVX−QR

IVX−QR

MBB IVX−QR

0.1

MBB IVX−QR

0.1

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−70 0.20

test t test

size

0.15

MBB IVX−QR

0.10

MBB IVX−QR 0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

18

Figure 4: Size performance: N = 700 and ARCH(1) errors

c=0

c=−2 0.25

test 0.2

test 0.20

size

size

t test MBB

t test MBB

0.15

IVX−QR

IVX−QR

MBB IVX−QR

0.1

MBB IVX−QR

0.10

0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−5

c=−7

0.25

test

0.20

test

0.2

t test

size

size

t test MBB

0.15

MBB

IVX−QR 0.10

IVX−QR

MBB IVX−QR

0.1

MBB IVX−QR

0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

τ

c=−70 0.20

test

size

t test 0.15 MBB IVX−QR 0.10 MBB IVX−QR 0.05 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95

τ

19

0.00

−0.02

1940 1960

Time 1980 2000

20 tms 0.02

0.10

tbl

0.00

0.05

0.04

0.15

0.00

0.5

bm

0.05

1.5

0.10

ntis

1.0

0.15

2.0

−3.5

−1.0

de

−3.0

ep

−0.5

−2.5

−2.0

0.0

0.01

−4.5

dp

0.03

dfy

−3.5

0.05

−2.5

Figure 5: Predictors: 1927:01 - 2005:12

1940 1960

Time 1980 2000

Table 1: Size performance (%) without IVX correction (IID errors, φ = −0.95)

n=200 t-test c=0 -2 -5 -7 -70 MBB c=0 -2 -5 -7 -70 n=700 t-test c=0 -2 -5 -7 -70 MBB c=0 -2 -5 -7 -70

τ =0.05

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.95

0.219 0.176 0.153 0.145 0.141

0.185 0.146 0.116 0.125 0.077

0.171 0.148 0.132 0.095 0.087

0.207 0.141 0.102 0.098 0.065

0.207 0.147 0.097 0.084 0.072

0.225 0.16 0.115 0.091 0.066

0.225 0.152 0.101 0.1 0.06

0.198 0.147 0.101 0.094 0.051

0.197 0.142 0.093 0.102 0.093

0.182 0.147 0.139 0.121 0.088

0.195 0.166 0.156 0.165 0.131

0.074 0.067 0.044 0.048 0.039

0.109 0.076 0.049 0.054 0.035

0.139 0.122 0.1 0.082 0.055

0.192 0.136 0.096 0.096 0.043

0.205 0.14 0.104 0.094 0.054

0.229 0.157 0.101 0.094 0.052

0.214 0.139 0.102 0.09 0.059

0.187 0.129 0.099 0.086 0.036

0.15 0.111 0.075 0.07 0.054

0.122 0.075 0.078 0.063 0.036

0.087 0.06 0.048 0.057 0.031

0.143 0.109 0.109 0.11 0.1

0.158 0.114 0.093 0.091 0.096

0.166 0.137 0.087 0.085 0.06

0.203 0.151 0.102 0.107 0.048

0.195 0.136 0.095 0.092 0.069

0.185 0.149 0.12 0.09 0.066

0.207 0.129 0.104 0.092 0.061

0.191 0.138 0.096 0.103 0.068

0.185 0.134 0.076 0.106 0.066

0.162 0.119 0.119 0.1 0.069

0.13 0.114 0.113 0.104 0.093

0.077 0.067 0.049 0.05 0.042

0.126 0.077 0.07 0.05 0.057

0.145 0.115 0.076 0.069 0.048

0.177 0.139 0.092 0.103 0.05

0.189 0.118 0.086 0.088 0.07

0.175 0.144 0.109 0.083 0.08

0.199 0.128 0.096 0.081 0.065

0.167 0.126 0.099 0.085 0.055

0.156 0.105 0.071 0.079 0.048

0.141 0.084 0.077 0.074 0.035

0.085 0.067 0.059 0.061 0.033

21

Table 2: Size performance (%) with IVX correction (IID errors, φ = −0.95)

n=200 IVX-QR c=0 -2 -5 -7 -70 MBB IVX-QR c=0 -2 -5 -7 -70 n=700 IVX-QR c=0 -2 -5 -7 -70 MBB IVX-QR c=0 -2 -5 -7 -70

τ =0.05

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.95

0.124 0.091 0.09 0.075 0.087

0.078 0.056 0.061 0.066 0.055

0.056 0.044 0.052 0.052 0.035

0.059 0.033 0.048 0.036 0.032

0.049 0.035 0.033 0.038 0.033

0.043 0.049 0.034 0.035 0.044

0.067 0.054 0.044 0.035 0.028

0.058 0.042 0.041 0.04 0.04

0.063 0.036 0.04 0.046 0.049

0.073 0.061 0.043 0.045 0.051

0.097 0.09 0.085 0.084 0.087

0.047 0.039 0.048 0.048 0.037

0.028 0.03 0.035 0.042 0.041

0.043 0.049 0.047 0.054 0.039

0.049 0.04 0.045 0.051 0.044

0.043 0.054 0.043 0.049 0.045

0.038 0.061 0.05 0.048 0.044

0.052 0.043 0.043 0.048 0.035

0.049 0.039 0.055 0.04 0.044

0.046 0.032 0.038 0.032 0.05

0.043 0.045 0.031 0.028 0.032

0.033 0.046 0.042 0.032 0.04

0.082 0.057 0.063 0.067 0.07

0.065 0.051 0.052 0.056 0.048

0.076 0.05 0.041 0.041 0.043

0.042 0.046 0.038 0.041 0.054

0.057 0.047 0.041 0.05 0.041

0.06 0.046 0.04 0.033 0.052

0.066 0.055 0.047 0.038 0.039

0.049 0.049 0.041 0.04 0.05

0.076 0.073 0.04 0.056 0.043

0.076 0.047 0.053 0.066 0.043

0.087 0.085 0.07 0.07 0.074

0.053 0.039 0.027 0.041 0.047

0.036 0.05 0.037 0.05 0.029

0.06 0.051 0.041 0.041 0.041

0.043 0.052 0.034 0.046 0.048

0.057 0.056 0.048 0.058 0.048

0.046 0.059 0.05 0.046 0.055

0.049 0.059 0.054 0.043 0.049

0.051 0.053 0.058 0.048 0.058

0.055 0.066 0.046 0.055 0.051

0.044 0.05 0.04 0.046 0.034

0.037 0.06 0.04 0.049 0.047

22

Table 3: Size performance (%) without IVX correction (ARCH(1) errors, φ0 = −0.9)

n=200 t-test c=0 -2 -5 -7 -70 MBB c=0 -2 -5 -7 -70 n=700 t-test c=0 -2 -5 -7 -70 MBB c=0 -2 -5 -7 -70

τ =0.05

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.95

0.347 0.318 0.312 0.307 0.214

0.272 0.283 0.234 0.245 0.212

0.238 0.227 0.172 0.185 0.167

0.224 0.196 0.172 0.142 0.159

0.2 0.166 0.119 0.122 0.146

0.204 0.143 0.12 0.098 0.16

0.216 0.176 0.111 0.109 0.148

0.218 0.157 0.147 0.142 0.161

0.253 0.2 0.206 0.191 0.169

0.289 0.247 0.253 0.242 0.202

0.319 0.3 0.34 0.311 0.203

0.11 0.101 0.11 0.095 0.068

0.108 0.119 0.096 0.085 0.078

0.14 0.125 0.096 0.105 0.064

0.156 0.141 0.112 0.088 0.061

0.184 0.114 0.092 0.09 0.062

0.189 0.131 0.09 0.069 0.073

0.183 0.153 0.076 0.08 0.047

0.177 0.113 0.088 0.089 0.062

0.15 0.119 0.104 0.09 0.059

0.14 0.106 0.105 0.064 0.052

0.104 0.087 0.111 0.097 0.061

0.241 0.262 0.251 0.261 0.203

0.24 0.21 0.221 0.215 0.204

0.195 0.19 0.154 0.151 0.149

0.193 0.199 0.129 0.113 0.111

0.189 0.134 0.092 0.095 0.103

0.198 0.128 0.092 0.099 0.084

0.175 0.135 0.108 0.103 0.088

0.21 0.178 0.119 0.133 0.116

0.207 0.197 0.17 0.163 0.156

0.229 0.212 0.218 0.198 0.198

0.27 0.263 0.251 0.259 0.23

0.096 0.109 0.081 0.085 0.078

0.117 0.096 0.089 0.092 0.074

0.146 0.108 0.087 0.078 0.066

0.139 0.138 0.079 0.087 0.044

0.156 0.123 0.084 0.079 0.065

0.19 0.117 0.085 0.08 0.054

0.15 0.124 0.088 0.089 0.064

0.163 0.135 0.087 0.08 0.057

0.125 0.114 0.095 0.083 0.061

0.101 0.103 0.096 0.09 0.075

0.103 0.095 0.098 0.1 0.079

23

Table 4: Size performance (%) with IVX correction (ARCH(1) errors, φ0 = −0.9)

n=200 IVX-QR c=0 -2 -5 -7 -70 MBB IVX-QR c=0 -2 -5 -7 -70 n=700 IVX-QR c=0 -2 -5 -7 -70 MBB IVX-QR 0 -2 -5 -7 -70

τ =0.05

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.95

0.213 0.197 0.193 0.178 0.148

0.167 0.17 0.157 0.165 0.132

0.174 0.146 0.15 0.142 0.118

0.146 0.14 0.113 0.129 0.114

0.123 0.12 0.101 0.13 0.104

0.137 0.145 0.095 0.108 0.127

0.16 0.142 0.133 0.109 0.087

0.153 0.127 0.124 0.12 0.117

0.153 0.159 0.159 0.138 0.11

0.196 0.18 0.168 0.135 0.107

0.199 0.161 0.199 0.19 0.118

0.066 0.063 0.076 0.06 0.057

0.062 0.06 0.051 0.064 0.059

0.053 0.064 0.06 0.065 0.049

0.048 0.059 0.05 0.052 0.043

0.04 0.049 0.049 0.061 0.041

0.055 0.052 0.039 0.052 0.049

0.053 0.065 0.048 0.045 0.037

0.061 0.052 0.047 0.06 0.055

0.05 0.054 0.054 0.054 0.051

0.05 0.063 0.066 0.053 0.043

0.069 0.061 0.068 0.064 0.043

0.2 0.209 0.171 0.181 0.172

0.196 0.159 0.19 0.198 0.168

0.164 0.157 0.13 0.141 0.135

0.112 0.109 0.122 0.087 0.107

0.128 0.088 0.077 0.077 0.086

0.084 0.095 0.073 0.082 0.087

0.109 0.094 0.093 0.077 0.087

0.134 0.113 0.111 0.115 0.135

0.145 0.157 0.135 0.115 0.139

0.206 0.174 0.192 0.199 0.17

0.215 0.23 0.175 0.217 0.189

0.073 0.075 0.073 0.077 0.07

0.078 0.062 0.072 0.066 0.053

0.063 0.071 0.047 0.065 0.06

0.05 0.052 0.066 0.043 0.05

0.059 0.061 0.059 0.043 0.052

0.05 0.056 0.047 0.062 0.048

0.054 0.057 0.057 0.054 0.038

0.067 0.056 0.066 0.064 0.054

0.056 0.052 0.061 0.049 0.056

0.08 0.062 0.072 0.074 0.056

0.08 0.083 0.079 0.071 0.069

24

25

IVX-QR 0.003* 0.000* 0.002* 0.123 0.749 0.605 0.157 0.004* 0.059 0.057 0.004*

IVX-QR 0.000* 0.000* 0.000* 0.000* 0.030* 0.747 0.025* 0.000* 0.000* 0.000* 0.000*

τ 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.95

τ 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.95

dp MBB IVX-QR (-0.125, 0.006) (-0.131, -0.000)* (-0.089, 0.001) (-0.072, -0.003)* (-0.052, -0.000)* (-0.039, 0.012) (-0.022, 0.022) (-0.018, 0.033) (-0.015, 0.033) (-0.021, 0.054) (-0.035, 0.074) dfy MBB IVX-QR (-6.179, -3.229)* (-4.489, -2.063)* (-3.434, -0.802)* (-2.325, -0.296)* (-1.691, 0.272) (-0.763, 0.82) (-0.319, 1.193) (0.118, 1.957)* (0.44, 2.879)* (1.514, 4.463)* (1.968, 5.047)* IVX-QR 0.917 0.903 0.579 0.759 0.801 0.821 0.975 0.534 0.500 0.288 0.260

IVX-QR 0.000* 0.000* 0.000* 0.000* 0.015* 0.375 0.534 0.096 0.027* 0.000* 0.000*

de MBB IVX-QR (-0.137, 0.007) (-0.116, 0.004) (-0.073, 0.016) (-0.045, 0.005) (-0.039, 0.011) (-0.035, 0.007) (-0.026, 0.005) (-0.034, 0.007) (-0.030, 0.012) (-0.034, 0.038) (-0.016, 0.086) ep MBB IVX-QR (-0.084, 0.035) (-0.063, 0.045) (-0.038, 0.014) (-0.04, 0.012) (-0.035, 0.019) (-0.027, 0.018) (-0.021, 0.019) (-0.015, 0.021) (-0.018, 0.027) (-0.012, 0.044) (-0.029, 0.042) IVX-QR 0.056 0.003* 0.003* 0.270 0.186 0.826 0.741 0.582 0.834 0.934 0.966

IVX-QR 0.001* 0.000* 0.001* 0.005* 0.511 0.792 0.125 0.001* 0.023* 0.009* 0.008*

bm MBB IVX-QR (-0.252, 0.030) (-0.278, 0.004) (-0.239, 0.007) (-0.146, 0.008) (-0.093, 0.025) (-0.065, 0.045) (-0.027, 0.088) (0.002, 0.097)* (0.008, 0.118)* (-0.014, 0.216) (-0.001, 0.214) ntis MBB IVX-QR (-0.887, 0.470) (-0.701, 0.281) (-0.576, 0.143) (-0.433, 0.074) (-0.267, 0.117) (-0.255, 0.106) (-0.300, 0.060) (-0.218, 0.157) (-0.237, 0.124) (-0.271, 0.244) (-0.462, 0.321) IVX-QR 0.175 0.063 0.219 0.612 0.229 0.352 0.242 0.703 0.170 0.000* 0.000*

IVX-QR 0.604 0.689 0.060 0.006* 0.003* 0.011* 0.007* 0.097 0.016* 0.001* 0.146

Table 5: Tests for stock returns (S&P500) quantile predictability: 1927:01 - 2005:12 tbl MBB IVX-QR (-0.598, 0.301) (-0.467, 0.206) (-0.334, 0.005) (-0.372, 0.023) (-0.355, -0.055)* (-0.282, -0.022)* (-0.351, 0.036) (-0.328, 0.052) (-0.249, 0.054) (-0.371, 0.13) (-0.501, 0.381) tms MBB IVX-QR (-1.746, 0.140) (-1.181, 0.483) (-0.626, 0.471) (-0.358, 0.346) (-0.230, 0.385) (-0.130, 0.334) (-0.210, 0.383) (-0.227, 0.507) (-0.029, 0.566) (0.219, 1.169)* (0.354, 1.707)*

A

Technical Appendix

A.1

Supporting Lemmas and Their Proofs

I(0) xt case is covered by Fitzenberger (1997, Theorem 3.3) so in this Section, we provide MBB lemmas for MI-I(1) cases where the IVX variables are less persistent than the predictors, i.e., δ < min(α, 1). Pi+b−1 ∗† Let Y˜i† = √1b j=i mt denote the normalized average of the ith resampled block under the P i+b−1 † mt denote the normalized average of the block (m†i , ..., m†i+b−1 ), i ≥ 1. MBB, and Bi† = √1b t=i Each of the normalized block sums, after the IVX-filtration, will satify CLT as b → ∞. However, to provide MBB consistency, the cross-product of some higher moments should be reasonably bounded (Proof of Lemma A.4). One sufficient condition for this is b = O(kn ), which is given in Assumption 4.1. Lemma A.1 The second moment of the normalized instruments, Zt† , is bounded for all t: E(Zt†2 ) = O(1). Proof of Lemma A.1.

−1/2

Recall that Zt† := kn

z˜t . By construction, the process z˜t can be

decomposed as: C Ψnt , nα P t−i u , and where zt is a mildly integrated instruments that zt = ti=1 Rnz xi z˜t = zt +

Ψnt =

t X i=1

t−i Rnz xi−1

=

t X i−1 X i=1 k=0

26

(19)

t−i i−1−k Rnz Rn uxk .

Under Assumption 3.1, E(Zt†2 ) = =

2C C2 1 E(zt2 + α zt Ψnt + 2α Ψ2nt ) kn n n t t i−1 X 1 2C X X t−h t−i i−1−k 1 t−i 2 E[( Rnz uxi ) ] + E( α Rnz Rnz Rn uxh uxk ) kn kn n i=1

+

=



1 C2 E[( kn n2α

1 kn +

t X

t−i i−1−k Rnz Rn uxk )2 ]

i=1 k=0

t−h t−i Rnz Rnz E(uxh uxi ) +

h,i=1

1 · kn n2α t X

t i−1 1 2C X X t−h t−i i−1−k · α Rnz Rnz Rn E(uxh uxk ) kn n h,i=1 k=0

t h−1 i−1 C2 X X X

1 kn +

h,i=1 k=0

t X i−1 X

(t−h)+(t−i) (h−1−f )+(i−1−k) Rnz Rn E(uxf uxk )

h,i=1 f =0 k=0

2(t−i) Rnz E(u2xi ) +

t i−1 1 2C X t−i X t−k i−1−k · α Rnz Rnz Rn E(u2xk ) kn n i=1

i=1

C2

1 · kn n2α

t X

(t−h)+(t−i) Rnz

i−1 X

k=0

Rn2(i−1−k) E(u2xk ).

k=0

h,i=1

Using the bounds sup

t X

1≤t≤n i=1

m(t−i) Rnz

δ

= O(n )

and

sup

t−1 X

1≤t≤n i=0

Rnm(t−i) = O(nα ),

(20)

t−k ≤ 1 for all k ≤ t, we obtain that for m=1,2, and the fact that Rnz t

sup E(Zt†2 ) ≤

1≤t≤n

t

i−1

i=1

k=0

   X X X 1 2C  1 2(t−i) t−i sup · α sup Rnz Rnz E(u2xi ) + sup Rni−1−k E(u2xk ) kn 1≤t≤n kn n 1≤t≤n 1≤i≤n i=1

+

C2

1 · kn n2α



sup

t X

1≤t≤n i=1

t−i Rnz

2 

sup 1≤i≤n

i−1 X



Rn2(i−1−k) E(u2xk )

k=0

1 1 1 = O( )O(nδ ) + O( )O(nδ )O(nα ) + O( )O(n2δ )O(nα ) α kn kn n kn n2α = O(1).

Lemma A.2 Let q = n − b + 1 and q = O(n), then √ q i+b−1 n 1X1 X † 1X † b mj = mt + Op ( ). q b n n i=1

t=1

j=i

27

(21)

Proof of Lemma A.2. q i+b−1 1X1 X † mj = q b i=1

j=i

n

1 X † b mt − (b − 1)m†1 + (b − 2)m†2 + ... + m†b−1 qb t=1  +(b − 1)m†n + (b − 2)m†n−1 + ... + m†n−b+2 n

=

b−1

 1 X 1X † mt − (b − j) m†j + m†n−j+1 q qb t=1

=

=

=

j=1

b−1 1 j † 1X j − (1 − )mj + (1 − )m†n−j+1 q b q b t=1 j=1 j=1 √ √ n b−1 b−1  b 1 X b 1 X 1X † j † j √ √ mt − (1 − )mj − (1 − )m†n−j+1 q q b q b b j=1 b j=1 t=1 √ n b 1X † mt + Op ( ). n n

1 q

n X

b−1 X

m†t

t=1

By Lemma A.1, the last equation holds because: b X

V ar(m†j ) =

j=1 b X

b X

 † †0 E Zj−1 Zj−1 E(ψτ2 (u0jτ )|Fj−1 ) = O(b),

(22)

j=1

 Cov(m†i , m†j

b X

=

i>j=1

 † †0 E Zi−1 Zj−1 ψτ (u0jτ )E(ψτ (u0iτ )|Fi−1 ) = 0,

(23)

i>j=1

and then by Chebyshev’s inequality, b  1 X † mj ≥ t ≤ Pr √ b j=1

V ar

√1 b

† j=1 mj

Pb t2

Pb =

j=1 V

ar(m†j ) + 2

† † i>j=1 Cov(mi , mj ) , t2

Pb



= O(1).

Corollary A.1 Let m ¯ †n := n−1 E ∗ Y˜1†

† t=1 mt .

Pn

From Lemma A.2, if P ∗ (Y˜1† = Bj† ) = 1/q, 1 ≤ j ≤ q,

q q i+b−1 b 1X † 1X 1 X † √ † √ = Bi = mt = bm ¯ n + Op ( ). q q n b i=1

i=1

t=i

28

(24)

Lemma A.3 Let zt :=

Pt

t−i i=1 Rnz uxi

be a mildly integrated instrument that satisfies t ∈ {1, ..., n}, z0 = 0.

zt = Rnz zt−1 + uxt ,

Suppose Assumption 3.1 holds, for 1 ≤ j ≤ b, 1 ≤ t ≤ n, 2 |F ) ≤ 1. E(zt+j t

(t+j−i)+(t+j−h) uxi uxh i,h=1 Rnz

Pt

2. E(zt+j Ψnt+j |Ft ) ≤ 3. E(Ψ2nt+j |Ft ) ≤

Pt

i=1

Pt+1 Ph−1 h=1

k=0

Pt+1 Pi−1 Ph−1 k=0

f =0

i,h=1

+ Op (b).

(t+j−i)+(t+j−h)

Rnz

(t+j−i)+(t+j−h)

Rnz

Rnh−1−k uxi uxk + Op (b2 ).

(i−1−f )+(h−1−k)

Rn

uxf uxk + Op (b3 ).

Proof of Lemma A.3. Under Assumption 3.1, for all 1 ≤ s ≤ t ≤ n, E(uxt uxs ) = E[uxs E(uxt |Ft−1 )] = 0.

(25)

2 E(σxt ) = E[E(u2xt |Ft−1 )] = E(u2xt ) = O(1).

(26)

and

2∗ ≡ σ 2 − Eσ 2 denote the centered random variable of σ . Note that under Assumption 3.1, Let σxi xi xi xi 2∗ } is a α-mixing sequence of size the process {σxi

−r r−2

2∗ |r < ∞ for all t = 1, ..., n. for r > 2, and E|σxt

Hence, by the mixing inequality in Fan and Yao (2003, Theorem 2.17), for any t with 1 ≤ t ≤ n, there exists a positive constant C2 independent of j such that " E

t+j  X

2∗ σxi

2

# ≤ C2 j ≤ sup C2 j = O(b).

(27)

1≤j≤b

i=t+1

By (27) and Chebyshev’s inequality, for any  > 0, hP i t+j ! t+j 2∗ )2 X E ( σ xi i=t+1 2∗ Pr σxi ≤ O(b), > ≤ 2 i=t+1

which implies that for 1 ≤ j ≤ b, t+j X

2 σxi

=

i=t+1

t+j X

2∗ σxi

i=t+1

t+j X

+

2 E(σxi ) ≤ Op (b).

i=t+1

Therefore, by the (conditional) dominated convergence theorem, " E

t+j X

# 2 σxi |Ft ≤ Op (b).

i=t+1 t−i ≤ 1 and Rt−i ≤ 1, for all 1 ≤ i ≤ t ≤ n, we obtain Using (25), (28) and the fact that Rnz n

29

(28)

1. 2 E(zt+j |Ft )

t+j t X X  t+j−i t+j−i =E ( Rnz uxi + Rnz uxi )2 |Ft i=1



t X

=E (

i=t+1 t+j−i Rnz uxi )2

+(

i=1 t X

=(

t+j−i Rnz uxi )2

=

t+j−i Rnz uxi )2 |Ft

t+j  X  t+j−i +E ( Rnz uxi )2 |Ft i=t+1

(t+j−i)+(t+j−h) Rnz uxi uxh

+



=

(t+j−i)+(t+j−h) Rnz uxi uxh

2(t+j−i) Rnz E(u2xi |Ft )

+

t+j X

  E E(u2xi |Ft−i )|Ft

i,h=1

i=t+1

t X

" (t+j−i)+(t+j−h) Rnz uxi uxh + E

t X

t+j X

# 2 σxi |Ft

i=t+1

i,h=1



t+j X i=t+1

i,h=1 t X



i=t+1

i=1 t X

t+j X

(t+j−i)+(t+j−h) Rnz uxi uxh + Op (b).

i,h=1

30

(29)

2. t+j t+j X h−1 X X  t+j−i t+j−h h−1−k E(zt+j Ψnt+j |Ft ) = E ( Rnz uxi ) · ( Rnz Rn uxk )|Ft i=1

=

h=1 k=0

t X t+1 X h−1 X

(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk

i=1 h=1 k=0 t+j t+j X X

+

h−1 X

(t+j−i)+(t+j−h) h−1−k Rnz Rn E(uxi uxk |Ft )

i=t+1 h=t+2 k=t+1

=

t X t+1 X h−1 X

(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk

i=1 h=1 k=0 t+j h−1 X X t+j−h t+j−k h−1−k + Rnz Rnz Rn E(u2xk |Ft ) h=t+2 k=t+1



t X t+1 X h−1 X

(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk

i=1 h=1 k=0 t+j h−1 X X t+j−h + Rnz Rnh−1−k E(u2xk |Ft ) h=t+2 k=t+1



t X t+1 X h−1 X

(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk

i=1 h=1 k=0

 +b

sup

h−1 X

t+2≤h≤t+j k=t+1



t X t+1 X h−1 X

E(u2xk |Ft )



(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk + Op (b2 ).

i=1 h=1 k=0

3. E(Ψ2nt+j |Ft )

t+j X h−1 X  t+j−h h−1−k = E ( Rnz Rn uxk )2 |Ft

=

h=1 k=0 t+1 i−1 X h−1 X X

(t+j−i)+(t+j−h) (i−1−f )+(h−1−k) Rnz Rn uxf uxk

i,h=1 f =0 k=0

+

t+j X

h−1 X

(t+j−i)+(t+j−h) 2(h−1−k) Rnz Rn E(u2xk |Ft )

i,h=t+2 k=t+1



t+1 X i−1 X h−1 X

(t+j−i)+(t+j−h) (i−1−f )+(h−1−k) Rnz Rn uxf uxk + Op (b3 ).

i,h=1 f =0 k=0

31

Lemma A.4 Under Assumption 2.1, 3.1 and 4.1,, q

1 X †2 Bi →p τ (1 − τ )Vcxz . q

(30)

i=1

Proof of Lemma A.4. q

1 X †2 Bi = q i=1

=

q q i+b−1 i+b−1 1 X  1 X † 2 1 X 1  X †  2 √ mt mt = q q b b i=1 q X

1 bq

t=i i+b−1 X

i=1

2

m†j + 2

i=1

j=1

i+b−1 X

t=i

m†j m†j−1 + · · · + 2m†i m†i+b−1



j=i+1

n X

 n−1 n−b+1 X † † X 1 †2 † † b· mt + (b − 1) · 2 mt mt+1 + · · · + 1 · 2 mt mt+b−1 = bq t=1 t=1 t=1  2 2 † †2 † †2 − (b − 1)(m1 + m†n ) + (b − 2)(m2 + mn−1 + 2m2 m†1 + 2m†n−1 m†n ) 

† † † † † † † †2 † +(b − 3)(m†2 3 + mn−2 + 2m3 m2 + 2m3 m1 + 2mn−2 mn−1 + 2mn−2 mn ) † † † † + · · · + 1(mb−1 †2 + m†2 n−b+2 + 2mb−1 mb−2 + · · · + 2mb−1 m1  +2m†n−b+2 m†n−b+3 + · · · + 2m†n−b+2 m†n )

  X n−b+1 n n−1 X X † † 1 † † †2 = mt mt+b−1 b· mt + (b − 1) · 2 mt mt+1 + · · · + 1 · 2 bq t=1 t=1 t=1  b−1 b−1 t−1 X X 1 X †2 (m†t m†t−h ) − ) + 2 (b − t) (b − t)(m†2 + m t n+1−t bq t=2 t=1 h=1  b−1 t−1 X X † (mn+1−t m†n+1−t+h ) +2 (b − t) t=2

=

1 q

n X

h=1

2 m†t

t=1

−2

b−1

b−1

n−j

t=1

j=1

t=1

X 1X t j 1X † † †2 (1 − ) − (1 − )(m†2 mt mt+j t + mn+1−t ) + 2 q b b q

b−1 X

j−1 X

j=2

i=1

j 1 (1 − ) b q

(m†j m†j−i + m†n+1−j m†n+1−j+i )

≡ (I) − (II) + 2 · (III) − 2 · (IV ). (1) Using the fact that q = n − b + 1 and b = o(n), we can obtain that (I) =

n

n

t=1

t=1

q n

1 X †2 n 1 X †2 mt = · mt →p E(m†2 t ) = τ (1 − τ )Vcxz , q q n

32

→ 1, as n → ∞. Thus as n → ∞.

(2) In order to show (II) =

1 q

Pb−1

t=1 (1

†2 − bt )(m†2 t + mn+1−t ) = op (1), we first obtain

b−1

b−1

t=1

t=1

1X t 1 X †2 (1 − )m†2 mt , t ≤ q b q and P r(

P Pb−1 b−1 †2 E( b−1 E(m†2 1 X †2 b t ) t=1 mt ) mt > ) ≤ = t=1 = O( ) = o(1). q q q q t=1

†2 t 1 Pb−1 − bt )m†2 t = op (1). Similarly, we can obtain q t=1 (1 − b )mn+1−t = op (1). Pb−1 P n−j (3) To obtain (III) = j=1 (1− jb ) 1q t=1 m†t m†t+j = op (1), it suffices to show, for 1 ≤ j ≤ b−1,

Thus

1 Pb−1

t=1 (1

q

n−j

1 1X † † mt mt+j = O( ). q n

(31)

t=1

Using Chebyshev’s inequality and the fact that ψτ (u0tτ ) ≤ 1, 1 ≤ t ≤ n, n−j

1X † † P r( mt mt+j > ) ≤ q

 P † † 2 E ( n−j t=1 mt mt+j ) Pn−j

=

t=1

E(m†t m†t+j )2

(33)

q 2 2 Pn−j



(32)

q 2 2

t=1

t=1

† † Zt+j−1 )2 E(Zt−1

q 2 2

.

(34)

Equation (33) makes use of the fact that, for 1 ≤ t ≤ s ≤ n, 1 ≤ j ≤ b − 1, E(mt |Ft−1 ) = Zt−1 E[ψτ (u0tτ )|Ft−1 ] = 0,

(35)

and E(mt mt+j ms ms+j ) = E[mt mt+j ms E(ms+j |Fs+j−1 )] = 0. (36) P † † 2 Therefore, in order to obtain (31) it is suffices to show n−j t=1 E(Zt−1 Zt+j−1 ) = O(n). Recall that P P P † t t i−1 t−i u , and Ψ = t−i i−1−k u . Zt = √1k (zt + nCα Ψnt ), where zt = i=1 Rnz xi nt xk i=1 k=1 Rnz Rn n

n−j X t=1

† E(Zt† Zt+j )2

n−j  C C 1 X  E (zt + α Ψnt )2 · (zt+j + α Ψnt+j )2 = 2 kn n n

=

1 kn2

t=1 n−j X t=1

 2 2C 2 C2 2 2C E zt2 zt+j + α zt zt+j Ψnt + 2α zt+j Ψ2nt + α zt2 zt+j Ψnt+j n n n

4C 2 2C 3 C2 + 2α zt zt+j Ψnt Ψnt+j + 3α zt+j Ψ2nt Ψnt+j + 2α zt2 Ψ2nt+j n n n 4  2C 3 C + 3α zt Ψnt Ψ2nt+j + 4α Ψ2nt Ψ2nt+j n n 33

(37)

Next, by Assumption 3.1 and 4.1 and Lemma A.3, we can obtain the order of each term in (37) using the bounds sup

t X

1≤t≤n i=1

m(t−i) Rnz = O(nδ )

and

sup

t X

1≤t≤n i=1

Rnm(t−i) = O(nα ),

(38)

t−k ≤ 1 and Rt−k ≤ 1, for all 0 ≤ k ≤ t ≤ n. for m=1, 2, 3, 4, and the fact that Rnz n

Since b = o(kn ) from Assumption 4.1, the first term in (37) is n−j n−j   1 X  2 1 X 2 2 2 E z z = E zt E(zt+j |Ft ) t t+j 2 2 kn kn t=1

t=1

n−j t  1 X h X (t−i)+(t−k) ≤ 2 E Rnz uxi uxk · kn

=

t=1

i,k=1

n−j 1 X h E kn2

t X

t=1

t X

i (t+j−i)+(t+j−h) Rnz uxi uxh + Op (b)

i,h=1

(t−i)+(t−k)+(t+j−h)+(t+j−s) Rnz uxi uxk uxh uxs +

i,k,h,s=1

t X

(t−i)+(t−k) Rnz uxi uxk Op (b)

i,k=1





n−j n−j t 1 X  X (t−i)+(t−k) 1 X In1 + 2 E Rnz uxi uxk Op (b) , = 2 kn kn t=1

where In1 =

t=1

(39)

i,k=1

(t−i)+(t−k)+(t+j−h)+(t+j−s) E(uxi uxk uxh uxs ). i,k,h,s=1 Rnz fact that E(u2xt ) = O(1) and E(uxt uxs ) = 0 for 1 ≤

Pt

Using the

s 6= t ≤ n, by the dominated

convergence theorem, we can obtain   n−j n−j t t X X 1 1 X X 2(t−i) b (t−i)+(t−k)   E R u u O (b) = Rnz E(u2xi Op ( )) xi xk p nz kn2 kn kn t=1

t=1 i=1

i,k=1

n ≤ O( ) kn = O(

sup

1≤t≤n i=1

bn1+δ kn2

t X

 b 2(t−i) Rnz O( ) kn

).

Next, under Assumption 3.1 and 4.1, (In1 -(i)) if i 6= k 6= h 6= s, E(uxi uxk uxh uxs ) = 0. Thus In1 = 0. (In1 -(ii)) if i = k = h = s, In1 =

t X i=1

4(t−i) Rnz E(u4xi )

≤ sup

t X

1≤t≤n i=1

34

4(t−i) Rnz · O(1) = O(nδ ).

(40)

i

(In1 -(iii)) if i = k 6= h = s, In1 =

t X

2(t−i)+2(t−h) Rnz E(u2xi u2xh ) ≤

i,h=1

sup

t X

1≤t≤n i=1 2δ

≤ O(n ) ·

2(t−i) Rnz · sup

1≤t≤n

t X

2(t−h) Rnz · E(u2xi u2xh )

h=1

[E(u4xi )E(u4xh )]1/2

= O(n2δ ) · O(1) = O(n2δ ). (In1 -(iv)) if i = k = h 6= s, In1 =

t X

3(t−i)+(t−s) Rnz E(u3xi uxs ) ≤

i,s=1

sup

t X

1≤t≤n i=1

3(t−i) Rnz · sup

t X

1≤t≤n s=1

t−s Rnz · E[(u2xi )(uxi uxs )]

≤ O(n2δ ) · [E(u4xi )E(u2xi u2xs )]1/2 ≤ O(n2δ ) · O(1) = O(n2δ ). (In1 -(v)) if i = k 6= h 6= s, t X

In1 =

2(t−i)+(t−h)+(t−s) Rnz E(u2xi uxh uxs ).

i,h,s=1

Let p = (1 − 2/r0−1 , r < r0 < r + γ/2, for some small γ > 0. Assumption 3.1 implies that there −r

exists η such that α(d) ≤ η · d r−2 for all d = 0, 1, 2, .... By the covariance inequality in Proposition (2.5) of Fan and Yao (2003) and Cauchy-Schwarz inequality, we obtain E(u2xi uxh uxs ) = Cov(u2xi , uxh uxs ) + E(u2xi )E(uxh uxs ) ≤ |Cov(u2xi , uxh uxs )| 0

0

0

0

1/r ≤ 8α(min(|i − h|, |i − s|))1/p · [E(u2r · [E(|uxh uxs |r )]1/r xi )]  −r 1−2/r 0 1 0 1/r0 2r0 0 ≤ 8 η · [min(|i − h|, |i − s|)] r−2 · C1 · [E(u2r xh )E(uxs )] 2r 2

≤ 8η 1− r0 · [min(|i − h|, |i − s|)] = M · [min(|i − h|, |i − s|)] Let d1 = |i − h|, d2 = |i − s|, and λ = fact that

n X

−r(r0 −2) r0 (r−2) .

dλi < ∞,

−r(r 0 −2) r 0 (r−2)

−r(r 0 −2) r 0 (r−2)

2/r0

· C1

2

2/r0

, where M = 8η 1− r0 C1

.

(41)

By definition, λ < −1. Using (38), (41) and the

for i =1, 2, as n → ∞,

di =1

35

(42)

we can obtain t X

In1 =

2(t−i)+(t−h)+(t−s) Rnz E(u2xi uxh uxs )

i>h>s=1 t X

≤M

≤M

+

2(t−i)+(t−h)+(t−s) Rnz E(u2xi uxh uxs )

i>s>h=1 2(t−i)+(t−h)+(t−s) Rnz · |i − h|

≤M

≤M

t X

−r(r 0 −2) r 0 (r−2)

t X

+M

2(t−i)+(t−h)+(t−s) Rnz · |i − s|

i>s>h=1 i>h>s=1 t t i−1 t t X i−1 X X  X X 2(t−i) X t−s t−h 2(t−i) Rnz Rnz |i − h|λ + M Rnz Rnz |i − s=1 i=2 h=1 i=2 s=1 h=1 t t i−1 t t i−1 X X  X X  X X t−s 2(t−i) λ t−h 2(t−i) Rnz Rnz d1 + M Rnz Rnz dλ2 s=1 i=2 i=2 d1 =1 h=1 d2 =1

sup

t X

1≤t≤n s=1 2δ

t−s Rnz



t X

sup

1≤t≤n i=2



2(t−i) Rnz



i−1 X

sup 1≤i≤n

dλ1



+M

d1 =1

t X

t−h Rnz

s|λ

t  X

2(t−i) Rnz

i=2

h=1

−r(r 0 −2) r 0 (r−2)



= O(n ) + O(n ) = O(n ).



i−1 X

sup 1≤i≤n

dλ2

d2 =1

(43)

Using the results of (In1 -(i))-(In1 -(v)), (39) and (40) yields n−j  n1+2δ bn1+δ 1 X 2 2 E z z ≤ O( ) + O( ). t t+j kn2 kn2 kn2

(44)

t=1

By the dominated convergence theorem, the second term in (37) is n−j n−j i  1 X h 2C 2 2C X  2 E z z Ψ = E z Ψ E(z |F ) t nt t nt t t+j t+j kn2 nα nα kn2 t=1



2C nα kn2 ×

t=1 n−j X t=1

t X

E

t h X

t−i Rnz uxi

t X h−1  X

i=1

t−h h−1−k Rnz Rn uxk



h=1 k=0

i (t+j−i)+(t+j−h) Rnz uxi uxh + Op (b)

i,h=1 n−j n−j t h−1 2C X 2C X h X X (t−i)+(t−h) h−1−k b i = α 2 In2 + α Rnz Rn E uxi uxk Op ( ) n kn n kn kn t=1



=

where In2 =

Pt

i,h,s,g=1

t=1

n−j 2C X n In2 + O( α ) α 2 n kn n kn

2C nα kn2

Ph−1 k=0

t=1 n−j X t=1

In2 + O(

i,h=1 k=0

sup 1≤t≤n

t X h=1

t−h Rnz



sup

h−1 X

1≤h≤n k=0

 b Rnh−1−k O( ) kn

bn1+2δ ), nα kn2

(t−i)+(t−h)+(t+j−s)+(t+j−g)

Rnz

(45)

 Rnh−1−k E uxi uxk uxs uxg .

In order to find the order of In2 , we note that (In2 -(i)) if i 6= k 6= s 6= g, E(uxi uxk uxs uxg ) = 0. Thus In2 = 0. 36



(In2 -(ii)) if i = k = s = g, t X

In2 =

t−h Rnz

h=1

h−1 X

3(t−k) h−1−k Rnz Rn E(u4xk )

k=0



sup 1≤t≤n

t X

t−h Rnz



h=1

sup

h−1 X

1≤h≤n k=0

 Rnh−1−k O(1) = O(nα+δ ).

(In2 -(iii)) if i = k 6= s = g, In2 =

t X

(t−h)+2(t−s) Rnz

h,s=1

h−1 X

t−k h−1−k Rnz Rn E(u2xk u2xs ) ≤

k=0

sup 1≤t≤n

t X

t−h Rnz

2

h=1

sup

t X

1≤h≤n k=0

 Rnh−1−k O(1)

≤ O(n2δ ) · O(nα ) · O(1) = O(n2δ+α ). (In2 -(iv)) if i = k = s 6= g, In2 =

t X h,g=1

(t−h)+(t−g) Rnz

h−1 X

2(t−k) h−1−k Rnz Rn E(u3xk uxg )

k=0



sup 1≤t≤n

t X h=1

 t−h 2 Rnz

sup

h−1 X

1≤h≤n k=0

 Rnh−1−k O(1)

≤ O(n2δ ) · O(nα ) · O(1) = O(n2δ+α ). (In2 -(v)) if i = k 6= s 6= g, In2 =



t X

(t−h)+(t−s)+(t−g) Rnz

h−1 X

h,s,g=1

k=0

t X

h−1 X

h,s,g=1

(t−h)+(t−s)+(t−g) Rnz

k=0

37

t−k h−1−k Rnz Rn E(u2xk uxs uxg )

Rnh−1−k E(u2xk uxs uxg ).

(46)

Let d1 = |k − s|, d2 = |k − g|. Using the results of (41), (42) and (46), we obtain In2 ≤

t X

h−1 X

(t−h)+(t−s)+(t−g) h−1−k Rnz Rn E(u2xk uxs uxg )

+

h=4 k>s>g=1

≤M

t X

t X

h−1 X

(t−h)+(t−s)+(t−g) h−1−k Rnz Rn E(u2xk uxs uxg )

h=4 k>g>s=1

h−1 X

(t−h)+(t−s)+(t−g) h−1−k Rnz Rn |k − s|

−r(r 0 −2) r 0 (r−2)

h=4 k>s>g=1

+M

t X

h−1 X

(t−h)+(t−s)+(t−g) h−1−k Rnz Rn |k

− g|

−r(r 0 −2) r 0 (r−2)

h=4 k>g>s=1

≤M

t X

t−g Rnz

t  X

g=1

+M

h=4

t X

t−s Rnz

≤M



t−g Rnz

sup

+M

sup

t−g Rnz

t X

1≤t≤n s=1

Rnh−1−k

t−h Rnz

t−s Rnz

h−1 X

sup 4≤t≤n



t−s Rnz |k − s|λ



s=2

h−1 X

Rnh−1−k

Rnh−1−k

k−2 X

t−h Rnz



h=4

4≤t≤n

t−g |k − g|λ Rnz

dλ1





+M

t X

t X

t−h Rnz

h=4

sup

h−1 X

4≤h≤n k=3



t−s Rnz

t  X

s=1

d1 =2

t X

sup

k−1 X g=2

k=3



k−1 X

k=3

h=4 t X

1≤t≤n g=1



t−h Rnz

h=4 t  X

g=1

≤M

h−1 X k=3

t  X

s=1 t X

t−h Rnz

sup

Rnh−1−k

h−1 X

4≤h≤n k=3

t−h Rnz

k=3

h=4



Rnh−1−k

sup

k−2 X

3≤k≤n d =1 1



sup

h−1 X

dλ1

k−2 X

3≤k≤n d =1 2

= O(n2δ+α ) + O(n2δ+α ) = O(n2δ+α ).

Rnh−1−k

k−2 X

dλ2



d2 =1



dλ2



(47)

Using the results of (In2 -(i))-(In2 -(v)), (45) yields n−j i 1 X h 2C 2 bn1+2δ n1+2δ E z z Ψ ) + O( ). ≤ O( t nt kn2 nα t+j kn2 nα kn2 t=1

38

(48)

The third term in (37) is n−j n−j i i 1 X h C2 2 C2 X h 2 2 2 E z Ψ = E Ψ E z |F t t+j nt nt t+j kn2 n2α n2α kn2 t=1



=

t=1

n−j X

C2 n2α kn2 C2 n2α kn2 +

E

t X h−1 h X

t=1 n−j X

t−h h−1−k Rnz Rn uxk

t 2 X

h=1 k=0

E

i

i,h=1

g−1 s−1 X X

t h X

(t+j−i)+(t+j−h) Rnz uxi uxh + Op (b)

(t+j−i)+(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn uxi uxh uxm uxr

i,h,s,g=1 m=0 r=0

t=1

g−1 t s−1 X X X

(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn uxm uxr Op (b)

i

s,g=1 m=0 r=0 n−j n−j t s−1 g−1 C2 X b i C 2 X h X X X (t−s)+(t−g) (s−1−m)+(g−1−r) = 2α 2 Rn E(uxm uxr Op ( )) Rnz In3 + 2α 2 n kn n kn kn t=1

C2 ≤ 2α 2 n kn =

C2 n2α kn2

where In3 =

n−j X t=1 n−j X

s,g=1 m=0 r=0

t=1

t

s−1

s=1

m=0

2   X X n  b t−s In3 + O( 2α 2 ) sup Rnz sup Rns−1−m O( ) n kn 1≤t≤n k n 1≤s≤n In3 + O(

bn1+2δ+α

t=1

Pt

i,h,s,g=1

),

n2α kn2

Ps−1 Pg−1 m=0

r=0

(49) (t+j−i)+(t+j−h)+(t−s)+(t−g)

Rnz

(s−1−m)+(g−1−r)

Rn

E(uxi uxh uxm uxr ).

To obtain the order of In3 , we show that (In3 -(i)) if i 6= h 6= m 6= r, E(uxi uxh uxm uxr ) = 0. Thus In3 = 0. (In3 -(ii)) if i = h = m = r, In3 =

t X

(t−s)+(t−g) Rnz

s,g=1



s−1 X

2(t+j−m) (s−1−m)+(g−1−m) Rnz Rn E(u4xm )

m=0

sup

t X

1≤t≤n s=1

 t−s 2 Rnz

sup

s−1 X

1≤s≤n m=0

 Rns−1−m O(1)

= O(n2δ+α ). (In3 -(iii)) if i = m 6= h = r, In3 =

t X

(t−s)+(t−g) Rnz

s,g=1





sup

g−1 s−1 X X

(t+j−m)+(t+j−r) (s−1−m)+(g−1−r) Rnz Rn E(u2xm u2xr )

m=0 r=0 t X

1≤t≤n s=1

t−s Rnz

2 

sup

s−1 X

1≤s≤n m=0

Rns−1−m

39

2

O(1) = O(n2δ+2α ).

(In3 -(iv)) if i = h = m 6= r, t X

In3 =

(t−s)+(t−g) Rnz

2(t+j−m) (s−1−m)+(g−1−r) Rnz Rn E(u3xm uxr )

m=0 r=0

s,g=1



g−1 s−1 X X

sup

t X

1≤t≤n s=1

t−s Rnz

2

sup

s−1 X

1≤s≤n m=0

2 Rns−1−m O(1) = O(n2δ+2α ).

(In3 -(v)) if i = m 6= h 6= r, t X

In3 =

(t+j−h)+(t−s)+(t−g) Rnz

t X

t+j−m (s−1−m)+(g−1−r) Rnz Rn E(u2xm uxh uxr )

m=0 r=0

h,s,g=1



g−1 s−1 X X

(t+j−h)+(t−s)+(t−g) Rnz

g−1 s−1 X X

Rn(s−1−m)+(g−1−r) E(u2xm uxh uxr ).

(50)

m=0 r=0

h,s,g=1

Let d1 = |m − h|, d2 = |m − r|. Using the results of (41), (42) and (50), we obtain In3 ≤

t X t X

s−1 X

(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn E(u2xm uxh uxr )

g=1 s=3 m>h>r=0

+

t X t X

s−1 X

(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn E(u2xm uxh uxr )

g=1 s=4 m>r>h=1

≤M

t X t X

s−1 X

(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn |m

− h|

−r(r 0 −2) r 0 (r−2)

g=1 s=3 m>h>r=0

+M

t X t X

s−1 X

(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn |m − r|

−r(r 0 −2) r 0 (r−2)

g=1 s=4 m>r>h=1

≤M



sup

t X

1≤t≤n g=1

+M



sup

t−g Rnz

t X

1≤t≤n g=1



t−g Rnz

sup

s−1 X

1≤s≤n r=0

Rns−1−r

t X s−1 m−1 X X

(t+j−h)+(t−s) s−1−m Rnz Rn |m − h|

−r(r 0 −2) r 0 (r−2)

s=3 m=2 h=1

t X s−1 m−1 2 X X

t−s (s−1−m)+(g−1−r) Rnz Rn |m − r|

−r(r 0 −2) r 0 (r−2)

s=4 m=3 r=2

t X s−1 m−1 t X s−1 m−2 X X X  X  t−s s−1−m t−s s−1−m ≤ O(nα+δ ) Rnz Rn dλ1 + O(n2δ ) Rnz Rn dλ2 s=3 m=2

s=4 m=3

d1 =1

d2 =1

≤ O(nα+δ )O(nα+δ ) + O(n2δ )O(nα+δ ) = O(n2α+2δ ). Using the results of (In3 -(i))-(In3 -(v)), (49) yields n−j i 1 X h C2 2 n1+2δ+2α bn1+2δ+α n1+2δ bn1+2δ 2 E z Ψ ≤ O( ) + O( ) = O( ) + O( ). t+j nt kn2 n2α n2α kn2 n2α kn2 kn2 nα kn2 t=1

40

(51)

The fourth term in (37) is n−j n−j i i 1 X h 2C 2 2C X h 2 E z z Ψ = E z E z Ψ |F t+j nt+j t+j nt+j t t t kn2 nα nα kn2 t=1

=

=

t=1

2C nα kn2 2C nα kn2 2C + α n



2C nα kn2

n−j X

E

t h X

t=1

t X t+1 X h−1  X

g−1 t+1 X X

t X

(t+j−i)+(t+j−h) h−1−k Rnz Rn uxi uxk + Op (b2 )

i

i=1 h=1 k=0

i,h=1

n−j h X

i (t−i)+(t−h)+(t+j−s)+(t+j−g) g−1−r Rnz Rn E(uxi uxh uxs uxr )

i,h,s=1 g=1 r=0

t=1 n−j h X t=1

t X

(t−i)+(t−h) Rnz E(uxi uxh Op (

i,h=1

n−j X

In4 + O(

t=1

where In4 =

(t−i)+(t−h) Rnz uxi uxh

Pt

i,h,s=1

b2 i )) kn2

b2 n1+δ ), nα kn2

(52)

Pt+1 Pg−1 g=1

r=0

(t−i)+(t−h)+(t+j−s)+(t+j−g)

Rnz

Rng−1−r E(uxi uxh uxs uxr ).

Recall that t h−1 X X

In2 =

 (t−i)+(t−h)+(t+j−s)+(t+j−g) h−1−k Rnz Rn E uxi uxk uxs uxg ≤ O(nα+2δ ).

i,h,s,g=1 k=0

We then obtain In4 ≤ O(nα+2δ ). The result of (52) yields n−j i n1+2δ+α b2 n1+δ n1+2δ b2 n1+δ 1 X h 2C 2 E z z Ψ ≤ O( ) + O( ) = O( ) + O( ). t+j nt+j t kn2 nα nα kn2 nα kn2 kn2 nα kn2

(53)

t=1

The fifth term in (37) is n−j n−j i i 1 X h 4C 2 4C 2 X h E z z Ψ Ψ = E z Ψ E z Ψ |F t t+j nt nt+j t nt t+j nt+j t kn2 n2α n2α kn2 t=1



=

4C 2 n2α kn2 4C 2 n2α kn2 +

t=1

n−j h X g−1 s−1 t X t+1 X t t X  X i X  X t−i t−s s−1−m (t+j−h)+(t+j−g) g−1−r E Rnz uxi Rnz Rn uxm Rnz Rn uxh uxr + Op (b2 ) t=1 n−j X t=1

t X s−1 X

s=1 m=0

i=1

E

t h X

g−1 t+1 X s−1 X X

h=1 g=1 r=0

(t−i)+(t+j−h)+(t−s)+(t+j−g) (s−1−m)+(g−1−r) Rnz Rn uxi uxh uxm uxr

i,h,s=1 g=1 m=0 r=0

i (t−i)+(t−s) s−1−m Rnz Rn uxi uxm Op (b2 )

i,s=1 m=0



n−j 4C 2 X b2 n1+δ I + O( ), n5 n2α kn2 nα kn2

(54)

t=1

41

where In5 =

Pt

i,h,s=1

Pt+1 Ps−1 Pg−1 m=0

g=1

r=0

(t−i)+(t+j−h)+(t−s)+(t+j−g)

Rnz

(s−1−m)+(g−1−r)

Rn

E(uxi uxh uxm uxr ).

Recall that t X

In3 =

g−1 s−1 X X

(t+j−i)+(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn E(uxi uxh uxm uxr ) ≤ O(n2α+2δ ).

i,h,s,g=1 m=0 r=0

We therefore obtain In5 ≤ O(n2α+2δ ). The result of (54) yields n−j i b2 n1+δ n1+2δ b2 n1+δ 1 X h 4C 2 n1+2α+2δ E z z Ψ Ψ ≤ O( ) + O( ) = O( ) + O( ). t t+j nt nt+j kn2 n2α n2α kn2 nα kn2 kn2 nα kn2

(55)

t=1

The sixth term in (37) is n−j n−j   1 X  2C 3 2C 3 X  2 2 z Ψ Ψ E = E Ψnt E zt+j Ψnt+j |Ft t+j nt nt+j 2 3α 3α 2 kn n n kn t=1

t=1

2C 3 ≤ 3α 2 n kn

n−j X

2C 3 n3α kn2

n−j X

2C 3 n3α kn2

n−j X

=



=

2C 3 n3α kn2

E

t X h−1 h X

t=1

g−1 t X t+1 X 2  X

In6 +

n−j 2C 3 X

n3α

In6 + O(

In6 + O(

t=1

i (t+j−i)+(t+j−g) g−1−r Rnz Rn uxi uxr + Op (b2 )

i=1 g=1 r=0

h=1 k=0

t=1

t=1 n−j X

t−h h−1−k Rnz Rn uxk t X h−1 X s−1 X

(t−h)+(t−s) (h−1−k)+(s−1−m) Rn E(uxk uxm Op ( Rnz

t=1 h,s=1 k=0 m=0

n ) n3α

sup 1≤t≤n

t X

t−h Rnz

2

h=1

sup

s−1 X

1≤s≤n m=0

b2 )) kn2

 b2 Rn2(s−1−m) O( 2 ) kn

b2 n1+α+2δ ), n3α kn2

(56)

where In6 =

g−1 t t+1 X h−1 X s−1 X X X

(t+j−i)+(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(uxi uxk uxm uxr ).

i,h,s=1 g=1 k=0 m=0 r=0

To obtain the order of In6 , we show that (In6 -(i)) if i 6= k 6= m 6= r, E(uxi uxk uxm uxr ) = 0. Thus In6 = 0. (In6 -(ii)) if i = k = m = r, In6 =

t X t+1 X

(t−h)+(t−s)+(t+j−g) Rnz

h,s=1 g=1



sup 1≤t≤n

t X h=1

g−1 X

t+j−r (h−1−r)+(s−1−r)+(g−1−r) Rnz Rn E(u4xr )

r=0 t−h Rnz

3

sup

g−1 X

1≤g≤n r=0

 Rng−1−r O(1) = O(n3δ+α ).

42

(In6 -(iii)) if i = k 6= m = r, In6 =

t X t+1 X

(t−h)+(t−s)+(t+j−g) Rnz

h,s=1 g=1





sup 1≤t≤n

h−1 X

t+j−k h−1−k Rnz Rn

t−h Rnz

3 

h=1

sup

h−1 X

1≤h≤n k=0

Rn(s−1−r)+(g−1−r) E(u2xk u2xr )

r=0

k=0

t X

g−1 X

Rnh−1−k

2

O(1) = O(n3δ+2α ).

(In6 -(iv)) if i = k = m 6= r, In6 =

t X t+1 X h,s=1 g=1



sup 1≤t≤n

s−1 X

(t−h)+(t−s)+(t+j−g) Rnz

t X

t+j−m (h−1−m)+(s−1−m) Rnz Rn

m=0 t−h Rnz

3

h=1

sup

s−1 X

1≤s≤n m=0

g−1 X

Rng−1−r E(u3xm uxr )

r=0

2 Rns−1−m O(1) = O(n3δ+2α ).

(In6 -(v)) if i = k 6= m 6= r, In6 =

g−1 t X t+1 X h−1 X s−1 X X

(t+j−k)+(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr )

h,s=1 g=1 k=0 m=0 r=0



g−1 t X t+1 X h−1 X s−1 X X

(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr ).

h,s=1 g=1 k=0 m=0 r=0

43

(57)

Let d1 = |k − m|, d2 = |k − r|. Using the results of (41), (42) and (57), we obtain In6 ≤

t X t+1 X t X

h−1 X

(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr )

s=1 g=1 h=3 k>m>r=0

+

t X t+1 X t X

h−1 X

(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr )

s=1 g=1 h=3 k>r>m=0 t X t+1 X t X

≤M

h−1 X

(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn |k − m|

−r(r 0 −2) r 0 (r−2)

s=1 g=1 h=3 k>m>r=0

+M

t X t+1 X t X

h−1 X

(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn |k − r|

−r(r 0 −2) r 0 (r−2)

s=1 g=1 h=3 k>r>m=0

≤M



sup

t+1 X

1≤t≤n g=1

+M



sup

t+j−g Rnz

t X

1≤t≤n s=1

t−s Rnz



g−1 X

sup

1≤g≤n r=0



t X t h−1 X X

(t−s)+(t−h) (h−1−k)+(s−1−m) Rnz Rn |k − m|

−r(r 0 −2) r 0 (r−2)

s=1 h=3 k>m=1

s−1 X

sup

Rng−1−r

1≤s≤n m=0

Rns−1−m

t+1 X t h−1 X X

(t+j−g)+(t−h) (h−1−k)+(g−1−r) Rnz Rn |k − r|

−r(r 0 −2) r 0 (r−2)

g=1 h=3 k>r=1

t X t X h−1 k−1 t+1 X t X h−1 k−1 X  X  X X (t−s)+(t−h) h−1−k (t−g)+(t−h) h−1−k ≤ O(nα+δ ) Rnz Rn dλ1 + O(nα+δ ) Rnz Rn dλ2 s=1 h=3 k=2

≤ O(n

α+δ

)O(n

α+2δ

g=1 h=3 k=2

d1 =1

) + O(n

α+δ

)O(n

α+2δ

) = O(n

2α+3δ

d2 =1

).

Using the results of (In6 -(i))-(In6 -(v)), (56) yields n−j  n1+2α+3δ b2 n1+α+2δ n1+3δ b2 n1+2δ 1 X  2C 3 2 E z Ψ Ψ ≤ O( ) + O( ) = O( ) + O( ). t+j nt+j nt kn2 n3α n3α kn2 n3α kn2 nα kn2 n2α kn2

(58)

t=1

The seventh term in (37) is n−j n−j  C2 X  2 1 X  C2 2 2  2 E z Ψ = E z E Ψ |F t t nt+j t nt+j kn2 n2α n2α kn2 t=1





t=1

C2

n−j X

n2α kn2

t=1

E

t h X

=

g−1 t+1 X s−1 X  X

(t+j−s)+(t+j−g) (s−1−m)+(g−1−r) Rnz Rn uxm uxr + Op (b3 )

s,g=1 m=0 r=0

i,h=1

n−j n−j t C2 X C 2 X X (t−i)+(t−h)  b3  I + R E u u O ( ) n7 xi xh p nz n2α kn2 nα nα kn2 t=1



(t−i)+(t−h) Rnz uxi uxh

C2 n2α kn2

n−j X

C2

t=1 n−j X

n2α kn2

t=1

t=1 i,h=1

t

In7 + O(

 X n  b3 t−i ) sup R O( ) nz nα 1≤t≤n nα kn2 i=1

In7 + O(

b3 n1+δ n2α kn2

),

(59)

44

i

where In7 =

Pt

i,h=1

Pt+1 Ps−1 Pg−1 s,g=1

m=0

r=0

(t−i)+(t−h)+(t+j−s)+(t+j−g)

Rnz

(s−1−m)+(g−1−r)

Rn

 E uxi uxh uxm uxr .

Recall that In3 =

g−1 s−1 X X

t X

(t+j−i)+(t+j−h)+(t−s)+(t−g) (s−1−m)+(g−1−r) Rnz Rn E(uxi uxh uxm uxr ) ≤ O(n2α+2δ ).

i,h,s,g=1 m=0 r=0

We therefore obtain In7 ≤ O(n2α+2δ ). The result of (59) yields n−j n1+2α+2δ b3 n1+δ n1+2δ b3 n1+δ 1 X  C2 2 2  ≤ O( E z Ψ ) + O( ) = O( ) + O( ). t nt+j kn2 n2α n2α kn2 n2α kn2 kn2 n2α kn2

(60)

t=1

The eighth term in (37) is n−j n−j  i 1 X  2C 3 2C 3 X h 2 2 z Ψ Ψ z Ψ E E = E Ψ |F t nt nt+j t nt nt+j t kn2 n3α n3α kn2 t=1



t=1

2C 3 n3α kn2 ×

n−j X

E

t h X

t=1

t−i Rnz uxi



i=1

g−1 t+1 X s−1 X  X

t X h−1 X

t−h h−1−k Rnz Rn uxk



h=1 k=0

i (t+j−s)+(t+j−g) (s−1−m)+(g−1−r) Rnz Rn uxm uxr + Op (b3 )

s,g=1 m=0 r=0 n−j n−j t h−1 2C 3 X 2C 3 X X X (t−i)+(t−h) h−1−k  b3  = 3α 2 In8 + 2α Rnz Rn E uxi uxk Op ( α 2 ) n kn n n kn t=1



2C 3 n3α kn2

n−j X

t=1 i,h=1 k=0

In8 + O(

t=1

b3 n1+δ ), n2α kn2

(61)

where In8 =

g−1 t t+1 X h−1 X s−1 X X X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(uxi uxk uxm uxr ).

i,h=1 s,g=1 k=0 m=0 r=0

Recall that In6 =

g−1 h−1 X s−1 X t t+1 X X X

(t+j−i)+(t−h)+(t−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(uxi uxk uxm uxr )

i,h,s=1 g=1 k=0 m=0 r=0

≤ O(n2α+3δ ). We therefore obtain In8 ≤ O(n2α+3δ ). The result of (61) yields n−j  1 X  2C 3 n1+2α+3δ b3 n1+δ n1+3δ b3 n1+δ 2 E z Ψ Ψ ≤ O( ) + O( ) = O( ) + O( ). t nt nt+j kn2 n3α n3α kn2 n2α kn2 nα kn2 n2α kn2 t=1

45

(62)

The ninth term in (37) is n−j n−j i 1 X  C4 2 2  C4 X h 2  2 E Ψ Ψ = E Ψ E Ψ |F t nt nt+j nt nt+j kn2 n4α n4α kn2 t=1



t=1

C4 n4α kn2

n−j X t=1

C4 n4α kn2

n−j X

C4 ≤ 4α 2 n kn

n−j X

=

=

C4 n4α kn2

E

t X i−1 h X

g−1 t+1 X s−1 X 2  X

In9 +

n−j C4 X

n3α

(t+j−s)+(t+j−g) (s−1−m)+(g−1−r) Rnz Rn uxm uxr + Op (b3 )

i

s,g=1 m=0 r=0

i=1 f =0

t=1

t=1 n−j X

t−i i−1−f Rnz Rn uxf t X i−1 X h−1 X

t=1 i,h=1 f =0 k=0

 b3  (t−i)+(t−h) (i−1−f )+(h−1−k) Rnz Rn E uxf uxk Op ( α 2 ) n kn

t

h−1

i=1

k=0

2   X X n  b3 t−i In9 + O( 3α ) sup Rnz sup Rnh−1−k O( α 2 ) n n kn 1≤t≤n 1≤h≤n In9 + O(

b3 n1+α+2δ n4α kn2

t=1

),

(63)

where In9 =

g−1 t t+1 X i−1 X h−1 X s−1 X X X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (i−1−f )+(h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(uxf uxk uxm uxr ).

i,h=1 s,g=1 f =0 k=0 m=0 r=0

To obtain the order of In9 , we show that (In9 -(i)) if f 6= k 6= m 6= r, E(uxf uxk uxm uxr ) = 0. Thus In9 = 0. (In9 -(ii)) if f = k = m = r, In9 ≤

t t+1 X X

(t−i)+(t−h)+(t+j−s)+(t+j−g) Rnz

i,h=1 s,g=1



sup

t X

1≤t≤n i=1

g−1 X

Rng−1−r E(u4xr )

r=0 t−i Rnz

4

sup

g−1 X

1≤g≤n r=0

 Rng−1−r O(1) = O(n4δ+α ).

(In9 -(iii)) if f = k 6= m = r, In9 ≤

t t+1 X X

(t−i)+(t−h)+(t+j−s)+(t+j−g) Rnz

i,h=1 s,g=1





sup

t X

1≤t≤n i=1

h−1 X k=0

t−i Rnz

4 

sup

h−1 X

1≤h≤n k=0

Rnh−1−k

46

2

Rnh−1−k

g−1 X

Rng−1−r E(u2xk u2xr )

r=0

O(1) = O(n4δ+2α ).

(In9 -(iv)) if f = k = m 6= r, In9 ≤

t t+1 X X

(t−i)+(t−h)+(t+j−s)+(t+j−g) Rnz

t X

sup

Rn(s−1−m)+(g−1−r) E(u3xm uxr )

m=0 r=0

i,h=1 s,g=1



g−1 s−1 X X

1≤t≤n i=1

t−i Rnz

4

sup

s−1 X

1≤s≤n m=0

2 Rns−1−m O(1) = O(n4δ+2α ).

(In9 -(v)) if f = k 6= m 6= r, g−1 t t+1 X h−1 X s−1 X X X

In9 ≤

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr ).

i,h=1 s,g=1 k=0 m=0 r=0

(64) Let d1 = |k − m|, d2 = |k − r|. Using the results of (41), (42) and (64), we obtain In9 ≤

t X t+1 X t X

h−1 X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr )

i=1 s,g=1 h=3 k>m>r=0

+

t X t+1 X t X

h−1 X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn E(u2xk uxm uxr )

i=1 s,g=1 h=3 k>r>m=0

≤M

t X t+1 X t X

h−1 X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn |k − m|

−r(r 0 −2) r 0 (r−2)

i=1 s,g=1 h=3 k>m>r=0

+M

t X t+1 X t X

h−1 X

(t−i)+(t−h)+(t+j−s)+(t+j−g) (h−1−k)+(s−1−m)+(g−1−r) Rnz Rn |k

i=1 s,g=1 h=3 k>r>m=0

≤M



sup

t X

1≤t≤n i=1

×

t−i Rnz

t+1 X t h−1 X X



sup

t+1 X

1≤t≤n g=1

t+j−g Rnz



sup

g−1 X

1≤g≤n r=0

Rng−1−r

(t+j−s)+(t−h) (h−1−k)+(s−1−m) Rnz Rn |k

− m|



−r(r 0 −2) r 0 (r−2)

s=1 h=3 k>m=1

+M

×



t X

sup

1≤t≤n i=1 t+1 X t h−1 X X

t−i Rnz



sup

t X

1≤t≤n s=1

t+j−s Rnz



sup

s−1 X

1≤s≤n m=0

Rns−1−m

(t+j−g)+(t−h) (h−1−k)+(g−1−r) Rnz Rn |k − r|

g=1 h=3 k>r=1 t+1 X t X h−1 k−1 X  X (t+j−s)+(t−h) h−1−k ≤ O(nα+2δ ) Rnz Rn dλ1 s=1 h=3 k=2

+ O(n

α+2δ

t+1 X t X h−1 X

)

d1 =1 (t+j−g)+(t−h) h−1−k Rnz Rn

g=1 h=3 k=2

k−1 X

dλ2



d2 =1

≤ O(nα+2δ )O(nα+2δ ) + O(nα+2δ )O(nα+2δ ) = O(n2α+4δ ).

47



−r(r 0 −2) r 0 (r−2)





− r|

−r(r 0 −2) r 0 (r−2)

Using the results of (In9 -(i))-(In9 -(v)), (63) yields n−j 1 X  C4 2 2  n1+2α+4δ b3 n1+2δ n1+4δ b3 n1+2δ E Ψ Ψ ≤ O( ) + O( ) = O( ) + O( ). nt nt+j kn2 n4α n4α kn2 n3α kn2 n2α kn2 n3α kn2

(65)

t=1

Therefore, by (44), (48), (51), (53), (55), (58), (60), (62), (65), and the fact that q = O(n), (37) yields n−j X

† E(Zt† Zt+j )2 = O(n),

t=1

and thus

n−j

1X † † mt mt+j > ) ≤ P r( q

Pn−j t=1

† † E(Zt−1 Zt+j−1 )2

q 2 2

t=1

1 = O( ). n

Under Assumption 4.1,

(III) =

b−1 X j=1

(4) We find the order of (IV ) =

n−j

j 1X † † b (1 − ) mt mt+j = Op ( ) = op (1). b q n

(66)

t=1

Pb−1

j=2 (1

− jb ) 1q

† † i=1 (mj mj−i

Pj−1

+ m†n+1−j m†n+1−j+i ). Let t = j − i,

1 ≤ t ≤ b − 1 − i. By (66) and the fact that b = o(n), Therefore, we can obtain that (IV ) = op (1). Finally, by the results of (1) - (4), we complete the proof that

1 q

†2 i=1 Bi

Pq

→p τ (1 − τ )Vcxz .

Lemma A.5 The long-run variance of m†t satisfies: (1/n)

n X

Cov(m†t , m†s ) = Op (1).

(67)

s,t=1

Proof of Lemma A.5.

Using Lemma A.1 and the conditional moment restriction E(ψτ (utτ )|Ft−1 ) =

0 n 1 X † † Cov(Zt−1 ψτ (utτ ), Zs−1 ψτ (usτ )) n s,t=1 n X

n

=

1 n

=

n n 1X 2X † †0 † †0 E[Zt−1 E[Zt−1 Zt−1 E(ψτ2 (utτ )|Ft−1 )] + Zs−1 ψτ (usτ )E(ψτ (utτ )|Ft−1 )] n n s 0. Then the sequence {Xnr } is uniformly integrable.

A.2

Proofs for Main Theorems

Proof of Theorem 3.1. For I(0) xt , we have n  X √  QR −1 1 ˆ n βτ − βτ = Ω1 √ ψτ (utτ )Xt−1 + op (1) n t=1

where Ω1 =

fε Fε−1 (τ )



0 Xt−1 Xt−1 E σt





and, by MGCLT, n

  1 X 0 √ ψτ (utτ )xt−1 =⇒ N 0, τ (1 − τ ) E Xt−1 Xt−1 . n t=1 Thus  √  QR n βˆτ − βτ =⇒ N

0 Xt−1 Xt−1 E  2 σt Fε−1 (τ )

τ (1 − τ )

0, fε



−1 E



0 Xt−1 Xt−1



0 Xt−1 Xt−1 E σt



−1 ! .

In a similar way, for (MI) xt   p nkn βˆτQR − βτ =⇒ N

with Vxx =

R∞ 0

erC Ωxx erC dr and V˜xx = E

0,

h i 1 σt

τ (1 − τ )

!

−1 −1 2 V˜xx Vxx V˜xx fε Fε−1 (τ )

Vxx . See also Kostakis et al (2014, Lemma B.4).

I(1) case is already discussed in Section 3.1 so is omitted. Proof of Theorem 3.2. Following the proof of Lee (2016) but using CHE innovation specification we have

   −1 IV XQR ˜ β ,n (nkn )1/2 βˆ1τ − β1,τ = M Gτ,n + op (1) τ

49

where ˜ β ,n M τ

:

=

n X

0 , f˜utτ ,t−1 (0) Z˜t−1,n Z˜t−1,n

t=1

 1 , f˜utτ ,t−1 (0) = fε Fε−1 (τ ) σt hence ˜ β ,n →p M ˜β M τ τ

 i 0  h  fε F −1 (τ ) E Xt−1 Xt−1 , for (I0), ε h iσt ≡  1  fε F −1 (τ ) E for (MI) and (I1), ε σt Vcxz .

Moreover, Gτ,n :=

n X

( Z˜t−1,n ψτ (utτ ) =⇒ Gτ ≡

  0 N 0, τ (1 − τ )E Xt−1 Xt−1

for (I0),

N (0, τ (1 − τ )Vcxz )

for (MI) and (I1),

t=1

.

Thus, the desired result follows. Theorem A.2 (Stability Condition) Let m ¯ ∗† ` denote the average of resampled data under the P ∗† ∗† ` MBB, m ¯ ` = (1/`) t=1 mt . Under the assumptions 2.1, 3.1 and 4.1, `V ar∗ (m ¯ ∗† ` ) →p τ (1 − τ )Vcxz ,

as n → ∞.

(68)

Proof of Theorem A.2. Using the fact that Y˜i† are iid under P ∗ , we can observe m  1 X  ∗ ˜ † = V ar∗ (Y˜ † ). √ `V ar∗ (m ¯ ∗† ) = `V ar Y 1 i ` `m i=1

By Corollary A.1 and the fact that P ∗ (Y˜1† = Bj† ) = 1/q, 1 ≤ j ≤ q, q

V ar (Y˜1† ) = ∗

= =

Providing that



bm ¯ †n

=

1X † (Bi − E ∗ Y˜1† )2 q i=1  2 q  X √ † 1 b † Bi − bm ¯ n − Op q n 1 q

i=1 q  X

Bi†





bm ¯ †n

2

i=1

q

b n

·

(69) (70)

q

2 X † √ † − Bi − bm ¯ n Op q i=1

n−1/2

† t=1 mt

Pn

= Op (

q

b n ),

   2 b b + Op n n

(71)

and that b = o(n),

q

V ar∗ (Y˜1† ) =

1 X  † √ † 2 Bi − bm ¯ n + op (1). q i=1

50

(72)

In order to prove 68, it suffices to show q

2 1 X † Bi − b1/2 m ¯ †n →p τ (1 − τ )Vcxz . q

(73)

i=1

This can be decomposed further as: q

1 X  † √ † 2 ¯n = Bi − bm q i=1

q

q

q

i=1

i=1

i=1

1 X †2 2 X † √ † 1 X  √ † 2 ¯n + bm ¯n Bi − Bi bm q q q

≡ An − 2Cn + Dn . Next we will show: (a) An →p τ (1 − τ )Vcxz ; (b) Cn →p 0; (c) Dn →p 0. (a) This result is obtained by Lemma A.4. (b) √

b  √ bm ¯ n + Op ( ) · bm ¯n n r r n n  b 1 X b 1 X † b  † = ·√ ·√ mt + Op ( ) · m n n n n t=1 n t=1 t

Cn =

=

n n  1 X b 1 X b ·√ m†t + Op (( )3/2 ) · √ m† n n n t=1 n t=1 t

(74) (75) (76)

= op (1) · Op (1)

(77)

= op (1).

(78)

(c) q

n

1X 2 b 1 X † 2 √ Dn = m = op (1)Op (1) = op (1). bm ¯n = q n n t=1 t i=1

Proof of Theorem 4.1. Since



n(m ¯ †n − Em†t ) →D Gτ (Lee (2016), Lemma A.2), where

 N (0, τ (1 − τ )E X X 0 ), for (I0) t−1 t−1 Gτ = N (0, τ (1 − τ )V ), for (MI) and (I1), cxz and Gτ is a continuous distribution, by Poly¯ a’s Theorem, √  sup P n(m ¯ †n − Em†t ) ≤ x − Gτ (x) →p 0,

as n → ∞.

(79)

as n → ∞.

(80)

x∈R

Hence, it is suffices to show that √  † ) ≤ x − G (x) sup P ∗ `(m ¯ ∗† − m ¯ →p 0, τ n `

x∈R

51

Recall that Y˜i† =

√1 b

Pi+b−1 j=i

m∗† t . We can obtain √

m

1 X ˜† √ Yi . `m ¯ ∗† = ` m

(81)

i=1

n 2  o ˆ n (λ) := 1 Pm E ∗ Y˜ † − E ∗ Y˜ † 1 |Y˜ † − E ∗ Y˜ † | > λ√m , for λ > 0. Using the fact that Let ∆ i=1 i i i i m Y˜i† , i = 1, 2, ..., m are iid under P ∗ , and that P ∗ (Y˜1† = Bj† ) = 1/q, 1 ≤ j ≤ q, for any  > 0, ˆ n (λ) > ) ≤ −1 E(∆ ˆ n (λ)) P r(∆ n h 2  √ io = −1 E E ∗ Y˜1† − E ∗ Y˜1† 1 Y˜1† − E ∗ Y˜1† > λ m q h q q  n1 X  X X 2 † 1 √ io 1 † † † −1 = E Bi − Bj 1 Bi − Bj > λ m q q q i=1 j=1 j=1 q q q nh   X X X 2 † 1 √ io 1 † † † −1 = (q) E Bi − Bj 1 Bi − Bj > λ m q q i=1 j=1 j=1 X q h q q  X X    √ √ i 1 1 † † 2 † 2 † −1 E Bi 1 Bi > (λ/2) m + E ≤ 4(q) Bj 1 Bj > (λ/2) m q q i=1

≤ 4(q)−1

j=1

j=1

q √  1 X † 2 i † † 2 E Bi 1 Bi > (λ/2) m + E Bj q

q h X

j=1

i=1

= o(1), since

h √ i lim E (Bi† )2 1 Bi† > (λ/2) m = 0,

(82)

n→∞

and by Lemma A.2, and b = o(n), q = O(1), E(m†t )2 = O(1), for 1 ≤ t ≤ n, E(m†i m†j ) = 0, for i 6= j, 1 ≤ i, j ≤ n, E

q 1 X

q

Bj†

2

=E

j=1

n n √b X

b−1 o2 1 X m†t − √ (b − j) m†j + m†n−j+1 q q b j=1 t=1

n n X b−1 b−1 n 1 X 2 2b X 2 o  b X ≤E b m†t − 2 m†j + m†n−j+1 m†t m†j + m†n−j+1 + 2 n q q t=1

=

t=1 j=1

j=1

n b−1 b−1 b−1 2 b X 2b X b X 2b X † 2 † 2 † 2 E(m ) − E(m ) − E(m ) + E m†j + m†n−j+1 t j n−j+1 2 2 2 2 n n n n t=1

b2

j=1 b2

j=1

b2

b = O( ) + O( 2 ) + O( 2 ) + O( 2 ) n n n n = o(1).

52

j=1

In order to prove (82), we define the triangular array {Vn,i : i = 1, ..., n − b + 1} as Vn,i =

Bi†2

i+b−1 1 X † 2 = ( mt ) . b i=1

For fixed i = 1, ..., q, E(m†t )2 < ∞, we observe that i+b−1 1 X † √ mt →d N (0, E(m†t )2 ), b i=1

and

i+b−1 1 X † 2 E[( √ mt ) ] = E(m†t )2 < ∞. b i=1

∞ Hence, by Serfling (1980, page 15, Lemma B), the sequences {Vn,i }∞ n=n1 , ..., {Vn,i }n=nq is uniformly

integrable, which implies that, for λ > 0: h  i h i 2  2 λ λ E Vn,i 1 Vn,i > ( )2 m = E Bi† 1 Bi† > ( )2 m → 0, as m → ∞, 2 2 Using this result yields h lim E (Bi† )2 1

n→∞

where m =

n b

√ i † B > (λ/2) m = 0, i

→ ∞, as n → ∞.

ˆ n (λ) = op (1), which implies the Lindeberg’s condition: Therefore, we conclude that ∆ ` 2  √ o 1 X ∗ n ∗† ∗† ∗ ∗† 1 mt − E ∗ m∗† m − E m > λ E ` = op (1), t t t `

as n → ∞.

(83)

t=1

Next, by Lindeberg’s Central Limit Theorem, using Theorem A.2, along with the Lindeberg’s √ condition (83), yields that the conditional distribution of `(m ¯ †n ) given (m†1 , ..., m†n ) converges ¯ ∗† ` −m to Gτ , as n → ∞. Hence, by P oly¯ a’s Theorem, (80) follows. ∗0 (β − β ), θ ˆτ := βˆτ − βτ , Proof of Theorem 4.2 Let u∗tτ := u∗t − z˜t−1 τ

λ∗` (βˆτ ) := `−1

` X

∗† ∗0 ˆ Et−1 Zt−1 ψτ (u∗tτ − z˜t−1 θτ ),

t=1

and λn (βˆτ ) := n−1

n X

† 0 Et−1 Zt−1 ψτ (utτ − z˜t−1 θˆτ ).

t=1

We prove the following claims 1-6 and derive the main results. 1. βˆτ∗ − βτ →P 0: Following the argument in Fitzenberger (1997, proof of Theorem 3.3), we

53

obtain: βˆτ∗ − βˆτ →P ∗ 0. Combined with the consistency of βˆτ for βτ shown in Lee (2016), the result follows. P P ∗† † 2. `1/2 [1/` `t=1 Zt−1 ψτ (u∗tτ ) − 1/n nt=1 Zt−1 ψτ (utτ )] =⇒ Gτ under P ∗ , where  N (0, τ (1 − τ )E X X 0 ), for (I0) t−1 t−1 Gτ = N (0, τ (1 − τ )V ), for (MI) and (I1). cxz : This result is obtained following Theorem 4.1 and Theorem 3.2. P † 3. n−1/2 nt=1 Zt−1 ψτ (utτ ) + n1/2 λn (βˆτ ) = op (1): This is shown in Lee (2016, proof of Theorem 3.1, (A.1)). 4. λ∗` (βτ ) = λn (βτ ) + O(b/n), 5. ∂λ∗` (βτ )/∂βτ = ∂λn (βτ )/∂βτ + O(b/n): These two results directly follow from Fitzenberger (1997, proof of Theorem 3.3), using argument in Lemma A.2 and Corollary A.1. P ∗ ψ (u∗ ) + `1/2 λ∗ (β ˆ∗ 6. `−1/2 `t=1 Zt−1 τ tτ ` τ ) = op (1): Recall that βˆτ∗ = arg min β

` X

 ∗ ρτ yt∗ − β 0 z˜t−1 .

t=1

We obtain the approximate FOC: `

−1/2

` X

∗† ∗0 ˆ∗ Zt−1 [τ − 1(yt∗ ≤ z˜t−1 βτ )] = op (1),

(84)

t=1

which implies that op (1) = `−1/2 = `−1/2

=

` X t=1 ` X

∗† ∗0 ˆ∗ ψτ (u∗tτ − z˜t−1 θτ ) Zt−1

(85)

∗† ∗0 ˆ∗ ∗0 ˆ∗ Zt−1 {ψτ (u∗tτ − z˜t−1 θτ ) − Et−1 ψτ (u∗tτ − z˜t−1 θτ )

(86)

t=1 −ψτ (u∗tτ ) + Et−1 ψτ (u∗tτ )} ` ` X X ∗† ∗† −1/2 ∗ ∗0 ˆ∗ −1/2 +` Zt−1 Et−1 ψτ (utτ − z˜t−1 θτ ) + ` Zt−1 ψτ (u∗tτ ) t=1 t=1 ` X ∗† `1/2 λ∗` (βˆτ∗ ) + `−1/2 Zt−1 ψτ (u∗tτ ) + op (1). t=1

(87) (88)

(89)

The main results of Theorem 4.2 now follow by claims 1-6. Note that, by the Taylor expansion

54

of λn (βˆτ∗ ) around βˆτ : λ∗` (βˆτ∗ ) = λn (βˆτ∗ ) + O(b/n) ∂λn (βτ ) = λn (βˆτ ) + | ˆ (βˆ∗ − βˆτ ) + op (βˆτ∗ − βˆτ ) + O(b/n) ∂βτ βτ τ n X ∂λn (βτ ) † | ˆ (βˆ∗ − βˆτ ) + op (βˆτ∗ − βˆτ ) + O(b/n) = [−n−1 Zt−1 ψτ (utτ ) + n−1/2 op (1)] + ∂βτ βτ τ

(by 3).

t=1

Then plug this result of λ∗` (βˆτ∗ ) in (89) op (1) = `1/2 λ∗` (βˆτ∗ ) + `−1/2

` X

∗† Zt−1 ψτ (u∗tτ )

(90)

t=1

" = `1/2

` 1X

`

t=1

n

1X † ∗† Zt−1 ψτ (u∗tτ ) − Zt−1 ψτ (utτ ) n

# (91)

t=1

∂λn (βτ ) | ˆ · `1/2 (βˆτ∗ − βˆτ ) + `1/2 op (βˆτ∗ − βˆτ ) + `1/2 O(b/n) + op (1) ∂βτ βτ # " n 1X † †0 0 ∗ zt−1 (βˆτ − βτ )) (`kn )1/2 (βˆτ∗ − βˆτ ) = Gτ,n − Zt−1 Zt−1 futτ ,t−1 (˜ n +

(92) (93)

t=1

1/2

+(`kn ) =

G∗τ,n

(βˆτ∗ − βˆτ )kn−1/2 op (1) + op (1)

(94)

(βˆτ∗

(95)

˜ ˆ · (`kn ) −M βτ ,n

1/2

− βˆτ ) + op (1)

√ † ∗† where Zt−1 = z˜t−1 / kn and similarly for Zt−1 . Now " G∗τ,n = (`/kn )1/2

# ` n X 1 X ∗† 1 † Zt−1 ψτ (u∗tτ ) − Zt−1 ψτ (utτ ) →D Gτ ` n t=1

under P ∗ by 2,

t=1

and ˜ ˆ = (1/n) M βτ ,n

n X

† †0 0 ˜β Zt−1 Zt−1 futτ ,t−1 (˜ zt−1 (βˆτ −βτ )) →p M τ

t=1

Therefore, the results in Theorem 4.2 follows.

55

 i 0  h fε Fε−1 (τ ) E Xt−1 Xt−1 , for (I0) h i σt ≡ f F −1 (τ ) E 1 V , for (MI) and (I1). ε cxz ε σt

References [1] L. I. Boneva, D. Kendall, and I. Stefanov. Spline transformations: Three new diagnostic aids for the statistical data-analyst. Journal of the Royal Statistical Society. Series B (Methodological), 33(1):1–71, 1971. [2] J. Y. Campbell and M. Yogo. Efficient tests of stock return predictability. Journal of Financial Economics, 81(1):27–60, 2006. [3] E. Carlstein. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. The Annals of Statistics, 14(3):1171–1179, 1986. [4] M. Carrasco and X. Chen. Mixing and moment properties of various GARCH and stochastic volatility models. Econometric Theory, 18(1):17–39, 2002. [5] Y. Choi, S. Jacewitz, and J. Y. Park. A reexamination of stock return predictability. Journal of Econometrics, 192(1):168–189, 2016. [6] B. Efron and R. J. Tibshirani. An introduction to the bootstrap. Chapman & Hall, 1993. [7] J. Fan and Q. Yao. Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York, 2003. [8] B. Fitzenberger. The moving blocks bootstrap and robust inference for linear least squares and quantile regressions. Journal of Econometrics, 82(2):235–287, 1997. [9] A. Goyal and I. Welch. Predicting the equity premium with dividend ratios. Management Science, 49(5):639–654, 2003. [10] P. Hall, J. L. Horowitz, and B.-Y. Jing. On blocking rules for the bootstrap with dependent data. Biometrika, 82(3):561–574, 1995. [11] W. Hendricks and R. Koenker. Hierarchical spline models for conditional quantiles and the demand for electricity. Journal of the American Statistical Association, 87(417):58–68, 1992. [12] Y. M. Kim and D. J. Nordman. Properties of a block bootstrap under long-range dependence. Sankhy¯ a Series A, 73(1):79–109, 2011. [13] K. Knight. Limiting distributions for L1 regression estimators under general conditions. The Annals of Statistics, 26(2):755–770, 1998. [14] R. Koenker. Quantile regression. Cambridge University Press, 2005. [15] R. Koenker and G. Bassett Jr. Regression quantiles. Econometrica, 46(1). [16] R. Koenker and G. Bassett Jr. Robust tests for heteroscedasticity based on regression quantiles. Econometrica, 50(1):43–61, 1982. 56

[17] R. Koenker and Q. Zhao. Conditional quantile estimation and inference for ARCH models. Econometric Theory, 12(5):793–813, 1996. [18] A. Kostakis, T. Magdalinos, and M. P. Stamatogiannis. Robust econometric inference for stock return predictability. The Review of Financial Studies, 28(5):1506–1553, 2014. [19] H. R. K¨ unsch. The jackknife and the bootstrap for general stationary observations. The Annals of Statistics, 17(3):1217–1241, 1989. [20] S. Lahiri. On second-order properties of the stationary bootstrap method for studentized statistics. In S. Ghosh, editor, Asymptotics, Nonparametrics, and Time Series, pages 683–711. Marcel Dekker, New York, 1999. [21] S. Lahiri. Consistency of the jackknife-after-bootstrap variance estimator for the bootstrap quantiles of a studentized statistic. The Annals of Statistics, 33(5):2475–2506, 2005. [22] J. H. Lee. Online supplement to “predictive quantile regression with persistent covariates: IVX-QR approach”. Available at https://sites.google.com/site/jihyung412/research. 2014. [23] J. H. Lee. Predictive quantile regression with persistent covariates: IVX-QR approach. Journal of Econometrics, 192(1):105–118, 2016. [24] A. M. Lindner. Continuous time approximations to GARCH and stochastic volatility models. In T. Andersen, R. Davis, J.-P. Kreiss, and T. Mikosch, editors, Handbook of Financial Time Series, pages 481–496. Springer, New York, 2009. [25] R. Y. Liu and K. Singh. Moving blocks jackknife and bootstrap capture weak dependence. In R. Lepage and L. Billard, editors, Exploring The Limits of Bootstrap, pages 225–248. Wiley, New York, 1992. [26] A. Maynard, K. Shimotsu, and Y. Wang. Inference in predictive quantile regressions. Unpublished Manuscript, 2011. [27] P. C. Phillips. Towards a unified asymptotic theory for autoregression. Biometrika, 74(3):535– 547, 1987. [28] P. C. Phillips and J. H. Lee. Predictive regression under various degrees of persistence and robust long-horizon regression. Journal of Econometrics, 177(2):250–264, 2013. [29] P. C. Phillips and J. H. Lee. Robust econometric inference with mixed integrated and mildly explosive regressors. Journal of Econometrics, 192(2):433–450, 2016. [30] P. C. Phillips, T. Magdalinos, et al. Econometric inference in the vicinity of unity. Singapore Management University, CoFie Working Paper, 7, 2009.

57

[31] D. N. Politis and J. P. Romano. A circular block-resampling procedure for stationary data. In R. Lepage and L. Billard, editors, Exploring The Limits of Bootstrap, pages 263–270. Wiley, New York, 1992. [32] D. N. Politis and J. P. Romano. The stationary bootstrap. Journal of the American Statistical Association, 89(428):1303–1313, 1994. [33] D. Pollard. Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7(2):186–199, 1991. [34] R. J. Serfling. Approximation theorems of mathematical statistics. Wiley, New York, 1980. [35] M. M. Siddiqui. Distribution of quantiles in samples from a bivariate population. J. Res. Nat. Bur. Standards, 64B:145–150, 1960. [36] I. Welch and A. Goyal. A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4):1455–1508, 2008. [37] A. Welsh. Kernel estimates of the sparsity function. In Y. Dodge, editor, Statistical Data Analsyis Based on the L1 -Norm and Related Methods, pages 369–377. Elsevier, New York, 1987. [38] W. B. Wu. Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences of the United States of America, 102(40):14150–14154, 2005. [39] Z. Xiao. Quantile cointegrating regression. Journal of Econometrics, 150(2):248–260, 2009. [40] Z. Xiao and R. Koenker.

Conditional quantile estimation for generalized autoregressive

conditional heteroscedasticity models.

Journal of the American Statistical Association,

104(488):1696–1712, 2009.

58

Suggest Documents