Estimation in Semiparametric Time Series

0 downloads 0 Views 357KB Size Report
Oct 27, 2010 - School of Economics. Research Paper ... Interest focuses on general nonparametric and semiparametric ... In both theory and practice, there is.
The University of Adelaide School of Economics

Research Paper No. 2010-27 October 2010

Estimation in Semiparametric Time Series Regression Jia Chen, Jiti Gao and Degui Li

Estimation in Semiparametric Time Series Regression Jia Chen, Jiti Gao1 and Degui Li The University of Adelaide, SA 5005, Australia Abstract In this paper, we consider a semiparametric time series regression model and establish a set of identification conditions such that the model under discussion is both identifiable and estimable. We then discuss how to estimate a sequence of local alternative functions nonparametrically when the null hypothesis does not hold. An asymptotic theory is established in each case. An empirical application is also included.

1

Jiti Gao is from the School of Economics, The University of Adelaide. Adelaide SA 5005,

Australia. Email: [email protected].

2

1. Introduction Various estimation and specification testing problems have been proposed and discussed in recent years. Interest focuses on general nonparametric and semiparametric time series models under stationarity. Recent studies include Tong (1990), Fan and Gijbels (1996), H¨ ardle, Liang and Gao (2000), Fan and Yao (2003), Gao (2007), and Li and Racine (2007) as well as the references therein. In the semiparametric case, interest is on the estimation and specification testing in a semiparametric time series model when there are at least two different time series are involved. In both theory and practice, there is some need to establish the mathematical relationship between one time series and another and then discuss both the estimation and specification testing in such a model. When the same time series variable is fully involved in both the parametric and nonparametric components of a semiparametric time series regression model, to the best of our knowledge, the issue of how to identify and then estimate the model has not been addressed. This paper starts with a semiparametric time series model of the form Yt = Vtτ β + ∆(Vt ) + et , t = 1, 2, · · · , n

(1.1)

where {Vt } is a vector of stationary time series, β is a vector of unknown parameters, ∆(·) is an unknown function defined on Rd , {et } is a sequence of independent and identically distributed errors, and n is the number of observations. This paper focuses on the case of 1 ≤ d ≤ 3. In the case of d ≥ 4, one may need to approximate ∆(·) by a partial sum of univariate functions in a similar fashion to Section 2.3 of Gao (2007). Model (1.1) has different types of motivations and applications to the conventional semiparametric time series model of the form Yt = Utτ β + ∆(Vt ) + et , in which Ut and Vt are two different stationary time series such that Σ = E [(Ut − E[Ut |Vt ]) (Ut − E[Ut |Vt ])τ ] is a positive definite matrix. In model (1.1), the linear component in many cases plays the leading role while the nonparametric component behaves like a type of unknown departure from the classical linear model. Since such departure is usually unknown, it is not unreasonable to treat ∆(·) as a nonparametrically unknown function. In the process of estimating both β and ∆(·) consistently and efficiently, existing methods, as discussed in the literature for the partially linear case in H¨ardle, Liang and Gao (2000), and Gao (2007) for example, are not valid because Σ = E [(Vt − E[Vt |Vt ]) (Vt − E[Vt |Vt ])τ ] = 0. In Section 2 below, we therefore consider a more general semiparametric time series model

3

than model (1.1) and propose using a nonlinear least squares (LS) estimation method to deal with the estimation of the unknown parameters and function involved. In Section 3 below, we then consider an extension of model (1.1) to cover the case where an extended form of model (1.1) becomes a semiparametric function as an alternative involved in the hypotheses: H0 : E[Yt |Vt = v] = v τ β0

versus

H1 : E[Yt |Vt = v] = v τ β1 + ∆n (v),

(1.2)

where each βi is the true value of parameter β under either H0 or H1 and {∆n (v)} is a sequence of nonparametrically unknown functions. Interest in the literature mainly focuses on constructing a test for H0 . To the best of our knowledge, the literature does not provide us with any guidance about how to specify an alternative form and then consistently estimate ∆n (·) when H0 is not accepted to be true. The paper by Gao and Gijbels (2008) suggests using a semiparametric estimation method in the practical implementation of an optimal bandwidth selection method when a kernel–based test is used. Section 3 below systematically discusses how to identify and then estimate both β1 and ∆n (·) consistently. In both models (1.1) and (1.2), we allow {Vt } to have a deterministic trend component. As a consequence, both models are more applicable to deal with the case where the nonstationarity of {Vt } is caused by its mean function. Such cases include consumer price indices. The rest of the paper is organized as follows. Section 2 discusses both the identification and estimation of β and ∆(·) in model (1.1). The identification and estimation of β1 and ∆n (·) involved in model (1.2) is then investigated in detail in Section 3. Section 4 mentions some extensions. A simulation study is given in Section 5. Some concluding remarks are given in Section 6. All the mathematical proofs are given in Appendix A. 2. Identification and estimation 2.1 Nonlinear LS estimation method Consider a semiparametric nonlinear time series model of the form Yt = g(Vt , θ1 ) + ∆(Vt ) + et ,   t Vt = H + Ut , t = 1, 2, · · · , n n

(2.1)

where g(·, θ1 ) is a parametrically known function indexed by an unknown parameter vector θ1 ∈ Θ ⊂ Rp (p ≥ 1), both ∆(·) and H(·) are unknown function defined on Rd and R1 ,

4

respectively, and both {Ut } and {et } are assumed to be stationary. To present the main ideas and make this paper more concise, we consider the case of d = 1 in Sections 2 and 3 as well as in Appendix A. Section 4 discusses how to deal with the case of d ≥ 2. As discussed in Section 1, there are various motivations for us to consider a semiparametric time series model of the form (2.1). In the analysis of economic and financial data, one may motivate the proposal of model (2.1) by considering a general parametric nonlinear model of the form Yt = g(Vt , θ1 ) + εt ,

(2.2)

where the error process {εt } is endogenously correlated with {Vt } in an additive model of the form εt = ∆(Vt ) + et . In such cases, {Vt } and {et } are likely to be dependent to each other. Obviously, there is also some need to consider a multiplicative model of the form εt = σ(Vt ) ηt . This is a different kind of model to what we are interested in this paper. To estimate θ1 and then ∆(·) involved in (2.1), we start with a nonlinear least squares estimation method by choosing the true version of θ such that n 1X E [Yt − g(Vt , θ)]2 = min! n t=1

This implies

1 n

Pn

t=1 E

over θ.

(2.3)

[∆(Vt ) g(V ˙ t , θ1 )] = 0, which is asymptotically equivalent to

Z 1 Z



∆(v) g(v, ˙ θ1 )p(v − H(r))dv dr = 0,

(2.4)

0

where p(·) denotes the marginal density function of {Ut } and g(·, ˙ θ1 ) =

∂g(·,θ) ∂θ |θ=θ1

denotes

the partial derivative of g(·, θ) with respect to θ. Equation (2.4) is needed to ensure that θ1 is identifiable and estimable. The sample version of (2.3) suggests using the Method of Moments to estimate θ1 by minimizing n 1X [Yt − g(Vt , θ)]2 n t=1

over θ.

(2.5)

The resulting nonlinear least squares estimator is denoted by θb1 . We then estimate ∆(·) by a local linear estimator of the form b ∆(v) =

n X

Wnt (v)(Yt − g(Vt , θb1 )),

(2.6)

t=1 K

v,b where {Wnt (v)} is a sequence of weight functions given by Wnt (v) = P n

(Vt )

with

Kv,b (Vk )

k=1

Kv,b (Vt ) =

1 b Kn



Vt −v b



, in which Kn



Vt −v b



=K

5



Vt −v b



Sn,2 (v) −



Vt −v b





Sn,1 (v) with

Sn,j (v) =

1 nb

Pn

s=1 K



Vs −v b



Vs −v b

j

for j = 0, 1, 2, and K(·) and b are the kernel function

and bandwidth parameter, respectively. b To establish an asymptotic theory for θb1 and ∆(·), we need to introduce the following

conditions. A1 (i) Suppose that {(et , Ut )} is a sequence of independent and identically distributed (i.i.d.) random variables. Let P (E(et |Vt ) = 0) = 1 for all t ≥ 1 and 0 < σe2 := E[e21 ] < ∞. (ii) Let p(u) and p(e, u) be the marginal density of et and the joint density of (et , Ut ), respectively. Suppose that p(u) is continuous in u and that p(e, u) is continuous in (e, u). A2 The nonlinear regression function g(v, θ) is twice differentiable with respect to θ, and its derivatives are continuous in θ. The matrix Z 1 Z

Σ(θ1 ) :=



τ

g(v, ˙ θ1 )g˙ (v, θ1 )p(v − H(r))dv dr

0

is positive definite. In addition, g(v, θ) is continuous in v, and the matrix Z 1 Z

Σe :=



σ 2 (v, r)g(v, ˙ θ1 )g˙ τ (v, θ1 )p(v − H(r))dv dr

0

is positive definite, where σ 2 (v, r) =

R∞

−∞ x

2 p(x, v

− H(r))dx.

A3 (i) Suppose that H(r) is continuous in r. (ii) ∆(v) is twice continuously differentiable, and the matrix Z 1 Z

Σ∆ :=

2



τ

∆ (v)g(v, ˙ θ1 )g˙ (v, θ1 )p(v − H(r))dv dr

0

is positive definite. A4 (i) K(·) is a symmetric and continuous probability density function with ∞ and

R

R

K 2 (u)du


Theorem 2.1(i) shows that the parametric estimator θb1 has the root-n convergence rate which is same as that in the parametric linear model case. The influence of ∆(·) and the error process {et } on the asymptotic distribution is reflected by Σ∆ and Σe in the variance matrix. Theorem 2.1(ii) shows that it is achievable to obtain a standard result for the local linear estimator. The detailed proof of Theorem 2.1 is given in Appendix A below. 2.2 Semiparametric weighted LS estimation method If we follow the literature (Gao and Liang 1997; H¨ardle, Liang and Gao 2000 for example) by treating model (2.1) as a usual partially linear model of the form Yt − g(Vt , θ1 ) = ∆(Vt ) + et

(2.9)

and estimate ∆(·) first by ∆(v) = ∆(v, θ1 ) =

n X

Wnt (v) (Yt − g(Vt , θ1 )) ,

(2.10)

t=1

we will then estimate θ1 by a semiparametric weighted least squares estimator of the form θe1 such that θe1 = arg = arg

where Yet = Yt −

n P s=1

min

n  X

over all θ t=1 n  X

min

over all θ t=1

2

Yt − g(Vt , θ) − ∆(Vt , θ) Yet − ge(Vt , θ)

2

Wns (Vt )Ys and ge(Vt , θ) = g(Vt , θ) −

,

(2.11) n P

Wns (Vt )g(Vs , θ), in which

s=1

Wns (v) is as defined in (2.6). Due to the local linear method, similarly to the proof of Theorem 2.1(ii), one may show that as n → ∞ e t ) = (1 + oP (1)) ∆00 (Vt ) b2 , ge(Vt , θ1 ) = (1 + oP (1)) cτ1 g20 (Vt , θ1 ) b2 and ∆(V

7

(2.12)

where g20 (v, θ1 ) =

∂ 2 g(v,θ1 ) , ∂v 2

e t ) = ∆(Vt ) − ∆(V

Pn

s=1 Wns (Vt )∆(Vs ),

and c1 is a constant

vector. Analogously to the proof of Theorem 2.1(i), we then have as n → ∞ 2

b



θe1 − θ1



= c2 (1 + oP (1)) + c2 b2 (1 + oP (1))

n X

!−1 n X

τ g˙ 20 (Vt , θ1 )g˙ 20 (Vt , θ1 )

t=1 n X

g˙ 20 (Vt , θ1 ) t=1 !−1 n X

where c2 is some constant, g˙ 20 (v, θ1 ) = (1 + oP (1))

n P t=1

(2.13)

g˙ 20 (Vt , θ1 )∆00 (Vt ),

τ g˙ 20 (Vt , θ1 )g˙ 20 (Vt , θ1 )

t=1

et

t=1

∂g20 (v,θ) |θ=θ1 ∂θ

g˙ 20 (Vt , θ1 ) et , in which eet = et −

and we have used

n P t=1

g˙ 20 (Vt , θ1 ) eet =

Pn

s=1 Wns (Vt )es . 1

This implies that the rate of convergence of θe1 to θ1 is only proportional to n− 2 b−2 , 1

which is much slower than the rate of n− 2 for θb1 , because of b → 0. In the case of 1 1 b = c · n− 5 , the rate of convergence of θe1 to θ1 is only proportional to n− 10 . This is the

main reason we propose using θb1 rather than θe1 in this paper. In general, this is the reasoning why the semiparametric estimation method proposed for the conventional partially nonlinear model of the form Yt = g(Ut , θ1 ) + ∆(Vt ) + et discussed in Gao and Liang (1997) for the case where Ut and Vt are different sets of regressors is not directly applicable to a partially nonlinear model of the form Yt = g(Vt , θ1 ) + ∆(Vt ) + et . 3. Estimation of alternative functions This section is concerned with a nonlinear time series model of the form Yt = m(Vt ) + et , t = 1, · · · , n,   t Vt = H + Ut , n

(3.1)

where m(·) is some smooth function, {Ut } is a stationary time seres and {et } is a sequence of stationary time series errors. We are then interested in estimating a class of local nonparametric departure functions involved in the following alternative hypothesis: H0 : m(v) = g(v, θ0 )

versus

H1 : m(v) = g(v, θ1 ) + ∆n (v),

8

(3.2)

where θ0 ∈ Θ is the true value of the parameter θ under H0 , θ1 ∈ Θ and {∆n (·)} is a sequence of nonparametrically unknown functions. As discussed in the literature (see, for example, Gao 2007; Li and Racine 2007; Gao and Gijbels 2008; Kreiss, Neumann and Yao 2008), the choice of this type of semiparametric alternatives is mainly because interest in some cases is to detect whether there is a kind of slight departure from a commonly used parametric form when there is no sufficient evidence to suggest accepting the null hypothesis. Also in such cases, the level of such departure may be unknown and will need to be estimated. To the best of our knowledge, the issue of how to consistently estimate ∆n (·) has not been discussed in the literature. Similarly to the discussion in Section 2 above, the unknown parameter vector θ1 is identifiable and estimable under the identifiability condition: Z 1 Z



∆n (v) g(v, ˙ θ1 )p(v − H(r))dv dr = O(δn )

as n → ∞

(3.3)

0

with δn → 0. The resulting estimator is denoted by θb1 . We then estimate ∆n (v) by b n (v) = ∆

n X

Wnt (v)(Yt − g(Vt , θb1 )),

(3.4)

t=1

where {Wnt (v)} is as defined in (2.6). b n (v), we need to introduce To be able to establish an asymptotic theory for θb1 and ∆

the following condition. A5 Suppose that ∆n (v) is twice continuously differentiable and that H(r) is continuous in r. In addition, as n → ∞, Z 1 Z

δ1n :=



||g(v, ˙ θ1 )∆n (v)||2 p(v − H(r))dv dr → 0.

0

We now establish the following theorem. Theorem 3.1 (i). Let (3.3), A1, A2 and A5 hold. Then as n → ∞    √ b d n θ1 − θ1 − c2n → N 0, Σ−1 (θ1 )Σe Σ−1 (θ1 ) ,

(3.5)

where Z 1 Z

c2n =



0





∆n (v) g(v, ˙ θ1 )p(v − H(r))dv dr + OP n−1/2 δ1n 

= OP δn + n−1/2 δ1n .

9



(ii). Let (3.3), A1, A2, A4 and A5 hold. Then as n → ∞ √

where c3n (v0 ) =





d b n (v0 ) − ∆n (v0 ) − b2 c3n (v0 ) + τn −→ N (0, Σ1 (v0 )) , nb ∆ 1 2



(3.6)



∆00n (v0 ) u2 K(u)du, and τn = OP n−1/2 + δn . R

√ Theorem 3.1(i) shows that the parametric estimator of θ1 still has the– n rate of √ convergence. If nδn → 0, then the bias term c2n in (3.5) will be eliminated. We have the following corollary when the dependence of ∆n (·) on n is explicitly specified as ∆n (v) = δn ∆(v), in which ∆(v) satisfies A3 and δn → 0. Corollary 3.1 (i). Let (2.4) and A1–A3 hold. Then as n → ∞    √ b d n θ1 − θ1 −→ N 0, Σ−1 (θ1 )Σe Σ−1 (θ1 ) .

(ii). Let (2.4) and A1–A4 hold. If, in addition, q

!

b n (v0 ) ∆ nb ∆2n (v0 ) −1 ∆n (v0 )



(3.7)

nb5 δn → 0, then as n → ∞,

d

−→ N (0, Σ1 (v0 )) ,

(3.8)

where Σ1 (v0 ) is the same as in (2.8). 4. Discussion on possible extensions Sections 2 and 3 discuss two classes of semiparametric time series models and then establish asymptotic properties for the proposed estimators for the case of d = 1. As discussed in the literature (Fan and Gijbels 1996; Gao 2007; Li and Racine 2007 for example), one will need to employ a dimensional–reduction model to approximate model (2.1) when d is large. One possible model is an additive model of the form Yt = g(Vt , θ1 ) +

d X

∆j (Vtj ) + et ,

j=1

t n

 

Vt = H

+ Ut ,

(4.1)

where each ∆j (·) is an unknown univariate function defined on R1 , and the others are as defined in (2.1). Under the identifiability condition: Z 1 0

    Z d X  g(v, ˙ θ1 )  ∆j (vj ) p(v − H(r))dv1 · · · dvd  dr = 0, j=1

10

the unknown parameter vector θ1 can still be consistently estimated by θb1 . Function ∆(v) =

Pd

j=1 ∆j (vj )

can then be estimated as in (2.6) and each of the functions ∆j (·)

will be estimated by the marginal integration method as discussed in Section 2.3 of Gao (2007). Another possible model is a semiparametric single–index model of the form Yt = g(Vt , θ1 ) + ∆(Vtτ γ) + et ,   t Vt = H + Ut , n

(4.2)

where γ is a vector of unknown parameters. Model (4.2) is an extension of the partially single–index model discussed in Xia, Tong and Li (1999). Estimation of θ1 , γ and ∆(·) is then mainly based on the identifiability and estimability of model (4.2). Establishing the corresponding conditions and results to those given in Xia, Tong and Li (1999) requires further study and therefore is left for future research. 5. Simulation study In this section, we give some Monte Carlo studies to show the finite sample performance of the proposed estimation method. We employ the “leave-one-out” cross–validation method to select the bandwidth involved in the estimation of the nonparametric function ∆(·). We use a quadratic kernel function of the form K(u) =

3 4 (1

− u2 )I(|u| < 1)

throughout this section. The first example illustrates the performance of the nonlinear LS estimation method through a model of the form (2.1), and the second example examines the performance of the estimation method of model (3.1) under the alternative hypothesis (3.2). Example 5.1. Consider a pair of regression models of the form t n

 

Yt = θ1 Vt + ∆(Vt ) + et and Vt = H

+ Ut ,

t = 1, · · · , n,

i.i.d.

(5.1)

i.i.d.

where θ1 = 0.8, ∆(v) = 2v 2 , H(u) = u − 0.5, et ∼ N (0, 0.52 ), Ut ∼ U (−0.1, 0.1), and {et } and {Ut } are independently generated. It is easy to verify that the identifiability condition Z 1 Z



∆(v) g(v, ˙ θ1 )p(v − H(r))dv dr = 0

0

holds. We generated 2000 realizations, each consisting of n = 100, 200 and 500 observations. The simulation results for model (5.1) are presented in Table 5.1, where we report

11

the mean estimates (Mean) of θ1 and their corresponding standard deviations (STD) and b mean squared errors (MSE). The nonparametric local linear estimator ∆(v) was estimated

at the grid points −0.5, −0.4, · · ·, 0.4, 0.5 using a bandwidth selected by the “leave-oneout” cross–validation method. The true function ∆(v) and the mean of the estimated b function ∆(v) from 2000 realizations are plotted in Figure 5.1.

Example 5.2. Consider another pair of regression models of the form t n

 

Yt = m(Vt ) + et and Vt = H

t = 1, · · · , n

+ Ut ,

(5.2)

under the alternative hypothesis H1 : m(Vt ) = θ1 Vt + ∆ (V ), where Vt , H(·), Ut , et , and  n t θ1 are the same as in Example 5.1, and ∆n (v) = log 1 + 

As log 1 +

v2



=

1 n5

v2 1

v2

1

.

n5

(1 + o(1)) as n → ∞, it is easy to show that

n5

Z 1 Z



∆n (v) g(v, ˙ θ1 )p(v − H(r))dv dr = o

0



1 1



,

n5

which validates the identifiability condition (3.3). The estimation results for the parameter θ1 is summarized in Table 5.2. In Figure 5.2, we plotted the true function ∆n (v) and the b n (v) from 2000 replications. mean of the estimates ∆

Table 5.1. The results for θ1 in Example 5.1 sample size

Mean

STD

MSE

100

0.8270

0.1764

0.0318

200

0.8110

0.1218

0.0150

500

0.8066

0.0756

0.0058

Table 5.2. The results for θ1 in Example 5.2 sample size

Mean

STD

MSE

100

0.8104

0.1645

0.0272

200

0.8036

0.1179

0.0139

500

0.8015

0.0747

0.0056

12

The Monte Carlo studies indicate that the proposed nonlinear LS estimation method works well in estimating both the parameter and the nonparametric function. Tables 5.1– 5.2 and Figures 5.1–5.2 also indicate that the performance of the estimators improves as the sample size increases. 6. Conclusions Sections 1 and 2 have proposed two classes of semiparametric time series models and then discussed how to identify and estimate the proposed models. Because of the particular features of the proposed models, existing estimation methods available for the conventional partially linear models are not applicable. Section 3 has discussed the issue of how to consistently estimate a sequence of nonparametric departure functions in a class of local alternatives involved in a model specification testing problem. Such an estimation procedure may be useful in several aspects, such as studying the power function of a nonparametric test and the choice of a smoothing parameter involved in the nonparametric test. Section 4 has briefly discussed possible extensions. Some small–sample studies have been given in Section 5. While we haven’t been able to include any finite–sample comparison to show that θb1 has much better finite–sample performance than that of θe1 as supported by the theory discussed in Section 2.2, we will report such simulation results when they become available. 7. Acknowledgments The research of this paper was supported by an Australian Research Council Discovery Grants Program Grant Number: DP1096374. Appendix A: Proofs of the Theorems Proof of Theorem 2.1. Observe that θb1 − θ1 =

n X t=1

!−1 τ

g(V ˙ t , θ1 )g˙ (Vt , θ1 )

n X

!

g(V ˙ t , θ1 )(∆(Vt ) + et ) (1 + oP (1)).

(A.1)

t=1

By the law of large numbers for i.i.d. sequences, we have n 1X g(V ˙ t , θ1 )g˙ τ (Vt , θ1 )−→P Σ(θ1 ). n t=1

13

(A.2)

n=100 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 !0.5

0

0.5

*+200 0.) 0.6 0.# 0.4 0.3 0.2 0.1 0 !0.#

0

0.#

n=500 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 !0.5

0

0.5

b Figure 5.1. The true curve ∆(v) (solid lines) and the mean of ∆(v) over 2000 realizations (dashed lines) in Example 5.1 with sample sizes 100, 200 and 500 (from top to bottom).

14

n=100 0.1

0.08

0.06

0.04

0.02

0 !0.5

0

0.5

)*200 0.1

0.0'

0.0&

0.04

0.02

0 !0.5

0

0.5

n=500 0.1

0.08

0.06

0.04

0.02

0 !0.5

0

0.5

b n (v) over 2000 realizations Figure 5.2. The true curve ∆n (v) (solid lines) and the mean of ∆ (dashed lines) in Example 5.2 with sample sizes 100, 200 and 500 (from top to bottom).

15

Hence, to prove Theorem 2.1(i), we need only to show that n 1 X √ g(V ˙ t , θ1 )(∆(Vt ) + et )−→D N (0, Σe + Σ∆ ) , n t=1

(A.3)

using n 1 X Var √ g(V ˙ t , θ1 )(∆(Vt ) + et ) n t=1

!

n X 1 = E g(V ˙ t , θ1 )(∆(Vt ) + et ) n t=1

!2

= Σe + Σ ∆ . By the central limit theorem for i.i.d. sequences, we can show that (A.3) holds and thus the proof of Theorem 2.1(i) is completed. Theorem 2.1(ii) follows by (2.7) and the standard arguments of local linear estimators. Proof of Theorem 3.1. By a standard argument, we have θb1 − θ1 = +

n X t=1 n X

g(V ˙ t , θ1 )(g(V ˙ t , θ1 ))τ g(V ˙ t , θ1 )(g(V ˙ t , θ1 ))τ

!−1 n X t=1 !−1 n X

t=1

g(V ˙ t , θ1 )∆n (Vt ) (1 + oP (1)) g(V ˙ t , θ1 )et .

(A.4)

t=1

By the central limit theorem for i.i.d. sequences, we have n √ X g(V ˙ t , θ1 )(g(V ˙ t , θ1 ))τ n

!−1 n X





g(V ˙ t , θ1 )et −→D N 0, Σ−1 (θ1 )Σe Σ−1 (θ1 ) .

(A.5)

t=1

t=1

Meanwhile, as {Vt } is a sequence of independent variables, we have

=

2

n

X

˙ t , θ1 )∆n (Vt ) − E [g(V ˙ t , θ1 )∆n (Vt )]} E {g(V

t=1 ! n X 2

E kg(V ˙ t , θ1 )∆n (Vt )k

O

.

(A.6)

t=1

By A5, we have as n → ∞ n X

E kg(V ˙ t , θ1 )∆n (Vt )k

2

Z 1 Z

= n



2

kg(v, ˙ θ1 )∆n (v)k p(v − H(r))dv dr

0

t=1





2 = O nδ1n .

(A.7)

By (A.6) and (A.7), we also have as n → ∞ n 1X g(V ˙ t , θ1 )∆n (Vt ) = n t=1

n   1X E [g(V ˙ t , θ1 )∆n (Vt )] + OP n−1/2 δ1n n t=1

Z 1 Z

=





g(v, ˙ θ1 )∆n (v)p(v − H(r))dv dr + OP n−1/2 δ1n



0





= OP δn + n−1/2 δ1n .

16

(A.8)

By (A.4), (A.5) and (A.8), we prove that Theorem 3.1(i) holds. We now prove Theorem 3.1(ii). Observe that b n (v0 ) − ∆n (v0 ) ∆

= +

n X t=1 n X

Wnt (v0 )et +

n X

Wnt (v0 ) (∆n (Vt ) − ∆n (v0 ))

t=1





Wnt (v0 ) g(Vt , θ1 ) − g(Vt , θb1 )

t=1

=: Jn1 (v0 ) + Jn2 (v0 ) + Jn3 (v0 ).

(A.9)

By the central limit theorem for i.i.d. sequences again, we have, as n → ∞, √

where σ 2 (v0 ) =

!

σ 2 (v0 ) K 2 (u)du nbJn1 (v0 )−→D N 0, , f (v0 ) R

(A.10)

R1 2 R1 0 σ (v0 , r)dr and f (v0 ) = 0 p(v0 − H(r))dr.

By Taylor’s expansion, we have 1 + oP (1) 2 00 b ∆n (v0 ) Jn2 (v0 ) = 2

Z

u2 K(u)du.

(A.11)

Meanwhile, a straightforward derivation implies that as n → ∞ n X

Wnt (v0 )g(V ˙ t , θ1 ) →P g(v ˙ 0 , θ1 ),

(A.12)

t=1

which, along with the conclusion of Theorem 3.1(i), derives 



Jn3 (v0 ) = OP n−1/2 + δn .

(A.13)

The proof of Theorem 3.1(ii) therefore follows from equations (A.9)–(A.13).

References

Fan, J. and Gijbels, I. (1996). Local Polynomial Modeling and Its Applications. Chapman & Hall, London. Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York. Gao, J. (2007). Nonlinear Time Series: Semiparametric and Nonparametric Methods. Chapman & Hall, London.

17

Gao, J. and Gijbels, I. (2008). Bandwidth selection in nonparametric kernel testing. Journal of the American Statistical Association 484 1584-1594. Gao, J. and Liang, H. (1997). Statistical inference in single–index and partially nonlinear models. Annals of the Institute of Statistical Mathematics 49 493–517. H¨ ardle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. Physica–Verlag, New York. Kreiss, J. P., Neumann, M. H. and Yao, Q. (2008). Bootstrap tests for simple structures in nonparametric times series regression. Statistics and Its Interface 1 367–380. Li, Q. and Racine, J. (2007). Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton. Tong, H. (1990). Nonlinear Time Series: a Dynamical System Approach. Oxford University Press, Oxford. Xia, Y., Tong, H. and Li, W. K. (1999). On extended partially linear single–index models. Biometrika 86 831–842.

18