Electronic Journal of Statistics Vol. 9 (2015) 1267–1314 ISSN: 1935-7524 DOI: 10.1214/15-EJS1038
Inference and testing for structural change in general Poisson autoregressive models Paul Doukhan∗ AGM, Universit´ e de Cergy-Pontoise, 2 avenue Adolphe Chauvin. 95302 Cergy-Pontoise, France IUF, Universitary Institute of France e-mail:
[email protected]
and William Kengne∗ THEMA, Universit´ e de Cergy-Pontoise, 33 Boulevard du Port, 95011 Cergy-Pontoise Cedex, France e-mail:
[email protected] Abstract: We consider here together the inference questions and the change-point problem in a large class of Poisson autoregressive models (see Tjøstheim, 2012 [34]). The conditional mean (or intensity) of the process is involved as a non-linear function of it past values and the past observations. Under Lipschitz-type conditions, it can be written as a function of lagged observations. For the latter model, assume that the link function depends on an unknown parameter θ0 . The consistency and the asymptotic normality of the maximum likelihood estimator of the parameter are proved. These results are used to study change-point problem in the parameter θ0 . From the likelihood of the observations, two tests are proposed. Under the null hypothesis (i.e. no change), each of these tests statistics converges to an explicit distribution. Consistencies under alternatives are proved for both tests. Simulation results show how those procedures work in practice, and applications to real data are also processed. MSC 2010 subject classifications: Primary 60G10, 62M10, 62F12; secondary 62F03, 62F05, 62F10. Keywords and phrases: Time series of counts, Poisson autoregression, likelihood estimation, change-point, semi-parametric test. Received February 2015.
Contents 1 Introduction . . . . . . . . . 2 Assumptions and examples 2.1 Assumptions . . . . . 2.2 Examples . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1268 1271 1271 1272
∗ This work has been developed within the MME-DII center of excellence (ANR-11LABEX-0023-01) http://labex-mme-dii.u-cergy.fr/.
1267
1268
P. Doukhan and W. Kengne
2.2.1 Linear Poisson autoregression . . . . . . . . . . . . . . 2.2.2 Threshold Poisson autoregression . . . . . . . . . . . . 3 Likelihood inference . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Testing for parameter changes . . . . . . . . . . . . . . . . . . . . . 4.1 Asymptotic behavior under the null hypothesis . . . . . . . . 4.2 Asymptotic under the alternative . . . . . . . . . . . . . . . . 5 Some numerical results for inference in INTGARCH model . . . . 5.1 Estimation and identification . . . . . . . . . . . . . . . . . . 5.2 Some simulations results . . . . . . . . . . . . . . . . . . . . . 5.3 Application to real data . . . . . . . . . . . . . . . . . . . . . 6 Some numerical results for parameter change in INGARCH model 6.1 Testing for parameter change in INGARCH model . . . . . . 6.2 Real data application . . . . . . . . . . . . . . . . . . . . . . . 7 Proofs of the main results . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
1272 1273 1274 1275 1278 1279 1280 1280 1282 1283 1286 1286 1287 1290 1312 1312
1. Introduction Time series of counts appear as natural for modeling count events. Some examples can be found in epidemiology (number of new infections), in finance (number of transactions per minute), in industrial quality control (number of defects), just to name a few. We refer the reader to Held et al. (2005) [23], Br¨ ann¨as and Quoreshi (2010) [5], Lambert (1992) [30] among others, for more details. Real advances have been made in count time series modeling during the last two decades. Let Y = (Yt )t∈Z be an integer-valued time series; denote Ft = σ(Ys , s ≤ t) the σ-field generated by the whole past at time t and L(Yt /Ft−1 ) the conditional distribution of Yt given the past. A model is characterized by the type of marginal distribution L(Yt /Ft−1 ), and the dependence structure between L(Yt /Ft−1 ) and the past. Models with various marginal distributions and dependence structures have been studied; see for instance Kedem and Fokianos (2002) [26], Davis et al. (2005) [8], Ferland et al. (2006) [15], Davis and Wu (2009) [10], Weiß (2009) [35]. Fokianos et al. (2009) [19] considered the Poisson autoregression such that, L(Yt / Ft−1 ) is Poisson distributed with an intensity λt which is a function of λt−1 and Yt−1 . Under linear autoregression, they proved both the consistency and the asymptotic normality of the maximum likelihood estimator of the regression parameter, by using a perturbation approach, which allows to use the standard Markov ergodic setting. Fokianos and Tjøstheim (2012) [20] extended the method to nonlinear Poisson autoregression with; λt = f (λt−1 ) + b(Yt−1 ) for nonlinear measurable functions f and b. In the same vein, Neumann (2011) [31] studied a much more general model where λt = f (λt−1 , Yt−1 ). He focused on the absolute regularity and the specification test for the intensity function,
Inference and structural change for Poisson autoregression
1269
while the recent works of Fokianos and Neumann (2013) [18] studied goodnessof-fit tests which are able to detect local alternatives. Doukhan et al. (2012) [13], considered a more general model with infinitely many lags. Stationarity and the existence of moments are proved by using a weak dependent approach and contraction properties. Later, Davis and Liu (2012) [9] studied the model where, the distribution L(Yt /Ft−1 ) belongs to a class of one-parameter exponential family with finite order dependence. This class contains Poisson and negative binomial (with fixed number of failures) distribution. From the theory of iterated random functions, they established the stationarity and the absolute regularity properties of the process. They also proved the consistency and asymptotic normality of the maximum likelihood estimator of the parameter of the model. Douc et al. (2013) [11] considered a class of observation-driven time series which covers linear, log-linear, and threshold Poisson autoregressions. Their approach is based on a recent theory for Markov chains based upon Hairer and Mattingly (2006) [22] recent work; this allows existence and uniqueness of the invariant distribution for Markov chains without irreducibility. Further, they proved the consistency of the conditional likelihood estimator of the model (even for mis-specified models); the asymptotic normality is not yet considered in this setting. Asymptotic theory for inference on time series models usually needs the stationarity properties of the process. But in practice, real data often suffer from non-stationarity which may be due to structural changes occur during the data generating period. Several ways to consider such structural changes are possible, as this was demonstrated during the thematic cycle Nonstationarity and Risk Management held in Cergy-Pontoise during year 20121 . In the context of count models, Kang and Lee (2009) [25] proposed a CUSUM procedure for testing of parameter changes in a first-order random coefficient integer-valued autoregressive model defined through thinning operator. Fokianos and Fried (2010, 2012) [16, 17] studied mean shift in linear and log-linear Poisson autoregression. Dependence between the level shift and time allows their model to detect several types of interventions effects such as outliers. Franke et al. (2012) [21], considered parameter change in Poisson autoregression of order one. Their tests are based on the cumulative sum of residuals using conditional least-squares estimator. Here, we shall first consider a time series of counts Y = (Yt )t∈Z satisfying: Yt /Ft−1 ∼ Poisson(λt ) with λt = F ((λt−1 , Yt−1 ), (λt−2 , Yt−2 ) . . .)
(1)
where Ft = σ(Ys , s ≤ t) and F a measurable non-negative function. The properties of the general class of Poisson autoregressive model (1) have been investigated in Doukhan et al. [13]. Such infinite order processes provided a large way to take into account dependence on the past observations. Proceeding as 1 http://www.u-cergy.fr/en/advanced-studies-institute/thematic-cycles/thematic-cycle2012/finance-cycle.html
1270
P. Doukhan and W. Kengne
in Doukhan and Wintenberger (2008) [12], we show that under some Lipschitztype conditions on F , the conditional mean λt can be written as a function of the past observations. This yields to consider the model Yt /Ft−1 ∼ Poisson(λt )
with
λt = f (Yt−1 , Yt−2 . . .)
(2)
where f is a measurable non-negative function. We assume that f is know up to a parameter θ0 belonging to some compact set Θ ⊂ Rd with d ∈ N − {0}. That is Yt /Ft−1 ∼ Poisson(λt ) with λt = fθ0 (Yt−1 , Yt−2 . . .)
and
θ0 ∈ Θ. (3)
Many classical integer-valued time series satisfying (3) (see examples below). Remark that, model (3) (as well as models (1) and (2)) can be represented in terms of Poisson processes. Let {Nt (·) ; t = 1, 2, . . .} be a sequence of independent Poisson processes of unit intensity. Yt can be seen as the number (say Nt (λt )) of events of Nt (·) in the time interval [0, λt ]. So, we have the representation (4) Yt = Nt (λt ) with λt = fθ0 (Yt−1 , Yt−2 . . .). The Poisson autoregressive models are known to capture the overdispersion phenomenon in counts data, meaning if the process (Yt )t∈Z is stationary it always occurs that Var (Yt ) ≥ E (Yt ). The paper first works out the asymptotic properties of the maximum likelihood estimator of the model (3). Under some Lipschitz-type assumption on the function fθ , we investigate sufficient conditions for the consistency and the asymptotic normality of the maximum likelihood estimator of θ0 . Note that, Doukhan et al. [13] have used weak dependence approach to prove the existence of stationary and ergodic solution of (1). By using their results, some assumptions (such as increasing condition on fθ (·) or four times differentiability on the function θ → fθ ) that often needed for perturbation technique (see for instance Fokianos et al. [19] or Fokianos et al. [20]) are relaxed. Although the models studied by Davis and Liu [9] and Douc et al. [11] allow large classes of marginal distributions, the infinitely many lags of model (1) (or model (3)) enables a higher order dependence structure. The second contribution of this work is the two tests for change detection in model (3). We propose a new idea to take into account the change-point alternative. This implies that, the procedures proposed will be numerically easy applied than those proposed by Kengne (2012) [27]. The consistency under the alternative is proved. Contrary to Franke et al. [21], the multiple change alternative has been considered and independence between the observations before and after the change-point is not assumed. Note that, the intervention problem studied by Fokianos and Fried [16, 17] is intended to sudden shift in the conditional mean of the process. Such outlier could in some case be seen as a particular case of structural change problem that we develop here for a large class of models. However, if the intervention affects only a few data points, in
Inference and structural change for Poisson autoregression
1271
the classical change-point setting (where the length of each regime tends to infinity with the same rate as the sample size), such effects will be asymptotically negligible. The forthcoming Section 2 provides some assumptions on model with examples. Section 3 is devoted to the definition of the maximum likelihood estimator with its asymptotic properties. In Section 4, we propose the tests for detecting change in parameter of model (3). Some simulation results and real data applications for inference and change-point detection are presented in Sections 5 and 6, lastly the proofs of the main results are provided in Section 7. 2. Assumptions and examples 2.1. Assumptions We will use the following classical notations: p 1. y := j=1 |yj | for any y ∈ Rp ; 2. for any compact set K ⊆ Rd and for any function g : K −→ Rd , g K = supθ∈K ( g(θ) ); 3. if Y is a random variable with finite moment of order r > 0, then Y r = (E |Y |r )1/r ; ◦
4. for any set K ⊆ Rd , K denotes the interior of K. A classical Lipschitz-type conditions is assumed on the model (1). Assumption AF . There exists a sequence of non-negative real numbers (αj )j≥1 satisfying ∞ αj < 1/(1 + ) j=1
for some > 0 and such that for any x, x ∈ ((0, ∞) × N)N , |F (x) − F (x )| ≤
∞
αj xj − xj .
j=1
Under assumption (AF ), Doukhan et al. [13, 14] prove that the above equation has a strictly stationary solution (Yt , λt )t∈Z ; moreover this solution is τ -weakly dependent with finite moment of any order. The following proposition show that the conditional mean λt of model (1) can be expressed as a function of only the past observations of the process. Proposition 2.1. Under (AF ), the conditional mean of the strictly stationary and ergodic solution of (1) can be written as λt = f (Yt−1 , Yt−2 . . . ) where f : N → R+ is a measurable function. From Proposition 2.1, it appears that the information on the unobservable process (λt−j ) can be captured by the observable process (Yt−j ). Hence, we will
P. Doukhan and W. Kengne
1272
focus on the model (3); with the advantage to easily compute the derivative ∂λt /∂θ (very useful to derive the asymptotic covariance matrix in the inference study). Note that, if one carries inference on the model (1) by assuming that λt = Fθ0 ((λt−1 , Yt−1 ), (λt−2 , Yt−2 ), . . .), then it will not be easy (or not possible) to compute ∂λt /∂θ or to express it as a function of ∂Fθ /∂θ in the general case. We focus on the model (3) with the following assumptions. For i = 0, 1, 2 and for any compact set K ⊂ Θ, we introduce Assumption Ai (K). ∂ i fθ (0)/∂θi Θ < ∞ and there exists a sequence ∞ (0) (i) of non-negative real numbers (αk (K))k≥1 satisfying j=1 αk (K) < 1 or ∞ (i) j=1 αk (K) < ∞ (for i = 1, 2) such that ∞ ∂ i f (y) ∂ i f (y ) θ θ (i) − ≤ αk (K)|yk − yk | ∂θi ∂θi K
for all y, y ∈ (R+ )N .
k=1
The Lipschitz-type condition A0 (Θ) is the parametric version of the assumption AF . It is classical when studying the existence of solutions of such model (see for instance [12, 1] or [13]). In particular, A0 (Θ) implies that for all θ ∈ Θ and y ∈ (R+ )N , ∞ (i) αk (Θ)yk . fθ (y) ≤ fθ (0) + k=1
The latter relation is a useful tool for proving that the stationary solution of (3) admits finite moments (see the proof of Theorem 2.1 of [13]). The assumptions A1 (K) and A2 (K) as well as the following assumptions D(Θ), Id(Θ) and Var(Θ) are needed to define and to study the asymptotic properties of the maximum likelihood estimator of the model (3). Assumption D(Θ). ∃c > 0 such that inf θ∈Θ fθ (y) ≥ c for all y ∈ (R+ )N . Assumption Id(Θ). For all (θ, θ ) ∈ Θ2 , fθ (Yt−1 , . . . ) = fθ (Yt−1 , . . . ) a.s. for some t ∈ Z ⇒ θ = θ . Assumption Var(Θ). For all θ ∈ Θ and t ∈ Z, the components of the vector ∂fθ ∂θ i (Yt−1 , . . . ) are a.s. linearly independent. 2.2. Examples 2.2.1. Linear Poisson autoregression We consider an integer-valued time series (Yt )t∈Z satisfying for any t ∈ Z φk (θ0 )Yt−k Yt /Ft−1 ∼ Poisson(λt ) with λt = φ0 (θ0 ) +
(5)
k≥1
where θ0 ∈ Θ ⊂ Rd , the functions θ → φk (θ) are positive and satisfying k≥1 φk (θ) Θ < 1. This model is also called an INARCH(∞), due to its
Inference and structural change for Poisson autoregression
1273
similarity with the classical ARCH(∞) model. Assumptions A0 (Θ) holds automatically. If the functionφk are twice continuously differentiable such that k≥1 φk (θ) Θ < ∞ and k≥1 φk (θ) Θ < ∞ then A1 (Θ) and A2 (Θ) hold. If inf θ∈Θ φ0 (θ) > 0 then D(Θ) holds. Moreover, if there exists a finite subset I ⊂ N − {0} such that the function θ → (φk (θ))k∈I is injective, then assumption Id(Θ) holds i.e. model (5) is identifiable. Finally, assumption Var(Θ) holds if for any θ ∈ Θ there exists d functions φk1 , . . . , φkd such that the matrix (
∂φkj ∂θ i )1≤i,j≤d
(computed at θ) has a full rank.
Particular case of INGARCH(p, q) processes Assume that Yt /Ft−1 ∼ Poisson(λt )
with
λt = α0∗ +
p
αk∗ λt−k +
k=1
q
βk∗ Yt−k
(6)
k=1
where θ0 = (α0∗ , α1∗ , . . . , αp∗ , β1∗ , . . . , βq∗ ) ∈ Θ with p q Θ = θ = (α0 , α1 , . . . , αp , β1 , . . . , βq ) ∈ ]0, ∞[×[0, ∞[p+q , αk + βk < 1 . k=1
k=1
Hence, the Lipschitz-type condition (AF ) is satisfied. In this case, we can find for any θ ∈ Θ, a sequence of non-negative real numbers (ψk (θ))k≥0 satisfying k≥1 ψk (θ) Θ < 1 such that ψk (θ0 )Yt−k . λt = ψ0 (θ0 ) + k≥1
Therefore, assumptions A0 (Θ) holds. Moreover, the functions ψk (θ) are twice continuously differentiable with respect to θ and its derivatives decay exponentially, hence A1 (Θ) and A2 (Θ) hold. If inf θ∈Θ (α0 ) > 0 then D(Θ) holds. For this particular case, Id(Θ) holds automatically. See [15] and [35]) for more details on this model. The adequacy of this linear model to the number of transactions per minute for the stock Ericsson B during July 2, 2002 has been proved by Fokianos and Neumann [18]. 2.2.2. Threshold Poisson autoregression We consider a threshold Poisson autoregression model defined by: Yt /Ft−1 ∼ Poisson(λt ) with − φ+ λt = φ0 (θ0 ) + (θ ) max(Y − , 0) + φ (θ ) min(Y , ) 0 t−k 0 t−k k k k≥1 − where φ0 (θ0 ) > 0, φ+ k (θ0 ), φk (θ0 ) ≥ 0 and ∈ N. We can also write +
− λt = φ0 (θ0 ) + φ− k (θ0 )Yt−k + φk (θ0 ) − φk (θ0 ) max(Yt−k − , 0) . k≥1
(7)
P. Doukhan and W. Kengne
1274
This example of nonlinear model is also called an integer-valued threshold ARCH (or INTARCH) due to its definition like the threshold ARCH model proposed by Zako¨ıan (1994) [36]; see also [21] for INTARCH(1) model. In the INTARCH(∞) model, the regression coefficient at lag t − k is φ− k (θ0 ) if (θ ) if Y > ; such model can then be used to capYt−k ≤ and φ+ 0 t−k k ture a piecewise phenomenon. is the threshold parameter of the model. If
+ − < 1 then A max φ (θ) , φ (θ) (Θ) holds. Furthermore, if the Θ Θ 0 k≥1 k k − functions θ → φ+ (θ) and θ → φ (θ) are twice continuously differentiable such k
k ∂ + ∂2 + ∂ − that k≥1 max ∂θ φk (θ) Θ , ∂θ φk (θ) Θ < ∞ and k≥1 max ∂θ 2 φk (θ) Θ ,
∂2 − ∂θ2 φk (θ) Θ < ∞, then A1 (Θ) and A2 (Θ) hold. Conditions on D(Θ), Id(Θ) and Var(Θ) are obtained as above. A special case is the INTGARCH(p, q) model, see Section 5. Remark 2.1. An association argument can be used to show that the INGARCH and INTGARCH models capture only positive dependence in time series count data. It is a frequent phenomenon in transactions data (see for instance the subsection 5.3 for the number of transactions for the stock Ericsson B). 3. Likelihood inference Assume that the trajectory (Y1 , . . . , Yn ) is observed. The conditional (log)likelihood of model (3) computed on T ⊂ {1, . . . , n}, is given (up to a constant) by Ln (T, θ) = (Yt log λt (θ) − λt (θ)) = t (θ) with t (θ) = Yt log λt (θ) − λt (θ) t∈T
t∈T
where λt (θ) = fθ (Yt−1 , . . . ). In the sequel, we use the notation fθt := fθ (Yt−1 , . . .). Since only Y1 , . . . , Yn are observed, the (log)-likelihood is approximated by t (θ) − λ t (θ)) = t (θ) − λ t (θ) n (T, θ) = L (Yt log λ t (θ) with t (θ) = Yt log λ t∈T
t∈T
(8) t (θ) := ft := fθ (Yt−1 , . . . , Y1 , 0, . . . ) and λ 1 (θ) = fθ (0, . . . ). The maxiwhere λ θ mum likelihood estimator (MLE) of θ0 computed on T is defined by n (T, θ)). θn (T ) = argmaxθ∈Θ (L
(9)
For any k, k ∈ Z such as k ≤ k , denote Tk,k = {k, k + 1, . . . , k }. Theorem 3.1. Let (jn )n≥1 and (kn )n≥1 be two integer valued sequences such ◦
that jn ≤ kn , kn → +∞ and kn − jn → +∞ as n → +∞. Assume θ0 ∈ Θ and D(Θ), Id(Θ) and A0 (Θ) hold with (0) j × αj (Θ) < ∞. (10) j≥1
Inference and structural change for Poisson autoregression
It holds that
1275
a.s. θn (Tjn ,kn ) −−−−−→ θ0 . n→+∞
The following theorem regarding the asymptotic normality of the MLE of model (3) holds. Theorem 3.2. Let (jn )n≥1 and (kn )n≥1 be two integer valued sequences such that jn ≤ kn , kn → +∞ and kn −jn → +∞ as n → +∞. Under the assumptions of Theorem 3.1 and Var(Θ), if for i = 1, 2 Ai (Θ) hold with (i) j × αj (Θ) < ∞ (11) j≥1
then
D kn − jn (θn (Tjn ,kn ) − θ0 ) −−−−−→ N (0, Σ−1 ) n→+∞
∂ 0 ∂ 0 where Σ = E f10 ( ∂θ fθ0 )( ∂θ f θ0 ) . θ0
According to the Lemma 7.2 and the proof of Theorem 3.2, the matrix
kn 1 1 ∂ t ∂ t
and f f
kn − jn t=j fθt ∂θ θ ∂θ θ θ=θn (Tjn ,kn ) n 1 ∂2
− L (T , θ)
n j ,k n n kn − jn ∂θ∂θ θ=θn (Tjn ,kn )
are both consistent estimators of Σ. Remark 3.1. 1. Theorem 3.1 still holds if we replace (10) by the much weaker condition (0) log j × αj (Θ) < ∞ j≥1
(see the proof of the theorem). 2. In Theorems 3.1 and 3.2, the typical sequences jn = 1 and kn = n, ∀n ≥ 1 can be chosen. This choice is the case where the estimator is computed with all the observations. But in the change-point study and depending on the procedure used, one might need to compute the estimator on each regime. Results are written this way to cover this situation. (i) (i) 3. If the Lipschitz coefficients (αj (Θ))j≥1 satisfy αj (Θ) = O(j −γ ) with γ > 3/2, then conditions (10) and (11) hold. 4. Testing for parameter changes We consider the observations Y1 , . . . , Yn generated as in model (3) and assume that the parameter θ0 may change over time. More precisely, we assume that (j) ∗ ∃K ≥ 1, θ1∗ , . . . , θK ∈ Θ, 0 = t∗0 < t∗1 < · · · < t∗K−1 < t∗K = n such that Yt = Yt for t∗j−1 < t ≤ t∗j , where the process (Yj )t∈Z is a stationary solution of (3) depending on θj∗ . The case where the parameter does not change corresponds to K = 1. This problem leads to the following test hypotheses: (j)
P. Doukhan and W. Kengne
1276
H0 : Observations (Y1 , . . . , Yn ) are a trajectory of a process (Yt )t∈Z solution of (3), depending on θ0 ∈ Θ. ∗ ∗ with θ1∗ = θ2∗ = · · · = θK , 0 = t∗0 < H1 : There exists K ≥ 2, θ1∗ , θ2∗ , . . . , θK t∗1 < · · · < t∗K−1 < t∗K = n such that the observations (Yt )t∗j−1 p. It holds that vq ≤
p
αj vq−j .
j=1
By applying the Lemma 5.4 of [12], we obtain vq ≤ αq/p v0 where α =
∞
αj .
j=1
Hence, vq → 0 as q → ∞. Thus, for any p > 0, the sequence (λp,q 0 ) is a Cauchy sequence in L1 . Therefore, it converges to some limit denoted λp0 . Moreover, since the sequence (λp,q 0 )q≥1 is measurable w.r.t to σ(Yt , t < 0), it is
Inference and structural change for Poisson autoregression
1291
the case of the limit λp0 . So, there exists a measurable function f (p) such that λp0 = f (p) (Y−1 , . . .). By going along similar lines, it holds that for any t ∈ Z, p 1 (p) (Yt−1 , . . .) and since the sequence (λp,q t )q≥1 converges in L to some λt = f (Yt )t∈Z , is stationary and ergodic, the process (λpt )t∈Z is too stationary and ergodic. Let p and t fixed. For q large enough, we have p,q λp,q = F (λp,q t t−1 , . . . , λt−p , 0, . . . ; Yt−1 , . . .)
(see (16)). By using the continuity (which comes from Lipschitz-type conditions) of (Y1 , . . . , Yp ) → F (Y1 , . . . , Yp , 0 . . . ; y) for any fixed y = (y1 )i≥1 and by carrying q to infinity, it holds that λpt = F (λpt−1 , . . . , λpt−p , 0, . . . ; Yt−1 , . . .).
(17)
− λpt | and Δp = supt∈Z Δp,t . Denote μp = Eλpt , μ = supp≥1 μp , Δp,t = E|λp+1 t By going the same lines as in [12], we obtain Δp ≤ Cαp+1 . Therefore, Δp → 0 as p → ∞. This shows that for any fixed t ∈ Z, (λpt )p≥1 is a Cauchy sequence in t ∈ L1 . Moreover, λ t is measurable w.r.t L1 . Thus it converges to some random λ σ(Yj , j < t) (because it is the case of (λpt )p≥1 ). Thus, there exists a measurable t = f (Yt−1 , . . .) for any t ∈ Z. This implies that (λ t )t∈Z function f such that λ is strictly stationary and ergodic. Finally, by using equation (17) and continuity of F , it comes that t−1 , . . . ; Yt−1 , . . .), t = F (λ λ
for any t ∈ Z.
(18)
t )t∈Z is strictly stationary ergodic and satisfying (1). Hence, the process (Yt , λ t = λt a.s. By the uniqueness of the solution, it holds that λ Thus λt = f (Yt−1 , . . .) for any t ∈ Z. Proof of the Theorem 3.1. Without loss of generality, for simplifying notation, we will make the proof with Tjn ,kn = T1,n . The proof is divided into a.s. two parts. We will first show that n1 t∈T1,n t (θ) − L(θ) Θ −−−−−→ 0 where n→+∞
L(Θ) := E( 0 (θ)); secondly, we will show that the function θ → L(Θ) has a unique maximum in θ0 . (i) Let θ ∈ Θ, recall that t (θ) = Yt log λt (θ) − λt (θ) = Yt log fθt − fθt . Since (Yt )t∈Z is stationary and ergodic, for any θ ∈ Θ, ( t (θ))t∈Z is also a stationary and ergodic sequence. Moreover, we have for any θ ∈ Θ,
| t (θ)| ≤ |Yt | log fθt + fθt
ft
≤ |Yt | log θ × c + fθt c
f t
≤ |Yt | θ − 1 + |log c| + fθt ( for x > 1, |log x| ≤ |x − 1|) c
1
t
≤ |Yt | fθ + 1 + |log c| + fθt . c
P. Doukhan and W. Kengne
1292
Hence, sup | t (θ)| ≤ |Yt | θ∈Θ
1 fθt + 1 + |log c| + fθt . Θ Θ c r
We will show that, for any r > 0, E( fθt Θ ) < ∞. Since A0 (Θ) holds, we have (0) t fθ ≤ fθt − fθ (0) + fθ (0) ≤ αj (Θ) |Yt−j | + fθ (0) Θ . Θ Θ Θ j≥1
Thus, by using the stationarity of the process (Yt )t∈Z , it follows that 1/r (0) r r ≤ Y0 αj (Θ) + fθ (0) Θ < ∞. E (fθt Θ ) j≥1
Therefore, we have 1 1/2 2 2 E sup | t (θ)| ≤ (E |Yt | )1/2 · E fθt Θ c θ∈Θ
+ (1 + |log c|)E |Yt | + E fθt Θ < ∞.
By the uniform strong law of large number applied on ( t (θ))t≥1 (see Straumann and Mikosch (2006) [33]), it holds that 1 a.s. t (θ) − E 0 (θ) −−−−−→ 0. (19) n Θ n→+∞ t∈T1,n
Now let us show that 1 a.s. t (θ) − t (θ) −−−−−→ 0. n Θ n→+∞ t∈T1,n
t∈T1,n
We have
1 1 t (θ) − t (θ) ≤ t (θ) − t (θ) . n n Θ Θ t∈T1,n
t∈T1,n
t∈T1,n
We will apply the Corollary 1 of Kounias and Weng (1969) [29]. So, it suffices to show that 1 1 E t (θ) − t (θ) < ∞. n t Θ t≥1
For t ∈ T1,n and θ ∈ Θ, we have t (θ) − t (θ) = Yt log fθt − fθt − Yt log fθt + fθt = Yt (log fθt − log fθt ) − (fθt − fθt ). By using the relation | log fθt − fθt | ≤ 1c |fθt − fθt |, it comes that 1 t (θ) − t (θ) ≤ |Yt | fθt − fθt + fθt − fθt c Θ Θ Θ
Inference and structural change for Poisson autoregression
≤
1 c
1293
|Yt | + 1 fθt − fθt . Θ
By Cauchy-Schwartz inequality, 1 E t (θ) − t (θ) ≤E |Yt | + 1 fθt − fθt c Θ Θ 2 1/2 2 1/2 1 |Yt | + 1 × E fθt − fθt . ≤ E c Θ We have (by Minkowski inequality), 1 2 1/2 1 2 E |Yt | + 1 ≤ (E |Yt | )1/2 + 1 < ∞. c c Thus, it comes that 2 1/2 ≤ C E fθt − fθt . E t (θ) − t (θ) But, we have fθt − fθt Θ ≤ equality, it comes that
Θ
Θ
j≥t
αj (Θ)|Yt−j |. By using Minkowski in-
(0)
2 1/2 (0) 2 ≤ (E |Y0 | )1/2 αj (Θ). E fθt − fθt Θ
Hence
j≥t
2 1/2 (0) ≤C αj (Θ). E fθt − fθt Θ
j≥0
(0) ≤C αj (Θ). E t (θ) − t (θ)
Thus
Θ
j≥t
Therefore 1 (0) 1 (0) E t (θ) − t (θ) ≤ C α (Θ) αj (Θ) = C t t t j Θ
1 t≥1
t≥1
≤
j≥t
j j≥1 t=1
≤C
t≥1 j≥t
j (0) 1 (0) 1 αj (Θ) = C αj (Θ) t t t=1 j≥1
(0) αj (Θ)
· (1 + log j)
j≥1
≤C
j≥1
(0)
αj (Θ) + C
(0)
αj (Θ)
j
j≥1
< ∞ (according to A0 (Θ) and (10)).
1294
P. Doukhan and W. Kengne
Hence, it follows that 1 a.s. t (θ) − t (θ) −−−−−→ 0. n Θ n→+∞ t∈T1,n
(20)
t∈T1,n
From (19) and (20), we deduce that 1 a.s. t (θ) − E 0 (θ) −−−−→ 0. n Θ n→∞ t∈T1,n
(ii) We will now show that the function θ → L(θ) = E 0 (θ) has a unique maximum at θ0 . We will proceed as in [9]. Let θ ∈ Θ, with θ = θ0 . We have L(θ0 ) − L(θ) = E 0 (θ0 ) − E 0 (θ) = E(Y0 log fθ00 − fθ00 ) − E(Y0 log fθ0 − fθ0 ) = E fθ00 (log fθ00 − log fθ00 ) − E(fθ00 − fθ00 ). By applying the mean value theorem at the function x → log x defined on [c, +∞[, there exists ξ between fθ00 and fθ0 such that log fθ00 − log fθ00 = (fθ00 − fθ00 ) 1ξ . Hence, it comes that L(θ0 ) − L(θ) = E
1
fθ00 (fθ00 − fθ0 ) − E(fθ00 − fθ0 )
ξ f 0
θ0 =E − 1 fθ00 − fθ0 ξ 1 = E (fθ00 − ξ)(fθ00 − fθ0 ) . ξ
Since θ = θ0 , it follows from assumption Id(Θ) that 1ξ (fθ00 −ξ)(fθ00 −fθ0 ) = 0 a.s. Moreover • if fθ00 < fθ0 , then fθ00 < ξ < fθ0 and hence 1ξ (fθ00 − ξ)(fθ00 − fθ0 ) > 0; • if fθ00 > fθ0 , then fθ0 < ξ < fθ00 and hence 1ξ (fθ00 − ξ)(fθ00 − fθ0 ) > 0. We deduce that 1ξ (fθ00 − ξ)(fθ00 − fθ0 ) > 0 a.s.. Hence L(θ0 ) − L(θ) = E( 1ξ (fθ00 − ξ)(fθ00 − fθ0 )) > 0. Thus, the function θ → L(θ) has a unique maximum at θ0 . (i), (ii) and standard arguments lead to the consistency of θn (T1,n ). The following lemma are needed to prove the Theorem 3.2. Lemma 7.1. Let (jn )n≥1 and (kn )n≥1 two integer valued sequences such that (jn )n≥1 is increasing, jn → ∞ and kn − jn → ∞ as n → ∞. Let n ≥ 1, for any segment T = Tjn ,kn ⊂ {1, . . . , n}, it holds under assumptions of Theorem 3.2 that ∂ ∂ 1 −→ 0 . Ln (T, θ) E √ Ln (T, θ) − n→∞ ∂θ Θ kn − jn ∂θ
Inference and structural change for Poisson autoregression
Proof.
1295
Let i ∈ {1, . . . , n}. We have
∂ 1 ∂ ∂ ∂ t Ln (T, θ) = (Yt log fθt − fθt ) = fθt − fθ Yt t ∂θi ∂θi fθ ∂θi ∂θi t∈T t∈T ∂ = t (θ) ∂θi t∈T
and
∂ 1 ∂ t ∂ t ∂ Ln (T, θ) = Yt fθ − fθ = t (θ). t ∂θi ∂θi ∂θ ∂θ i i f θ t∈T t∈T
Hence
∂
1 ∂ t 1 ∂ ∂ ∂ ∂ t t t
∂θi t (θ) − ∂θi t (θ) =
Yt f t ∂θi fθ − ∂θi fθ − Yt t ∂θi fθ + ∂θi fθ
f θ
θ
1 ∂ 1 ∂ t
∂ t ∂ t
t ≤ |Yt | t f − f − f + f .
fθ ∂θi θ ft ∂θi θ ∂θi θ ∂θi θ θ
(21)
Using the relation |a1 b1 − a2 b2 | ≤ |a1 − a2 | |b2 | + |b1 − b2 | |a1 |, we have
1 ∂ 1 ∂ t
1 1
∂ t
∂ t ∂ t
1
t f − f − f ≤ −
fθ + f
t
fθ ∂θi θ ft ∂θi θ fθt ∂θi θ ∂θi θ fθt fθt ∂θi θ
1
t t
∂ t
1
∂ t ∂ t
≤ 2 fθ − fθ f − f + f . c ∂θi θ c ∂θi θ ∂θi θ Hence, (21) implies ∂ ∂ t 1 t t ∂ ∂θi t (θ) − ∂θi t (θ) ≤ |Yt | c2 fθ − fθ Θ ∂θi fθ Θ ∂ t ∂ 1 ∂ ∂ t t t + f − f − f f + c ∂θi θ ∂θi θ Θ ∂θi θ ∂θi θ Θ ∂ t ∂ t ∂ t t t ≤ C |Yt | − f + C(1 + |Y |) f − f f f t θ θ ∂θi θ ∂θi θ ∂θi θ . Θ Θ
Θ
Let r > 0. Using the Minkowski and H¨older’s inequalities, it holds that r r r1 1/r ∂ ∂ t ∂ t t r r f (θ) − (θ) ≤ C E |Y | − f f E t t θ ∂θi t ∂θi ∂θi θ Θ θ Θ Θ r 1/r ∂ t ∂ t + C E (1 + |Yt |)r ∂θi fθ − ∂θi fθ Θ ⎞ ⎛ 3r 1/3 1/3 1/r ∂ t 3r 3r ⎠ ≤ C ⎝(E |Yt | )1/3 E E fθt − fθt ∂θi fθ Θ Θ
P. Doukhan and W. Kengne
1296
⎞ ⎛ 2r 1/2 1/r
∂ ∂ 1/r t t ⎠ + C ⎝ E(1 + |Yt |)2r E ∂θi fθ − ∂θi fθ ∂ t ≤ C Yt 3r f ∂θi θ
1/3r t t 3r · E fθ − fθ
Θ 3r
Θ
2r 1/2r ∂ t ∂ t + C 1 + |Yt | 2r · E . ∂θi fθ − ∂θi fθ Θ
But we have Yt 3r = C < ∞ and 1 + |Yt | 1r < ∞. Hence (1) ∂ t ≤ ∂ fθ (0) + ∂ ft − ∂ fθ (0) ≤ C + αj (Θ) |Yt−j | . f ∂θi θ ∂θi ∂θi θ ∂θi Θ Θ Θ j≥1
Thus, ∂ t f ∂θi θ
≤ C + Y0 3r
Θ 3r
(1)
αj (Θ) ≤ C(1 +
j≥1
We also have fθt − fθt Θ ≤
(1)
αj (Θ)) < ∞.
j≥1
(0)
j≥t
αj (Θ)|Yt−j |. Hence
3r 1/3r = fθt − fθt E fθt − fθt Θ
Θ 3r
≤C
(0)
αj (Θ).
j≥t
The same argument gives 2r 1/2r (1) ∂ t ∂ t ∂ ∂ t t ≤C E fθ − = fθ − αj (Θ). fθ fθ ∂θi ∂θi ∂θi ∂θi Θ Θ 2r j≥t
Hence, r 1/r (0) ∂ ∂ (1) E (θ) − (θ) ≤ C (Θ) + α (Θ) . α t j j ∂θi t ∂θi Θ j≥t
Therefore, we have (with r = 1) ∂ 1 ∂ Ln (T, θ) Ln (T, θ) − E √ ∂θi kn − jn ∂θi Θ 1 ≤√ E t (θ) − t (θ) Θ kn − jn t∈T ⎞ ⎛ 1 (0) (1) ⎝ ≤ C√ α (Θ) + αj (Θ) ⎠ kn − jn t∈T j≥t j
Inference and structural change for Poisson autoregression
1297
⎡ ⎤ kn 1 (0) (1) (0) (1) ⎣ α (Θ) + αj (Θ) + αj (Θ) + αj (Θ) ⎦ ≤ C√ kn − jn t∈T j=t j j≥kn ⎡ ⎤ kn kn kn 1 (0) (1) (0) (1) ⎣ ≤ C√ α (Θ) + αj (Θ) + αj (Θ) + αj (Θ) ⎦ kn − jn t=j j=t j t=jn j≥kn n ⎡ j kn 1 (0) (1) ⎣ ≤ C√ αj (Θ) + αj (Θ) kn − jn j=j t=j n n ⎤ (0) (1) αj (Θ) + αj (Θ) ⎦ +(kn − jn ) ⎡
≤ C√
j≥kn
n 1 (0) (1) ⎣ (j − jn ) αj (Θ) + αj (Θ) kn − jn j=j
k
n
⎤ (0) (1) αj (Θ) + αj (Θ) ⎦ +(kn − jn ) j≥kn
≤ C√
1 kn − j n
kn
(0) (1) (j − jn ) αj (Θ) + αj (Θ)
j=jn
+C
(0) (1) αj (Θ) + αj (Θ) kn − j n j≥kn
C ≤√ kn − j n
jn +log(kn −jn )
(0)
(1)
(j − jn ) αj (Θ) + αj (Θ)
j=jn
C +√ kn − j n
(0) (1) (j − jn ) αj (Θ) + αj (Θ)
kn j=jn +log(kn −jn )
+C C log(kn − jn ) (0) (1) ≤ √ αj (Θ) + αj (Θ) kn − jn j≥1 + C j≥jn +log(kn −jn )
+C
(0) (1) j αj (Θ) + αj (Θ)
j≥kn
(0) (1) j − jn αj (Θ) + αj (Θ) (0) (1) j αj (Θ) + αj (Θ) j≥kn
−−−−−→ 0. n→+∞
This holds for any coordinate i = 1 . . . , d; and completes the proof of the lemma.
P. Doukhan and W. Kengne
1298
Lemma 7.2. Let (jn )n≥1 and (kn )n≥1 two integer valued sequences such that (jn )n≥1 is increasing, jn → ∞ and kn − jn → ∞ as n → ∞. Let n ≥ 1, for any segment T = Tjn ,kn ⊂ {1, . . . , n}, it holds under assumptions of Theorem 3.2, 2 a.s. 1 ∂ 0 (θ) ∂2 (i) kn −j L (T, θ) − E −→ 0; n ∂θ∂θ n ∂θ∂θ n→∞ Θ ∂ t ∂ t
a.s. 1 1 − E 10 ∂ f 0 ∂ f 0 (ii) kn −j f f −→ 0. t∈T ft ∂θ θ ∂θ θ ∂θ θ ∂θ θ f n Θ n→∞
θ
θ
Proof. (i) For i, j ∈ {1, . . . , d}, we have ∂2 1 ∂ t ∂ ∂ t t (θ) = (Yt t fθ − f ) ∂θi ∂θj ∂θj fθ ∂θi ∂θi θ ∂ 1 1 ∂2 ∂ t ∂2 t f + × f ft × − = Yt ∂θj fθt ∂θi θ fθt ∂θi ∂θj θ ∂θi ∂θj θ $ % t ∂fθ 1 ∂2 1 ∂ 2 fθt ∂fθt t = Yt − + × f − × θ 2 ∂θj ∂θi fθt ∂θi ∂θj ∂θi ∂θj (fθt ) 2 ∂ ∂ t ∂ t Yt Yt t =− fθ fθ + 2 t − 1 ∂θ ∂θ fθ . (22) t ∂θ ∂θ f i j i j (fθ ) θ 2
older’s inequality, We will show that E[ ∂θ∂i ∂θj t (θ) ] < +∞. From the H¨ we have ∂ t ∂ t ∂2 1 (θ) Y f f E ≤ t 3 ∂θi ∂θj t c2 ∂θi θ Θ 3 ∂θj θ Θ 3 Θ ∂ 2 t + C( Yt 2 + 1) fθ . ∂θi ∂θj Θ 2
But, we have Yt 3 = Y0 3 < ∞, and Yt 2 < ∞. ∂ t ∂ ∂ t ∂ f ≤ f (0) + f − f (0) ∂θi θ ∂θi θ ∂θi θ ∂θi θ θ 3 Θ 3 Θ 3 (1) ∂ ≤ αj (Θ) |Yt−j | ∂θi fθ (0) + j≥1 Θ 3 (1) ∂ ≤ αj (Θ) < +∞. ∂θi fθ (0) + Y0 3 Θ j≥1
Similarly, we have ∂θ∂ j fθt Θ 3 < +∞. Using the same argument, we obtain (2) ∂ 2 t f ≤ Y αj (Θ) < +∞. 0 θ 2 ∂θi ∂θj Θ 2 j≥1
Inference and structural change for Poisson autoregression
1299
2
Hence, E[ ∂θ∂i ∂θj t (θ) Θ ] < +∞. Thus, for the stationary ergodicity of the 2
sequence ( ∂θ∂i ∂θj t (θ))t∈Z and the uniform strong law of large numbers, it holds that 1 ∂2 ∂2 a.s. −−− (θ) − E (θ) −−→ 0. 0 kn − jn ∂θi ∂θj t n→+∞ ∂θi ∂θj Θ This completes the proof of (i). (ii) Goes the same lines as in (i) and as in Lemma 7.1. Proof of Theorem 3.2. Here again, without loss of generality, we will make the proof with Tjn ,kn = T1,n . Recall that Θ ⊂ Rd . Let T ⊂ {1, . . . , n}; for any θ ∈ Θ and i = 1, . . . , n, by applying the Taylor expansion to the function ∂ Ln (T, θ), there exists θn,i between θ and θ0 such that θ → ∂θ ∂ ∂ ∂2 Ln (T, θ) = Ln (T, θ0 ) + Ln (T, θn,i ) · (θ − θ0 ). ∂θi ∂θi ∂θ∂θi Denote Gn (T, θ) = −
∂2 1 Ln (T, θn,i ) . Card(T ) ∂θ∂θi 1≤i≤d
It comes that Card(T )Gn (T, θ) · (θ − θ0 ) =
∂ ∂ Ln (T, θ0 ) − Ln (T, θ). ∂θ ∂θ
(23)
By applying (23) with θ = θn (T ) we obtain ∂ ∂ Ln (T, θ0 ) − Ln (T, θn (T )). (24) Card(T )Gn (T, θn (T )) · (θn (T ) − θ0 ) = ∂θ ∂θ (24) holds for any T ⊂ {1, . . . , n}, thus √ 1 nGn (T1,n , θn (T1,n )) · (θn (T1,n ) − θ0 ) = √ n
∂ Ln (T1,n , θ0 ) ∂θ ∂ − Ln (T1,n , θn (T1,n )) . ∂θ
(25)
We can rewrite (25) as √ 1 ∂ Ln (T1,n , θ0 ) nGn (T1,n , θn (T1,n ))(θn (T1,n ) − θ0 ) = √ n ∂θ n (T1,n , θn (T1,n )) 1 n (T1,n , θn (T1,n )) ∂Ln (T1,n , θn (T1,n )) 1 ∂L ∂L +√ − . −√ ∂θ ∂θ ∂θ n n ∂ For n large enough, ∂θ Ln (T1,n , θn (T1,n )) = 0, because θn (T1,n ) is a local maxi mum of θ → L(T1,n , θ).
P. Doukhan and W. Kengne
1300
Moreover, according to Lemma 7.1, it holds that
1
∂ ∂ Ln (T1,n , θn (T1,n ))
E √ Ln (T1,n , θn (T1,n )) − ∂θ n ∂θ 1 ∂ ∂ ≤ E √ Ln (T1,n , θ) − Ln (T1,n , θ) −−−−−→ 0. n→+∞ ∂θ n ∂θ Θ So, for n large enough, we have √ 1 ∂ Ln (T1,n , θ0 ) + oP (1). nGn (T1,n , θn (T1,n ))(θn (T1,n ) − θ0 ) = √ n ∂θ
(26)
To complete the proof of Theorem 3.2, we have to show that ∂ (a) ( ∂θ t (θ0 ), Ft )t∈Z is a stationary ergodic martingale difference sequence ∂ and E( ∂θ t (θ0 ) 2 ) < ∞; a.s. ∂2 −−−−→ Σ; (b) Σ = −E( ∂θ∂θ 0 (θ0 )) and Gn (T1,n , θn (T1,n )) − ∂ 0 ∂ 0 fθ0 )( ∂θ fθ0 ) ] is invertible. (c) Σ = E[ f10 ( ∂θ
n→+∞
θ0
∂ ∂θ t (θ0 ) = ∂ t f θ0 fθt0 and ∂θ
(a) Recall that functions
E
∂ t ( fYtt − 1) ∂θ fθ0 and Ft = σ(Ys , s ≤ t). Since the θ0
are Ft−1 -measurable, we have
1 ∂ ∂
t (θ0 ) Ft−1 = f t = 0. E(Y |F ) − 1 t t−1 ∂θ fθt0 ∂θ θ0
∂ t Moreover, since |Yt | and ∂θ fθ have moment of any order, we have
|Y | ∂
2
2 ∂ 2 t + 1 fθt Θ < ∞. E t (θ0 ) ≤ E ∂θ c ∂θ (b) According to (22), we have ∂2 ∂2 Yt ∂ t ∂ t Yt fθ fθ + t (θ) = − 2 −1 f t. t θ ∂θ∂θ ∂θ ∂θ f ∂θ∂θ θ fθt But by using the same argument as in (a), we obtain E
Y
∂2 0 t
F − 1 f t−1 = 0. fθ00 ∂θ∂θ θ0
Hence, (27) implies
Yt ∂ 0 ∂ 0 f f (fθ00 )2 ∂θ θ0 ∂θ θ0 1 ∂ 0 ∂ 0 f f = −E = −Σ. fθ00 ∂θ θ0 ∂θ θ0
∂2 E (θ ) = −E t 0 ∂θ∂θ
(27)
Inference and structural change for Poisson autoregression
1301
Moreover, recall that ∂2 L(T1,n , θn,i ) ∂θ∂θi 1≤i≤d n 1 ∂2 =− t (θn,i ) . n t=1 ∂θ∂θi
1 Gn (T1,n , θn (T1,n )) = − n
1≤i≤d
For any j = 1, . . . , d, we have
n
1 ∂2 ∂2
t (θn,i ) − E 0 (θ0 )
n ∂θ ∂θ ∂θ ∂θ j i j i
t=1
n
1 ∂2 ∂2
≤ t (θn,i ) − E 0 (θn,i )
n
∂θ ∂θ ∂θ ∂θ j i j i t=1
2
∂ ∂2
+ E 0 (θn,i ) − E 0 (θ0 )
∂θj ∂θi ∂θj ∂θi
2 2
∂ ∂ ≤
E 0 (θn,i ) − E 0 (θ0 )
∂θj ∂θi ∂θj ∂θi n 1 ∂2 ∂2 a.s. + t (θ) − E 0 (θ) −−−−−→ 0. n→+∞ n ∂θj ∂θi ∂θj ∂θi t=1 Θ
This holds for any 1 ≤ i, j ≤ d. Thus, 2 ∂ 1 Gn (T1,n , θn (T1,n )) = − t (θn,i ) n ∂θ∂θi 1≤i≤d $ 2 % ∂ ∂ 0 ∂ 0 1 a.s. f f −−−−−→ −E 0 (θ0 ) = E = Σ. n→+∞ ∂θ∂θ fθ00 ∂θ θ0 ∂θ θ0 (c) If U is a non-zero vector of Rd , according to assumption Var, it holds that ∂ 0 fθ0 = 0 a.s. Hence U ∂θ ∂ 0 ∂ 0 1 f f U U > 0. U ΣU = E fθ00 ∂θ θ0 ∂θ θ0 Thus Σ is positive definite. From (a), apply the central limit theorem for stationary ergodic martingale difference sequence, it follows that 1 ∂ 1 ∂ √ Ln (T1,n , θ0 ) = √ t (θ0 ) n ∂θ n t=1 ∂θ $ % ∂ ∂ D 0 (θ0 ) 0 (θ0 ) . −−−−−→ N 0, E n→+∞ ∂θ ∂θ n
(28)
P. Doukhan and W. Kengne
1302
Recall that for i = 1, . . . , d,
∂ ∂θi t (θ)
∂ = ( Yf tt − 1) ∂θ f t . For 1 ≤ i, j ≤ d, we have i θ θ
Y ∂
2 ∂ t ∂ ∂ t
t F t (θ0 ) × t (θ0 ) = E E − 1 f × f E t−1 ∂θi ∂θj fθt0 ∂θi θ0 ∂θj θ0 Y
2 t
Ft−1 × ∂ fθt × ∂ fθt . =E E − 1 fθt0 ∂θi 0 ∂θj 0 We have, Y 2 1 2 t E − 1 Ft−1 = t 2 E(Yt2 |Ft−1 ) − t × fθt0 + 1 t f θ0 (fθ0 ) f θ0 1 1 = t 2 E(Yt2 |Ft−1 ) − 1 = t 2 Var (Yt |Ft−1 ) + (E(Yt |Ft−1 ))2 − 1 (fθ0 ) (fθ0 ) 1 1 = t 2 (fθt0 + (fθt0 )2 ) − 1 = t . (fθ0 ) f θ0 Thus, E Hence, E
1 ∂ ∂ ∂ ∂ =E t t (θ0 ) × t (θ0 ) fθt0 × fθt0 . ∂θi ∂θj fθ0 ∂θi ∂θj
∂ 1 ∂ ∂ ∂ =E t = Σ. t (θ0 ) × t (θ0 ) fθt0 × fθt0 ∂θ ∂θ fθ0 ∂θ ∂θ
Thus, (28) becomes 1 ∂ 1 ∂ D √ Ln (T1,n , θ0 ) = √ t (θ0 ) −−−−−→ N (0, Σ). n→+∞ n ∂θ n t=1 ∂θ n
(29)
(b) and (c) implies that the matrix Gn (T1,n , θn (T1,n )) converges a.s. to Σ and Gn (T1,n , θn (T1,n )) is invertible for n large enough. Hence, from (26) and (29), we have −1 ∂ √ 1 Ln (T1,n , θ0 ) + oP (1) n(θn (T1,n ) − θ0 ) = √ Gn (T1,n , θn (T1,n )) ∂θ n ∂ 1 = √ Σ−1 Ln (T1,n , θ0 ) + oP (1) ∂θ n D
−−−−−→ N (0, Σ−1 ). n→+∞
Before proving the Theorem 4.1, let us prove some preliminary lemma. Under H0 , recall 1 ∂ ∂ ∂ ∂ Σ=E 0 =E fθ00 fθ00 0 (θ0 ) 0 (θ0 ) . fθ0 ∂θ ∂θ ∂θ ∂θ Define the statistics • Cn = maxvn ≤k≤n−vn Cn,k where
Inference and structural change for Poisson autoregression
1303
Cn,k =
1 k 2 (n − k)2 n (Tk+1,n ) Σ θn (T1,k ) − θn (Tk+1,n ) k
(T ) − θ θ n 1,k n3 q2 n
(1)
(1)
• Qn = maxvn ≤k≤n−vn Qn,k where (1)
Qn,k =
k2 θn (T1,k ) − θn (T1,n ) Σ θn (T1,k ) − θn (T1,n ) ; n
(2)
(2)
• Qn = maxvn ≤k≤n−vn Qn,k where (2)
Qn,k =
(n − k)2 θn (Tk+1,n ) − θn (T1,n ) Σ θn (Tk+1,n ) − θn (T1,n ) . n
Lemma 7.3. Under assumptions of Theorem 4.1, as n → +∞, n,k − Cn,k | = oP (1); (i) maxvn ≤k≤n−vn |C (j) − Q(j) | = oP (1). (ii) for j = 1, 2, maxvn ≤k≤n−vn |Q n,k n,k Proof. (i) For any vn ≤ k ≤ n − vn , we have as n → ∞
Cn,k − Cn,k =
1 k 2 (n − k)2 k
n3 q2 n
n (un ) − Σ θn (T1,k ) − θn (Tk+1,n )
Σ × θn (T1,k ) − θn (Tk+1,n ) 2 1 k 2 (n − k)2 n (Tk+1,n ) θ Σ ≤ 2 k
(u ) − Σ (T ) − θ n n n 1,k n3 q n √ 2 1 k(n − k) n (T1,k ) − θ0 ) Σ ≤ C 2 k
(u ) − Σ k( θ n n n2 q n √ 2 + n − k(θn (Tk+1,n ) − θ0 ) ≤C
q2
1 k(n − k) k
o(1)OP (1). n2 n
Thus, as n → ∞, it holds that max
vn ≤k≤n−vn
Cn,k − Cn,k ≤ oP (1)
max
vn ≤k≤n−vn
q2
1 k(n − k) k
n2 n
k 1 k k (1 − ) n n n τ (1 − τ ) 2 ≤ oP (1) sup = oP (1). q(τ ) 0