PARAMETER ESTIMATION FOR CONTINUOUS TIME PROCESSES OBSERVED WITH NOISE
PETER LAKNER and HALINA FRYDMAN New York University Stern School of Business
Abstract: We consider the estimation of a k-dimensional parameter θ that determines the dynamics of an unobserved process {Xt , t ≤ T }. Our observation consists of the integral of Xt plus an additive noise modeled by a Brownian motion, on a continuous time-horizon [0, T ]. A modified version of the Maximum Likelihood Estimator (MLE) will be defined through a discretization of the parameter space, and the weak consistency of this estimator will be shown under certain conditions. An implication of this result is that the (traditional) MLE is weakly consistent under the same conditions provided that the parameter space is finite. It will be shown that in a special case of a Hidden Markov Model (HMM) all conditions are satisfied.
Keywords: Likelihood function, Maximum Likelihood Estimator, Hidden Markov Model, Harris recurrence, stationary distribution, exponentional ergodicity.
Corresponding author: Peter Lakner New York University Stern School of Business 44 W. 4th St. Suite 8-61 New York, NY 10012 phone: 1-212-9980476 e-mail:
[email protected] 1
1. Introduction We suppose that an unobservable (hidden) process {Xt (θ), t < ∞} depends on a kdimensional parameter θ ∈ Θ where Θ is a compact subset of which converges to zero as T → ∞ by (3.4). The second expression is bounded by h i h 1 i 1√ 1 P vk (T ) ≥ T qk (T ), dk < qk (T ) + ≤ P √ vk (T ) ≥ (dk − ) 2 2 T 6
which converges to zero as T → ∞ by dk − > 0 and (3.5). Part (b) is a straightforward consequence of (a) since if δ (i) is the single element of Λ(θ, D) then by part (a) limT →∞ P [θˆT (D) = δ (j) ] = 0 for all j 6= i, and the statement now follows. Based on the previous theorem we shall establish the consistency of the estimator θT (D) when D is sufficiently dense in Θ, and T is large. We shall assume here that Θ is a compact subset of 0 there exists a finite set D(ξ) ⊂ Θ such that for any η ∈ Θ there exists a δ ∈ D(ξ) satisfying |δ − η| < ξ (| · | is the Euclidean norm). For every ξ > 0 we fix an appropriate finite set D(ξ). Instead of θˆT (D(ξ)) we shall write θˆT (ξ). For future reference we formulate the following two additional conditions: Condition B. For any δ ∈ Θ the relation f (θ, δ) = 0 implies θ = δ. Condition C. The function f (θ, ·) is continuous on Θ. Condition B is the ”identifiability” condition for the parameter θ. Here follows our consistency result for θˆT (ξ). 3.2 Theorem. Suppose that Θ is compact, and Conditions A,B, and C hold. Then for any > 0 there exists a ξ0 () = ξ0 > 0 such that for every ξ < ξ0 we have h i lim P |θˆT (ξ) − θ| > = 0. T →∞
(3.6)
Proof. For every > 0 we define m(θ, ) = min{f (θ, η); η ∈ Θ, |θ − η| ≥ }.
(3.7)
By conditions B and C and the compactness of Θ we have m(θ, ) > 0. We also define for every ξ > 0 M(θ, ξ) = max{f (θ, η); η ∈ Θ, |η − θ| ≤ ξ}.
(3.8)
Since f (θ, θ) = 0 so Condition C implies limξ→0 M(θ, ξ) = 0. Hence there exists a ξ0 > 0 such that for all ξ < ξ0 we have M(θ, ξ) < m(θ, ). One can see easily that for all ξ < ξ0 ¯ D(ξ)). Indeed, if |θˆT (ξ) − θ| > then the inequality |θˆT (ξ) − θ| > implies θˆT (ξ) ∈ Λ(θ, f (θ, θˆT (ξ)) ≥ m(θ, ) by (3.7). On the other hand, there exists a δ ∈ D(ξ) such that |δ − θ| ≤ ξ, which by (3.8) implies that f (θ, δ) ≤ M(θ, ξ) < m(θ, ξ). Hence f (θ, θˆT (ξ)) > ¯ D(ξ)). It follows that min{f (θ, η); η ∈ D(ξ)} which implies θˆT (ξ) ∈ Λ(θ, i i h ¯ D(ξ)) , P |θˆT (ξ) − θ| > ≤ P θˆT (ξ) ∈ Λ(θ, 7
and this expression converges to zero as T → ∞ by Theorem 3.1. 3.3 Remark. The quantity in (3.6) may be considered the required level of precision for the estimation of θ. In practice the question arises that for a given precision level how do we find the proper value of ξ such that (3.6) holds? In the above proof ξ0 () depends on θ which is unknown. However, we can modify the selection of ξ0 in the following way. Suppose that f (·, ·) is continuous on Θ × Θ. Then we define m() = min{f (γ, η); γ, η ∈ Θ, |γ − η| ≥ } and note that by the compactness of Θ, the continuity of f (·, ·), and condition B we have m() > 0. We also define M(ξ) = max{f (γ, η); γ, η ∈ Θ, |η − γ| ≤ ξ}, and notice that under our conditions limξ→0 M(ξ) = 0. Hence there exists a ξ0 (depending on ) such that M(ξ) < m() whenever ξ < ξ0 . Relation (3.6) follows for every ξ < ξ0 just like in the proof of Theorem 3.2. The difference is that now ξ0 does not depend on the parameter. Let θˆT be the Maximum Likelihood Estimator (MLE), that is, θˆT = max{lT (δ), δ ∈ Θ}. 3.4 Theorem. If Θ is a finite set and Conditions A and B are satisfied, then the MLE is weakly consistent, i.e., we have h i lim P θˆT = θ = 1. T →∞
Proof. If Θ is finite then Condition C is obviously satisfied. For every ξ > 0 we can select D(ξ) = Θ, and the statement now follows from (3.6). 4. A Hidden Markov Model. We are going to apply the results of the previous section in the following situation. Suppose that ut (θ) is a Markov process with state space {0, 1}, independent of the Brownian motion w. We denote the transition rates from 0 to 1 and from 1 to 0 by θ1 and θ2 , respectively. The hidden process will be Xt (θ) = θ3 ut (θ) where θ3 is another parameter, and the observation Y is given by (2.3). The unknown parameter is the three-dimensional θ = (θ1 , θ2 , θ3 ); we shall estimate all three parameters 8
simultaneously. We assume that the initial distribution (P (u0 = 0), P (u0 = 1)) does not depend on the parameters, and θ ∈ Θ where Θ is a compact subset of (0, ∞) × (0, ∞) ×