Nonlinear rational expectation models, asymptotic variance vs. asymptotic bias, and choice of instruments Stanislav Anatolyev∗ New Economic School March 19, 2003
Abstract When estimating and testing nonlinear rational expectations models, applied researchers complain of lack of guides in instrument selection. For the one- and two-period one-return Hansen–Singleton models, we derive exact expressions for asymptotic variances and biases of various moment condition based estimators for various choices of instrument set. We then numerically evaluate these expressions using a dynamic lognormal specification calibrated to US data. Overall, the results reveal that applied researchers do not always form instrument sets judiciously and potentially can seriously improve the quality of their findings. Our results differ sharply for the one- and two-period models proving that the presence of serial correlation should be taken into account when making decisions about the instrument vector. At the same time, alternative estimators developed recently to improve finite sample behavior of GMM, are preferable to the most popular two step GMM estimator. By considering the trade-off between the first order asymptotic variance and second order asymptotic bias, we give concrete recommendations of preferable instrument sets for plausible parameter combinations and sample sizes.
∗
Address: Stanislav Anatolyev, New Economic School, Nakhimovsky Prospekt, 47, Moscow, 117418 Russia. E-mail:
[email protected]. Research for this paper was supported in part by a fellowship from the Economics Education and Research Consortium, with funds provided by the Government of Sweden through the Eurasia Foundation. The opinions expressed in this paper are those of the author and do not necessarily reflect the views of the Government of Sweden, the Eurasia Foundation, or any other member of the Economics Education and Research Consortium.
1
1
Introduction
The instrumental variables method (also known as GMM) for nonlinear consumption-based capital asset pricing models (C-CAPM) introduced by Hansen and Singleton (1982) has become a convenient way of estimating and testing nonlinear rational expectations models, along with the method of maximum likelihood (Hansen and Singleton 1983) and loglinearization popular in the Real Business Cycles literature. All three approaches raise serious econometric issues. The maximum likelihood estimation requires auxiliary distributional specifications which has unpleasant consequences such as possible inconsistency due to density misspecification. The literature on log-linearization has not yet come to a single conclusion about its relative benefits and shortcomings (Attanasio and Low 2000, Carroll 2001). When a researcher decides to take the classical approach and apply GMM to the original system of Euler equation, he or she is faced with a need to form an instrument set. This is important step carelessness in implementation of which can lead to unstable results or even reverse them. However, applied researchers often complain of lack of guides in selecting instruments in such models. In this paper, we intend to partially fill this gap. There are several extensive Monte Carlo studies of finite sample performance of GMM estimators in nonlinear rational expectations models: Tauchen (1986), Kocherlakota (1990), Mao (1990), Hansen, Heaton and Yaron (1996), Smith (1999). These studies resulted in a number of stylized facts that are used by applied researchers as guides in instrument selection. For example, in a recent paper Weber (2002, p. 301) after referring to Tauchen (1986) and Mao (1990) says that “several studies of the properties of GMM estimators have found that they perform better when the instrument set includes relatively few lags of the variables which appear to the equation to be estimated. h· · · i There is also evidence that the instrument set should contain lagged values of the variables which appear in the equation to be estimated.” The next section provides examples of numerous papers that appeared in the last two decades where various versions of a CAPM are estimated or simulated. Most of the papers are applied, some are theoretical having a section devoted to Monte Carlo simulations or an empirical application. An analysis of how instrument sets are formed in those papers lead to the following general observations. Applied researchers typically (1) include a constant even when they do not say that explicitly; (2) include a relatively small number of lags of potential instruments; (3) as potential instruments, employ all variables that figure into the moment function; (4) tend to include equal number of lags of potential instruments; (5) include these variables in levels, occasionally in logs, very rarely using other nonlinear functions; (6) run GMM for several alternative sets of instruments to see how robust results are, sometimes finding out their fragility, sometimes insensitivity to the instrument choice; (7) do not tend to tailor the number of instruments to whether the errors are or are not serially correlated. Some of these tendencies can be justified by econometric theory, experience and stylized evidence coming from earlier simulation studies. For example, (1) can be justified by high identifiability by a constant of parameters in which the moment condition is linear (see Stock and Wright 2000); (6) is justified by the fact that parameter consistency is invariant to the choice of instruments if all of them are valid. Literature on linear optimal instruments (e.g., Hansen and Singleton 1996; West, Wong and Anatolyev 2002) gives sufficient numerical evidence that the returns in terms of asymptotic variance to using more lags are quickly diminishing when the error is a martingale difference and are diminishing more slowly when there is strong serial correlation. Thus, this literature justifies (2) for stock price models
2
but does not for models with habit formation, and does not support (7). The literature on nonlinear optimal instruments (Hansen 1985; Hansen, Heaton and Ogaki 1988) also does not support (5) as possible conditional heteroskedasticity may allow attaining efficiency gains by using nonlinear functions of original instruments. Finally, the literature on weak identification (Stock and Wright 2000) does not entirely support (3) and (4). The present study, using numerical computations as a tool, aims to shed light on how instrument sets should be formed in CAPM-like models. We specify a plausible law of motion and distribution for the involved variables in such a way that it is possible to find closed form expressions for various moments and calculate exact numerical values for asymptotic variances and asymptotic biases of estimators of interest. Such strategy of investigation differs greatly from Monte Carlo simulations. Similarity between the two methods is that it is possible only to report results for a limited number of parameter combinations, or at most to give response surfaces based on still a limited number of results. Among advantages of our method is fast implementation for any additional parameter combination once all moments of interest are figured out and programmed, while running a new set of simulations can be very time consuming, especially with nonlinear models and complicated estimators. On the other hand, numerical evaluations are less flexible since we have to stick to a convenient distribution with which the moments have closed forms. When some new feature is of interest, we have first to figure out its expression analytically and then program it, while it is easy to incorporate a new statistics in a Monte Carlo setup. But at the same time we can tell in advance whether a moment of interest is finite or infinite, while it is easy to end up evaluating non-existent moments in simulations. One more significant difference between the two strategies is that the simulation results, given a sufficient number of Monte Carlo repetitions, show the whole picture incorporating all features of distributions of statistics of interest no matter what their reason is, while our numerical results give information only on the moments we concentrate on. This difference between the two approaches cannot give unambiguous advantage to either of them, but we regard obtaining more concentrated information more valuable. More concretely, we consider the one- and two-period one-return Hansen–Singleton models estimated by a number of moment condition based estimators after an instrument set is formed. The estimators we compare are first order asymptotically equivalent but differ in their finite sample properties. They are: the two-step GMM estimator (Hansen 1982), the iterated GMM estimator (Hansen, Heaton and Yaron 1996), the Empirical Likelihood (EL) estimator (Qin and Lawless 1994, Imbens 1997) and a smoothed version of the EL estimator (Kitamura 1997, Smith 1997). The two-step GMM is the “canonical” and still most popular estimator for this class of models, while the other three estimators have been recruited over years to improve poor finite sample properties of the GMM estimator. The second order asymptotic theory sheds light on the advantages of these estimators over the two step GMM showing that they lack certain components of the asymptotic bias which the GMM estimator possesses (Newey and Smith 2001, Anatolyev 2002). For the four estimators, we evaluate first order asymptotic variances and second order asymptotic biases when various combinations of lagged variables entering the Euler equation are included into the instrument set thus allowing various comparisons. The underlying data generating mechanism is such that the vector comprising log rate of return and log consumption growth follows a VAR of first order with normal innovations. Such lognormal specification is often adopted when maximum likelihood is applied or is presumed when log-linearization is performed. The parameters of the VAR specification subject to the Euler equation are calibrated to
3
actual US data. However, as noted above, results for other parameter combinations can be obtained quickly. The results differ sharply for the one- and two-period models proving that the presence of strong serial correlation should be taken into account when making decisions about the instrument vector. For the one-period model the returns to asymptotic variance of using more lags drop sharply after most recent values of consumption growth and rate of return belonging to the information set are included, while for the two-period model these returns are more persistent and adding more and more lags is desirable from the viewpoint of reducing first order asymptotic variance. As far as the second order asymptotic bias is concerned, the pattern is very complicated, there being many similarities as well as discrepancies across the two problems. These figures differ substantially across the estimators studied, there being some tendency for the bias to be similar and large for the two GMM estimators, and smaller for the basic/corrected EL and especially the smoothed EL estimators. The tendency is not uniform, often is fragile, but when there are discrepancies, they may be appreciable. The experiments also show that the number of components in a formula for the asymptotic bias may not be a good measure of biasedness because in practice cancellations among some components occur. This evidence leads to several important conclusions. Fewer, sometimes significantly, instruments should be used in one-period problems than in multiple-period ones. The new empirical likelihood based estimators developed recently are preferable to GMM, allowing taking more lagged values into the instrumental vector thus allowing utilization of more information, which is especially remarkable of the smoothed EL estimator. This is remarkable in the case of the one-period problem as smoothing is not needed when there is no serial correlation, and thus is not stipulated in the literature. Going back to the seven observations listed above, our results fully justify (1), criticize (4) and (7), and justify or criticize (2), (3) and (5) depending on circumstances. In addition, by considering the trade-off between the first order asymptotic variance and second order asymptotic bias, we give concrete recommendations of preferable instrument sets for plausible parameter combinations and sample sizes. The paper is organized as follows. Section 2 provides examples from applied econometric practice that support the observations given above. Section 3 describes the data generation mechanism, models and estimators, as well as characterizes the optimal instrument. Section 4 supplies formulas and numerical values for first order asymptotic variances, section 5 – for second order asymptotic biases. There is extensive discussion of the results in both sections. Section 6 analyzes the bias–variance trade-off, and section 7 concludes. The Appendix contains mathematical details of computing the moments used in calculations.
2
Selection of instruments: econometric practice
In order to support general observations given in the introduction, we look at the literature to see what researchers do in concrete studies when selecting their instrument sets. This review contains intensively cited earlier papers and recent literature featuring more sophisticated models. We ignore papers where a linear quadratic setup or loglinearization is used. These offshoots are popular and important but are not of interest to us as both give rise to linear decision rules. We adhere to a chronological order of appearance of reviewed papers. When ”most recently dated” variables are mentioned, they are presumed to belong to agents’ information set.
4
Hansen and Singleton (1982) Hansen and Singleton (1982) formulate a rational expectations intertemporal consumption-based asset pricing model with flexible preferences that gives rise to a nonlinear decision rule, and suggest an instrumental variables method to estimate its parameters without imposing distributional assumptions. In the empirical section of the paper, the authors estimate single-return and three-return stock price models having included into an instrument set both consumption growth and rate of return lagged once, twice, four times, and six times. Brown and Gibbons (1985) Brown and Gibbons (1985) illustrate the Hansen–Singleton methodology for the same model with consumption proxied by the return on market portfolio. The authors apply the law of iterated expectations to the Euler equation thus using only a constant as an instrument, and end up with just- and over-identifying systems of moment conditions in cases of two and more assets, respectively. Mark (1985) Mark (1985) estimates a similar model where the role of assets is played by foreign currencies. Accordingly, the role of rates of return is played by the product of the price of consumption and exchange rate speculative profits. The instrument set the author uses includes consumption growth and one, two or three lags of speculative profits on all currencies considered. Another set uses forward exchange rate premia instead of realized speculative profits. Mankiw, Rotemberg, Summers (1985) These authors consider an intertemporal representative agent model with utility over consumption and leisure in the CES form, and use three sets of instruments. One set includes a constant, short term and longer term rates of inflation, rate of return, and holding period yield. Another set includes a constant and two lags of consumption, interest rate, leisure, price, and wage. Yet another contains the same variables as set B dated one period later. Tauchen (1986) Tauchen (1986) runs Monte–Carlo simulations on the Hansen–Singleton model. As instruments, he uses a constant, current and several past values of consumption growth and asset return. For the GMM estimate of the curvature parameter, Tauchen finds strong variance–bias trade-off: more lags used, greater bias, but less variance. The author comes to an overall conclusion that short lag lengths are preferred to longer ones in applied work with the GMM procedure. Dunn and Singleton (1986) Dunn and Singleton (1986) consider a complicated multiple equation time nonseparable model with high order serial correlation. As instruments the authors use all most recently dated variables that figure into the Euler equations. Eichenbaum, Hansen and Singleton (1988) These authors consider a model with nonseparable preferences across consumption and leisure. The system of two Euler equations is estimated with the use of the following instruments. For one of equations, a constant, consumption growth, leisure growth, real wage growth, and rate of return are used; for the other equation – the variables dated one period earlier.
5
Hotz, Kydland and Sedlacek (1988) These authors consider an intertemporal model of choice of labor supply and consumption with transcendental logarithmic utility. Two sets of instruments are exploited. The first set includes various exogenous socioeconomic characteristics at three most recent periods. The second set in addition has a couple of endogenous variables at same time periods shifted to the past. Singleton (1990) In an empirical part of this survey article, Singleton (1990) considers the Hansen–Singleton model with securities of various maturities: one, six and twelve. The instruments selected are: a constant, two most recent values of consumption growth, and two most recent rates of return that belong to the agents’ information set. Thus, the author uses the same list of instruments for both one-period and multiperiod setups. Kocherlakota (1990) This important paper studies the Hansen–Singleton model with up to 3 types of returns in both simulation and empirical application modes. With one or two returns, the instruments used are: a constant, consumption growth, and all involved rates of return. The GMM estimators with multiple instruments perform poorly, underestimating the preference parameters. With three returns, only a single instrument is used: a constant, consumption growth, or one of the returns. The GMM estimator that employs a single instrument performs well. Mao (1990) Mao (1990) runs simulations on the RBC model with time separable Cobb– Douglas utility over consumption and leisure. To the list of instruments the author includes a constant, lags of consumption growth, lags of return on investment, and lags of real wage growth. The number of lags is kept at 1 and 2. Finn, Hoffman and Schlagenhauf (1990) These authors estimate and test Euler equations implied by dynamic barter, cash-in-advance, and money-in-the-utility function models. Most of instrument sets include a constant and most recently observed values of variables from the information set that enter into the Euler equation. The authors find out that the estimation results are qualitatively robust to choices over alternative instrument sets. Epstein and Zin (1991) Epstein and Zin (1991) present empirical results using US data of estimating the model with more flexible preferences (which will be referred to as Epstein– Zin preferences in subsequent literature). The Euler equation is of the same type as of the Hansen–Singleton model, but with an additional parameter. The three sets of instrument all include a constant and current consumption growth; the first set also includes a lag of consumption growth, the second – current rate of return, and the third – lagged rate of return. The latter instrument set is employed to make inference robust to time aggregation. Ferson and Constantinides (1991) Ferson and Constantinides (1991) use iterated GMM in studying the model with time nonseparable preferences driven by habits and durability. Estimated is a set of several Euler equations associated with 5, sometimes 2, assets. The instruments the authors use are: in one instrument set – a constant, consumption growth, returns on a safe asset and on a small-stock portfolio, all variables being lagged once and twice; in another instrument set – a constant and 8 financial variables lagged by one or two periods. The authors find the nonseparability parameter sensitive to a choice of instruments.
6
Marshall (1992) Marshall (1992) studies a representative agent model with money holdings with logarithmic utility over consumption and the Cobb–Douglas transaction cost function. The moment function turns out to be a complicated function of velocity, nominal interest rate, and money growth rate. The instrumental set comprises a constant, nominal interest rate, and money growth rate. Hansen, Heaton and Yaron (1996) This important simulation study of various alternative GMM estimators uses the Ferson–Constantinides time nonseparable model with habit formation and two returns, stocks and bonds. Two alternative instrument sets are used: one comprises a constant and consumption growth; the other in addition contains both returns. Holman (1998) Holman (1998) considers an intertemporal problem with money in the utility function as an asset providing liquidity services, with three specifications for the utility function. Motivated by warnings provided in previous studies, this paper carefully approaches the issue of instrument selection to avoid bias associated with GMM. The set of instruments for a separate equation contains a constant and variables entering that particular equation together with their single lags. The author concludes that the results are somewhat fragile. Smith (1999) This paper investigates small sample properties of GMM tests of asset pricing restrictions in the Epstein–Zin model. In addition to the two-step GMM estimator, the author explores two alternative GMM estimators: an iterative GMM estimator, and continuous-updating GMM estimator (Hansen, Heaton and Yaron 1996), and considers three different instrument sets containing two, three and four lagged variables. One instrument set contains a constant and the return lagged one period, another set adds lagged consumption growth to these variables. Yet another set contains the same variables plus the current riskfree rate. In addition, the author checks the properties of two other instrument sets: one contains only a constant and exactly identifies the model, the other is optimal in the sense of Hansen (1985). Stock and Wright (2000) This influential study introduces an idea of strongly and weakly identified parameters in a nonlinear model. Stock and Wright (2000) run Monte Carlo on the single- and multi-return Hansen–Singleton model. In the case of a single asset, a constant, log of consumption growth, and rate of return are included in the instrument set. In the case of two assets, a couple of alternative instrument sets are tried: one containing only a constant and log of consumption growth, another containing both returns in addition to the same two. Stock and Wright (2000) also obtain empirical results using three nonlinear models cited before. In the Hansen-Singleton model, the instrument set contains most often the first, or sometimes second because of temporal aggregation, lags of stock returns and consumption growth. In the Epstein–Zin model, the instrument vector contains most often the first or second lags of return on optimal portfolio and consumption growth. In the Ferson– Constantinides model, the instruments are the first or second lags of stock returns and consumption growth. In all three models, alternative instrument sets are formed by adding bond returns, dividend yield, or spread between the long bond rate and short interest rate.
7
Weber (2000) Weber (2000) studies a model with two type of households (rule-of-thumb and permanent-income), and the Epstein–Zin preferences. The widest instrument set contains a constant and two lags of the following variables: consumption/income ratio, income growth, rates of return on a market portfolio and on a safe asset, 9 variables in total. Other instrument sets either exclude second lags, or exclude all returns, or exclude all variables except returns. Weber (2002) The same author considers an intertemporal consumption model with time nonseparability and two type consumers, with the utility function in three non-quadratic functional forms. He uses three instrument sets: a constant and either one, two or three lags of each variable figuring into the Euler equation. Also, he adds several additional variables with the same number of lags “for the sake of comparability with previous studies and to assess the robustness of the results”. Wright (2002) In the Monte–Carlo section, Wright (2002) runs simulations with the Hansen–Singleton model with one asset. The vector of instruments consists of a constant and most recent consumption growth and return.
3 3.1 3.1.1
Models and estimators Models and their features Conditional moment restrictions
Let x1,t be the one period rate of return, and x2,t be one period consumption growth. The basic k-period one-return C–CAPM of Hansen and Singleton (1982) with CRRA utility implies the Euler equation E β k x1,t+1 x1,t+2 · · · x1,t+k xα2,t+1 xα2,t+2 · · · xα2,t+k − 1|It = 0, where It is time t information available to the decision maker and econometrician, β is the discount factor, and α is the risk aversion parameter. The one-period model is E βx1,t+1 xα2,t+1 − 1|It = 0, (1) with the conditional moment function µt+1 = βx1,t+1 xα2,t+1 − 1. The two-period model is E β 2 x1,t+1 x1,t+2 xα2,t+1 xα2,t+2 − 1|It = 0, with the conditional moment function µt+2 = β 2 x1,t+1 x1,t+2 xα2,t+1 xα2,t+2 − 1.
8
(2)
3.1.2
Parameters and instruments
The parameter vector is β θ= . α The vector of instruments is zt = (x1,t x1,t−1 · · · x1,t−nl1 +1 x2,t x2,t−1 · · · x2,t−nl2 +1 1)0 . Thus, along with a constant, in the vector of instruments we employ nl1 current and lagged values of x1,t in a row and nl2 current and lagged values of x2,t in a row, totaling to ` = nl1 + nl2 + 1 instruments. 3.1.3
DGP and calibration of parameters
We derive the moments of interest under the condition that the vector xt ≡ (x1,t x2,t )0 is lognormally distributed, and the law of motion for Xt ≡ (log x1,t log x2,t )0 is stationary VAR(1) with normal innovations Ut ≡ (u1,t u2,t )0 : Ut ∼ IID N (0, VU ) ,
Xt = λ + ΦXt−1 + Ut , where
λ1 λ= , λ2
Φ=
φ11 φ12 φ21 φ22
,
VU =
σ 21 σ 12 σ 12 σ 22
.
Denote EX ≡ E [Xt ] = (I2 − Φ)−1 λ, VX ≡ V [Xt ] = VU + ΦVU Φ0 + Φ2 VU Φ02 + · · · . Estimation of the law of motion was performed subject to constraints imposed by the conditional moment restriction (1). The restriction (2) will then also be satisfied. The constraints are (see Hansen and Singleton 1982, formulae (4.5)) 0 = (1 α) Φ 0 = log β + (1 α) λ +
(1 α) VU (1 α)0 2
Essentially, only one constraint on the parameters of the law of motion is imposed; the other two restrictions determine the deep parameters β and α. Using the data from the Hansen/Heaton/Ogaki GMM package, we obtained λ1 φ11 σ 21 β
= = = =
0.01571, λ2 = 0.003291, 0.04636, φ12 = 0.01435, φ21 = 0.3935, φ22 = 0.1218, 0.006349, σ 22 = 3.221 × 10−5 , σ 12 = 0.0001086, 0.9817, α = −0.1178.
9
3.1.4
Conditional moments
For the one-period model, the DGP is such that the moment function is conditionally homoskedastic with the variance σ 2m = exp (1 α) VU (1 α)0 − 1. In the calibrated model σ 2m = 0.006344. For the two-period model, the DGP is such that the moment function is conditionally homoskedastic and homoautocorrelated, with the variance and covariance, respectively, σ 2m = exp 2 (1 α) VU (1 α)0 − 1, γ m = exp (1 α) VU (1 α)0 − 1. This implies the autocorrelation coefficient and implied MA(1) coefficient, respectively, −1 exp (1 α) VU (1 α)0 + 1 , p 1 −1 = ρ − ρ−2 m −4 . 2 m
ρm = %m
In the calibrated model σ 2m = 0.012728, ρm = 0.4984, and %m = 0.9235. As one can see, the first order serial correlation is pretty severe. It is worth noting that the conditional moment being time invariant is a result of the lognormal specification. It is very fortunate as it simplifies further computations and makes possible analytical derivation of optimal instruments and efficiency bounds. 3.1.5
Unconditional moment functions
For the one-period model, the unconditional moment function is mt = zt βx1,t+1 xα2,t+1 − 1 , with the first derivative mθt = zt x1,t+1 xα2,t+1 (1 β log(x2,t+1 )) . For the two-period model, the unconditional moment function is mt = zt β 2 x1,t+1 x1,t+2 xα2,t+1 xα2,t+2 − 1 , with the first derivative mθt = βzt x1,t+1 x1,t+2 xα2,t+1 xα2,t+2 (2 β log(x2,t+1 x2,t+2 )) .
3.2 3.2.1
Estimators One-period model
For the one-period model, we consider the following four estimators.
10
The two-step GMM estimator uses the identity matrix as the weight matrix at the first step to get the first step preliminary (inefficient) estimator ¯θ. It is used to form the efficient ˆ ¯θ , where weight matrix W ˆ (θ) = W
!−1 T 1X m (wt , θ) m (wt , θ) . T t=1
The two-step GMM estimator ˆθGM M solves the !0 T X 1 ˆ min m (wt , θ) W θ∈Θ T t=1
optimization problem ! T 1X ¯θ m (wt , θ) . T t=1
The iterative GMM estimator ˆθIGM M can be obtained when the GMM procedure is iterated to convergence and is characterized by the FOC 0 ! T ∂m w , ˆ T 1X θ X t IGM M 1 ˆ ˆθIGM M W m wt , ˆθIGM M = 0, T t=1 T t=1 ∂θ0 or the three-step GMM estimator that uses the efficient weight matrix at the second step and the efficient GMM estimator at the third step, i.e. !0 ! T T 1X 1X ˆ ˆθGM M arg min m (wt , θ) W m (wt , θ) . θ∈Θ T t=1 T t=1 The empirical likelihood estimator ˆθEL together with the ` × 1 vector of additional ˆ EL solves the optimization problem parameters λ min
sup
T X
θ∈Θ λ: 1+λ0 m >0 t t=1
log (1 + λ0 m (wt , θ)) .
and is characterized by the FOC 0 =
1 T
0 =
1 T
ˆ m wt , θEL , ˆ0 ˆ t=1 1 + λEL m wt , θ EL 0 0 ˆ ˆ EL T X ∂m wt , θEL /∂θ λ . ˆ 0 m wt , ˆθEL 1+λ t=1 EL
T X
The smoothed empirical likelihood estimator is designed for problems with serially correlated moment function, since then the empirical likelihood estimator is asymptotically inefficient. However, even though there is no serial correlation in our one-period model, there is benefit to using smoothing nonetheless, as we will see shortly. Define σ
m (wt , θ) =
r X
m (wt−s , θ) and
s=−r
11
mσθ
r X ∂m (wt−s , θ) (wt , θ) = , 0 ∂θ s=−r
where r is some integer going sufficiently slowly to infinity with the sample size. The FOCs are modified in the following way: σ ˆ T m w , θ X t SEL 1 , 0 = 0 T t=1 1 + λ σ w ,ˆ ˆ m θ t SEL SEL 0 σ ˆ ˆ T 1 X mθ wt , θSEL λSEL . 0 = T t=1 1 + λ ˆ 0 mσ wt , ˆθSEL SEL
3.2.2
Two-period model
We will consider the same estimators with the following modifications caused by the presence of serial correlation. For GMM-type estimators, the efficient weighting is now composed via !−1 T T X X 1 1 ˆ (θ) = W m (wt , θ) m (wt , θ) + (m (wt , θ) m (wt−1 , θ) + m (wt−1 , θ) m (wt , θ)) . T t=1 T t=2 The basic EL estimator is asymptotically inefficient, therefore we do not consider it. Its counterpart is the corrected EL estimator ˆθCEL that together with the ` × 1 vector of ˆ CEL is characterized by the FOC additional parameters λ ˆθCEL T −1 m w , X t 1 , 0 = T t=2 1 + λ ˆ ˆ ˆ ˆ0 m w , θ + m w , θ + m w , θ t CEL t−1 CEL t+1 CEL CEL 0 ˆ EL T −1 ∂m wt , ˆθCEL /∂θ0 λ 1X . 0 = T t=2 1 + λ ˆ0 m wt , ˆθCEL + m wt−1 , ˆθCEL + m wt+1 , ˆθCEL CEL
The smoothed EL estimator takes the same form as in the one-period problem.
4 4.1
Asymptotic variance Optimal instruments
Because of conditional homoskedasticity and homoautocorrelatedness, it is possible to derive the optimal instrument, i.e. the one that allows attaining the GMM/EL efficiency bound, the greatest lower bound for the asymptotic variance of GMM/EL estimators (Hansen 1985), see the Appendix for these derivations. For the one-period model, the optimal instrument yields the following minimal asymptotic variances for estimates of β and α: 6.878 × 10−3 and 6.202, respectively. As follows from the Appendix, the efficiency bound can be attained by using the vector of instruments (log x1,t log x2,t 1)0 . We emphasize that the simple form of functions comprising the optimal instrument is due to the assumption of lognormality and treat the minimum asymptotic variance figures as benchmarks rather than the targets. For the two-period model, the optimal instrument yields the following minimal asymptotic variances for estimates of β and α: 7.121 × 10−3 and 6.487, respectively. As follows
12
from the Appendix, the optimal instrument is a linear function of all lags of log x1,t and log x2,t . The efficiency bound can be attained only asymptotically by raising the number of lags of the log variables as the sample size increases.
4.2
Formula for asymptotic variance
All four estimators are asymptotically normal with asymptotic variance equal to −1 Σ = Q0 V −1 Q , where Q = E [mθt ] , and in the one-period problem V = E [mt m0t ] , while in the two-period problem V = E [mt m0t ] + E [mt−1 m0t ] + E mt m0t−1 .
4.3 4.3.1
Asymptotic biases quantified One-period model
Table 1A shows how asymptotic variance is affected by the presence of a constant in the vector of instruments. It appears that inclusion of a constant brings sharp efficiency gains, much higher gains than are attainable by inclusion of several additional regular instruments. Therefore, in what follows we will always presume that a constant is included in the list of instruments. Table 1B shows dependence of the asymptotic variance on the composition of the instrument vector. One can see that the asymptotic variance is stable over instrument combination and quickly reaches an asymptote when either of the lag lengths rise, except that not using x1,t , the rate of return, as an instrument increases the asymptotic variance greatly for both parameters. Further, provided that x1,t is included, inclusion of x2,t allows to nearly reach the variance asymptote. Note that the asymptote with asymptotic variances 6.880 × 10−3 for β and 6.222 for α is not that far from the asymptotic variance bounds 6.878 × 10−3 and 6.202, respectively, differing at most by meagre 0.3%. All this means that for the sake of attaining more asymptotic efficiency using nonlinear functions of the basic instruments or many lags of them is not worth it. 4.3.2
Two-period model
Table 2A shows how asymptotic variance is affected by the presence of a constant in the vector of instruments. As in the one-period problem, inclusion of a constant brings efficiency gains, although not as striking, but comparable to those attainable by inclusion of quite a few additional regular instruments. In what follows we will always presume that a constant is included in the instrument set. Table 2B shows dependence of the asymptotic variance on the composition of the instrument vector. In contrast to the one-period problem, the asymptotic variance for both
13
parameters can be indefinitely decreased by adding more and more lags of either variable, although the marginal benefit from each addition is relatively small provided that x1,t , the rate of return, is included. For instance, adding two more lags (which often occurs in empirical practice) of x1,t and x2,t to the instrument set (x1,t x2,t 1)0 reduces the asymptotic variances for β and α by 1.24% and 2.99%, respectively. Adding twenty even more lags (which is inconceivable in practice) reduces the asymptotic variances further by 0.05 − 0.20%%. However, again in contrast to the one-period problem, all these gains fall short of what the efficiency bound provides: attaining it could deliver about 12% of efficiency gains for β and about 50% – for α. This would raise a question of how to attain the bound, but even switching from levels to logs of instruments keeping the same number of them would certainly reduce asymptotic variance more significantly.
5 5.1
Asymptotic bias Formulae for asymptotic biases
Let us denote Ξ = ΣQ0 V −1 , 5.1.1
Ω = V −1 − V −1 QΞ.
One-period model
Let Bias0 = Ξ
+∞ X
E [mθt Ξmt−s ] − E
"
s=0
Bias1 (s) = Bias2 (s) =
2 X ∂mθt Σ j=1
−ΣE [m0θt Ωmt−s ] , ΞE [mt m0t Ωmt−s ] , " 2 X ∂mt m0 t
Bias3 = ΞE
j=1
∂θj
−1
ΩV Q (Q0 Q)
∂θj 2
ej
#!
,
#
ej .
From the second order asymptotic analysis given in Anatolyev (2002) it follows that the second order biases of ˆθGM M , ˆθIGM M , ˆθEL and ˆθSEL are (omitting the order factor 1/T ) BGM M = Bias0 + BIGM M = Bias0 + BEL = Bias0 +
+∞ X
s=0 +∞ X
s=0 +∞ X
Bias1 (s) + Bias1 (s) + Bias1 (s) +
s=1
+∞ X
s=0 +∞ X
s=0 +∞ X
Bias2 (s) + Bias3 , Bias2 (s) , Bias2 (s) ,
s=1
BSEL = Bias0 . These formulae are special cases of more general ones given in Anatolyev (2002); some simplifications occurred due to the martingale difference structure of the moment function and conditional homoskedasticity. It is clear that the GMM type estimators possess more
14
bias components, the EL type estimators – fewer; also there are differences between different versions of estimators of the same type (that the SEL possesses fewest components is the motivation for considering it in the situation where no smoothing is necessary). This suggests ranking the estimators by how many bias components they have; for a general discussion, see Anatolyev (2002). Here our purpose is to see quantitative relations among biases of the four estimators. First we will analyze the bias components separately, then we will compare total biases of the four estimators. 5.1.2
Two-period model
Let Bias0 = Ξ
+∞ X
E [mθt Ξmt−s ] − E
s=−1
"
2 X ∂mθt Σ j=1
∂θj 2
ej
#!
,
[m0θt Ωmt−s ] ,
Bias1 (s) = −ΣE Bias2 (s) = ΞE mt (mt + mt−1 + mt+1 )0 Ωmt−s , " 2 # 0 0 X ∂mt m0 ∂m m ∂m m t t−1 t−1 t −1 t Bias3 = ΞE + + ΩV Q (Q0 Q) ej . ∂θ ∂θ ∂θ j j j j=1 From the second order asymptotic analysis given in Anatolyev (2002) it follows that the second order biases of ˆθGM M , ˆθIGM M , ˆθCEL and ˆθSEL are (omitting the order factor 1/T ) BGM M = Bias0 + BIGM M = Bias0 +
+∞ X
Bias2 (s) + Bias3 ,
s=−1
+∞ X
s=−2
+∞ X
+∞ X
Bias2 (s) ,
Bias1 (s) + Bias1 (s) +
s=−1
BCEL = Bias0 +
+∞ X
s=−2
Bias1 (s) + Bias2 (−2) +
s=2
+∞ X
Bias2 (s) ,
s=2
BSEL = Bias0 . Again, these formulae are special cases of more general ones given in Anatolyev (2002).
5.2
Bias components quantified
For comparability, we will discuss the one- and two-period problems at the same time. The results are contained in tables 3 and 5. It turns out that sometimes the tendencies are similar across the problems, but sometimes they are dissimilar or even opposite. We associate the latter phenomenon with different behavior of sandwich components of the asymptotic variances in the two problems. From table 3 and 5 it is immediately obvious that not using x1,t , the rate of return, as an instrument increases almost all bias components overwhelmingly for both parameters. Therefore, we will ignore such instrument combinations (i.e. first lines in all panels) for most of the discussion in this subsection. The component Bias0 common for all estimators results from nonlinearity in parameters of the moment function and randomness of its derivative. Tables 3A and 5A show dependence
15
of the first component of Bias0 on the composition of the instrument vector. For the oneperiod problem, once both types of instruments are in business, this bias component is pretty stable over instrument combinations and quickly reaches an asymptote when both lag lengths rise. For the two-period problem, the dependence is much more erratic, and does not exhibit a clear pattern. It is interesting that not using x2,t , the consumption growth, turns out to be beneficial for the one-period problem, but harmful for the two-period problem, while not using x1,t , the rate of return, as an instrument increases this bias component overwhelmingly for both parameters. Tables 3B and 5B show dependence of the second component of Bias0 on the composition of the instrument vector. Here the tendencies are identical for both problems. The bias component is extremely stable (nearly constant) over instrument combinations. For estimation of α, this bias component is negligeable in contrast to the previously discussed one, but this is not true for estimation of β. The constituents Bias1 (s) are nonzero only under overidentification and result from randomness of the derivative of the moment function. This bias component appears as an infinite sum, but it has fewer leading terms for non-smoothed EL estimators than for GMM type estimators. Tables 3C and 5C show dependence of Bias1 (s) summed appropriately for non-smoothed EL estimators, tables 3D and 5D – for GMM type estimators, on the composition of the instrument vector. It is clear that the former figures are much less sizable than the latter ones, which underlines the beneficial use of EL-type estimators over GMM-type ones. Also, this shows that the biggest role in the Bias1 component play contemporaneous or low lag correlations of the moment function and its derivative, in contrast to high lag correlations. Interestingly, the bias component Bias1 increases with an increase of a number of instruments, except for the one-period problem and EL estimator. The tendency of such bias component to grow with the number of instrument was first noted in Newey and Smith (2002) in the IID context. The constituents Bias2 (s) exhibit similar tendencies as Bias1 (s): they appear in infinite sums, but without some leading terms for non-smoothed EL estimators; they are also nonzero only under overidentification, but result from skewness of the moment function. Tables 3E and 5F show dependence of Bias2 (s) summed appropriately for non-smoothed EL estimators, tables 3E and 5F – for GMM-type estimators, on the composition of the instrument vector. One can notice that the bias component Bias2 tends not to increase with an increase in a number of instruments for the non-smoothed EL estimator, but does increase for the GMM-type estimators. Finally, the component Bias3 is present only in the two step GMM estimator and is due to asymptotic inefficiency of the preliminary estimator of parameters obtained at the first step. Tables 3G and 5G show dependence of the component Bias3 on the composition of the instrument vector. One notices immediately that this component is negligeable for the one-period problem; for the two-period problem it is much more sizable, but still is able to drown among the other components. A relatively low impact of the first step may well be due to a lucky choice of the initial weight matrix (identity).
5.3 5.3.1
Asymptotic biases quantified One-period model
Tables 4A, 4B, 4C, 4D show dependence of total second order biases for the two-step GMM, iterative GMM, basic EL and smoothed EL estimators, respectively, on the composition of the instrument vector.
16
It is easily seen that the biases of both GMM estimators are nearly identical and, as far as estimation of β is concerned, far exceed those of both EL estimators. This occurs because they are dominated by the component Bias2 with constituents summed over s from 0 to +∞, or, to be more exact, by the leading term in this sum that corresponds to s = 0 (compare tables 3E and 3F). As a result, the biases of GMM estimators for estimation of β are big and quickly rise when more instruments are exploited. In contrast, the biases of EL estimators for estimation of β are small and stable over instrument combinations; in addition, they are of the opposite sign. As far as estimation of α is concerned, the story is quite different. The biases of EL-type estimators are mostly driven by the first component of Bias0 which tend to be a little greater than 1 once both type of instruments are included. In contrast, the biases of GMM-type estimators are driven by the first component of Bias0 and the leading term in Bias1 , the former and the latter entering with opposite signs. As a result, cancellations occur and the biases of GMM-type estimators turn out to be smaller in absolute value than those for EL-type estimators. It is likely though that increasing the number of instruments beyond that provided by the tables will eventually make the leading term in Bias1 dominant, and the biases of GMM estimators will then exceed those for EL estimators. 5.3.2
Two-period model
Tables 6A, 6B, 6C, 6D show dependence of the total second order biases for the two-step GMM, iterative GMM, corrected EL and smoothed EL estimators, respectively, on the composition of the instrument vector. Most of the tendencies encountered in the one-period model can be observed here too, but also there are specific ones. The biases of the iterative GMM estimator tends to be larger than those of the twostep GMM estimator. The reason is that the component Bias3 comes into play for large instrument vectors, and, being of the opposite sign to the prevailing GMM bias components, partly offsets them. For estimation of β, the EL estimators exhibit much less biasedness than the GMM estimators, because the leading terms in Bias1 and Bias2 that corresponds to s from −1 to +1 (compare tables 5C and 5D, and 5E and 5F) are particularly large. For estimation of α, although the same is true for the leading terms in Bias1 and Bias2 , the large and positive first component of Bias0 partly offsets the previously discussed components and thus makes good for the biases of the corrected EL and to some extent of the iterative GMM estimator.
6
Asymptotic bias–variance trade-off
Applied researchers are guided by asymptotic variances when choosing and implementing estimators in such contexts as CAPM. Often they discover, or are warned by the simulation literature, that the actual stochastic properties of these estimators in practice substantially deviate from predictions of asymptotic theory. Most often researchers complain that the estimators are biased even though the asymptotic normality of estimators implies asymptotic unbiasedness. The discovered bias is a feature of higher order asymptotics and is of lower order of magnitude, but the impact of this bias in practice turns out to be comparable with usual uncertainty implied by the first order asymptotic variance. Thus, it makes sense to analyze the trade-off between the first order asymptotic variance and the second order asymptotic bias. This analysis will ignore other higher order asymptotic terms, the third
17
order asymptotic variance in particular. Even though such terms may be of the same order of magnitude with respect to the sample size as the asymptotic bias, applied researchers seldom complain of their large influence.The latter fact is a determinant in the logic behind the proposed analysis. For a given estimator, instrument vector, and sample size T, we define our efficiency measure as 1 M SE = Σ + B 2 , T where Σ is the first order asymptotic variance common to all estimators, and B is the second order asymptotic bias specific for each estimator. Next, we define the optimal bold strategy as an instrument combination that delivers the minimal MSE over all allowable instrument combinations, or several such if their corresponding MSEs are practically indistinguishable. We have seen from the previous evidence that often the bias may be small as a result of cancellations even though the bias has a few components that are relatively big in absolute value. The bold strategy defined above ignores these effects and takes final numbers at face value. A more delicate consideration would instead rely on the sum of absolute values of involved bias components. For example, for the two-step GMM estimator in the case of the one-period problem the bias measure would be rather computed as +∞ +∞ X X BGM M = |Bias01 | + |Bias02 | + Bias1 (s) + Bias2 (s) + |Bias3 | , s=−1 s=−2
where the first and second components of Bias0 are also separated. When B is computed in this manner, and the MSE is computed using the same formula, we will talk of the optimal prudent strategy. In the experiments to follow, we regard two MSEs indistinguishable if they have the same digits up to three places after the decimal point. We allow instrument combinations where the number of lags of either variable is no larger than 6. Of course, the evidence below is only suggestive and shows tendencies rather than gives actual recommendations on which instrument combinations are optimal.
6.1
One-period model
Tables 7A and 7B show optimal strategies for all four estimators and two sample sizes, 30 and 100. One notices immediately that the sample size has very little or no impact on optimal strategies, as well as that often the set of optimal strategies is large. The reason for these phenomena lies in a very weak dependence of asymptotic variance on instrument vectors. Both GMM estimators yield practically identical results, although this may happen because the identity initial weight matrix turns out to be a lucky choice. EL-type estimators also give similar results as far as the bold strategy is concerned, but with the prudent strategy, the discrepancy in MSEs of EL and SEL estimators may be pretty sizable, and the optimal strategies may quite differ. Switching from GMM-type estimators to EL-type ones allows one to exploit more information in the lags of basic instruments to achieve lower MSE. This is particularly transparent for the prudent strategy where the MSE often falls significantly with such a switch. For the bold strategy this is not always the case: for estimation of α the MSE may increase instead because of fortunate cancellations among many bias components for GMM estimators. Interestingly, for all estimators and both types of strategies (except
18
basic EL and prudent strategy), more lags of x1,t , the rate of return, are used than lags of x2,t , the consumption growth, in optimal strategies. The general conclusion is that a judicious choice of instrument vectors in the case of oneperiod problems is making it pretty restrictive, sometimes even exactly identifying, if the estimator to be used is GMM or its variants, which is overwhelmingly the case in practice. However, switching attention to EL-type estimators, especially to the SEL, may well pay off in terms of actual properties of the estimators.
6.2
Two-period model
Tables 8A and 8B show optimal strategies for all estimators and same two sample sizes. In contrast to the one-period model, the sample size has a significant impact on optimal strategies, and sets of optimal strategies typically include one element. The GMM estimators yield sometimes different although still similar results, the iterative GMM outperforming the two-step GMM if the prudent strategy is adopted, and vice versa if the bold strategy is in effect. But much more efficiency can be attained from the move from GMM-type estimators to EL-type ones, at least if one is equipped with the prudent strategy. In this case the MSE falls significantly, and optimal instrument combinations exploit a reasonably moderate number of both instruments. For all estimators, strategy types, and sample sizes, these optimal strategies exploit many more lags than in the case of the one-period model. Also, in contrast to the one-period model, smoothing the moment function rather than correcting the EL first order conditions may have much positive effect on efficiency, exclusively for the prudent strategy though. Interestingly, for all estimators and both types of strategies, an equal number or a little more lags of x1,t , the rate of return, are used than lags of x2,t , the consumption growth, in optimal strategies, again at variance with the one-period model. The general conclusion is that a judicious choice of instrument vectors in the case of multi-period problems is pretty liberal, especially if the estimator to be used is one of EL modifications, especially SEL. These estimators are expected to behave more predictably that the usually used GMM or its variants.
7
Concluding remarks
We have quantified asymptotic variances and biases of various moment condition based estimators of preference parameters in one-period and two-period one-return C–CAPM of Hansen and Singleton (1982). We compute exactly necessary moments under the assumption that the vector comprising log rate of return and log consumption growth follows a VAR of first order with normal innovations. Our approach cannot answer all questions that can be possibly posed by an applied researcher, and the answers given are only suggestive. Thus, the results of this study should be considered together with those provided by simulation studies cited above. Econometric theoretical approaches such as the weak identification theory of Stock and Wright (2000) and Wright (2002) are also able to provide useful insight.
19
References Anatolyev, S. (2002) Empirical Likelihood, GMM, Serial Correlation, and Asymptotic Bias. Manuscript, New Economic School, Moscow. Attanasio, O.P. and H. Low (2000) Estimating Euler equations. NBER Technical Working Paper 253. Brown, D.P. and M.R. Gibbons (1985) A simple econometric approach for utility-based asset pricing models. Journal of Finance 40, 359–381. Carroll, C.D. (2001) Death to the log-linearized consumption Euler equation! (And very poor health to the second-order approximation). Working Paper, Johns Hopkins University. Dunn, K., K. Singleton (1986) Modeling the term structure of interest rates under nonseparable utility and durability of goods. Journal of Financial Economics 17, 27–55. Eichenbaum, M.S., L.P. Hansen and K.J. Singleton (1988) A time series analysis of representative agent models of consumption and leisure choice under uncertainty. Quarterly Journal of Economics 103, 51–78. Epstein, L.G., S.E. Zin (1991) Substitution, risk aversion, and the temporal behavior of consumption and asset returns: An empirical analysis. Journal of Political Economy 99, 263–286. Ferson, W.E., and Constantinides, G.M. (1991) Habit persistence and durability in aggregate consumption. Journal of Financial Economics 29, 199–240. Finn, M.G., D.L. Hoffman and D.E. Schlagenhauf (1990) Intertemporal asset-pricing relationships in barter and monetary economies: An empirical analysis. Journal of Monetary Economics 25, 431–451. Grossman, S., A. Melino and R.J. Shiller (1987) Estimating the continuous time consumptionbased asset pricing model. Journal of Business and Economic Statistics 5, 315–327. Hansen, L.P. (1982) Large sample properties of generalized method of moments estimators. Econometrica 50, 1029–1054. Hansen, L.P. (1985) A method for calculating bounds on the asymptotic covariance matrices of generalized method of moments estimators. Journal of Econometrics 30, 203–238. Hansen, L.P., Heaton, J. and A. Yaron (1996) Finite-sample properties of some alternative GMM estimators. Journal of Business and Economic Statistics 14, 262–280. Hansen, L.P. and Singleton K.J. (1982) Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50, 1269–1286. Hansen, L.P. and K.J. Singleton (1983) Stochastic consumption, risk aversion, and the temporal behavior of asset returns. Journal of Political Economy 91, 249–265. Hansen, L.P. and Singleton K.J. (1996) Efficient estimation of linear asset pricing models with moving-average errors. Journal of Business and Economic Statistics 14, 53–68. Holman, J. (1998) GMM estimation of a money-in-the-unility-function model: The implications of functional forms. Journal of Money, Credit, and Banking 30, 679–698. Hotz, V.J., F.E. Kydland and G.L. Sedlacek (1988) Intertemporal preferences and labor supply. Econometrica 56, 335–360. Imbens, G.W. (1997) One-step estimators for over-identified generalized method of moments models. Review of Economic Studies 64, 359–383. Kitamura, Y. (1997) Empirical likelihood methods with weakly dependent processes. Annals of Statistics 25, 2084–2102. Kocherlakota N.R. (1990) On tests of representative consumer asset pricing models. Journal of Monetary Economics 26, 285–304.
20
Mao C.-S. (1990) Hypothesis testing and finite sample properties of generalized method of moments estimators: A Monte Carlo study. Federal Reserve Bank of Richmond, Working Paper 90–12. Mankiw, N.G., J.J. Rotemberg and L.H. Summers (1985) Intertemporal substitution in macroeconomics. Quarterly Journal of Economics 100, 225–251. Marshall D.A. (1992) Inflation and asset returns in a monetary economy. Journal of Finance 47, 1315–1342. Newey, W. K. and R.J. Smith (2001) Higher order properties of GMM and generalized empirical likelihood estimators. Working Paper, MIT and University of Bristol. Qin, J. and J. Lawless (1994) Empirical likelihood and general estimating equations. Annals of Statistics 22, 300–325. Singleton, K.J. (1990) Specification and estimation of intertemporal asset pricing models. In: B.M. Friedman and F.H. Hahn (eds.) Handbook of Monetary Economics, Vol. 1, Ch. 12. Smith, D.C. (1999) Finite sample properties of tests of the Epstein–Zin asset pricing model. Journal of Econometrics 93, 113–148. Smith, R.J. (1997) Alternative semi-parametric likelihood approaches to generalized method of moments estimation. Economic Journal 107, 503–519. Tauchen, G. (1986) Statistical properties of generalized method-of-moments estimators of structural parameters obtained from financial market data. Journal of Business and Economic Statistics 4, 397–416. Weber, C.E. (2000) ”Rule-of-thumb” consumption, intertemporal substitution, and risk aversion. Journal of Business and Economic Statistics 18, 497–502. Weber, C.E. (2002) Intertemporal non-separability and ”rule of thumb” consumption. Journal of Monetary Economics 49, 293–308. Wright, J.H. (2002) Detecting lack of identification in GMM. Econometric Theory, forthcoming.
21
Table 1 Asymptotic variances of estimators, one-period problem 1A. Impact of presence of constant parameter β, ×10−3 α (nl1 , nl2 ) (1, 1) (1, 2) (2, 1) (2, 2) (1, 1) (1, 2) (2, 1) (2, 2) constant 6.880 6.880 6.880 6.880 6.222 6.222 6.222 6.222 no constant 7.230 7.075 6.918 6.915 8.277 7.364 6.443 6.428 1B. Impact of instrument composition parameter nl1 ↓ nl2 → 0 1 −3 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ 23.381 23.353 23.353 23.353 6.892 6.880 6.880 6.880 6.880 6.881 6.880 6.880 6.880 6.880 6.881 6.880 6.880 6.880 6.880 6.881 6.880 6.880 6.880 6.880 ∞ 140.139 139.916 139.916 139.916 6.318 6.222 6.222 6.222 6.222 6.226 6.222 6.222 6.222 6.222 6.225 6.222 6.222 6.222 6.222 6.225 6.222 6.222 6.222 6.222
Table 2 Asymptotic variances of estimators, two-period problem 2A. Impact of presence of constant parameter β, ×10−3 α (nl1 , nl2 ) (1, 1) (1, 2) (2, 1) (2, 2) (1, 1) (1, 2) (2, 1) (2, 2) constant 8.185 8.160 8.167 8.136 13.559 13.399 13.445 13.241 no constant 8.349 8.342 8.348 8.342 14.280 14.269 14.280 14.269 2B. Impact of instrument composition parameter nl1 ↓ nl2 → 0 1 β, ×10−3 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ 56.300 51.708 51.312 51.276 8.273 8.185 8.160 8.158 8.158 8.185 8.167 8.136 8.132 8.132 8.168 8.137 8.130 8.123 8.122 8.166 8.135 8.123 8.122 8.120 ∞ 330.031 299.828 297.227 296.986 14.137 13.559 13.399 13.383 13.382 13.558 13.445 13.241 13.212 13.210 13.447 13.245 13.197 13.154 13.147 13.434 13.229 13.155 13.144 13.134
22
Table 3 Components of asymptotic bias, one-period problem 3A. First component of Bias0 parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −17.011 −14.790 −14.753 −14.754 −0.296 −0.619 −0.619 −0.619 −0.619 −0.295 −0.631 −0.642 −0.642 −0.642 −0.268 −0.638 −0.641 −0.641 −0.641 −0.264 −0.639 −0.641 −0.641 −0.641 ∞ 15.885 13.885 13.851 13.853 0.827 1.119 1.119 1.119 1.119 0.8269 1.130 1.139 1.139 1.139 0.802 1.136 1.139 1.139 1.139 0.799 1.136 1.139 1.139 1.139
3B. Second component of Bias0 parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ −0.279 −0.275 −0.275 −0.275 ∞ −0.008 −0.008 −0.008 −0.008
1 −6.360 −0.275 −0.275 −0.275 −0.275 −0.015 −0.008 −0.008 −0.008 −0.008
2 −6.350 −0.275 −0.275 −0.275 −0.275 −0.015 −0.008 −0.008 −0.008 −0.008
3 −6.350 −0.275 −0.275 −0.275 −0.275 −0.015 −0.008 −0.008 −0.008 −0.008
4 −6.350 −0.275 −0.275 −0.275 −0.275 −0.015 −0.008 −0.008 −0.008 −0.008
3C. Component Bias1 with constituents summed over s from 1 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 −0.327 −0.339 −0.339 ∞ 0 0.295 0.305 0.305
1 0 −0.001 0.008 0.013 0.013 0 0.001 −0.008 −0.012 −0.012
23
2 2.495 −0.001 0.013 0.013 0.013 −2.248 0.001 −0.012 −0.012 −0.012
3 2.482 −0.001 0.013 0.013 0.013 −2.236 0.001 −0.012 −0.012 −0.012
4 2.482 −0.001 0.013 0.013 0.013 −2.236 0.001 −0.012 −0.012 −0.012
Table 3 continued Components of asymptotic bias, one-period problem 3D. Component Bias1 with constituents summed over s from 0 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 0.324 0.581 0.786 ∞ 0 −0.292 −0.523 −0.708
1 0 0.196 0.700 1.110 1.319 0 −0.177 −0.631 −1.000 −1.188
2 3 4 8.048 12.411 16.754 0.397 0.591 0.784 1.173 1.367 1.561 1.422 1.663 1.857 1.653 1.866 2.067 −7.250 −11.181 −15.093 −0.358 −0.533 −0.706 −1.057 −1.232 −1.406 −1.281 −1.499 −1.672 −1.489 −1.681 −1.862
3E. Component Bias2 with constituents summed over s from 1 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 0.321 0.3431 0.3391 ∞ 0 −0.289 −0.309 −0.306
1 0 0.001 −0.004 0.037 0.0391 0 −0.001 0.003 −0.034 −0.036
2 3 4 0.036 0.068 0.068 0.001 0.001 0.001 0.019 0.019 0.019 0.025 0.026 0.026 0.025 0.026 0.026 −0.033 −0.061 −0.061 −0.001 −0.001 −0.001 −0.018 −0.0182 −0.018 −0.023 −0.0245 −0.024 −0.023 −0.024 −0.024
3F. Component Bias2 with constituents summed over s from 0 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ 0 1.879 3.777 5.648 0 1.858 3.729 5.601 7.473 2.167 3.708 5.602 7.474 9.346 4.058 5.618 7.477 9.350 11.222 5.925 7.492 9.349 11.222 13.094 ∞ 0 −0.006 −0.029 −0.028 0 0.013 0.014 0.014 0.014 −0.265 0.033 0.013 0.013 0.013 −0.282 −0.002 0.010 0.009 0.009 −0.278 −0.003 0.010 0.010 0.010
24
Table 3 continued Components of asymptotic bias, one-period problem 3G. Component Bias3 parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 −0.004 −0.003 −0.003 ∞ 0 0.004 0.003 0.002
1 0 −0.004 −0.005 −0.004 −0.003 0 0.003 0.005 0.004 0.003
2 −0.003 −0.003 −0.004 −0.003 −0.003 0.002 0.003 0.003 0.003 0.002
3 −0.002 −0.002 −0.003 −0.003 −0.002 0.002 0.002 0.003 0.002 0.002
4 −0.001 −0.002 −0.002 −0.002 −0.002 0.001 0.002 0.002 0.002 0.002
Table 4 Asymptotic bias of estimators, one-period problem 4A. Two-step GMM parameter nl1 ↓ nl2 → 0 1 β, ×10−2 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −23.371 −11.216 −4.916 1.297 −0.574 1.1571 3.229 5.296 7.361 1.916 3.497 5.855 7.922 9.988 4.093 5.810 7.979 10.095 12.161 6.171 7.894 10.083 12.169 14.243 ∞ 15.870 6.616 2.627 −1.283 0.819 0.950 0.770 0.595 0.421 0.266 0.528 0.091 −0.085 −0.259 −0.008 0.131 −0.136 −0.356 −0.530 −0.193 −0.060 −0.345 −0.538 −0.719 4B. Iterated GMM
parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −23.371 −11.213 −4.915 1.298 −0.574 1.160 3.232 5.299 7.363 1.920 3.502 5.859 7.925 9.990 4.096 5.814 7.983 10.098 12.163 6.173 7.897 10.086 12.172 14.245 ∞ 15.870 6.613 2.626 −1.284 0.819 0.947 0.768 0.593 0.419 0.262 0.524 0.087 −0.087 −0.261 −0.011 0.127 −0.139 −0.358 −0.532 −0.196 −0.0627 −0.347 −0.540 −0.721
25
Table 4 continued Asymptotic bias of estimators, one-period problem 4C. Basic EL parameter nl1 ↓ nl2 → 0 1 β, ×10−2 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −23.371 −18.608 −18.553 −18.555 −0.574 −0.894 −0.894 −0.894 −0.894 −0.576 −0.902 −0.884 −0.8844 −0.884 −0.538 −0.863 −0.878 −0.8764 −0.876 −0.538 −0.861 −0.878 −0.877 −0.877 ∞ 15.870 11.589 11.539 11.540 0.819 1.111 1.111 1.111 1.111 0.825 1.118 1.102 1.102 1.102 0.790 1.083 1.096 1.095 1.095 0.791 1.081 1.096 1.095 1.095 4D. Smoothed EL
parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −23.371 −21.140 −21.103 −21.105 −0.574 −0.894 −0.894 −0.894 −0.894 −0.570 −0.906 −0.916 −0.916 −0.916 −0.543 −0.913 −0.916 −0.916 −0.916 −0.538 −0.913 −0.916 −0.916 −0.916 ∞ 15.870 13.870 13.836 13.838 0.819 1.111 1.111 1.111 1.111 0.819 1.122 1.132 1.132 1.132 0.795 1.128 1.131 1.131 1.131 0.791 1.129 1.131 1.131 1.131
Table 5 Components of asymptotic bias, two-period problem 5A. First component of Bias0 parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −85.551 −53.011 −47.523 −49.423 −2.429 −1.597 −0.071 −0.542 −0.416 −2.677 −2.114 −0.474 −1.140 −0.969 −2.197 −1.186 −0.528 −1.352 −1.090 −2.376 −1.381 −0.651 −1.121 −0.760 ∞ 98.010 73.063 68.756 70.310 5.239 3.432 1.874 2.225 2.119 5.484 4.333 2.728 3.250 3.109 5.098 3.314 2.714 3.350 3.131 5.244 3.465 2.781 3.155 2.853
26
Table 5 continued Components of asymptotic bias, two-period problem 5B. Second component of Bias0 parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −18.336 −16.684 −16.542 −16.529 −1.060 −1.029 −1.020 −1.019 −1.019 −1.029 −1.023 −1.011 −1.010 −1.010 −1.023 −1.012 −1.009 −1.007 −1.007 −1.022 −1.011 −1.007 −1.007 −1.006 ∞ −0.042 −0.040 −0.040 −0.040 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.021 −0.020 −0.021 −0.021 −0.021 −0.020 −0.020
5C. Component Bias1 with constituents summed over s from 2 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 −0.000 0.090 0.529 ∞ 0 0.000 −0.073 −0.429
1 0 −0.000 −0.000 0.153 0.636 0 0.000 0.000 −0.124 −0.516
2 −0.000 0.000 0.000 0.093 0.867 0.000 −0.000 −0.000 −0.075 −0.703
3 −0.978 0.310 0.488 0.699 0.945 0.793 −0.251 −0.396 −0.567 −0.766
4 5.390 0.651 0.875 1.336 1.768 −4.371 −0.528 −0.709 −1.084 −1.434
5D. Component Bias1 with constituents summed over s from −1 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 2.742 4.299 5.769 ∞ 0 −2.224 −3.486 −4.679
1 2 3 4 0 66.888 101.334 137.157 2.886 3.636 5.266 6.900 2.963 3.688 5.182 6.843 3.849 4.097 5.439 7.083 5.318 4.912 5.586 7.077 0 −54.247 −82.183 −111.236 −2.341 −2.949 −4.271 −5.596 −2.403 −2.991 −4.203 −5.550 −3.121 −3.323 −4.411 −5.745 −4.313 −3.984 −4.531 −5.739
27
Table 5 continued Components of asymptotic bias, two-period problem 5E. Component Bias2 with constituents summed over s equal −2 and from 2 to +∞ parameter nl1 ↓ nl2 → 0 1 2 3 4 0 ∞ 0 −10.886 −5.786 −9.033 1 0 0.000 1.930 1.763 2.214 −2 β, ×10 2 −0.909 −0.619 1.479 1.048 1.541 3 −0.326 0.185 1.289 0.630 1.206 4 −0.497 0.094 1.540 0.979 1.857 0 ∞ 0 8.832 4.669 7.270 1 0 −0.000 −1.594 −1.492 −1.891 α 2 0.726 0.496 −1.242 −0.931 −1.365 3 0.220 −0.204 −1.102 −0.619 −1.129 4 0.319 −0.171 −1.369 −0.922 −1.690 5F. Component Bias2 with constituents summed over s from −2 to +∞ parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 ∞ 0 0.888 1.956 3.153 ∞ 0 1.095 1.895 2.571
1 0 0.287 0.940 1.998 3.270 0 1.766 1.633 2.378 2.994
2 3 4 2.082 5.160 6.950 1.026 2.273 3.358 1.585 2.911 3.894 1.881 3.636 4.472 3.364 3.902 4.872 0.291 −0.477 −0.235 2.888 3.568 4.376 2.906 3.517 4.407 3.072 3.417 4.418 3.440 3.609 4.583
5G. Component Bias3 parameter nl1 ↓ nl2 → 0 1 2 3 4 0 ∞ 0 5.595 −20.462 −28.114 1 0 −0.089 −0.766 −0.952 −1.052 −2 β, ×10 2 0.756 0.478 −0.579 −1.048 −1.317 3 −0.534 −0.789 −1.201 −1.614 −1.821 4 −0.880 −1.112 −1.524 −1.714 −1.904 0 ∞ 0 −4.535 16.5908 22.792 1 0 0.065 0.591 0.728 0.801 α 2 −0.561 −0.352 0.455 0.806 1.006 3 0.443 0.634 0.943 1.255 1.408 4 0.692 0.867 1.182 1.322 1.464
28
Table 6 Asymptotic bias of estimators, two-period problem 6A. Two-step GMM parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −103.887 4.871 21.967 50.042 −3.489 0.458 2.805 5.026 7.771 0.679 1.244 3.208 4.895 7.442 2.501 2.860 3.240 5.102 7.638 4.644 5.084 5.095 5.647 8.280 ∞ 97.968 14.532 2.646 −18.409 5.218 2.901 2.383 2.229 1.680 3.772 3.191 3.078 3.350 2.952 3.930 3.185 3.385 3.589 3.191 3.808 2.992 3.400 3.534 3.141 6B. Iterated GMM
parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −103.887 −0.724 42.429 78.156 −3.489 0.5472 3.571 5.978 8.823 −0.076 0.766 3.787 5.943 8.758 3.035 3.649 4.441 6.716 9.459 5.524 6.196 6.618 7.361 10.184 ∞ 97.968 19.067 −13.944 −41.201 5.218 2.836 1.792 1.500 0.879 4.334 3.543 2.623 2.544 1.946 3.487 2.550 2.442 2.334 1.784 3.116 2.125 2.217 2.212 1.677 6C. Corrected EL
parameter nl1 ↓ nl2 → 0 1 2 3 4 0 ∞ −103.887 −80.581 −70.829 −69.594 1 −3.489 −2.625 0.839 0.512 1.430 −2 β, ×10 2 −4.615 −3.755 0.006 −0.614 0.437 3 −3.456 −1.859 −0.156 −1.031 0.446 4 −3.366 −1.661 0.748 −0.204 1.859 0 ∞ 97.968 81.855 74.179 73.169 1 5.218 3.411 0.260 0.461 −0.321 α 2 6.189 4.808 1.466 1.903 1.014 3 5.224 2.965 1.516 2.143 0.898 4 5.113 2.757 0.689 1.446 −0.291
29
Table 6 continued Asymptotic bias of estimators, two-period problem 6D. Smoothed EL parameter nl1 ↓ nl2 → 0 1 −2 β, ×10 2 3 4 0 1 α 2 3 4
0 1 2 3 4 ∞ −103.887 −69.695 −64.065 −65.952 −3.489 −2.625 −1.091 −1.562 −1.435 −3.706 −3.136 −1.485 −2.150 −1.979 −3.220 −2.198 −1.538 −2.359 −2.097 −3.398 −2.392 −1.658 −2.128 −1.765 ∞ 97.968 73.023 68.7161 70.270 5.218 3.411 1.854 2.204 2.098 5.463 4.313 2.708 3.229 3.089 5.078 3.294 2.693 3.329 3.111 5.224 3.444 2.761 3.134 2.833 Table 7
Asymptotic bias–variance tradeoff, one-period problem 7A. Optimal bold strategies for various estimators and sample sizes parameter estimator GMM −3 β, ×10 IGMM EL SEL GMM α
IGMM EL SEL
T = 30 strategy (1, 1) (1, 1) (2 ÷ 6, 0) (2 ÷ 6, 0) (3 ÷ 4, 1) , (2 ÷ 3, 2) , (2, 3) (3 ÷ 4, 1) , (2 ÷ 3, 2) , (2, 3) (3 ÷ 6, 0) (3 ÷ 6, 0)
T = 100 MSE strategy 6.885 (1, 1) 6.885 (1, 1) 6.882 (2 ÷ 6, 0) , (1 ÷ 6, 1 ÷ 6) 6.882 (2 ÷ 6, 0) , (1 ÷ 6, 1 ÷ 6) (3 ÷ 5, 1) , (2 ÷ 3, 2) , 6.222 (2, 3 ÷ 4) , (1, 5 ÷ 6) (3 ÷ 5, 1) , (2 ÷ 3, 2) , 6.222 (2, 3 ÷ 4) , (1, 5 ÷ 6) 6.246 (3 ÷ 6, 0) 6.246 (3 ÷ 6, 0)
MSE 6.882 6.882 6.881 6.881 6.222 6.222 6.231 6.231
7B. Optimal prudent strategies for various estimators and sample sizes T = 30 parameter estimator strategy GMM (1, 0) β, ×10−3 IGMM (1, 0) EL (1 ÷ 6, 1 ÷ 6) SEL (2 ÷ 6, 0) GMM (1, 1) α IGMM (1, 1) EL (1, 1 ÷ 6) SEL (4 ÷ 6, 0)
MSE 6.893 6.893 6.883 6.882 6.280 6.280 6.264 6.246
30
T = 100 strategy (1, 1) (1, 1) (1 ÷ 6, 1 ÷ 6) (2 ÷ 6, 0) , (1 ÷ 6, 1 ÷ 6) (1, 1) (1, 1) (1, 1 ÷ 6) (3 ÷ 6, 0)
MSE 6.889 6.889 6.881 6.881 6.239 6.239 6.234 6.231
Table 8 Asymptotic bias–variance tradeoff, two-period problem 8A. Optimal bold strategies for various estimators and sample sizes T = 30 T = 100 parameter estimator strategy MSE strategy MSE GMM (3, 1) 8.164 (3, 2) 8.140 −3 β, ×10 IGMM (2, 1) 8.169 (3, 2) 8.149 EL (4, 3) 8.122 (4 ÷ 5, 3) 8.122 SEL (4, 4) 8.131 (4 ÷ 6, 4) 8.123 GMM (6, 6) 13.300 (6, 6) 13.180 α IGMM (6, 6) 13.139 (6, 6) 13.132 EL (6, 3) 13.135 (5, 4) , (6, 3) 13.134 SEL (6, 2) 13.394 (4, 4) 13.215 8B. Optimal prudent strategies for various estimators and sample sizes T = 30 T = 100 parameter estimator strategy MSE strategy MSE GMM (1, 1) 8.300 (2, 2) 8.190 β, ×10−3 IGMM (1, 2) 8.271 (2, 2) 8.182 EL (3, 2) 8.158 (3, 3) 8.137 SEL (4, 4) , (5, 2) 8.131 (4 ÷ 6, 4) 8.123 GMM (1, 0) 15.059 (2, 2) 14.070 α IGMM (1, 0) 15.059 (2, 2) 13.989 EL (3, 1) 13.693 (3, 2) 13.350 SEL (5, 2) 13.396 (4, 4) 13.217
31
A
Appendix
A.1
Basic expectations
If
u1 u2
∼N
µ1 ω 11 ω 12 , , µ2 ω 12 ω 22
then E [exp (u1 )] = exp (µ1 + .5ω 11 ) E [exp (u1 ) u2 ] = (µ2 + ω 12 ) exp (µ1 + .5ω 11 ) 2 E exp (u1 ) u22 = (µ2 + ω 12 ) + ω 22 exp (µ1 + .5ω 11 ) In particular, if ui = γ i U, i = 1, 2, and U ∼ N (EU , VU ) , then the above formulae may be applied with µi = γ i EU , ω ii = γ i VU γ 0i , i = 1, 2, and ω 12 = γ 1 VU γ 02 .
A.2
Derivative expectations
Denote VΣ (k) ≡ VU + ΦVU Φ0 + ... + Φk−1 VU Φk−10 + Φk VU Φk0 . 0
In all subsequent derivations, we make use of the fact that (1α) Φ = 0 and log β +(1α) EX +.5 (1α) VU (1α) = 0. Let ρ ≥ 0 be an integer, π = 0, 1, κ = 1, 2, and i1 , i2 = 1, 2; j1 = 0, 1, · · · , nli1 − 1; j2 = 0, 1, · · · , nli2 − 1. Let “◦” designate index when its value does not matter. It is easily seen that for j1 ≥ j2 S ρ ≡ E xρ1,t+1 xρα 2,t+1 = exp (ρ (1α) EX ) × E [exp (ρ (1α) (Xt − EX ))] Tiρ1 ,j1
≡ E xi1 ,t−j1 xρ1,t+1 xρα 2,t+1 =
Piρ1 ,j1 ,i2 ,j2
0 exp (ei1 + ρ (1α)) EX + .5ρ2 (1α) VU (1α) × E [exp (ei1 (Xt − EX ))]
≡ E xi1 ,t−j1 xi2 ,t−j2 xρ1,t+1 xρα 2,t+1 =
0
exp (ei1 + ei2 + ρ (1α)) EX + .5ρ2 (1α) VU (1α) + .5ei2 VΣ (j1 − j2 − 1) e0i2 ×E exp ei1 + ei2 Φj1 −j2 (Xt − EX )
On the basis of “Basic expectations” we compute the following moments: π ρ ρ ρα ρα κ Aπρ i1 ,j1 (κ) ≡ E xi1 ,t−j1 x1,t+1 x1,t+2 x2,t+1 x2,t+2 log (x2,t+1 x2,t+2 ) , κ Wiπρ (κ) ≡ E xπi1 ,t−j1 xρ1,t+1 xρα 2,t+1 log (x2,t+1 ) . 1 ,j1 Denote 0 c1 (j) ≡ e2 VU (1α) + e2 I2 − Φj+1 EX ,
0 c2 (j) ≡ e2 (2I2 + Φ) VU (1α) + e2 2I2 − (I2 + Φ) Φj+1 EX . Then β 2 A11 i1 ,j1 (1) β 2 A11 i1 ,j1 (2)
= c2 (j1 ) E [exp (ei1 Xt )] + E [exp (ei1 Xt ) (ve (j1 + 1) Xt )] 2 = c2 (j1 ) + e2 (VU + (I2 + Φ) VΣ (j1 ) (I2 + Φ0 )) e02 E [exp (ei1 Xt )] h i 2 +2c2 (j1 ) E [exp (ei1 Xt ) (ve (j1 + 1) Xt )] + E exp (ei1 Xt ) (ve (j1 + 1) Xt )
βA01 ◦,◦ (1)
= c1 (0) E [exp ((1α) Xt )] + E [exp ((1α) Xt ) (e2 (I2 + Φ) Xt )] 2 βA01 c1 (0) + e2 VU e02 E [exp ((1α) Xt )] ◦,◦ (2) = h i 2 +2c1 (0) E [exp ((1α) Xt ) (ve (0) Xt )] + E exp ((1α) Xt ) (ve (0) Xt )
32
βWi11 (1) 1 ,j1
= c1 (j1 ) E [exp (ei1 Xt )] + E exp (ei1 Xt ) e2 Φj1 +1 Xt 2 βWi11 (2) = c1 (j1 ) + e2 VΣ (j1 ) e02 E [exp (ei1 Xt )] 1 ,j1 h 2 i +2c1 (j1 ) E exp (ei1 Xt ) e2 Φj1 +1 Xt + E exp (ei1 Xt ) e2 Φj1 +1 Xt 0ρ W◦,◦ (κ)
κ
= E [exp (ρ (1α) Xt ) (e2 Xt ) ]
where ve (j) ≡ e2 (I2 + Φ) Φj . We also need P0,1,1 Γ0 = E [zt zt0 ] = P0,2,1 T00,1
P0,1,2 P0,2,2 T00,2
T0,1 T0,2 , 1
P1,1,1 0 Γ1 = E zt zt−1 = P1,2,1 T01,1
P1,1,2 P1,2,2 T01,2
T0,1 T0,2 1
where P0,i1 ,i2 P1,i1 ,i2
A.3
= Pi01 ,j1 ,i2 ,j2 j =0,···,nl −1, j =0,···,nl −1 T0,i1 = Ti01 ,j1 j =0,···,nl −1 1 i1 2 i2 1 i1
= Pi01 ,j1 ,i2 ,j2 j =0,···,nl −1, j =1,···,nl T1,i1 = Ti01 ,j1 j =1,···,nl 1
i1
2
i2
1
i1
Derivation of optimal instrument
For the one-period problem, because of conditional homoskedasticity, the optimal instrument is (Hansen 1985) ∂µt+1 1 |It . ζt = 2 E σm ∂θ But the first entry of " # ∂µt+1 x1,t+1 xα 2,t+1 E |It = E |It . ∂θ βx1,t+1 xα 2,t+1 log(x2,t+1 )
is β −1 ; the second entry is βE x1,t+1 xα 2,t+1 log(x2,t+1 )|It = βE [exp ((1α) Xt+1 ) (e2 Xt+1 ) |It ] = β exp ((1α) EX ) (E [exp ((1α) Ut )] e2 (EX + Φ (Xt − EX )) + E [exp ((1α) Ut ) (e2 Ut )]) = ν 1 (Xt − EX ) + ν 2 , where ν 1 = e2 Φ,
0 ν 2 = e2 VU (1α) + EX .
The efficiency bound equals Q−1 ζ∂µ , where 1 Qζ∂µ ≡ σ 2m E ζ t ζ 0t = 2 σm
β −2 β −1 ν 2
β −1 ν 2 ν 1 VX ν 01 + ν 22
.
For the two-period problem, because of conditional homoskedasticity, the optimal instrument is (Hansen 1985) % ζ t = −%m ζ t−1 + m δ t , γm where δt =
∞ X
i
(−%m ) E
i=0
∂µt+2+i |It . ∂θ
But the first entry of " # α 2βx1,t+1+i x1,t+2+i xα ∂µt+2+i 2,t+1+i x2,t+2+i E |It = E |It α ∂θ β 2 x1,t+1+i x1,t+2+i xα 2,t+1+i x2,t+2+i log(x2,t+1+i x2,t+2+i )
33
is 2β −1 ; the second entry is β 2 E [exp ((1α) (Xt+1+i + Xt+2+i )) e2 (Xt+1+i + Xt+2+i ) |It ] e2 2EX + (I2 + Φ) Φi+1 (Xt − EX ) × E [exp (2 (1α) Ut )] 2 = β exp (2 (1α) EX ) × +E [exp ((1α) Ut ) (e2 (2I2 + Φ) Ut )] × E [exp ((1α) Ut )] 0 = e2 (I2 + Φ) Φi+1 (Xt − EX ) + 2EX (1 + γ m ) + (2I2 + Φ) VU (1α) Thus δt
= E =
"
∞ X
i
(−%m )
i=0 −1
α βx1,t+1+i x1,t+2+i xα 2,t+1+i x2,t+2+i −1
2β (1 + %m ) ν 1 (Xt − EX ) + ν 2
2
β log(x2,t+1+i x2,t+2+i )
|It
#
.
where ν1
=
(1 + γ m ) e2 (I2 + Φ) Φ (I2 + %m Φ)
ν2
=
(1 + %m )
−1
−1
, 0 e2 (2I2 + Φ) VU (1α) + 2 (1 + γ m ) EX .
The efficiency bound equals Q−1 ζ∂µ , where −2 ∂µt+2 %m 4β −2 (1 + %m ) |I = Qζ∂µ ≡ E ζ t E t −1 γm 2β −1 (1 + %m ) ν 2 ∂θ0
A.4
−1
2β −1 (1 + %m ) ν 2 ν 1 VX ν 01 + ν 22
.
Computation of asymptotic variance
For the one-period model, the matrix of expected outer square of the moment function is Qmm = σ 2m Γ0 , and the matrix of expected derivatives of the moment function is −1 β T1 βW1 β −1 T2 . Q∂m = E zt x1,t+1 xα βW2 2,t+1 (1 β log(x2,t+1 )) = 01 β −1 βW◦,◦ (1) For the two-period model, the matrix of expected outer square of the moment function is Qmm = σ 2m Γ0 + γ 2m (Γ1 + Γ01 ) , and the matrix of expected derivatives of the moment function is 2β −1 T1 β 2 A1 α 2β −1 T2 . Q∂m = βE zt x1,t+1 x1,t+2 xα β 2 A2 2,t+1 x2,t+2 (2 β log(x2,t+1 x2,t+2 )) = −1 2 01 2β β A◦,◦ (1) In these matrices,
Ti1 = Ti01 ,j1 j
1 =0,···,nli1 −1
A.5 A.5.1
Wi1 = Wi11 (1) j 1 ,j1
1 =0,···,nli1 −1
Ai1 = A11 i1 ,j1 (1) j
1 =0,···,nli1 −1
Computation of asymptotic bias Computation of first component of Bias0
The key elements of the first component of Bias0 are: ρ1 ρ2 ρ1 α ρ2 α ρ3 ρ4 ρ3 α ρ4 α x1,t+1−s x1,t+2−s x2,t+1−s x2,t+2−s x1,t+1 x1,t+2 x2,t+1 x2,t+2 π π ρ ρ ρ ρ σ σ . Di11,j12,i21,j22 3 4 1 2 (s) ≡ E 1 2 ×xπi11,t−j1 xπi22,t−j2 −s log(xσ2,t+1 xσ2,t+2 ) for s ≥ 0, as well as π π 00ρ ρ σ 1 σ 2
3 4 Di11,j12,i2 ,−1
ρ3 ρ4 ρ3 α ρ4 α 1 2 (0) ≡ E xπi11,t−j1 xπi22,t+1 x1,t+1 x1,t+2 x2,t+1 x2,t+2 log(xσ2,t+1 xσ2,t+2 ) .
For the one-period model, we need Ξ
∞ X
E [mθt Ξmt−s ] .
s=0
34
Apart from the factor Ξ, the sth term E [mθt Ξmt−s ] is P P (1) (2)
11001000 11101010 11001010 ξ i2 ,j2 βDi11101000 (s) − D (s) + βξ βD (s) − D (s)
,j ,i ,j i ,j ,i ,j i ,j ,i ,j i ,j ,i ,j i ,j 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 2 2
i2 j2 (1) (2) 10101000 10001000 10101010 10001010
+ξ βDi1 ,j1 ,◦,◦ (s) − Di1 ,j1 ,◦,◦ (s) + βξ βDi1 ,j1 ,◦,◦ (s) − Di1 ,j1 ,◦,◦ (s) i1 ,j1 P P (2) (1) 01101000 01001000 01101010 01001010 ξ i2 ,j2 βD◦,◦,i (s) − D (s) + βξ βD (s) − D (s) ◦,◦,i2 ,j2 ◦,◦,i2 ,j2 ◦,◦,i2 ,j2 i2 ,j2 2 ,j2 i2 j2 (1) (2) 00101000 00001000 00101010 00001010 +ξ βD◦,◦,◦,◦ (s) − D◦,◦,◦,◦ (s) + βξ βD◦,◦,◦,◦ (s) − D◦,◦,◦,◦ (s)
For the two-period model, we need Ξ
∞ X
E [mθt Ξmt−s ] .
s=−1 th
Apart from the factor Ξ, the (s ≥ 0) term E [mθt Ξmt−s ] is β times P P (1) (2)
2ξ i2 ,j2 β 2 Di11111100 (s) − Di11001100 (s) + βξ i2 ,j2 β 2 Di11111111 (s) − Di11001111 (s)
1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2
i2 j2
(1) 2 10111100 (2) 2 10111111 10001111 10001100
+2ξ β Di1 ,j1 ,◦,◦ (s) − Di1 ,j1 ,◦,◦ (s) + βξ β Di1 ,j1 ,◦,◦ (s) − Di1 ,j1 ,◦,◦ (s) i1 ,j1 P P (1) (2) 2 01111111 01111100 01001100 01001111 2ξ i2 ,j2 β 2 D◦,◦,i (s) − D (s) + βξ β D (s) − D (s) ◦,◦,i2 ,j2 ◦,◦,i2 ,j2 ◦,◦,i2 ,j2 i2 ,j2 2 ,j2 i2 j2 (1) 2 00111100 (2) 2 00111111 00001100 00001111 +2ξ β D◦,◦,◦,◦ (s) − D◦,◦,◦,◦ (s) + βξ β D◦,◦,◦,◦ (s) − D◦,◦,◦,◦ (s)
th
Apart from the factor Ξ, the (s = −1) term E [mθt Ξmt+1 ] is β times P P (1) (2)
(0) − Di11001111 (0) (0) − Di11001100 (0) + βξ i2 ,j2 βDi11001211 2ξ i2 ,j2 βDi11001200
1 ,j1 ,i2 ,j2 −1 1 ,j1 ,i2 ,j2 −1 1 ,j1 ,i2 ,j2 −1 1 ,j1 ,i2 ,j2 −1 i2 j2 +2ξ (1) βDi10001200 (0) − Di10001100 (0) + βξ (2) βDi10001211 (0) − Di10001111 (0) 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ P P (1) (2) 01001211 01001111 01001200 01001100 (0) − D◦,◦,i (0) (0) − D◦,◦,i (0) + βξ i2 ,j2 βD◦,◦,i 2ξ i2 ,j2 βD◦,◦,i 2 ,j2 −1 2 ,j2 −1 2 ,j2 −1 2 ,j2 −1 i2 j2 00001200 00001100 00001211 00001111 (0) +2ξ (1) βD◦,◦,◦,◦ (0) − D◦,◦,◦,◦ (0) + βξ (2) βD◦,◦,◦,◦ (0) − D◦,◦,◦,◦ Let us denote eΦ (k1 , k2 ) ≡ π 1 ei1 Φk1 + π 2 ei2 Φk1 , π π ρ ρ ρ3 ρ4 σ 1 σ 2
The term Di11,j12,i21,j22
Φσ ≡ σ 1 I2 + σ 2 Φ,
cσ ≡ (σ 1 + σ 2 ) e2 EX .
(s) is a product of two components. The first component is
exp ((π 1 ei1 + π 2 ei2 + (ρ1 + ρ2 + ρ3 + ρ4 ) (1α)) EX ) 2 2 2 2 if s > 1 ρ1 + ρ2 + ρ3 + ρ4 2 0 ρ2 + (ρ2 + ρ3 ) + ρ24 if s = 1 × exp .5 (1α) VU (1α) 1 2 2 (ρ1 + ρ3 ) + (ρ2 + ρ4 ) if s = 0 Now we turn to the second component. In the special case ρ1 = 0, ρ2 = 0, j2 = −1, s = 0, it is 0 exp π 2 ρ3 ei2 VU (1α) + .5π 22 ei2 VΣ (j1 ) e0i2 0 e2 (ρ3 Φσ + ρ4 σ 2 ) VU (1α) + cσ + π 2 e2 Φσ VΣ (j1 ) e0i2 . × ×E [exp (eΦ (0, j1 + 1) (Xt − EX ))] j1 +1 +E exp (eΦ (0, j1 + 1) (Xt − EX )) e2 Φσ Φ (Xt − EX ) In the case min (j1 , j2 + s) > s − 1, it is 2 π 2 ei2 VΣ (j1 − j2 − s − 1) e0i2 exp .5 π 21 ei1 VΣ (j2 + s − j1 − 1) e0i1
35
if j1 ≥ j2 + s if j2 + s ≥ j1
i1 ,j1
(ρ1 Φ + ρ2 ) Φσ Φs−1 + ρ3 Φσ + ρ4 σ 2 if s > 1 0 (ρ Φ + ρ2 + ρ3 ) Φσ + ρ4 σ 2 if s = 1 e2 V (1α) 1 U (ρ1 + ρ ) Φ + (ρ + ρ ) σ if s = 0 σ 2 3 2 4 j2 +s+1 0 π2 Φ VΣ (j1 − j2 − s − 1) ei2 if j1 ≥ j2 + s +cσ + e2 Φσ j1 +1 0 VΣ (j2 + s − j1 − 1) ei1 if j2 + s ≥ j1 π 1 Φ (eΦ (0, j1 − j2 − s) (Xt − EX )) if j1 ≥ j2 + s ×E exp (eΦ (j2 + s − j1 , 0) (Xt − EX )) if j2 + s ≥ j1 (eΦ (0, j1 − j2 − s) (Xt − EX )) if j1 ≥ j2 + s × e2 Φσ Φj1 +1 (Xt − EX ) +E exp (eΦ (j2 + s − j1 , 0) (Xt − EX )) if j2 + s ≥ j1 j2 +s+1 × e2 Φσ Φ (Xt − EX )
×
In the case j2 + s > s − 1 ≥ j1 , it is if j1 = s − 1 ρ1 0 π 1 ei1 ρ1 + ρ2 if j1 = s − 2 VU (1α) exp s−2−j 1 (ρ1 Φ + ρ2 ) Φ if j1 < s − 2 +.5π 21 ei1 VΣ (j2 − j1 + s − 1) e0i1 (ρ1 Φ + ρ2 ) Φσ Φs−1 + ρ3 Φσ + ρ4 σ 2 if s > 1 0 VU (1α) e2 (ρ1 Φ + ρ2 + ρ3 ) Φσ + ρ4 σ 2 if s = 1 j +1 0 +cσ + π 1 e2 Φσ Φ 1 VΣ (j2 + s − j1 − 1) ei1 × ×E [exp (eΦ (j2 + s − j1 , 0) (Xt − EX ))] +E exp (eΦ (j2 + s − j1 , 0) (Xt − EX )) e2 Φσ Φj2 +s+1 (Xt − EX ) A.5.2
.
.
Computation of second component of Bias0
For the one-period problem, the second derivatives of the moment function are ∂mθt = zt x1,t+1 xα 2,t+1 (0 log(x2,t+1 )) , ∂β ∂mθt = zt x1,t+1 xα 2,t+1 log(x2,t+1 ) (1 β log(x2,t+1 )) , ∂α so the second component of Bias0 , apart from the factor −Ξ, is
Σ12 W 11 (1) + .5βΣ22 W 11 (2) i1 ,j1 i1 ,j1 i1 ,j1 . 01 01 Σ12 W◦,◦ (1) + .5βΣ22 W◦,◦ (2) For the two-period problem, the second derivatives of the moment function are ∂mθt α = 2zt x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 (1 β log(x2,t+1 x2,t+2 )) , ∂β ∂mθt α = βzt x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 log(x2,t+1 x2,t+2 ) (2 β log(x2,t+1 x2,t+2 )) . ∂α The second component of Bias0 , apart from the factor −Ξ, is " #
β −2 Σ11 T 0 + 2βΣ12 A11 (1) + .5β 2 Σ22 A11 (2) i1 ,j1 i1 ,j1 i1 ,j1 i1 ,j1 . 2 01 β −2 Σ11 + 2βΣ12 A01 ◦,◦ (1) + .5β Σ22 A◦,◦ (2) A.5.3
Computation of Bias1
For the one-period model, we need Bias1 (s) = −ΣE [m0θt Ωmt−s ] , where s may vary from 0 to ∞. Apart from the factor −Σ, it is 00101000 00001000 XX βDi11101000 (s) − Di11001000 (s) βD◦,◦,◦,◦ (s) − D◦,◦,◦,◦ (s) 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 ω i1 ,j1 ,i2 ,j2 2 11101010 + ω 2 00101010 00001010 (s) β D◦,◦,◦,◦ (s) − βD◦,◦,◦,◦ β Di1 ,j1 ,i2 ,j2 (s) − βDi11001010 (s) 1 ,j1 ,i2 ,j2 i1 ,i2 j1 ,j2 XX 01101000 01001000 XX βDi10101000 (s) − Di10001000 (s) βD◦,◦,i (s) − D◦,◦,i (s) 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ 2 ,j2 2 ,j2 + ω i1 ,j1 2 10101010 + ω i2 ,j2 01101010 (s) − βD 01001010 (s) β Di1 ,j1 ,◦,◦ (s) − βDi10001010 (s) β 2 D◦,◦,i ◦,◦,i2 ,j2 1 ,j1 ,◦,◦ 2 ,j2 i1 j1 i2 j2
36
For the two-period model, we need Bias1 (s) = −ΣE [m0θt Ωmt−s ] , where s may vary from −1 to ∞. Apart from the factor −Σβ, for s ≥ 0 it is XX
ω i1 ,j1 ,i2 ,j2
2β 2 Di11111100 (s) − 2Di11001100 (s) 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2
+ω
00001100 00111100 2β 2 D◦,◦,◦,◦ (s) − 2D◦,◦,◦,◦ (s)
00111111 (s) − βD 00001111 (s) β 3 D◦,◦,◦,◦ β 3 Di11111111 (s) − βDi11001111 (s) ◦,◦,◦,◦ 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 2 2 10111100 10001100 01111100 01001100 XX XX 2β Di1 ,j1 ,◦,◦ (s) − 2Di1 ,j1 ,◦,◦ (s) 2β D◦,◦,i2 ,j2 (s) − 2D◦,◦,i (s) 2 ,j2 ω i1 ,j1 ω i2 ,j2 + + 01111111 (s) − βD 01001111 (s) β 3 Di10111111 (s) − βDi10001111 (s) β 3 D◦,◦,i ◦,◦,i2 ,j2 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ 2 ,j2 i1 j1 i2 j2 i1 ,i2 j1 ,j2
Apart from the factor −Σβ, for s = −1 it is E [m0θt Ωmt+1 ] , or XX
ω i1 ,j1 ,i2 ,j2
2βDi11001200 (0) − 2Di11001100 (0) 1 ,j1 ,i2 ,j2 −1 1 ,j1 ,i2 ,j2 −1
+ω
00001200 00001100 2βD◦,◦,◦,◦ (0) − 2D◦,◦,◦,◦ (0)
00001211 (0) − βD 00001111 (0) β 2 D◦,◦,◦,◦ β 2 Di11001211 (0) − βDi11001111 (0) ◦,◦,◦,◦ 1 ,j1 ,i2 ,j2 −1 1 ,j1 ,i2 ,j2 −1 XX 10001200 10001100 01001200 01001100 XX 2βDi1 ,j1 ,◦,◦ (0) − 2Di1 ,j1 ,◦,◦ (0) 2βD◦,◦,i2 ,j2 −1 (0) − 2D◦,◦,i (0) 2 ,j2 −1 + ω i1 ,j1 2 10001211 + ω i2 ,j2 2 01001211 01001111 (0) β Di1 ,j1 ,◦,◦ (0) − βDi10001111 (0) β D◦,◦,i2 ,j2 −1 (0) − βD◦,◦,i 1 ,j1 ,◦,◦ 2 ,j2 −1 i1 j1 i2 j2 i1 ,i2 j1 ,j2
A.5.4
Computation of Bias2
The key elements of Bias2 are: π π π ρ ρ ρ ρ ρ5 ρ6
Fi11,j12,i23,j21,i23 ,j33 4
(r, s) ≡ E
ρ
ρ
ρ α
ρ α
ρ
ρ
ρ α
ρ α
1 2 1 2 3 4 3 4 x1,t+1−r x1,t+2−r x2,t+1−r x2,t+2−r x1,t+1−s x1,t+2−s x2,t+1−s x2,t+2−s ρ5 ρ6 ρ5 α ρ6 α x1,t+2 x2,t+1 x2,t+2 ×xπi11,t−j1 xπi22,t−j2 −r xπi33,t−j3 −s x1,t+1
with s running from −2 to ∞, and r = 0, ±1. For the one-period model, we need Bias2 (s) = ΞE [mt m0t Ωmt−s ] , with s running from 0 to ∞. Apart from the factor Ξ, the sth term is h 2 i E [mt m0t Ωmt−s ] = E zt (zt0 Ωzt−s ) βx1,t+1 xα βx1,t+1−s xα 2,t+1 − 1 2,t+1−s − 1 PP P P
ω F 111 (s) + ω i2 ,j2 Fi110 (s) 1 ,j1 ,i2 ,j2 ,◦,◦
i2 ,i3 j2 ,j3 i2 ,j2 ,i3 ,j3 i1 ,j1 ,i2 ,j2 ,i3 ,j3
i2 j2 PP
101 100 +
ω i3 ,j3 Fi1 ,j1 ,◦,◦,i3 ,j3 (s) + ωFi1 ,j1 ,◦,◦,◦,◦ (s)
i3 j3 i1 ,j1 P P PP = 011 010 ω i2 ,j2 ,i3 ,j3 F◦,◦,i2 ,j2 ,i3 ,j3 (s) + ω i2 ,j2 F◦,◦,i2 ,j2 ,◦,◦ (s) i2 ,iP ,j3 i2 j2 3 j2 P 001 000 + ω i3 ,j3 F◦,◦,◦,◦,i (s) + ωF◦,◦,◦,◦,◦,◦ (s) 3 ,j3 i3 j3
where (◦, s) − Fiπ11,jπ12,iπ23,j000000 (◦, s) Fiπ11,jπ12,iπ23,j2 ,i3 ,j3 (s) = β 3 Fiπ11,jπ12,iπ23,j001020 2 ,i3 ,j3 2 ,i3 ,j3 −β 2 2Fiπ11,jπ12,iπ23,j001010 (◦, s) + Fiπ11,jπ12,iπ23,j000020 (◦, s) 2 ,i3 ,j3 2 ,i3 ,j3 +β 2Fiπ11,jπ12,iπ23,j000010 (◦, s) + Fiπ11,jπ12,iπ23,j001000 (◦, s) 2 ,i3 ,j3 2 ,i3 ,j3 For the two-period model, we need 0 Bias2 (s) = ΞE mt (mt + mt−1 + mt+1 ) Ωmt−s , with s running from −2 to ∞. Apart from the factor Ξ, the sth term consists of three components, for r = 0, ±1, of the type 2 0 α α z z Ωz β x x x x − 1 t t−s 1,t+1 1,t+2 t−r 2,t+1 2,t+2 α E mt m0t−r Ωmt−s = E × β 2 x1,t+1−r x1,t+2−r xα 2,t+1−r x2,t+2−r − 1 2 α α × β x1,t+1−s x1,t+2−s x2,t+1−s x2,t+2−s − 1
37
PP P P
ω i2 ,j2 Fi110 (r, s) ω F 111 (r, s) + 1 ,j1 ,i2 ,j2 ,◦,◦
i2 ,i3 j2 ,j3 i2 ,j2 ,i3 ,j3 i1 ,j1 ,i2 ,j2 ,i3 ,j3 i2 j2 PP
101 100 +
ω F (r, s) + ωF (r, s) i3 ,j3 i1 ,j1 ,◦,◦,i3 ,j3 i1 ,j1 ,◦,◦,◦,◦
i3 j3 i1 ,j1 P P P P = 011 010 ω i2 ,j2 ,i3 ,j3 F◦,◦,i (r, s) + ω F (r, s) i ,j 2 2 ,j ,i ,j ◦,◦,i ,j ,◦,◦ 2 2 3 3 2 2 i2 ,iP ,j3 i2 j2 3 j2 P 001 000 + ω i3 ,j3 F◦,◦,◦,◦,i3 ,j3 (r, s) + ωF◦,◦,◦,◦,◦,◦ (r, s) i3 j3
where Fiπ11,jπ12,iπ23,j2 ,i3 ,j3 (r, s)
= β 6 Fiπ11,jπ12,iπ23,j111111 (r, s) − Fiπ11,jπ12,iπ23,j000000 (r, s) 2 ,i3 ,j3 2 ,i3 ,j3 (r, s) + Fiπ11,jπ12,iπ23,j110011 (r, s) + Fiπ11,jπ12,iπ23,j111100 (r, s) −β 4 Fiπ11,jπ12,iπ23,j001111 2 ,i3 ,j3 2 ,i3 ,j3 2 ,i3 ,j3 +β 2 Fiπ11,jπ12,iπ23,j110000 (r, s) + Fiπ11,jπ12,iπ23,j001100 (r, s) + Fiπ11,jπ12,iπ23,j000011 (r, s) 2 ,i3 ,j3 2 ,i3 ,j3 2 ,i3 ,j3
Let us denote eΦ (k1 , k2 , k3 ) = π 1 ei1 Φk1 + π 2 ei2 Φk2 + π 3 ei3 Φk3 0 Vαπ (ρ, k1 , k2 , k3 ) = ρ (1α) + π 1 ei1 Φk1 + π 2 ei2 Φk2 + π 3 ei3 Φk3 VU ρ (1α) + π 1 ei1 Φk1 + π 2 ei2 Φk2 + π 3 ei3 Φk3 2 2 (ρ1 + ρ6 ) + (ρ2 + ρ3 ) + ρ24 + ρ25 if r = −1, s = −2 2 2 2 (ρ2 + ρ4 ) + (ρ21 + ρ3 + ρ6 ) + ρ52 if r = −1, s = −1 2 (ρ1 + ρ4 + ρ6 ) + ρ2 + (ρ3 + ρ5 ) if r = −1, s = 0 2 2 (ρ1 + ρ6 ) + ρ22 + ρ23 + (ρ4 + ρ5 ) if r = −1, s = +1 2 (ρ1 + ρ6 ) + ρ22 + ρ23 + ρ24 + ρ25 if r = −1, other s 2 2 2 if r = 0, s = −1 (ρ + ρ ) + (ρ + ρ + ρ ) + ρ 1 5 2 3 6 4 2 2 (ρ + ρ + ρ ) + (ρ + ρ + ρ ) if r = 0, s = 0 0 2 1 3 5 2 4 6 (1α) VU (1α) σ r,s = exp .5 2 2 (ρ1 + ρ4 + ρ5 ) + (ρ2 + ρ6 ) + ρ23 if r = 0, s = +1 (ρ + ρ )2 + (ρ + ρ )2 + ρ2 + ρ2 if r = 0, |s| > 1 1 5 2 6 3 4 2 2 2 2 (ρ + ρ ) + (ρ + ρ ) + ρ + ρ if r = +1, s = −1 2 5 3 6 1 4 2 2 2 (ρ + ρ + ρ ) + (ρ + ρ ) + ρ if r = +1, s = 0 2 3 5 4 6 1 2 2 2 ρ5 ) + ρ6 if r = +1, s = +1 (ρ1 + ρ3 )2 + (ρ2 + ρ4 + 2 2 2 (ρ + ρ4 ) + (ρ2 + ρ5 ) + ρ3 + ρ6 if r = +1, s = +2 21 2 2 2 2 ρ1 + (ρ2 + ρ5 ) + ρ3 + ρ4 + ρ6 if r = +1, other s 2 (ρ1 + ρ4 + ρ6 ) + ρ22 + ρ25 if r = −1, s = 0 2 2 2 (ρ1 + ρ6 ) + (ρ4 + ρ5 ) + ρ2 if r = −1, s = 1 2 if r = −1, s > 1 (ρ1 + ρ6 ) + ρ22 + ρ24 + ρ25 0 2 2 (1α) VU (1α) σ 2s>r = exp .5 (ρ1 + ρ4 + ρ5 ) + (ρ2 + ρ6 ) if r = 0, s = 1 (ρ1 + ρ5 )2 + (ρ2 + ρ6 )2 + ρ24 if r = 0, s > 1 2 2 2 (ρ + ρ ) + (ρ + ρ ) + ρ if r = +1, s = 2 1 4 2 5 6 2 2 2 2 ρ1 + (ρ2 + ρ5 ) + ρ4 + ρ6 if r = +1, s > 2 2 (ρ2 + ρ3 ) + ρ24 + ρ25 + ρ26 if r = −1, s = −2 (ρ + ρ )2 + ρ2 + ρ2 + ρ2 if r = 0, s = −2 2 6 3 4 5 2 2 2 (ρ + ρ + ρ ) + ρ + ρ if r = 0, s = −1 0 2 3 6 4 5 σ 2r>s = exp .5 (1α) V (1α) U 2 2 2 2 (ρ + ρ ) + ρ + ρ + ρ if r = +1, s = −2 2 5 3 4 6 2 2 (ρ + ρ ) + (ρ + ρ ) + ρ2 if r = +1, s = −1 2 5 3 6 4 2 2 (ρ2 + ρ3 + ρ5 ) + (ρ4 + ρ6 ) if r = +1, s = 0 2 if r = −1 (ρ1 + ρ6 ) + ρ22 + ρ25 0 2 2 σ 2r = exp .5 (1α) VU (1α) (ρ + ρ5 ) + (ρ2 + ρ6 ) if r = 0 21 2 ρ1 + (ρ2 + ρ5 ) + ρ26 if r = +1 2 ρ24 + ρ25 + (ρ3 + ρ6 ) if s = −1 2 2 (ρ3 + ρ5 ) + (ρ4 + ρ6 ) if s = 0 0 σ 2s = exp (1α) VU (1α) .5 ρ2 + (ρ + ρ )2 + ρ2 if s = +1 4 5 6 23 2 2 2 if |s| > 1 ρ3 + ρ4 + ρ5 + ρ6 38
2 0 σ 2s=r=1 = exp .5 (ρ2 + ρ4 + ρ5 ) + ρ26 (1α) VU (1α) 2 0 σ 2r=1 = exp .5 (ρ2 + ρ5 ) + ρ26 (1α) VU (1α) 0 σ 2− = exp .5 ρ25 + ρ26 (1α) VU (1α)
π π π ρ ρ ρ ρ ρ5 ρ6
The term Fi11,j12,i23,j21,i23 ,j33 4
(r, s) is a product of two components. The first component is
exp ((ρ1 + ρ2 + ρ3 + ρ4 + ρ5 + ρ6 ) (1α) EX + (π 1 ei1 + π 2 ei2 + π 3 ei3 ) EX ) if j1 ≥ max (j2 + r, j3 + s) eΦ (0, j1 − j2 − r, j1 − j3 − s) (Xt − EX ) eΦ (j2 + r − j1 , 0, j2 + r − j3 − s) (Xt − EX ) if j2 + r ≥ max (j1 , j3 + s) ×E exp eΦ (j3 + s − j1 , j3 + s − j2 − r, 0) (Xt − EX ) if j3 + s ≥ max (j1 , j2 + r) Now we turn to the second component. In the case min (j1 , j2 + r, j3 + s) > max (r − 1, s − 1) , it 2 π 3 ei3 VΣ (j2 + r − j3 − s − 1) e0i3 + π e + π e Φj2 +r−j3 −s V (j − j − r − 1) if j1 ≥ j2 + r ≥ j3 + s 2 i2 3 i3 Σ 1 2 j2 −j3 −s+r 0 × π e + π e Φ 2 i 3 i 2 3 2 π 2 ei2 VΣ (j3 + s − j2 − r − 1) e0i2 + π 2 ei2 Φj3 +s−j2 −r + π 3 ei3 VΣ (j1 − j3 − s − 1) if j1 ≥ j3 + s ≥ j2 + r 0 j3 +s−j2 −r + π e × π e Φ 3 i3 2 i2 π 23 ei3 VΣ (j1 − j3 − s − 1) e0i3 + j −j −s VΣ (j2 + r − j1 − 1) if j2 + r ≥ j1 ≥ j3 + s π 1 ei1 + π 3 ei3 Φ 1 3 j1 −j3 −s 0 2× π 1 ei1 + π 3 ei3 Φ σ 2r,s exp .5 π 1 ei1 VΣ (j3 + s − j1 − 1) e0i1 + π 1 ei1 Φj3 +s−j1 + π 3 ei3 VΣ (j2 + r − j3 − s − 1) if j2 + r ≥ j3 + s ≥ j1 0 × π 1 ei1 Φj3 +s−j1 + π 3 ei3 π 22 ei2 VΣ (j1 − j2 − r − 1) e0i2 + π 1 ei1 + π 2 ei2 Φj1 −j2 −r VΣ (j3 + s − j1 − 1) if j3 + s ≥ j1 ≥ j2 + r j1 −j2 −r 0 × π e + π e Φ 1 i1 2 i2 π 2 ei VΣ (j2 + r − j1 − 1) e0 + 1 1 i1 π 1 ei Φj2 +r−j1 + π 2 ei VΣ (j3 + s − j2 − r − 1) if j + s ≥ j + r ≥ j 3 2 1 1 2 0 × π 1 ei1 Φj2 +r−j1 + π 2 ei2
is .
In the case j1 ≥ j2 + r > j3 + s = r − 1 > s − 1, it is .5Vαπ (ρ1 , −, −, 0) + .5π 23 ei3 ΦVΣ (j2 + r − j3 − s − 2) Φ0 e0i3 σ 2r>s exp 0 +.5 π 2 ei2 + π 3 ei3 Φj2 +r−j3 −s VΣ (j1 − j2 − r − 1) π 2 ei2 + π 3 ei3 Φj2 +r−j3 −s In the case j1 ≥ j3 + s > j2 + r = s − 1 > r − 1, it is .5Vαπ (ρ3 , −, 0, −) + .5π 22 ei2 ΦVΣ (j3 + s − j2 − r − 2) Φ0 e0i2 σ 2s>r exp 0 +.5 π 2 ei2 Φj3 +s−j2 −r + π 3 ei3 VΣ (j1 − j3 − s − 1) π 2 ei2 Φj3 +s−j2 −r + π 3 ei3 In the case j1 ≥ j2 + r > r − 1 > j3 + s > s − 1, it is .5Vαπ (ρ1 , −, −, r − j3 − s − 1) + .5Vαπ (ρ2 , −, −, r − j3 − s − 2) +.5π 23 ei3 VΣ (r − j3 − s − 3) e0i3 + .5π 23 ei3 Φr−j3 −s VΣ (j2 − 1) Φr−j3 −s0 e0i3 σ 2s exp j2 +r−j3 −s j2 +r−j3 −s 0 +.5 π 2 ei2 + π 3 ei3 Φ VΣ (j1 − j2 − r − 1) π 2 ei2 + π 3 ei3 Φ In the case j1 ≥ j3 + s > s − 1 > j2 + r > r − 1, it is .5Vαπ (ρ3 , −, s − j2 − r − 1, −) + .5Vαπ (ρ4 , −, s − j2 − r − 2, −) +.5π 22 ei2 VΣ (s − j2 − r − 3) e0i2 + .5π 22 ei2 Φs−j2 −r VΣ (j3 − 1) Φs−j2 −r0 e0i2 σ 2r exp 0 j3 +s−j2 −r j3 +s−j2 −r +.5 π 2 ei2 Φ + π 3 ei3 VΣ (j1 − j3 − s − 1) π 2 ei2 Φ + π 3 ei3 In the case j2 + r > j1 ≥ j3 + s = r − 1 > s − 1, it is .5Vαπ (ρ1 , −, −, 0) + .5π 23 ei3 ΦVΣ (j1 − j3 − s − 2) Φ0 e0i3 σ 2r>s exp +.5 π 1 ei1 + (ρ1 (1α) + π 3 ei3 ) Φj1 −j3 −s VΣ (j2 + r − j1 − 1) 0 × π 1 ei1 + (ρ1 (1α) + π 3 ei3 ) Φj1 −j3 −s 39
In the case j3 + s > j1 ≥ j2 + r = s − 1 > r − 1, it is .5Vαπ (ρ3 , −, 0, −) + .5π 22 ei2 ΦVΣ (j1 − j2 − r − 2) Φ0 e0i2 σ 2s>r exp +.5 π 1 ei1 + (ρ3 (1α) + π 2 ei2 ) Φj1 −j2 −r VΣ (j3 + s − j1 − 1) 0 × π 1 ei1 + (ρ3 (1α) + π 2 ei2 ) Φj1 −j2 −r In the case j2 + r > j1 ≥ r − 1 > j3 + s > s − 1, it is .5Vαπ (ρ1 , −, −, r − j3 − s − 1) + .5Vαπ (ρ2 , −, −, r − j3 − s − 2) 2 σ s exp +.5π 23 ei3 VΣ (r − j3 − s − 3) e0i3 + .5π 23 ei3 Φr−j3 −s VΣ (j1 − r − 1) Φr−j3 −s0 e0i3 In the case j3 + s > j1 ≥ s − 1 > j2 + r > r − 1, it is .5Vαπ (ρ3 , −, s − j2 − r − 1, −) + .5Vαπ (ρ4 , −, s − j2 − r − 2, −) σ 2r exp +.5π 22 ei2 VΣ (s − j2 − r − 3) e0i2 + .5π 22 ei2 Φs−j2 −r VΣ (j1 − s − 1) Φs−j2 −r0 e0i2 r−1≥s−1 In the case min (j2 + r, j3 + s) > j1 = , it is s−1>r−1 2 if j1 = r − 1 > s − 1 σ r>s exp (.5Vαπ (ρ1 , 0, −, −)) σ2 exp (.5Vαπ (ρ1 + ρ3 , 0, −, −)) if j1 = r − 1 = s − 1 s=r=1 σ 2s>r exp (.5Vαπ (ρ3 , 0, −, −)) if j1 = s − 1 > r − 1 .5π 21 ei1 ΦVΣ (j3 + s − j1 − 2) Φ0 e0i1 + .5 π 1 ei1 Φj3 +s−j1 + π 3 ei3 if j2 + r ≥ j3 + s 0 ×VΣ (j2 + r − j3 − s − 1) π 1 ei1 Φj3 +s−j1 + π 3 ei3 × exp 2 0 0 j2 +r−j1 .5π 1 ei1 ΦVΣ (j2 + r − j1 − 2) Φ ei1 + .5 π 1 ei1 Φ + π 2 ei2 0 if j3 + s ≥ j2 + r j2 +r−j1 ×VΣ (j3 + s − j2 − r − 1) π 1 ei1 Φ + π 2 ei2 In the case min (j3 + s, j2 + r) > s − 1 > j1 ≥ r − 1, it is 2 σ r exp .5π 21 ei1 VΣ (s − j1 − 3) e0i1 σ 2r=1 exp .5Vαπ (ρ1 , 0, −, −) + .5π 21 ei1 ΦVΣ (s − j1 − 4) Φ0 e0i1
if j1 > r − 1 if j1 = r − 1
× exp (.5Vαπ (ρ3 , s − j1 − 1, −, −) + .5Vαπ (ρ4 , s − j1 − 2, −, −)) s−j 0 0 j +s−j1 2 s−j + π 3 ei3 .5π 1 ei1 Φ 1 VΣ (j3 − 1) Φ 1 ei1 + .5 π 1 ei1 Φ 3 0 ×VΣ (j2 + r − j3 − s − 1) π 1 ei1 Φj3 +s−j1 + π 3 ei3 × exp s−j1 0 0 2 s−j1 j2 +r−j1 VΣ (j2 + r − s − 1) Φ ei1 + .5 π 1 ei1 Φ + π 2 ei2 .5π 1 ei1 Φ 0 ×VΣ (j3 + s − j2 − r − 1) π 1 ei1 Φj2 +r−j1 + π 2 ei2
if j2 + r ≥ j3 + s if j3 + s ≥ j2 + r
In the case j3 + s > s − 1 > j1 ≥ j2 + r > r − 1, it is .5Vαπ (ρ3 , s − j1 − 1, s − j2 − r − 1, −) + .5Vαπ (ρ4 , s − j1 − 2, s − j2 − r − 2, −) +.5π 22 ei2 VΣ (j1 − j2 − r − 1) e0i2 0 σ 2r exp +.5 π 1 ei1 + π 2 ei2 Φj1 −j2 −r VΣ (s − j1 − 3) π 1 ei1 + π 2 ei2 Φj1 −j2 −r 0 +.5 π 1 ei1 Φs−j1 + π 2 ei2 Φs−j2 −r VΣ (j3 − 1) π 1 ei1 Φs−j1 + π 2 ei2 Φs−j2 −r
In the case j3 + s > j2 + r = s − 1 > j1 ≥ r − 1, it is 2 σ r exp .5π 21 ei1 VΣ (s − j1 − 3) e0i1 if j1 > r − 1 σ 2r=1 exp .5Vαπ (ρ1 , 0, −, −) + .5π 21 ei1 ΦVΣ (s − j1 − 4) Φ0 e0i1 if j1 = r − 1 0 .5Vαπ (ρ4 , s − j1 − 2, −, −) + .5 ρ3 (1α) + π 1 ei1 Φj2 +r−j1 + π 2 ei2 × exp 0 ×VΣ (j3 + s − j2 − r − 1) ρ3 (1α) + π 1 ei1 Φj2 +r−j1 + π 2 ei2 In the case j3 + s > s − 1 > j2 + r > j1 ≥ r − 1, it is 2 if j1 > r − 1 σ r exp .5π 21 ei1 VΣ (j2 + r − j1 − 1) e0i1 0 σ 2r=1 exp .5 (ρ1 (1α) + π 1 ei1 ) VΣ (j2 ) (ρ1 (1α) + π 1 ei1 ) if j1 = r − 1 .5Vαπ (ρ3 , s − j1 − 1, s − j2 − r − 1, −) + .5Vαπ (ρ4 , s − j1 − 2, s − j2 − r − 2, −) 0 × exp +.5 π 1 ei1 Φj2 +r−j1 + π 2 ei2 VΣ (s − j2 − r − 3) π 1 ei1 Φj2 +r−j1 + π 2 ei2 s−j1 s−j2 −r s−j1 s−j2 −r +.5 π 1 ei1 Φ + π 2 ei2 Φ VΣ (j3 − 1) π 1 ei1 Φ + π 2 ei2 Φ 40
In the case j3 + s > s − 1 > j2 + r > r − 1 > j1 , it is .5Vαπ (ρ1 , r − j1 − 1, −, −) + .5Vαπ (ρ2 , r − j1 − 2, −, −) +.5Vαπ (ρ3 , s − j1 − 1, s − j2 − r − 1, −) + .5Vαπ (ρ4 , s − j1 − 2, s − j2 − r − 2, −) +.5π 21 ei1 VΣ (r − j1 − 3) e0i1 + .5π 21 ei1 Φr−j1 VΣ (j2 − 1) Φr−j1 0 e0i1 σ 2− exp 0 +.5 π 1 ei1 Φj2 +r−j1 + π 2 ei2 VΣ (s − j2 − r − 3) π 1 ei1 Φj2 +r−j1 + π 2 ei2 0 +.5 π 1 ei1 Φs−j1 + π 2 ei2 Φs−j2 −r VΣ (j3 − 1) π 1 ei1 Φs−j1 + π 2 ei2 Φs−j2 −r A.5.5
Computation of Bias3
The key elements of Bias3 are: π π ρ ρ τ 1 τ 2 σ0 σ1 σ2
Ci11,j12,i21,j22
ρ1 ρ2 ρ1 α ρ2 α 1 2 ≡ E xπi11,t−j1 xπi22,t−j2 x1,t+1 x1,t+2 x2,t+1 x2,t+2 xτ1,t1 xτ2,t2 log(xσ2,t0 xσ2,t+1 xσ2,t+2 ) .
Let us denote eΦ (k1 , k2 , t) = π 1 ei1 Φk1 + π 2 ei2 Φk2 + τ Φt ,
vσ = e2 σ 0 + σ 1 Φ + σ 2 Φ2 .
Then for j1 ≥ j2
((ρ1 + ρ2 ) (1α) + π 1 ei1 + π 2 ei2 + τ ) EX 0 = exp +.5 ρ21 + ρ22 (1α) VU (1α) + .5τ VΣ (j2 − 1) τ 0 0 +.5eΦ (−, 0, j2 ) VΣ (j1 − j2 − 1) eΦ (−, 0, j2 ) 0 e2 (σ 0 + σ 1 + σ 2 ) EX + e2 (ρ1 (σ 1 + σ 2 Φ) + ρ2 σ 2 ) VU (1α) +vσ VΣ (j1 − 1) τ 0 + π 2 vσ Φj2 VΣ (j1 − j2 − 1) e0i2 . × ×E [exp (eΦ (0, j1 − j2 , j1 ) (Xt − EX ))] j1 +E exp (eΦ (0, j1 − j2 , j1 ) (Xt − EX )) vσ Φ (Xt − EX )
π π ρ ρ τ 1 τ 2 σ0 σ1 σ2
Ci11,j12,i21,j22
For the one-period model, the derivatives of outer square of the moment function are ∂mt m0t ∂β ∂mt m0t ∂α
=
α 2zt zt0 βx1,t+1 xα 2,t+1 − 1 x1,t+1 x2,t+1 ,
=
α 2zt zt0 βx1,t+1 xα 2,t+1 − 1 x1,t+1 x2,t+1 log(x2,t+1 ).
The component Bias3 is
P1,1 2 σ 2Ξ m Γ0 Ψe1 + P2,1 β T01 −1
where Ψ = ΩV Q (Q0 Q)
P1,2 P2,2 T02
T1 T2 Ψe2 , S
, and
Pi1 ,i2 Ti1 S
= βCi112000010 − Ci111000010 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 j1 =0,···,nli1 −1, j2 =0,···,nli2 −1
102000010
101000010
= βCi1 ,j1 ,◦,◦ − Ci1 ,j1 ,◦,◦ j =0,···,nl −1 1
=
002000010 βC◦,◦,◦,◦
−
i1
001000010 C◦,◦,◦,◦ .
For the two-period model, the derivatives of outer square of the moment function are ∂mt m0t ∂β ∂mt m0t−1 ∂β ∂mt m0t ∂α ∂mt m0t−1 ∂α
=
α α α 2zt zt0 β 2 x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 − 1 2βx1,t+1 x1,t+2 x2,t+1 x2,t+2 ,
0 α α α = zt zt−1 β 2 x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 − 1 2βx1,t x1,t+1 x2,t x2,t+1 0 α α α +zt zt−1 β 2 x1,t x1,t+1 xα 2,t x2,t+1 − 1 2βx1,t+1 x1,t+2 x2,t+1 x2,t+2 , 2 α α α = 2zt zt0 β 2 x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 − 1 β x1,t+1 x1,t+2 x2,t+1 x2,t+2 log(x2,t+1 x2,t+2 ), 2 0 α α α = zt zt−1 β 2 x1,t+1 x1,t+2 xα 2,t+1 x2,t+2 − 1 β x1,t x1,t+1 x2,t x2,t+1 log(x2,t x2,t+1 ) 2 0 α α α +zt zt−1 β 2 x1,t x1,t+1 xα 2,t x2,t+1 − 1 β x1,t+1 x1,t+2 x2,t+1 x2,t+2 log(x2,t+1 x2,t+2 ).
41
The component Bias3 is Ξ (B4β Ψe1 + B4α Ψe2 ) , −1
where Ψ = ΩV Q (Q0 Q) , and both B4β , and B4α have structure P1,1 P1,2 T1 Pi1 ,i2 = kPi1 ,j1 ,i2 ,j2 kj1 =0,···,nli −1, j2 =0,···,nli −1 1 2 P2,1 P2,2 T2 , where T i1 = kTi1 ,j1 kj1 =0,···,nli −1 0 0 1 T1 T2 S For B4β , the entries are Pi1 ,j1 ,i2 ,j2 Ti1 ,j1 S
+ 4β 2 Ci11201α000 + Ci11201α000 4β β 2 Ci112200000 − Ci111100000 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2 −2β Ci11101α000 + Ci11101α000 − 2 Ci111000000 + Ci111000000 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 + Ci10201α000 + 4β 2 Ci10201α000 = 4β β 2 Ci102200000 − Ci101100000 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ −2β Ci10101α000 + Ci10101α000 − 2 Ci101000000 + Ci101000000 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ 002200000 001100000 00201α000 00101α000 001000000 = 4β β 2 C◦,◦,◦,◦ − C◦,◦,◦,◦ + 8β 2 C◦,◦,◦,◦ − 4βC◦,◦,◦,◦ − 4C◦,◦,◦,◦ =
and for B4α , the entries are Pi1 ,j1 ,i2 ,j2
Ti1 ,j1
S
=
− Ci111100011 2β 2 β 2 Ci112200011 1 ,j1 ,i2 ,j2 1 ,j1 ,i2 ,j2
+β 2 β Ci11201α110 + Ci11201α110 − Ci11101α110 + Ci11101α110 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 +β 2 β 2 Ci11211α011 + Ci11211α011 − Ci111100011 + Ci111100011 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 1 ,j1 ,i2 ,j2 +1 1 ,j1 +1,i2 ,j2 = 2β 2 β 2 Ci102200011 − Ci101100011 1 ,j1 ,◦,◦ 1 ,j1 ,◦,◦ +β 2 β Ci10201α110 + Ci10201α110 − Ci10101α110 + Ci10101α110 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ +β 2 β 2 Ci10211α011 + Ci10211α011 − Ci101100011 + Ci101100011 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ 1 ,j1 ,◦,◦ 1 ,j1 +1,◦,◦ =
002200011 001100011 00201α110 00101α110 00211α011 001100011 2β 2 β 2 C◦,◦,◦,◦ − C◦,◦,◦,◦ + βC◦,◦,◦,◦ − C◦,◦,◦,◦ + β 2 C◦,◦,◦,◦ − C◦,◦,◦,◦
42