Document not found! Please try again

Bayesian Comparison of Dynamic Macroeconomic ... - UNSW Sydney

6 downloads 0 Views 419KB Size Report
Oct 27, 1998 - E-Mail: j.landon[email protected] ... *I would like to thank Professors John Geweke, Michael Keane, Lee Ohanian, and Craig Swan for their ...
Bayesian Comparison of Dynamic Macroeconomic Models John S. Landon-Lane ∗ University of New South Wales Sydney, NSW. 2052 E-Mail: [email protected]

Preliminary draft. Please do not quote. October 27, 1998

Abstract This paper develops a method to formally compare and evaluate non-linear dynamic macroeconomic models. In particular, the method developed aims to compare models that are found in the real business cycle (RBC) literature. The method is based on the Bayesian model comparison literature and is flexible enough to incorporate model uncertainty and prior beliefs over the parameters of the model into the final decision. Calibration is therefore treated as a special case of a more general method. The method is also able to compare two models over subsets of the data as well as the whole data set thus allowing for the identification of where one model is “better” than another. In order to use the Bayesian model comparison literature, a likelihood function for each model needs to be calculated. Using Bayesian techniques a likelihood function for a typical model in the RBC literature is calculated directly form the equations that make up the solutions to the model. Because the method is likelihood based, models in the RBC literature can now be compared across the full dimension of the data. To illustrate the method, two models found in the RBC literature are compared for the case where there is no prior uncertainty over the structural parameters of the model and for the case where there is some prior uncertainty over the parameters. It is found that the comparison is sensitive to the prior specification. The paper also shows how to draw from the posterior distributions of models that are typical in the RBC literature using standard Markov chain Monte Carlo techniques.

1

Introduction

Dynamic macroeconomic models have become an important tool in modern macroeconomics. In macroeconomics, dynamic general equilibrium models have been used to develop models of the economy; to ask and answer questions regarding observed economic phenomena in these economies; and to conduct ∗I

would like to thank Professors John Geweke, Michael Keane, Lee Ohanian, and Craig Swan for their useful comments. I also acknowledge the comments of Arnie Quinn, Claustre Bajona-Xandri and seminar participants at Louisiana State University, Rice University and the University of Minnesota. All remaining mistakes are my own.

1

experiments using the models. Frequently, new models are distinguished from older models by how they emulate features in the data. There may be a puzzling feature of the data that is observed and a new model may be proposed that attempts to mimic this feature. If one is to seriously ask whether a model answers a particular question, they first must believe that the model used to answer the question is a “good” model. However, the question of what is a “good” model is a difficult concept. It may be too much to expect a model to perfectly predict the complex system that is being observed. In that case, one has to decide what aspects of the system are important for a model to explain well. Once a determination has been made on what the model should be attempting to explain, it is necessary to determine how to decide whether a model has replicated the important aspects of the observed system. In economics, the models that have been used to model the observed economic world have become more and more complex. In particular, after the seminal paper of Kydland and Prescott (1982), the Real Business Cycle (RBC) literature grew using more and more complicated dynamic non-linear models in the search for answers to questions posed by investigators. The approach of the RBC literature that is described by Kydland and Prescott (1996) is a familiar one: Models are advanced to approximate an economy that is observed. The models are tested to see if they are “good” representations of the economy that they are attempting to model. If the model is determined to be a “good” representation of the observed economy, it is then used to answer questions regarding the observed economy or used as an experimental tool to analyze the effects of competing policies that are proposed. The methods that have been used to validate the use of the models in the RBC literature have come under increasing criticism. The most common method of determining whether a model does a “good” job of mimicking an observed economy has been criticized for being too informal (Hansen and Heckman 1996). Various alternative methods have been proposed to evaluate the ability of models with respect to observations on the economy. Almost all of the alternative methods evaluate the performance of a model with respect to the observed data rather than evaluating a model with respect to other competing models in the literature. According to Kydland and Prescott (1996), the models used in the RBC literature should not be expected to predict the observed economy perfectly. Placing too high of a standard on these models could lead to the rejection of models that could nevertheless lead to increased understanding of the observations at hand. Their view is that while the models are not good at prediction, the models could be useful in helping to understand observations from an economy. If this is the case, then it would be important to be able to distinguish which model of the type used in the RBC literature is the “best”. Therefore, a method that is able to distinguish among the class of models that is used in the RBC literature is needed. One of the criticisms of the literature that was made by Stadler (1994) was that there were no methods that directly compared models in the RBC literature. The current methods of model evaluation involve the comparison of models to the observed data. Then there is an informal comparison across models. Once a model has been proposed and shown to “fit” the observations sufficiently, it is common to use the model to analyze the effects of changing structural parts of the model. Again, any answer to this kind of experiment can only be taken seriously if the model that is used to conduct the experiment can be shown to improve upon competing models. Note that a competing model does not necessarily have to be of a similar type. In fact, it is useful to know how a model performs against different models that are not of the same type. Inferences from a set of models can only be put into context if there is a comparison of those models with other models that are found in the wider literature. Therefore it is important for any method of comparison to be able to formally compare models inside the RBC literature directly with models outside the RBC literature. In this paper, a method will be developed that will allow for the direct comparison of models both inside and outside the RBC literature. In relation to the method of comparison, the concept of “better”

2

will be defined and it will be shown that the method can be extended to allow for the comparison of models across sub-samples of the data as well as across the whole sample. The ability to compare models across sub-samples could allow for a better understanding as to what is the difference between two models that are being compared. Another problem inherent in this approach to model selection is the use of prior information. Prior information is used to determine what characteristics are used to evaluate a model. In the worst case, prior beliefs about a model could lead to biased model selection. It would be nice for a method to be able to formally include prior beliefs as to a model’s validity to be included in the process. With this in mind, the method of comparison will be Bayesian in nature. One benefit of using a Bayesian approach is that model uncertainty is incorporated into the comparison the same as any other uncertainty in the model. The Bayesian approach also easily allows for prior uncertainty over the structural parameters of the model. The common practice in the RBC literature is to fix the values of the structural parameters of a model to predetermined values. In the literature this practice is called calibration. The method developed in this paper incorporates the practice of calibration as a special case of a more general method. The more general method has the values of the structural parameters being defined with some uncertainty. One criticism (Hansen and Heckman 1996) of the calibration approach is that the studies that are used to help calibrate the values of the structural parameters are ill equipped for the job. This suggests that there is a need to allow for prior uncertainty in the calibration of the models. Another criticism made by Hansen and Heckman (1996) of how models are compared that was the methods are not likelihood based. The Bayesian method used in this paper is a likelihood-based approach. One feature of this approach is that it satisfies the Likelihood Principle. The Likelihood Principle states that all relevant experimental information is contained in the likelihood function. Therefore, all inferences are based only on the sample that is observed. Berger and Wolpert (1988) contains an excellent discussion on the Likelihood Principle. However, in order to use a likelihood based approach, a likelihood function for models that are typically found in the RBC literature needs to be calculated. The first part of this paper deals with this problem. For models in the RBC literature especially, and more generally dynamic non-linear macroeconomic models, the calculation of a likelihood function is a difficult problem. This is because the models that are used do not have tractable likelihood functions. The models generally do not have as many stochastic components as they have variables, which imply that there is a dimensionality problem when defining the likelihood function. A method of directly constructing and calculating a likelihood function for models found in the RBC literature is defined. This method involves conditioning on the initial conditions of a model. By doing that a likelihood function can be directly computed. However, given the dimensionality problem, the method that is defined uses only a subset of the variables of a model to calculate the likelihood functions for a model. The layout of this paper is as follows: Section 2 will review the methods currently used to compare models found in the RBC literature while Section 3 will outline a formal method to compare models in the RBC literature. Section 3 will also outline a method to calculate the likelihood function for a typical model found in the RBC literature. In Section 4 there will be an application of the method. The application will entail the formal comparison of the models found in Farmer and Guo (1994) both for the case of no prior uncertainty and for the case of prior uncertainty over the structural parameters of the models. Finally, Section 5 will conclude.

3

2

Real Business Cycle Models

Surveys of RBC literature can be found in Stadler (1994), Danthine and Donaldson (1993), and in Kydland and Prescott (1996). In their paper, Kydland and Prescott outline the steps used to pose an interesting question, construct a model and conduct an experiment. They explain the procedure in general and for the case of models in the RBC literature. The first part of their procedure is to pose a question. One example of a question they give is to ask what is the quantitative nature of fluctuations induced by technology shocks? This was the question that was posed in Kydland and Prescott (1982). Other questions posed in the literature have attempted to understand various observed features of the business cycle. For example, Lucas (1977) notes that amongst observed cyclical fluctuations, it is observed that investment is more volatile than output, consumption is less volatile than output, and the capital stock is very much less volatile than output (Stadler 1994). Danthine and Donaldson (1993) note that the velocity of money is counter cyclical and that the correlation between money aggregates and output varies substantially. Once an interesting question has been posed, Kydland and Prescott (1996) suggest to build a model economy based on well-tested theory. Once a model economy is constructed they suggest that the structural parameters of the model should be fixed to specific values. To do that, they suggest assigning values to the structural parameters of the model so that artificial data generated from the model is similar to that observed. They call this procedure calibration. In essence, prior information is being incorporated into the model with no uncertainty. Once the structural parameters of the model are fixed, it is now possible to simulate the model to get artificial observations on the variables that make up the model. Once these artificial observations have been obtained, it is possible to answer the question that was first posed. In order to evaluate how well the question has been answered, however, it is necessary to ask and answer the question of how “good” the model that is being used is. The term “good” could refer to a models performance relative to a class of models that could have been used to answer the question posed or it could refer to how well the model predicts the observed data. In the RBC literature, models are usually judged by how they replicate all, or more likely, some of the characteristics of the observed data. There have been a number of criticisms leveled at the RBC literature (Stadler (1994), Kim and Pagan (1995),Danthine and Donaldson (1993), Hansen and Heckman (1996), and Sims (1996)). This paper, however, will concentrate on the criticisms regarding how models in the RBC literature are evaluated. The most common method is to evaluate RBC models using informal moment matching. Certain stylised facts are taken from the observed data and these facts are used to evaluate the performance of the model. In practice, second moments of the observed data and correlations between variables have been used as the stylised facts. An example of this approach can be found in Kydland and Prescott (1982) and in Hansen (1985). The first criticism, leveled against this approach, by Kim and Pagan (1995) is that the range of stylised facts used in the evaluation may be too restrictive. An example that Kim and Pagan (1995) give is the case of models that aim to study asset returns. They argue that the stylised facts that are used to determine the validity of the model should include facts about whether the asset return is integrated, whether the asset returns exhibit high leptokurtosis and whether the asset returns exhibit ARCH behavior. Kim and Pagan (1995) claim that very few models in the RBC literature that aim to study asset returns ever use these facts as determinants of a model’s performance. Another main criticism of the “stylised fact matching” method of model evaluation is that the distance between the model and the stylised fact is measured informally. This criticism is noted in Kim and Pagan (1995) and in Hansen and Heckman (1996).

4

There have been a number of attempts to formally measure the distance between a model and the data. Reviews can be found in Canova and Ortega (1995) and Kim and Pagan (1995). The basic approach in the literature is to choose a set of stylised facts and then to chose a metric. Examples of formal measures can be found in Watson (1993), Farmer and Guo (1994), Christiano and Eichenbaum (1992), and Diebold, Ohanian, and Berkowitz (1994). These papers all define formal metrics to measure the distance between the model and the data. The stylised facts used in these papers have varied as well. Impulse response functions have been used as has the spectral density matrix. There have been other criticisms aimed at the RBC literature that is important to this paper. These were noted in Hansen and Heckman (1996) and Sims (1996). In their paper, Hansen and Heckman argue that the studies that are used to calibrate models in the RBC literature may not be as appropriate as thought. Their argument is that the studies used in the calibration of a model are typically microeconomic studies whereas the typical model in the RBC literature is a heavily aggregated dynamic general equilibrium model. Hansen and Heckman (1996) also argue that the parameters of a typical model may not be able to be calibrated with the high accuracy that is common in the RBC literature. This suggests that it would be preferred for any method that aims to evaluate model performance in the RBC literature should be able to allow for uncertainty over the structural parameters of the model. the method proposed in this paper allows for there to be prior uncertainty over the structural parameters of a model. Finally, Sims (1996) argues that model uncertainty should be formally incorporated into model evaluation. Sims (1996) notes that this problem, especially when the models are to be compared and evaluated using only one realization of a time series, only makes sense in a Bayesian context. Using a Bayesian approach to model comparison, all uncertainty over model specification is treated the same way as all other uncertainty that is inherent in the problem and models being compared.

3

A method to formally compare models in the RBC literature

With the comments of Sims (1996) in mind, the approach to the problem of comparing two models will utilize the Bayesian model comparison literature outlined in Geweke (1998). In order to be able to implement this literature, a likelihood function for models in the RBC literature needs to be constructed.

3.1

Constructing a likelihood function for models in the RBC literature

In order to be able to apply the Bayesian model comparison literature to the problem of directly comparing two or more models from the RBC literature, likelihood functions for the relevant models are needed. The likelihood function for all but the simplest of models found in the RBC literature is intractable. The problem is that for most models in the RBC literature, there is no analytical solution. Solutions to these models are usually approximated. Even with the approximated solutions there is still a problem with constructing a likelihood function for these models. This is because there are usually fewer stochastic elements of the model than there are variables. For example, the indivisible labor model of Hansen (1985) contains only one stochastic element. In that model, the only shock is a productivity shock that impacts upon the production function. There have been a number of attempts in the literature to construct an approximate likelihood function for models in the RBC literature. One such attempt is the method of Smith (1993), where the data that is generated from a model is represented as a VAR. The likelihood function of the VAR is constructed and this likelihood function is used to approximate the likelihood function of the model.

5

Because there are fewer stochastic variables than there are variables and because it is usual for variables to be expressed as functions of the state vectors of the model it is not possible to construct a VAR that includes all of the variables of the model. Another problem with this method is that it is common to use data that has been transformed to deviations from trend form. The common practice is to use the Hodrick and Prescott (1997) filter to calculate the trend for observations on a variable where the mean of the deviations from trend is equal to zero. In the case of a model with only one shock the main determinant of the level of the likelihood will be the variance of the simulated data from the model. In comparing two models, the model whose simulated data has the closest variance to the observed data will have the higher likelihood. In essence, the models are being compared with respect to their second moments. Anderson, Hansen, McGratten, and Sargent (1996) describe another way to construct a likelihood function for a specific class of models for which RBC models are a subset. The approach of Anderson, Hansen, McGratten, and Sargent (1996) is to construct a likelihood function for the model by calculating the “innovations” representation of the model using a Kalman filter. Once a likelihood function is calculated, Anderson, Hansen, McGratten, and Sargent (1996) use the likelihood function to calculate maximum likelihood estimates of the structural parameters that make up the model. The innovations representation approach to approximating the likelihood function of the model is also used by DeJong, Ingram, and Whiteman (1997). However, in their paper, DeJong, Ingram, and Whiteman (1997) do not assume any measurement error in observing the data on the variables in the model. In this case, there are fewer stochastic components than variables so that DeJong, Ingram, and Whiteman (1997) only use a subset of the data to construct the likelihood function. The number of variables that can be used , in this case, is constrained to be equal to the number of stochastic components in the model. To implement this method, the initial conditions for the Kalman filter along with the initial conditions of the model need to be assumed. In contrast, the method that will be described in this paper treats the initial conditions of the model as unknown parameters of the likelihood function thus allowing for a direct evaluation of the likelihood function. A state-space representation is used to construct the likelihood function. However, similar to the the approach of Smith (1993) and DeJong, Ingram, and Whiteman (1997) only a subset of the variables that make up a model will be used to construct the likelihood function. However, unlike Smith (1993) and DeJong, Ingram, and Whiteman (1997) the likelihood function will be calculated directly. The approach will be to construct the values of stochastic components of the model that is implied by the observed data. Once the implied shocks have been calculated, the assumed distribution of the shocks, which is part of the model description, is used to construct the likelihood function. This method relies on the ability to approximate the model in such a way that it is possible to invert the mapping between the model and the data. Let the model be denoted by M. Let st be a ns × 1 vector of state variables of the model at time period t where st = (at , bt )0 . The vector bt represents the subset of the state vector that is stochastic ˆ represent the approximation and at represents the non-stochastic component of the state vector. Let M to the model and let xt represent the vector of all variables that make up the model. Let nx be the order of the vector xt and let na and nb be the orders of the vectors at and bt respectively. The approximation to the model can be obtained a number of ways. The essential feature, though, is to be able to write all of the variables that are present in the model as functions of the state variables of the model. Then the model can be represented by the following representation: st+1

=

ˆ s (st ) + ut+1 f or t = 0, . . . , T − 1 , given s0 M

xt

=

ˆ x (st ). M

(3.1)

6

ˆ s is the In the above representation, ut represents the ns × 1 vector of innovations to the state vector, M function implied by the approximation to the model that relates the current value of the state vector to ˆ x is the function implied by the approximation the value of the state vector last period and the function M to the model that relates all of the variables in the model to the current value of the state vector. As an example, consider the model of Hansen (1985). The basic model that was used in Hansen (1985) is a variant of the one sector stochastic growth model. This model is a representative agent model with households and firms. Let consumption by the household be represented by ct and let the amount of labor supplied by the household in period t be denoted by ht . Let output be denoted by yt and let the stock of capital in period t be represented by kt . Let investment in capital be denoted by it and assume that the production function is hit by productivity shocks that are denoted by zt . The household values both consumption and leisure. The household has a time endowment of one so that the amount of leisure in period t is denoted by 1-ht . Therefore, the representative household solves the intertemporal maximization problem given by max E

∞ X

β t u(ct , 1 − ht )

(3.2)

t=0

subject to the following constraints. The first constraint is that all of the output of the economy is either consumed or invested. That is, ct + it ≤ yt . (3.3) The other constraints are the non-negativity constraints that impose that consumption is non-negative, ct ≥ 0, that the capital stock is non-negative, kt ≥ 0, and that the amount of labor endowment that is supplied is non-negative and less than or equal to one, 0 ≤ ht ≤ 1. Here the representative household’s labor endowment is normalized to be one. Capital is assumed to depreciate at a rate of δ so that the law of motion for the capital stock is kt+1 = (1 − δ)kt + it , 0 ≤ δ ≤ 1.

(3.4)

The representative firm is assumed to have a constant returns to scale production technology that is hit by productivity shocks. The production function is yt = zt ktα h1−α 0 ≤ α ≤ 1, t

(3.5)

where zt is the productivity shock that is constrained to be positive. The productivity shock is assumed to follow a first order Markov process log zt = γ log zt−1 + t ,

(3.6)

where t is assumed to be identically and independently distributed with mean zero and variance σ2 . There are a number of ways this model could be approximated. As there are no externalities or distortionary taxes in the model, the solution to the problem can be obtained from the following social planner’s problem: max E

∞ X

β t [u(ct , 1 − ht )] ,

t=0

subject to ct + it ≤ zt ktα h1−α , t

t = 0, 1, 2, . . .

ct ≥ 0, kt ≥ 0, , 0 ≤ ht ≤ 1 t = 0, 1, 2, . . .

7

(3.7)

kt+1 = (1 − δ)kt + it t = 0, 1, 2, . . . log zt = γzt−1 + t , where t ∼ N (0, σ2 ) t = 1, 2, 3, . . . k0 > 0 given ; z0 given. The solutions to the above model cannot be calculated analytically. There are many ways to approximate the model in order to obtain solutions. In Hansen (1985), the method described in Hansen and Prescott (1995) is used. Once the solution to the approximated model has been calculated, it is possible to calculate decision rules for hours supplied, ht and investment, it as functions of the state variables of the model. The state variables of Hansen’s model are capital, kt , the log of the productivity shock, log(zt ) and the constant 1. The decision rules for ht and it are of the form: ht = h1 + hk kt + hz log(zt )

(3.8)

it = i1 + ik kt + iz log(zt ).

(3.9)

and

where the coefficients in 3.8 and 3.9 are all non-linear functions of the structural parameters of the model. Given 3.8 and 3.9 it is possible to write all other variables of the model as functions of the state vector, st = (1, kt , log(zt ))0 . For example, for this model, output can be written as yt = zt ktα (h1 + hk kt + hz log(zt ))1−α .

(3.10)

Once all the variables of the model can be represented as functions of the state variables as is (3.8), ˆ x that is defined in (3.1). For example, (3.9) and (3.10) it is possible to define partially the function M ˆ x (st ) would be given by if xt = (ht , it , yt )0 then the function M   h1 + hk kt + kz log(zt ) ˆ x (st ) =  . (3.11) i1 + ik kt + iz log(zt ) M α 1−α zt kt (h1 + hk kt + hz log(zt )) For this model, the non-stochastic state variables are at = (1, kt )0 while the stochastic state ˆ s that is defined in (3.1) is variable is bt = log(zt ). Then it follows (3.2) to (3.6) that the function M given by   1 0 0 ˆ s (st ) =  i1 (1 − δ) + ik iz  st M (3.12) 0 0 γ and the innovation to the state vector, ut is ut = (0, 0, t )0 .

(3.13)

Let Y T = {yt }Tt=0 be the nx × T vector of observations on the variables xt for the model. To construct a likelihood for the model represented by 3.1, the values of the stochastic variables, bt , implied by Y T need to be calculated. To calculate the values of bt , only nb variables of the model can be used as there are only nb stochastic variables in the model. Let YbT = {ybt }Tt=0 be the set of observations that will be used to calculate the values of the stochastic elements of the model. Here, ybt is an nb × 1 vector continuing a subset of variables from yt . The calculation of the stochastic elements is a recursive one. Consider first, period zero. Suppose that the value of a0 is known. To calculate the value of b0 implied ˆ x (st ) in 3.1 are used. Once b0 is known, by yb0 , the appropriate relations that make up the function M ˆ x (st ) is the first part of 3.1 is used to calculate the values of a1 . Then, given the value of a1 and yb1 , M used to calculate the value of b1 . This process is continued until the end of the sample.

8

For example, suppose that one wished to calculate the values of the stochastic components of Hansen’s model using observations on total hours supplied , {ht }Tt=0 . In this example, nb is equal to one and na is equal to two. Given a value for the initial capital stock, k0 , and the observed value of h0 use (3.8) to calculate b0 = z0 . That is, h 0 − h 1 − h k k0 ˜ log(z) . 0 = hz

(3.14)

Then , for observations t = 1, . . . , T , kt

=

˜ log(z) t

=

(1 − δ)kt−1 + it−1 h t − h 1 − h k kt hz

(3.15)

where it is calculated using (3.9). Note that z˜t represents the calculated values of zt given the observations {ht }Tt=0 . Once the stochastic elements of the model, B T = {bt }Tt=0 , have been calculated it is possible to construct a likelihood function for the model. The object is to use the given distribution of the innovations to bt to construct the likelihood function. The innovation to the state vector, ut can be decomposed into two components: the innovation to at , uat , and the innovation to bt , ubt . In this example, uat = 0 for all values of t. Let gu (.) be the density function of the innovation, ubt and let gz0 (.) be the density function of the unconditional distribution of z0 . Using the first equation of (3.1), it is possible to calculate the values of the innovations to the stochastic elements that are implied by the data YbT . Let the matrix Pb be a matrix that picks the last nb elements of a vector. Then, for t = 1, 2, . . . , T h i ˆ s (˜ u ˜bt = Pb s˜t − M st−1 ) . (3.16) Let Py be a matrix that picks the nb elements from the vector yt that will be used to calculate the likelihood function. In general there are many combinations of yt that can be chosen. It follows from the first equation of the representation given in 3.1 that the current value of the state vector, st is a function of a0 , b0 and U t = {ut }tv=1 . Then, ˆ x (st (U t , b0 , a0 )) ybt = Py M

(3.17)

so that the likelihood function for the model is f (YbT | a0 )

=

f (yb0 | a0 )

=

T Y

T Y

f (ybt | Ybt−1 , a0 )

t=1

gb0 (˜b0 )

t=1

where



   ∂(yb0 , . . . , ybT )  =   ∂(b0 , ub1 , . . . , ubT )0  

 −1 ∂(yb0 , . . . , ybT ) gu (˜ ut ) ∂(b0 , ub1 , . . . , ubT )0

ˆ (s0 ) ∂Py M ∂b0 ˆ (s1 ) ∂Py M ∂b0 ˆ (s2 ) ∂Py M ∂b0

ˆ (s1 ) ∂Py M ∂ub1 ˆ (s2 ) ∂Py M ∂ub1

.. .

.. .

.. .

ˆ (sT ) ∂Py M ∂b0

ˆ (sT ) ∂Py M ∂ub1

ˆ (sT ) ∂Py M ∂ub2

0

0

...

0

0

...

0

ˆ (s2 ) ∂Py M ∂ub2

...

0 .. .

... ...

ˆ (sT ) ∂Py M ∂ubT

(3.18)



    .   

(3.19)

Note that the calculated values of zt are functions of the observations {ht }Tt=0 and the structural parameters of the model, θ. As an example, consider the model defined above in (3.3) to (3.6). For this model, at = (1, kt )0 , bt = log(zt ), and nb is equal to one. The distribution for ubt = t is reported to be Normal with zero

9

mean and variance σ2 . Hence, for Hansen’s model, gu (t ) =

(2πσ2 )−1/2

and gz0 (z0 ) =



1 − γ2 2πσ2

−1/2



1 exp − 2 2t 2σ



  1 − γ2 2 exp − z . 2σ2 0

(3.20)

(3.21)

In order to construct the likelihood function for Hansen’s model, the determinant of the Jacobian of the transformation, which is given in (3.19), needs to be determined. It follows from (3.8) and (3.7) that ∂ht = hz ∂t so that

 −1 ∂(h0 , . . . , hT ) +1) |. = |h−(T z ∂(z0 , 1 , . . . , T )0

(3.22)

Therefore, the likelihood function of the model described in (3.7) is ( )  −1/2  T X T 1 − γ2 1 − γ2 2 1 T 2 −T /2 2 f (Y | k0 ) = .(2πσ ) exp − zˆ − ˆt 2πσ2 2σ2 0 2σ2 t=1 |hz |−(T +1)

(3.23)

where zˆ0 and {ˆ t }Tt=1 are found using (3.14) and (3.15). The method described above is dependent on being able to calculate the values of {bt }Tt=0 that is implied by the observed data and the solution to the model. If this cannot be done then the likelihood function cannot be constructed. Situations where this may arise would include models that have multiplicative shocks. In this case, the product of the shocks would be able to be calculated but the individual values of the shocks would be intractable. Another possible case would be that, at the beginning of each period, simulations from the model were needed in order to obtain values of variables in the model. One example of this would be a model where the stochastic components were governed by a Markov chain with given transition probabilities that are dependent on a function of some of the variables of the model. In some cases the observed data could be used to calculate the probabilities but there might be a case where a variable is unobservable. There are potential solutions to these problems but these may make the method infeasible in that the solutions may increase the computing time to calculate the likelihood function. For example, it may be possible to simulate data to calculate transition probabilities. This, however, could increase the time it would take to calculate the likelihood function to such an extent that the Monte Carlo Markov Chain methods become infeasible. If it is possible to construct a likelihood function for a model in the RBC literature, then it is possible to use likelihood-based techniques to formally compare two or more models. The next section introduces the Bayesian model comparison techniques that will be used to compare models.

3.2

Bayesian Model Comparison

Once a likelihood function has been constructed for a set of models, likelihood-based methods of model comparison and evaluation are available. In particular, Bayesian model comparison is available. In using Bayesian methods to compare and evaluate the performance of a set of models, the uncertainty over which model is better is treated the same as all other uncertainty in the model. This is noted by Sims (1996). Another benefit of using Bayesian methods to compare models is that the results are

10

conditional on the observed data only and no assumptions with regard to sample size are needed. One of the criticisms noted in Hansen and Heckman (1996) was that models in the RBC literature might not be able to be calibrated with as much accuracy as previously thought. The Bayesian approach also handles this criticism by allowing for uncertainty over the structural parameters of the models. The problem is to compare a finite set of models indexed by I given the observations {yt }Tt=0 . Let θk be the structural parameters of model k ∈ I. For each model k ∈ I let p(yt | Y t−1 , θk ) be the density of yt conditional on Y t−1 under model k. Note that Y s = {yt }st=0 for an s≥0. Then the likelihood function for model k is p(Y T | θk ) =

T Y

p(yt | Y t−1 , θk ).

(3.24)

t=0

Let the prior for θk , defined on model k ∈ I, be p(θk ). Then the posterior distribution for θk given Y T and model k ∈ I is p(θk |Y T ) ∝ p(θk )p(Y T |θk ) (3.25) and the marginal likelihood of model k is defined to be Z Mk = p(θk )p(Y T |θk )dθk .

(3.26)

Θk

The Bayes factor between two models is then defined to be Bij =

Mi Mj

(3.27)

and it is this object that is used to compare models. Model uncertainty can be incorporated into the comparison through the posterior odds ratio. The posterior odds ratio is defined to be P ORij =

pi Bij pj

(3.28)

where pi is the prior probability that is assigned to model i ∈ I. In order to rule out scaling effects it is important to note that the prior and data densities that are used to calculate the posterior distribution are normalized so that they integrate to one. Beliefs as to which model best fits the data can be incorporated into the comparison using (3.28). If one does not have any strong feelings towards any single model or group of models, then the prior weights given to all models will be the same. In this case the posterior odds ratio is just the Bayes factor. One way to evaluate a model is to ask how much prior probability must be given to a model in order for it to have a higher posterior odds ratio. The more unrealistic that prior probability assigned to a model the more unrealistic the model. As mentioned in the previous section, the Bayesian approach to model comparison and evaluation treats model uncertainty the same way as you treat uncertainty in any part of a model. The Bayes factor and posterior odds ratio are used to compare and evaluate the models at hand. To gain insight into how we can use these concepts to compare and evaluate models we first look at the closely related concept of a predictive density. Suppose that we have data {yt }Tt=0 and we wish to predict the values of yT +1 , . . . , yT +m . The predictive density of yT +1 , . . . , yT +m conditional on model k and data Y T is

pk (yT +1 , . . . , yT +m ) =

Z

pk (θk | Y T )

ΘK

TY +m

p(yt | Y t−1 , θk )dθk .

(3.29)

t=T +1

The predictive density applies prior to observing the data and as usual we can define the analogous predictive likelihood function,

11

pˆTkT+m

=

Z

pk (θk | Y T )

ΘK

TY +m

p(yt | Y t−1 , θk )dθk .

(3.30)

t=T +1

T Note that Pˆk0 = MkT . It can be shown that

pˆvku

=

Z

pk (θk | Y T ) Θk

= = =

v Y

p(yt | Y t−1 , θk )dθk

t=u+1

Qu v Y pk (θk ) t=0 p(yt | Y t−1 , θk ) R Qu p(yt | Y t−1 , θk )dθk t−1 , θ )dθ p (θ ) p(y | Y k k t k k Θk Θk t=0 t=u R Qv t−1 p (θ ) p(y | Y , θ )dθ k k t k k RΘk Qut=0 t−1 , θ )dθ p (θ ) p(y | Y t k k t=0 Θk k k

Z

(3.31)

Mkv . Mku

Thus the predictive likelihood function for observations u+1 through v is just the ratio of the marginal likelihood’s for the sample of observations 0 through v and 0 through u respectively. The predictive likelihood can then be decomposed using (3.31). Consider any sequence of numbers such that 0 ≤ u = s0 < s1 < . . . < sq = v . Then it follows from (3.31) that pˆvku

q Y Mksq Mks1 ... = pˆsksτ τ −1 . = Mks0 Mksq−1 τ =1

(3.32)

Note that for u=0 and v=T,ˆ pvku = pˆTk0 = MkT . As the marginal likelihood can be decomposed into the product of ratios of predictive likelihoods, we can see that the marginal likelihood represents the out of sample prediction performance of the model. Thus, by using the Bayes factor, which is just the ratio of the respective marginal likelihood’s for each model, we are comparing models on their ability to predict out of sample. Another application for the decomposition given in (3.32) is for direct model diagnostics. Consider the complete decomposition in which u=0 and v=T and where si − si−1 = 1 . A relatively low value of pˆsksi i−1 would indicate that the observation indexed by si was surprising given observation si−1 and model k. Thus, one could use this decomposition to evaluate the performance of models in regard to predicting large movements in the data. For example, a large movement may be surprising to all models but some may do better than others. The decomposition is also useful in getting some insight into the Bayes factor. Using the decomposition of the marginal likelihood in 3.32, one can do the same for the Bayes factor. That is, v ˆij,u B

q Y pˆv = viu = pˆju τ =1

pˆsisττ −1 pˆsjsτ τ −1

!

=

q Y

ˆ sτ B ij,sτ −1

(3.33)

τ =1

Observations or periods of observations that make large contributions to the overall Bayes factor in favor of model i over model j can be identified using (3.33). Unusually low values of the predictive density for an observation or for a group of observations would indicate that the model did not do a good job of predicting that particular observation or group of observations. By breaking up the Bayes factor up into a product of predictive Bayes factor, as in (3.33), we are able to see what observations or group of observations have the biggest contribution to the overall Bayes factor. There may be observations that are surprising to both models, but the decomposition allows us to see which model handles the surprising event the best. The question of how models handle a rare but potentially important event could be extremely useful in the evaluation of that model. An example of a surprising observation might

12

be an observation that was more than three sample standard deviations different from the preceding observation. A plot of the cumulative log Bayes factor in favor of one model versus another is a way of seeing which observations had the most influence. Jumps in the cumulative log Bayes factor plot would indicate that a large addition to the overall log Bayes factor occurred at that particular observation. This would mean that one of the models out-performed the other significantly at that observation. On the other hand, there may be no large jumps in the cumulative log Bayes factor. This would mean that either one of the models out-performs the other consistently or that both models perform the same in the given sample. In order to use the Bayes Factor to compare models, the marginal likelihood defined in (3.26) needs to be calculated for each model. Geweke (1998) discusses various ways to calculate the marginal likelihood. In most cases, the marginal likelihood is not able to be calculated analytically. In order to calculate the marginal likelihood, it is essential to be able to make drawings from the posterior distribution of the model. There are many ways to do this, and the appropriate method depends on the problem. If the posterior distribution is known and is easily drawn from, then it is possible to make independent draws from the posterior. If independent draws are not possible then various Markov chain Monte Carlo (MCMC) methods are available. These are described in detail in Geweke (1998). One such MCMC method is the Metropolis-Hastings algorithm. The Metropolis-Hastings algorithm is described in detail in Chib and Greenberg (1995) and Tierney (1994). The Metropolis-Hastings algorithm is defined by a transition probability density function q(x, y) . Given a value of x , q(x, y) generates a value y from the target set of possible values. At the mth step, given the value θ(m) the algorithm generates a potential value for θ(m+1) from q(x, y), and accepts this value for θ(m+1) with probability α(θ

(m)

, θ) = min



 p(θ | Y T )q(θ, θ(m) ) ,1 . p(θ(m) | Y T )q(θ(m) , θ)

(3.34)

If the candidate value is not accepted the algorithm sets θ(m+1) = θ(m) . Chib and Greenberg (1995) show that the above Markov chain has the posterior distribution p(θ | Y T ) as its invariant distribution. In order to use this algorithm, a distribution for q(x, .) must be defined. One such choice for q(x, .) is to choose a density f that is defined on the support of p(θ | Y T ) and set q(x, y) = f (y − x) . Practically, defining this way means that the candidate y is determined by drawing z from f and adding it to x. That is, y = x + z . The choice of f is dependent on the problem. Tierney (1994) suggests a multivariate normal, multivariate t , or a uniform distribution defined on a disc as potential choices for f. If the choice of f is symmetric around the origin then the probability of acceptance collapses to α(θ(m) , θ) = min



 p(θ | Y T ) , 1 . p(θ(m) | Y T )

Hence the Random Walk Metropolis-Hastings algorithm is, given an initial value θ(0) , • for m=1, . . .,M, generate z from f • form θ = θ(m) + z ( θ (m+1) • let θ = θ(m)

with probability α(θ(m) , θ) else

• return {θ(0) , . . . , θ(M ) }.

13

(3.35)

Tierney (1994) shows that if f is chosen correctly the Metropolis-Hastings algorithm converges to its invariant distribution. If suitably defined, the invariant distribution of the Metropolis-Hastings algorithm is p(θk | Y T ) and so after a number of burn-in iterations, the Metropolis-Hastings algorithm draws from p(θk | Y T ). Once draws from the posterior distribution of a model are obtained it is possible to calculate the marginal-likelihood of the model using a variant of the method of Gelfand and Dey (1994) suggested by Geweke (1998). The method is as follows. Suppose we wish to approximate Z MT = p(θ)p(Y T | θ)dθ, Θ

where p(.) is the prior for the model in question. Let f(.) be any p.d.f. that has its support contained in Θ. Define the function g(θ) as   f (θ) g(θ) = p(θ)p(Y T | θ) . Then the conditional expectation of g(.) under the posterior distribution is

T

E[g(θ) | Y ] = = =

Z

Z

f (θ) p(θ | Y T )dθ T | θ) p(θ)p(Y Θ Θ Z f (θ) p(θ)p(Y T | θ) R dθ T | θ) p(θ)p(Y T | θ)dθ Θ p(θ)p(Y Θ R f (θ)dθ Θ R = MT−1 . T | θ)dθ p(θ)p(Y Θ T

g(θ)p(θ | Y )dθ =

(3.36)

This conditional moment can be approximated by E[g(θ) | Y T ] = M −1

M X

g(θ(m) ),

(3.37)

m=1

where {θ(m) }M m=1 is the set of draws from the posterior distribution from the Markov chain. A practical method for implementing this method can be found in Geweke (1998). Consider now the case of comparing a set of models from the RBC literature. In constructing the likelihood function for a typical model from this literature, a distinction was made between structural parameters of the model and parameters that were necessary to initialize the model. Let θs,k be the vector of parameters that make up the structural parameters of the model and let θi,k be the vector of parameters that are needed to initialize the model. Let p(Y T |θs,k , θi,k ) be the properly normalized data density for model k∈ I and let pk (θi,k , θs,k ) be the properly normalized prior density placed over the structural parameters and the initial parameters jointly. Then the posterior density for θk = (θi,k , θs,k )0 is p(θk |Y T ) ∝ pk (θi,k , θs,k )p(Y T |θs,k , θi,k ), (3.38) and given the posterior given in 3.38 the marginal likelihood for model k∈ I is Z Mk = pk (θi,k , θs,k )p(Y T |θs,k , θi,k )dθk .

(3.39)

Θk

In the RBC literature, it is common to calibrate the structural parameters, θs,k to specific values. By calibrating the values of θs,k , the prior ps,k (θs,k ) is degenerate. Therefore the prior for θk is c p(θk ) = pi,k (θi,k , θs,k )

14

(3.40)

c where θs,k is the calibrated value of θs,k . The difference between the calibrated case and the non-calibrated case is that the prior for the structural parameters of the model is degenerate. All of the methods described above still apply to the case of calibration. It is easy to see that this method of model comparison is very flexible and allows for the use of prior knowledge over the structural parameters of the model in a consistent way. In particular, prior uncertainty as to what are the correct calibrated values is allowed for in a consistent way. Also, the method is able to compare and evaluate models over sub-samples as well as across the whole sample. The next section contains an application of the technique described above for two separate cases. The first case is where the structural parameters of the model are calibrated to specific values as is the common practice in the RBC literature. The second case is where prior uncertainty over the values of the structural parameters of the model is allowed for. From the first case it will be clear that there is a need to be able to allow for uncertainty over the structural parameters of the model. It will also be shown that allowing for prior uncertainty can lead to different conclusions as to which model is preferred.

4

An Application

Recently, there has been renewed attention placed on models that have stochastic components that are unrelated to the “fundamental” components of the model. In particular Benhabib and Farmer (1994) show that by perturbing a standard RBC model it is possible to obtain a model that can generate cycles with shocks that are unrelated to any “fundamental” components of the model. Farmer and Guo (1994) calls this model a “sunspot” model, and compare such a model with a standard model with real shocks. Farmer and Guo (1994) motivate their work by noting that the source of fluctuations to an economy has important policy implications. They argue that if a model with shocks to its “fundamental” components best describes the data then there is no role for policy as these allocations would be Pareto optimal. However, if a model that has shocks that are not related to the fundamentals of the model is preferred, then there is a role for policy to reduce the fluctuations and increase welfare. The two models that are compared are variants of the one-sector stochastic growth model. The first model contains real shocks that are related to the fundamental components of the model while the second model was shown by Benhabib and Farmer (1994) to have the potential for shocks that are not related to any fundamental component of the model. Farmer and Guo (1994) first compare the simulated data of the two models respectively with the observed data. From this comparison, Farmer and Guo conclude that the “sunspot” model cannot be rejected. Farmer and Guo then compare the two models with respect to the dynamics of the data. They do this by comparing the impulse response function implied by each model with the impulse response function implied by the data. Their conclusions are that the “sunspot” model does a better job in replicating the dynamics of observed data. Therefore, Farmer and Guo conclude that a “sunspot” model cannot be rejected as a potential tool to analyze policy or to answer questions that are posed in the literature. The approach of this application is to compare these two models using the Bayesian model comparison methods described in Section 3.2. The likelihood functions of each model are constructed using the method of Section 3.1 and these are used to compare the two models. The models are compared at the values calibrated by Farmer and Guo (1994)and also for the case where there is prior uncertainty placed across the structural parameters of each model. Section 4.1 describes the models that are used while Sections 4.2 and 4.3 describe how a likelihood function is obtained for the models used in this paper. The results will be reported in Section 4.4.

15

4.1

The Models

The “fundamental” is the one-sector stochastic growth model with constant returns to scale aggregate production technology. The “sunspot” is a model that is a stochastic one-sector growth model with an increasing returns to scale aggregate production technology. The second model was shown by Benhabib and Farmer (1994) to have the potential to have a “sunspot” equilibrium. The two models are described below. Consider first the increasing returns to scale economy that is the basis of the “sunspot” model . A more detailed discussion of this model can be found in Benhabib and Farmer (1994) and in Farmer (1993). The economy is as follows: Let Ct denote consumption and Lt denote labor supply. There are a very large number of agents indexed by i ∈ [0, 1]. Each agent acts as both a household and a producer. When acting as a household, the agent maximizes the discounted sum of utility with a time discount factor of ρ where 0 < ρ < 1. The problem faced by the consumer is Ui =

∞ X

ρt E0

"

t=0

# L1−γ log Ct − A t , 1−γ

0 1 which implies the possibility of positive profits.

16

The economy described above is an economy that has an increasing returns to scale technology (α + β > 1). The degree of monopoly power for the intermediate producers is given by the parameter λ. If λ = 1, then there is no monopoly power and the factor share defined in (4.8) are exactly equal to their production elasticities. In this case the model collapses to the standard one-sector stochastic growth model. The two economies that are used in this comparison are variants of the economy defined above. The first economy, which will be known as the “fundamental” model, is the model described above with λ = 1 and α + β = 1. This is essentially the indivisible labor model of Hansen (1985). The second model, which will be called the “sunspot” model, is the model described above with λ restricted to the interval (0,1) and α + β > 1. In Section 4.2 there will be a discussion on how the second model can exhibit a sunspot equilibrium. The stochastic component of the above economy is Zt , the technology shock parameter that evolves according to the equation θ Zt = Zt−1 ηt (4.9) where ηt is assumed to be an identically and independently drawn random variable from the distribution N (1, ση2 ). Note that in the RBC literature it is common to assume that the error ηt has a Normal distribution even though this assumption implies that there is non-zero probability that Zt is negative. However, for the calibrated values that are used by Farmer and Guo (1994), 0 is 142.8 standard deviations away from 1 so that the probability of a negative value for Zˆt is negligible. It then follows from the first order conditions to the problem set out above and from the laws of motion for the state variables that an equilibrium to the economy described above is characterized by the following set of equations:

(4.10)

=

Zt Ktα Lβt , Yt b , Lt    ρ Yt+1 a Et +1−δ , Ct+1 Kt+1 (1 − δ)Kt + Yt − Ct ,

=

θ Zt−1 ηt , Z0 given,

(4.14)

=

0.

(4.15)

Yt Ct A γ Lt 1 Ct Kt+1

=

Zt K t lim ρt t→∞ Ct

= =

(4.11) (4.12) (4.13)

The aggregate technology for the economy is described in (4.10) while (4.13) and (4.14) describe the laws of motion for capital, Kt , and for the technology shock, Zt respectively. The next section will introduce the methods that allow solutions to the (4.10) through (4.15) can be calculated and how those solutions are used to construct a likelihood function for this model. Also, a discussion of how the model with increasing returns to scale can exhibit an equilibrium with shocks that do not affect the fundamental components of the model to cause fluctuations in the model.

4.2

Solving the models and constructing a likelihood function

In order to use the Bayesian model comparison literature to compare the two models introduced in Section 4.1, a likelihood function for each model is needed. As is the case for most models found in the Real Business Cycle literature, the likelihood function for the models described in the previous section is intractable. The approach is to approximate a solution to the model and use the solution to construct a likelihood function as described in Section 3.1. The solution to the model takes the form of equations that relate all of the variables to the vector of state variables of the model and equations that relate

17

how the state variables evolve over time. The structure of this section is as follows. The first part will deal with how a solution to the model will be approximated. Next, there will be a discussion of how the model with increasing returns can exhibit an “sunspot” equilibrium and then the construction of a likelihood function for each model will be introduced. The method of approximation used in this section is the method that is used in Farmer and Guo (1994). Farmer and Guo (1994) show that the general model can be represented by the following set of difference equations:

Kt+1 1 Ct Zt

BZtm Ktg Ctd + (1 − δ)Kt − Ct   τ g−1 d−1 m Et DZt+1 Kt+1 Ct+1 + Ct+1

= =

(4.1)

θ Zt−1 ηt

=

where B = (A/b)d , m = 1 − d, d = βφ, g = αm,τ = ρ(1 − δ), and D = Bαρ. The state variables of the models are {Kt , Ct , Zt } . Let {Kt∗ , Ct∗ , 1} be the unique deterministic steady state values for {Kt , Ct , Zt } where K ∗ and C ∗ solve the following equations

For any variable Xt define

K∗

=

B(K ∗ )g (C ∗ )d + (1 − δ)K ∗ − C ∗

1

=

D(K ∗ )g−1 (C ∗ )d + τ

∗ ˆ t ≡ Xt − Xt ∼ X = log ∗ Xt



(4.2) Xt Xt∗



.

Also, define et+1

 ˆ t+1 ] − K ˆ t+1 E t [K   =  Et [Cˆt+1 ] − Cˆt+1  . Et [Zˆt+1 ] − Zˆt+1 

Using the above definitions, the first order Taylor series approximation to (4.1) can be represented as the following matrix system:     " # ˆt ˆ t+1 K K ηˆt+1  ˆ   ˆ  , (4.3)  Ct  = J  Ct+1  + R e t+1 Zˆt Zˆt+1 where the (3×3) matrix J contains partial derivatives of(4.1) and R is a (3 × 4) matrix of coefficients. See Appendix A.1 for a derivation of (4.3). The system of equations in (4.3) contains the equations that determine how the state variables of the model evolve over time. The elements that make up the matrices J and R are all functions of the structural parameters of the model. The non-state variables of the model are output, Yt , investment,It = Yt − Ct , productivity, Pt = Yt /Lt , and supply of labor, Lt . These variables can also be written as functions of the state variables {Kt , Ct , Zt }. For example, labor supply can be written as a function of the state variables. That is, ˆ t = lk K ˆ t + lc Cˆt + lz Zˆt . L (4.4) where lk = −αφ, lc = φ, and lz = −φ. For a derivation of (4.4) , see Appendix A.2. Similar equations Yt can be obtained for output, Yt , investment, It = Yt − Ct , and productivity, Pt = L . Therefore it is t possible to write ˆ  Lt  ˆ  Kt  Iˆ   t  (4.5)  ˆ  = M Cˆt  .  Pt  ˆt Z Yˆt 18

The set of equations that relate the variables of the model to the state variables is given in (4.5). In order to use the method of constructing the likelihood function that was described in Section 3.1, there needs to be a set of equations that describe the evolution of the state variables of the model over time. Farmer (1993) discusses a method of how to solve the system given in (4.3). The general setup is as follows: The model described above for both the “fundamental” model and the “sunspot” model has three state variables. However, Kt and Zt both have known initial conditions. Therefore, according to Farmer (1993), the model described above has only one free state variable. Farmer (1993) defines problems for which the matrix J of (4.3) has exactly the same number of eigenvalues of modulus less than 1 as it has free state variables as “regular” problems. For a “regular” model, the equilibrium is a saddle point equilibrium. This is the most common case for economic models. For either of the models described above, if J has exactly one eigenvalue that has modulus less than one, the model would be classed as a “regular” model. If the number of eigenvalues of J that have a modulus less one is smaller then the number of free state variables, then the model a called and “irregular” model. The “fundamental” model is the model described above with λ equal to one and aggregate technology has constant returns to scale. Farmer (1993) shows that this model is an example of a regular problem and so has a unique “regular” equilibrium. Therefore, the matrix J of (4.3) for the “fundamental” model has exactly one eigenvalue that has modulus less than one. Farmer and Guo (1994) show that in this case it is possible to express one of the state variables as a linear function of the others. In ˆ t and Zˆt . That is, particular Cˆt can be expressed as a linear combination of K Cˆt = ck Kˆt + cz Zˆt

(4.6)

where the coefficients ck and cz are, again, non-linear functions of the structural parameters of the model. By substituting (4.6) in to (4.3) the evolution of the state vectors for the “fundamental” model is given by the following system ˆt K

=

ˆ t−1 + a12 Zˆt−1 a11 K

Zˆt

=

θZˆt−1 + ηˆt .

(4.7)

Substituting (4.6) into (4.5), all of the variables of the “fundamental” model can be represented as ˆ t and Zˆt . This together with (4.7) are the equations that are used functions of the two state variables K to construct the likelihood function. The “sunspot” model that is used in Farmer and Guo (1994) is the variant of the basic model described in (4.1) to (4.7) with 0 < λ < 1 and α+β > 1. It is the model with an increasing returns to scale aggregate technology that arises through agents producing intermediate goods with some market power indexed by λ. This model also has a representation of the form given in (4.3). Benhabib and Farmer (1994) show that for certain values of the parameters , all of the eigenvalues of J lie outside of the unit circle. In this case the model, using the notation of Farmer (1993), is said to be an “irregular” model. Farmer and Guo (1994) further restrict the “sunspot” economy by letting Zt equal its unconditional mean of one for all periods. That is, Zt = 1 for all t. Therefore, the “sunspot” model has no real shocks, only “sunspot” shocks. As the value of Zˆt is zero for all periods, the system of equations that represent the evolution of the state variables through time becomes   ˆ  ˆ  Kt+1 Kt ¯ ( e¯t+1 ) ¯ +R (4.8) =J ˆ Ct+1 Cˆt ¯ are the appropriate partitions of J and R respectively. If the eigenvalues of J¯ all have where J¯ and R modulus greater than one then the system   ˆ     ˆ 0 Kt+1 ¯−1 Kt + (4.9) = J Vˆt+1 Cˆt Cˆt+1 19

characterizes a Markov process that is stable and satisfies the equilibrium conditions given in (4.8). Note that the error term in (4.9) is given by ¯ et+1 J¯−1 R¯ where e¯t+1 =



ˆ t+1 ] − K ˆ t+1  E t [K . Et [Cˆt+1 ] − Cˆt+1

ˆ t+1 is known in period t so that Et [K ˆ t+1 ] − K ˆ t+1 is always equal to In this model, the value of K zero. Hence,   0 ¯ t+1 = J¯−1 Re . c2 Vˆt+1 where Vˆt is the random variable that has mean zero and variance σV2 that represents Et [Cˆt+1 ] − Cˆt+1 . So, for the one sector stochastic growth model with increasing returns it is possible that the model can exhibit an equilibrium that has the form of (4.9) where the random variable Vˆt is not related to any fundamental parameters of the model. One can think of this as a model that is driven by “sunspots”. For the purposes of the comparison, only parameter values that lead to the matrix J with the modulus of its eigenvalues all greater than zero were used. In that sense, the “fundamental” model was compared to the “sunspot” model. The evolution of the state variables in the “sunspot” model is described by the following system: ˆt K

=

ˆ t−1 + b12 Cˆt−1 , b11 K

Cˆt

=

ˆ t−1 + b22 Cˆt−1 + c2 Vˆt . b21 K

(4.10)

Once solutions to the two models have been obtained, it is now possible to construct a likelihood function for each model. Consider, first, the “fundamental” model. The “fundamental” model contains only one stochastic state variable, Zˆt . Before the likelihood function can be constructed, assumptions on the innovation to the stochastic variable need to be made. For the “fundamental” model, it is assumed that ηt ∼ N (1, ση2 ) so that ηˆt ∼ N (0, ση2 ). As there is only one stochastic state variable, the likelihood function for the “fundamental” model can ˆ t be the variable that is to be only be constructed using observations on one variable at a time. Let X used, where X could represent any of the variables that make up the model. For example, X could be ˆ T = {X ˆ t }T . As described in Section 3.1 the likelihood function is output or hours supplied. Let X t=0

constructed iteratively. All variables in the model can be written as a function of the state variables of the model so that in order to calculate the implied values of the shocks to the model, it is necessary to calculate the values of the state variables for each period. This is a straightforward problem in periods two and higher once the initial values of the state variables are known. For the case of the “fundamental” ˆ 0 is needed. Once this value is known it is possible to construct the values of all model, the initial value K other state variables from the equations defined in (4.7) and the equation that relates the current value ˆ t and the state variables. Let the equation that relates X ˆ t to the state variables be of the observable, X ˆ t = xk K ˆ t + xc Cˆt + xz Zˆt X

(4.11)

Then combining (4.6) with (4.11) gives, ˆ t = (xk + xc ck )K ˆ t + (xc cz + xz )Zˆt . X

20

(4.12)

where xc , xk , ck and cz are all functions of the structural parameters of the model. ˆ t and the value of X ˆ t , (4.12) can be used to calculate the value of Zˆt . Once Given the value of K the value of Zˆt is known, then all of the state variables are known for period t. Therefore it is possible to calculate, from (4.4) or similar, all the variables in the model. In particular, it is possible to calculate ˆ t+1 . Once K ˆ t+1 is known, it is possible to calculate the value of ηˆt+1 using the observation the value of K ˆ t+1 . This process continues until period T, the end of the sample. Once {Zˆt }T is known it is X t=0 possible to calculate the values of ηˆt for periods one through T using (4.7). The above description can be summarized in the following algorithm: ˆ0 • for t=0,. . . ,T, given K • Zˆt =

ˆ t −(xk +xc ck )K ˆt X (xc cz +xz )

ˆ t+1 = kk K ˆ t + kz Zˆt . • K Once {Zˆt }Tt=0 is known it is possible to calculate {ˆ ηt }Tt=1 using the equation ηˆt = Zˆt − θZˆt−1 . Then {ˆ ηt }Tt=1 , together with Zˆ0 are used to calculate the likelihood function for the “fundamental” model. The likelihood function is ˆ T |θf , K ˆ 0) p(X

=

ˆ 0 |θf , K ˆ 0) p(X

T Y

ˆ t |θf , X ˆ t−1 , K ˆ 0) p(X

t=1

= where

and

!−1 ˆ ˆ ∂(X0 , . . . , XT ) g0 (Zˆ0 ) g(ˆ ηt ) ∂(Zˆ0 , ηˆ1 , . . . , ηˆT )0 t=1 T Y

(4.13)

  1 − θ2 1/2 1 − θ2 ˆ 2 ˆ g0 (Z0 ) = ( ) exp − (Z0 − 1) 2πση2 2ση2   1 g(ηˆt ) = (2πση2 )−1/2 exp − 2 ηˆt2 . 2ση

It follows from (3.19) that the Jacobian matrix for the transformation between the observed data and the stochastic component is lower triangular so that the determinant of this Jacobian will be the inverse of the product T ˆ0 Y ˆt ∂X ∂X ∂ Zˆ0

where

and

t=1

∂ ηˆt

ˆt ∂X = ((xc cz + xz )) for t = 1, . . . , T ∂ ηˆt ˆ0 ∂X = ((xc cz + xz )). ∂ Zˆ0

Therefore the determinant of the Jacobian of the transformation is |(xc cz + xz )|

−(T +1)

.

Hence the likelihood function for the “fundamental” model is ( ) T (1 − θ)2 1/2 (1 − θ)2 ˆ 1 X 2 T 2 −T /2 2 ˆ ˆ p(X |θf , K0 ) = ( ) (2πση ) exp − (Z0 − 1) − 2 ηˆ 2πση2 2ση2 2ση t=1 t −(T +1)

|(xc cz + xz )|

(4.14)

21

The process to construct the likelihood function for the “sunspot” model is the same as that described above for the “fundamental” model. For the “sunspot” model, the vector of state variables is ˆ t , Cˆt )0 . The innovation to the state vector for the “sunspot” model is st = (K ut = (0, c2 Vˆt )0 where Vˆt is identically and independently distributed with mean zero and variance σV2 . For this example, it is assumed that Vˆt ∼ N (0, σV2 ) so that

  1 gv (Vˆt ) = (2πσV2 )−1/2 exp − 2 Vˆt2 . 2σV

ˆ T = {X ˆ t }T . Then (4.11) and (4.10) can be Again, suppose that there are observations on X t=0 ˆ T . The process is summarized used to calculate the values of Vˆt that are implied by the observations X in the following algorithm: ˆ −1 and Cˆ−1 , for periods t=0,. . . ,T • given K ˆ t = b11 K ˆ t−1 + b12 Cˆt−1 • K • Cˆt =

ˆ t −xk K ˆt X xc

ˆ t−1 − b22 Cˆt−1 }. • Vˆt = (1/c2 ){Cˆt − b21 K Using the above algorithm, it is possible to construct the values {Vˆt }Tt=1 implied by the observed data, ˆ t }T . Thus, it follows that the likelihood function for the “sunspot” model is {X t=1 ˆ T |θs , K ˆ −1 , Cˆ−1 ) p(X

=

T Y

ˆ t |θs , X ˆ t−1 , K ˆ −1 , Cˆ−1 ) p(X

t=0

=

!−1 ˆ ˆ ∂(X0 , . . . , XT ) . gv (Vˆt ) ∂(Vˆ0 , Vˆ1 , . . . , VˆT )0 t=0 T Y

(4.15)

It follows from (3.19) that the Jacobian in (4.15) is a lower triangular matrix, which implies that the determinant of the Jacobian is !−1 T ˆ ˆ ˆt Y ∂(X0 , . . . , XT ) ∂X = . ∂(Zˆ , ηˆ , . . . , ηˆ )0 t=0 ∂ Vˆt 0 1 T It follows from (4.11) that

ˆt ∂X = xc c 2 ˆ ∂ Vt so that

!−1 ˆ0, . . . , X ˆT ) ∂( X = |xc c2 |−(T +1) . ∂(Zˆ , ηˆ , . . . , ηˆ )0 0 1 T

Therefore, the likelihood function for the “sunspot” model is ˆT

ˆ −1 , Cˆ−1 ) p(X |θs , K

=

(2πσV2 )−(T +1)/2 |xc c2 |−(T +1) .

22

(

T 1 X ˆ2 V exp − 2 2σV t=0 t

) (4.16)

Now that the likelihood function for each model has been constructed,likelihood methods are now available. In particular, the Bayesian model comparison method introduced in Section 3.2 is now available. The results from this comparison for the two models described above can be found in Section 4.4 below. As each model has only one stochastic component, the models are compared across a number of data sets. A likelihood function is calculated for each data set. Section 4.3 describes the raw data that was used and also discusses how the data was transformed before it was used.

4.3

Data

The two models that were described in Section 4.1 were compared using five separate data sets. The data sets that were used were data on total hours supplied (hours), total consumption (consumption), total investment (investment), productivity, and Output. All of the data sets that were used consisted of deseasonalized quarterly data. Consumption, investment and output were obtained from the National Income and Product accounts and hours supplied was constructed using data obtained from the Bureau of Labor Statistics LABSTAT1 database. Two series were used in the construction of the total-hours series used. They were average hours supplied2 and number employed3 according to the Household Labor Survey. Total hours supplied was thus defined as the number of people employed multiplied by the average hours supplied. As defined in the Household Labor Survey, persons are defined as employed if, during the reference week, they 1) did any work at all as paid employees, worked in their own business or 2) were temporarily absent from jobs because of vacation, illness, bad weather, maternity leave, labor dispute, job training or personal reasons. People who work more than one job are counted only once. This definition of the number of people employed is used because it closely mirrors the construction of the average hours series. Once obtained, the total hours series was deseasonalized using seasonal dummies in the obvious way. One aspect of the Current Population Survey is that for the years of 1959, 1964, 1970, 1981, 1987, and 1992, Labor Day fell in the survey week. For those years the average hours supplied for September was artificially low, as the reference week only contained four days. To control for this effect, extra dummies were included when the data was deseasonalized. Those extra dummies took the value of one for an observation that fell during one of the Labor Day weeks, and zero otherwise. The data that makes up the consumption series includes all consumption expenditure on nondurable goods and services while the investment series is made up of total gross private investment as defined by the National Income and Product Accounts. Output is defined to be Real Gross National Product and productivity is defined to be output divided by total labor hours. The sample used in this comparison ranged from the first quarter of 1955 (1955:1) until the last quarter of 1996 (1996:4) for Total hours supplied. For all other data sets the data used was from the third quarter of 1958 (1958:3) until the last quarter of 1996 (1996:4). Before the data could be used in the comparison of two models from the RBC literature, it needs to be transformed to a form that better matches the equations used to calculate the likelihood function. The equations used to calculate the likelihood function use data in the form of deviations from trend of the log of the variable. If Xt is a variable of the model, then   Xt ˆ = log(Xt ) − log(X ∗ ) Xt = log X∗ is used in the equations that make up the solutions to the model. 1 http://stats.bls.gov:80/datahome.htm 2 The 3 The

series ID for average hours supplied is lfu1231040000000 series ID for number employed is lfs11104010000

23

The data that are observed grow over time. The models described above, however, abstract from growth and assumes the variables fluctuate around a steady state. It is the practice in the RBC literature to handle this problem by transforming the data using the Hodrick-Prescott (1997) filter. This filter acts to remove a trend from the data by solving the following problem. Let {xt }Tt=1 be the log of the raw data and let {τt }Tt=1 be the trend for that logged series. The Hodrick-Prescott (1997) filter chooses {τt }Tt=1 so as to minimize T T −1 1X φ X (xt − τt )2 + [(τt+1 − τt ) − (τt − τt−1 )]2 T t=1 T t=2

where φ is a smoothing parameter that is set equal to 1600 for quarterly data. Then dt = xt − τt is used ˆ t . In essence, log(X ∗ ) is replaced by τt . Figures 1 to 5 contain plots of the resulting to as a proxy for X deviations from trend for the data that was used in the comparison.

4.4

Results

The two models were then compared under two separate cases. The first case is the case where the structural parameters of the models are calibrated. This was the comparison that was undertaken by Farmer and Guo (1994). The second case, which was not undertaken by Farmer and Guo (1994) is where there is prior uncertainty over the structural parameters of the model. Please note that all results were calculated using software from the Bayesian Analysis Computation and Communication (BACC) project (Geweke and Chib 1998).

4.5

Calibration

In this case, the two models described in Section 4.1 are compared using the Bayesian model comparison techniques described in Section 3.2. In this section, the structural parameters of the model will be fixed at the calibrated values given by Farmer and Guo (1994). This is the exact comparison that was undertaken in Farmer and Guo (1994). In their paper, Farmer and Guo (1994) use informal methods such as comparing second moments and comparing the implied impulse response functions of the data generated by the models with the impulse response function implied by the observed data. This section aims to perform a formal comparison of the two models by constructing a likelihood function for the two models and forming a Bayes factor in favor of one model over the other. The values for the structural parameters that were used in this study were the same values used by Farmer and Guo (1994). These can be found in Table 1. a

b

α

β

δ

ρ

γ

λ

θ

ση or σV

“fundamental”

0.36

0.64

0.36

0.64

0.025

0.99

0

1

0.95

0.007

“sunspot”

0.23

0.70

0.40

1.21

0.025

0.99

0

0.58

NA

0.00217

Table 1: Calibrated values of structural parameters Bayes factors were constructed using the methods described in Section 3.2. The “fundamental” model described in Section 4.1 contains only one stochastic element which is the technology shock, Zt . As a result the likelihood function for the “fundamental” model will be constructed using one data set at a time. Thus the results that follow will be reported for each data set separately. Let p(X T |θs,f , θi,f ) be the properly normalized data density for the “fundamental” model. Here X T = {Xt }Tt=1 is the data set that is being used where Xt is the tth observation used in the construction of the likelihood function.

24

Given this data density, the posterior distribution for θf = (θs,f , θi,f )0 is p(θf |X T ) ∝ p(θf )p(X T |θf )

(4.1)

where p(θf ) = p(θs,f |θi,f )p(θi,f ) is the properly normalized prior distribution for θf . In this case, the c structural parameters are fixed at their calibrated values, θs,f . This implies that the prior for the structural parameters is c (θs,f ) p(θs,f |θi,f ) = Iθs,f c where I() is an indicator function which takes the value of one if θs,f = θs,f and zero otherwise. In the case of the “fundamental” model, the set of initial parameters is made up of only the initial level of ˆ 0 . In the notation of Section 3.1, θs,f = (α, β, a, b, δ, ρ, γ, λ, θ, σ 2 )0 and θi,f = K ˆ 0. capital, K η The prior for the initial condition was constructed as follows: The initial condition for the “fundamental” model is the deviation from trend of the initial capital stock. The prior distribution was assumed to be Gaussian with mean zero and variance σ 2 . To calculate the prior variance consider the Taylor’s series approximation to the capital accumulation equation given by

ˆ t+1 = (1 − δ)K ˆ t + δ Iˆt . K

(4.2)

Using (4.2), the unconditional variance of capital is approximately δ var(Iˆt ). 2−δ

ˆ t) ≈ var(K

(4.3)

ˆ t was Using this equation, the approximate variance for capital is 0.000032. The prior variance for K therefore set to be 0.000064, twice the calculated variance of the capital series. This makes the prior ˆ t. distribution for the initial capital stock relatively diffuse relative to the approximate distribution of K T The posterior distribution for θf given X is therefore c p(θf |X T ) ∝ p(θi,f )p(X T |θs,f , θi,f ).

To make draws from the posterior described above, a random-walk Metropolis-Hastings algorithm was used. The algorithm is as follows: (0)

• given θi,f

• for m=1,M • draw x ∼ f () (m−1)

• let y=θi,f +x   y (m) • let θi,f =  (m−1) θi,f (0)

with prob. min



p(y,θsc |X T ) (m−1)

p(θi,f

,θsc |X T )



,1

else

(M )

• return {θi,f , . . . , θi,f } The source density, f(), for the random walk step was chosen to be the Normal density with mean 0, 2 and variance σrw . The variance of the source density was chosen to tune the algorithm. Once the draws (0) (M ) {θi,f , . . . , θi,f } are obtained, they were used to calculate the marginal likelihood for the “fundamental” model as described in Section 3.2 The structural parameter vector for the “sunspot” model is θs,s = (α, β, a, b, δ, ρ, γ, λ, σV2 )0 and the ˆ 0 , Cˆ0 )0 . The structural parameters were fixed at the values given initial condition parameter is θi,s = (K

25

in Table 1. The prior for θs = (θs,s , θi,s )0 is defined as for the “fundamental” model except that the prior for the initial parameters is defined to be the product of two independent Normal distributions. The prior for the initial value of capital is the same as for the “fundamental” model while the prior for the initial value of consumption is defined to have mean zero and variance equal to 0.00135. The variance for consumption was constructed using the same approach that was used to construct the variance for the initial value of capital. The approximation to Ct + It = Yt yields an approximate variance for Cˆt of 0.000675. The prior variance for Cˆ0 was therefore set equal to twice the calculated variance for Cˆt . Table 2 contains information on the priors for the initial conditions for the “sunspot” model. The marginal-likelihood was calculated the same as described for the “fundamental” model. Model

parameter

prior mean

prior variance

“fundamental”

k0

0

0.000064

“sunspot”

k0

0

0.000064

c0

0

0.00135

Table 2: Prior specification for initial conditions The following tables contain the results for the two models. In all cases, a total of 50,000 draws from the posterior distribution of each model were made using the Metropolis-Hastings algorithm de2 scribed above. The value of σrw was chosen to tune each algorithm separately. There were a number of diagnostics that were used to determine whether the algorithm was tuned appropriately. One of those is the acceptance probability of the algorithm. Tierney (1994), discusses what should be the appropriate 2 acceptance proportion from the Metropolis-Hastings algorithm. The value of σrw was chosen so that the acceptance probability of the algorithm was between 0.3 and 0.5. Another diagnostic is to compare the numerical standard error of each moment calculated with the corresponding posterior standard deviation. These are reported by MOMENT4 . In all cases, the numerical standard errors of the moments calculated were all less than one tenth of the calculated posterior standard deviations. For the case of the calibrated models, the draws from the posterior distribution, c p(θi,k |X T , θs,k )

were used to calculate the marginal likelihoods for each model. These marginal likelihoods are reported in Table 3 below. The subsequent log Bayes factors in favor of the “fundamental” model over the “sunspot” model are reported in Table 4. Consumption

Hours

Investment

Productivity

Output

“fundamental”

-264.7742 (0.0035)

504.8544 (0.0129)

244.0432 (0.0143)

-827.0816 (0.0103)

458.9279 (0.0114)

“sunspot”

-1917.2548 (0.0144)

505.1721 (0.0150)

242.0583 (0.0122)

-3095.4583 (0.0160)

464.8219 (0.0104)

Table 3: Log marginal likelihoods: Calibrated case 4 http://www.econ.umn.edu/∼bacc

26

Consumption

Hours

Investment

Productivity

Output

1652.4806 (0.0148)

-0.3177 (0.0198)

1.9849 (0.0188)

2268.3767 (0.0190)

-5.8940 (0.0154)

Table 4: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Calibrated case The first thing that should be noted from the reported marginal likelihoods and Bayes factors is the extreme difference in the log Bayes factors across the data sets that are used to construct the likelihood functions. When consumption and productivity are used to calculate the likelihood function, the log Bayes factor is overwhelmingly in favor of the “fundamental” model. This contradicts Farmer and Guo’s finding that the “sunspot” model does no worse than the “fundamental” model. However, the variance in the log Bayes factors suggests that there is a problem with the calibration of the models for those data sets. The parameter that would have the biggest direct effect on the scale of the marginal likelihoods would be the variance parameters of the shock terms for each model. Suppose that these values were smaller than they should be. Then, the values of the shock terms that are used to construct the likelihood functions would be mostly in the tail of their assumed distribution. Thus tail effects could explain the drastic differences between the two models. In the literature, it is common to calibrate the models so that the variance of the artificial times series for output is approximately equal to the observed variance of output. It is not common, however, to recalibrate this value for all of the data sets that are used. For those variables that have significantly different variance form that of output, there could be a problem in using the variance of output to calibrate the variance of the shock term. To test this hypothesis, the variance of the shock term for each model was calibrated for each data set that was used. ˆ t be any variable that is used to construct the likelihood function. Consider first the case Let X of the “fundamental” model. Each variable in that model can be written as a function of the state variables. Therefore, ˆ t = xk K ˆ t + xz Zˆt X where Zˆt ∼ N (θZt−1 , ση2 ). ˆ T , the new value of ση2 was chosen so that the variance of the artificial data set For each data set, X was equal to the variance of the observed data set. Therefore, the value of ση2 was calibrated using the following relation, ˆ var(X)(1 − θ2 ) ση2 = . (4.4) 2 xz ˆ can be written as For the case of the “sunspot” model, the variable, X ˆ t = xk K ˆ t + xc Cˆt . X This along with (4.10) implies that the value of σV2 should be calibrated to, σV2 =

ˆ var(X) (c2 ∗ xc )2

(4.5)

The new calibrated values that were used are reported in Table 5. The log marginal likelihoods for the “fundamental” model and the “ sunspot” model were calculated exactly the same as the previous case. The models were solved at their calibrated values, and the posterior distribution of the remaining free parameters was formed using the constructed likelihood

27

Variable

ση2

σV2

consumption

1.0727 × 10−4

2.4334 × 10−4

hours

1.0902 × 10−5

1.0677 × 10−5

investment

1.4921 × 10−5

6.5254 × 10−6

productivity

1.3723 × 10−4

3.1128 × 10−4

−6

9.4757 × 10−6

real GDP

8.1364 × 10

Table 5: Calibrated values for variances to shock process function. The marginal likelihood was then calculated using the method of Gelfand and Dey (1994), as before. The results can be found in Tables 6 and 7. Consumption

Hours

Investment

Productivity

Output

“fundamental”

218.8067 (0.0035)

252.6044 (0.0141)

115.8952 (0.0103)

94.3669 (0.0119)

294.2506 (0.0121)

“sunspot”

453.0565 (0.0121)

497.9558 (0.0126)

222.6462 (0.0167)

428.4416 (0.0121)

444.0045 (0.0118)

Table 6: Log marginal likelihoods: New calibrated case

Consumption

Hours

Investment

Productivity

Output

-234.2498 (0.0126)

-245.3514 (0.0186)

-106.7510 (0.0196)

-334.0742 (0.0170)

-149.7539 (0.0169)

Table 7: Log Bayes factor in favor of “fundamental” model over “sunspot” model: New calibrated case The differences in the results presented in Tables 6 and 7 from those presented in Tables 3 and 4 are quite stark. Under the new calibration, the “sunspot” model is heavily favored using all data sets. The difference between the new calibration and the old calibration is greater for the variables consumption and productivity. While the calibration for these variables appears to be better than the old calibration, the new calibration is not favored for all variables. The log Bayes factors in favor of the new calibration can be found in Table 8. The above results do suggest however that the reason for the differences across the log Bayes factors reported in Table 3 is because of scaling. The above results also suggest that it is important to calibrate the variance of the shock process correctly when one is trying to compare the performance of two models. It is the practice to calibrate the variance of the shock process using information on output only. This practice, as it was the case above, could lead to incorrect inferences as to model validity for variables other than output. The results in Table 8 imply that calibrating the variance term of the shock process for the two models using the equations of the approximated solution does not do as good a job as the original calibration when observations on hours, investment, and output are used to construct the likelihood function. It is not clear, therefore, as to what is the “best” method to calibrate the value of ση2 and σv2 should be. Allowing for uncertainty over their values would seem to be the best solution to this problem. Allowing for prior uncertainty over structural parameters is discussed in the Section 4.6. Table 9 contains the log Bayes factors in favor of the “fundamental” model over the “sunspot” model for the most favored calibration. The results in Table 9 are not consistent across the data sets.

28

Model

Consumption

Hours

Investment

Productivity

Output

fundamental

483.5809 (0.0049)

-252.2500 (0.0191)

-128.1480 (0.0176)

921.4485 (0.0157)

-164.6773 (0.0166)

sunspot

2370.3113 (0.0188)

-7.2163 (0.0196)

-19.4121 (0.0207)

3523.8999 (0.0200)

-20.8174 (0.0157)

Table 8: Log Bayes factor in favor of the new calibration over the old calibration Consumption

Hours

Investment

Productivity

Output

-234.2498 (0.0126)

-0.3177 (0.0198)

1.9849 (0.0188)

-334.0742 (0.0170)

-5.8940 (0.0154)

Table 9: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Most favored calibration The “sunspot” model is favored for four of the five data sets. There is also inconsistency as to the degree to which the “sunspot” model is favored. The evidence would suggest that the claim of Farmer and Guo (1994) that the “sunspot” model is “as good” as the “fundamental” model is supported. Using consumption or productivity, the evidence suggests that the “sunspot” model is superior to the “fundamental” model. In order to understand why there is a disparity across the different variables that were used to construct the likelihood functions for the two models, the log Bayes factor was decomposed across the entire sample. It was hoped that this would ad insight as to the differences between the two models. The approach was as follows: For each observation t = 2, . . . , T − 1, the Metropolis-Hastings algorithm was used to draw from pk (θi,k |X t , θs,k ) and the value of pk (xt+1 |X t , θk ) was returned as a function of interest. The value of pk (xt+1 |Xt , θk ) is the predictive likelihood of the observation xt+1 given the information Xt . From the output of the Metropolis-Hastings algorithm, posterior moments of the pk (xt+1 |Xt , θk ) were formed for each period t = 2, . . . , T − 1 using the routine MOMENT from the BACC software.5 Then the cumulative sums of the log of the predictive likelihoods was calculated. The results of this can be found in Figures 1 to 5. Figures 1 to 5 contain three graphs each. The first graph is the graph of the deviations from trend as reported by the Hodrick-Prescott filter. The second graph displays the proportional deviation of period t’s observation from period (t-1)’s observation in relation to the range of the observations. That is, the value of the proportional deviation for period t, x ˜t is equal to x ˜t =

x ˆt − x ˆt−1 . T maxt=1 (ˆ xt ) − minTt=1 (ˆ x)

The third graph is the cumulative log Bayes factor in favor of the “fundamental” model over the “sunspot” model. In fact, the third graph is the cumulative sum of the posterior means of the predictive likelihoods for next period. It is clear from Figures 1 to 5 that neither of the two models are in ascendancy over the whole sample. In fact, it appears that the “sunspot” model does better during periods where the data is more 5 http://www.econ.umn.edu/∼bacc

29

Figure 1: Cumulative log Bayes factor: Consumption

Figure 2: Cumulative log Bayes factor: Hours volatile. Thus, one reason for the disparity of results noted above could be due to the nature of the data rather than the superiority of any model. This example gives a good illustration of the method in its ability to compare models across sub-sections of the data as well as across the whole data. During the comparison of the “sunspot” model with the “fundamental”model it was necessary to re-calibrate the variance of the shock process of the two models for each data set that was being used to compare the models. The re-calibration was carried out so as to set the variance of the respective shock processes so that the models would generate artificial data that had the same variance as the observed data. While it was clear that there was a problem with the old calibration, it was not clear that the new calibration was the best method either. One way to solve this problem would be to calibrate the models a number of ways to get an idea of how the results are sensitive to the calibration used. A better and easier solution would be to calibrate the models allowing for uncertainty as to what the exact value of the structural parameters should be. The next section deals with the problem of comparing models when there is uncertainty over the correct values of the structural parameters of the model.

4.6

Prior uncertainty over the structural parameters

The problem of comparing two or more models with prior uncertainty over the parameters of a model is an easy problem. Once a properly normalized data density and a properly normalized prior are specified, all of the Bayesian model comparison techniques that were used in Section 4.5 carry through. In the RBC literature there has not been a lot of attention paid to the problem of allowing for the structural parameters to be calibrated with some degree of uncertainty. DeJong, Ingram, and Whiteman (1996, 1997) do allow for the structural parameters to be calibrated with uncertainty. In their papers, they show that allowing for uncertainty allows the model to better fit the data. Hansen and Heckman (1996) argue that the calibration of dynamic macroeconomic models using cross-sectional data may not lead to “good” calibrations. Calibration, as it is known in the RBC literature, specifies the values of the structural parameters of a model using historical studies and known theory. In calibrating the parameters to specific values, one is placing a degenerate prior over the parameters. Hansen and Heckman (1996) argue that models in the RBC literature are highly aggregated and dynamic while some of the studies are micro-studies that are cross-sectional in nature. They argue that it may not be possible to be able to calibrate with the accuracy that is suggested in the literature. Also, as was found in the previous section, the calibration of the variance of the shock terms for the two models being compared left some uncertainty to what was the correct method that should be used to calibrate the model. In is clear that it would be prudent to allow for some prior uncertainty in calibrating the variance of the shock process. Before making inferences from a model, it is usual to use the data to calculate the values of the structural parameters that are most likely. This has not been the case in the RBC literature. Instead, the structural parameters have been fixed using prior knowledge without out any room for error. If one allows for the calibration of the structural parameters of a model with some uncertainty then it is possible to use prior information as well as information from the data to ascertain

30

Figure 3: Cumulative log Bayes factor: Investment

Figure 4: Cumulative log Bayes factor: Productivity the values of the structural parameters that are most likely. In this section, there is prior uncertainty allowed across the structural parameters of the two models described in Section 4.1. For the “fundamental” model, the structural parameters, θs,f , are the following: b, the labor share of income, δ, the depreciation rate, ρ, the time discount factor, θ, the AR(1) parameter on the productivity shock parameter, A, the preference parameter, and ση2 , the variance of the innovation process to the productivity shock. All other parameters of the model are either fixed or are functions of the above parameters. For example, a, capital’s share of income is defined to be equal to 1-b, while α and β are equal to a/λ and b/λ respectively. The labor supply elasticity, γ, is set equal to zero for all cases. This is not the only way of defining the free parameters that make up the model and hence the likelihood function. There is some flexibility as to which of the parameters a, b, α, β and λ to make free. However, for this comparison, the structural parameters for the “fundamental” model is the vector θs,f = (b, δ, ρ, θ, σ2 , A)0 . The structural parameters for the “sunspot” model are θs,s = (b, δ, ρ, σV , A, λ)0 . There are a number of restrictions that are placed on the structural parameters in the description of the models. In the “fundamental” model, the parameters b, δ, ρ, and θ are all constrained to lie in the unit interval, (0,1). The parameters A and ση2 are both constrained to be greater than zero. For the “sunspot” model, the same restrictions apply to those parameters that overlap except for the parameter b. In the “sunspot” model, it was necessary to incorporate monopolistic competition into the production side of the model. This meant that firms could make above normal profits. Farmer and Guo (1994) calibrate the proportion of national income due to profits as 7%. This was also done for this paper. This implies that the value of b, labor’s share of income, lies in the interval (0,0.93). The parameter λ is also constrained to the interval (0,1). Therefore, the domain of (θs,f ,Θs,f ) is Θs,f = (0, 1)4 × (0, ∞)2 and the domain of the structural parameters for the “sunspot” model is Θs,s = (0, 0.93) × (0, 1)4 × (0, ∞)2 . It is convenient to transform the domains of the structural parameters in each model to Rk , where k is equal to the dimension of the domain. This transformation is convenient in implementing the random-walk component of the Metropolis-Hastings algorithm that was used to make draws from the posterior distribution for each model. By transforming the domain of the parameter vectors, the randomwalk Metropolis-Hastings algorithm was more efficient in that it was guaranteed to always remain in the domain of the posterior distribution. The priors for the structural parameters were also defined over the transformed space.

31

Figure 5: Cumulative log Bayes factor: GDP The parameters were transformed in the following way: For those parameters that were constrained to an interval of the form (c,d), the parameter, x, where x represents any parameter, was transformed using the transformation x−c ), (4.6) x0 = log( d−x while the parameters that were constrained to the positive real numbers were transformed via the log transformation. Tables 10 and 11 report 95% credible intervals for the priors that were used in the study. In practice, the priors were independently defined over the transformed parameter space with means at the transformed value of the calibrated value for each parameter. For example, in the “fundamental” model, b is calibrated to be equal to 0.64. The transformed value, using the formula in (4.6) is equal 0.64 to log( 1−0.64 ) = 0.5754. The prior for the transformed value of b was then defined to be Normal with mean equal to 0.5754 and with a prior variance so as to get the 95% prior coverage intervals reported in Tables 10 and 11. That is, a 95% credible set was calculated for the transformed space and then transformed into the credible sets reported in Tables 10 and 11. This method was also used for the other parameters as well. Prior mean

95% prior coverage interval

b

0.64

[0.4857, 0.7699]

δ

0.025

[0.0128,0.0482]

ρ

0.99

[0.9616,0.9975]

θ

0.95

[0.8603,0.9832]

A

2.86

ση2

4.9 × 10

[1.9171,4.2667] −5

[3.13 × 10−5 , 1.26 × 10−4 ]

Table 10: Prior for “fundamental” model The prior variances were chosen to reflect a reasonably large degree of uncertainty over the values of the parameters. However they were chosen so that the 95% prior coverage intervals lay in a region of the parameter space that were not too unreasonable with respect to the literature. For example, the 95% prior coverage interval for b, labor’s share of income, was set to be equal to [0.4857, 0.7699]. Authors have suggested a variety of values for b. DeJong, Ingram, and Whiteman (1996) calibrate b to be 0.54 and use a prior for b of [0.48, 0.68]. Hansen (1985) and Kydland and Prescott (1982) calibrate b to be 0.64. However, depending on how capital is defined and measured, other authors have suggested higher values for b. For example, Prescott (1986) suggests a value for b of 0.75 while Christiano (1988) suggests a value of 0.66. All these values lie in the 95% prior coverage interval for b. The prior for δ, the depreciation rate for capital, covers a range of about 5% per annum to about 19% per annum with a prior mean at 10% per annum. Most studies calibrate δ to be 0.025. The prior for ρ, the time discount factor, has a 95% prior coverage interval of [0.9616,0.9975]. This implies an economy with a real interest rate ranging from 1% to 21%. The 95% prior coverage interval for the AR(1) parameter θ is [0.8603,0.9832]. This implies a wide range of persistence in the productivity shock process. Hansen (1985) calibrates θ to be 0.95 while DeJong, Ingram, and Whiteman (1996)) use a value of 0.90. A recent paper by Hansen (1997)) finds evidence to suggest that the value of θ is lower than

32

0.95 and closer to 0.90. Finally, the 95% prior coverage interval for ση2 is [3.13 × 10−5 , 1.26 × 10−4 ]. This implies that the variation in real GDP in this model economy ranges from 2.2% per annum to 4.5% per annum. The priors for the “sunspot” model are given in Table 11. In the “sunspot” model case, there is one extra free parameter. That is λ, the monopoly power parameter. Its value was calibrated to be 0.58. Farmer and Guo (1994) used a study by Domowitz, Hubbard, and Peterson (1988)) to calibrate the value of λ. They report a range of values that λ can take and the 95% prior coverage interval shown in Table 11 reflects this. One thing to note in the “sunspot” model is that there are monopoly profits present. Farmer and Guo (1994) calibrate the level of monopoly profits to be equal to 7% of national income. This implies that labor’s and capital’s income shares, for the “sunspot” model, sum to 0.93. Finally, for the purposes of this study, the elasticity γ is set equal to zero for both models and monopoly profits are always set equal to 7% of national income in the “sunspot” model. The same priors were used for all data sets. Prior mean

95% prior coverage interval

b

0.70

[0.5535, 0.8145]

δ

0.025

[0.0128,0.0482]

ρ

0.99

[0.9616,0.9975]

A

2.86

σV2 λ

4.0 × 10

[1.9171,4.2667] −6

[1.55 × 10−6 , 1.03 × 10−5 ]

0.58

[0.4934,0.6620]

Table 11: Prior for “sunspot” model Figures 6 through 17 contain pictures of the prior distributions for each of the structural parameters of each model. These were obtained by calculating the properly normalized prior density in the transformed space and using the appropriate Jacobian transformation to calculate the properly normalized density for each parameter in the original space. The priors for the “sunspot” model were constructed the same way as described for the “fundamental” model. However, not all possible combinations of parameters that make up the prior will lead to a sunspot equilibrium. To have a sunspot equilibrium, all of the roots of the matrix defined in (4.3) have to lie outside the unit circle. Hence the prior for the sunspot model is re-normalized by dividing through by the probability, psun , that a random draw from Θs leads to a model that exhibits a sunspot equilibrium. That is p(θs ) = pb,s (b)pδ,s (δ) . . . pλ,s (λ)pθi,s (θi,s )p−1 (4.7) sun . The probability psun was calculated by making M draws from the prior for θs and counting the number, n, of times the resulting parameter value led to a model that had a “sunspot” equilibrium. This entailed checking to see if all of the eigenvalues of the matrix J of (4.3) all had modulus greater than one. Then psun was set equal to n/M . ˆ 0, The initial parameters for each model were the same as for the calibrated case. That is, θi,f = K 0 ˆ ˆ and θi,s = (K−1 , C−1 ) . The same prior distributions as those reported in Table 2 were used for this example as well. The procedure to compare the two models with prior uncertainty over the structural parameters was the same as for the previous case of calibration. For each data set respectively, the likelihood function for each model was constructed by calculating the values of the stochastic components of the models. These values were used to construct the likelihood as described in Section 4.2.

33

Then, given p(X T |θs,f , θi,f ) and p(θf ), the posterior distribution for the “fundamental” model is p(θf |X T ) ∝ p(θf )p(X T |θf ).

(4.8)

where p(X T |θf ) is defined in (4.14). Likewise, the posterior distribution for the “sunspot” model is p(θs |X T ) ∝ p(θs )p(X T |θs ).

(4.9)

where p(X T |θs ) is defined in (4.16). In order to calculate the marginal likelihood for the two models, and hence use the Bayesian model comparison techniques described in Section 3.2 to compare them, draws from their respective posterior distributions need to be made. As in the calibrated case, a random walk Metropolis-Hastings algorithm was used to do this. Let p(θk |X T ) be either of the posterior distributions that draws are to be made from. Let R be a matrix of the same order of θk . The matrix R is the variance matrix of the source density from which the random step in the Metropolis-Hastings algorithm is drawn from. The source density for the algorithms that were used to obtain the results presented below was a multivariate Normal distribution with mean 0 and variance-covariance matrix R. The Metropolis-Hastings algorithm was as follows: (0)

• given θk drawn from the prior • for m=1,M • draw x ∼ N (0, R) (m−1)

• let y=θk (m)

• let θk =

  

(0)

+x y

with prob. min

(m−1)

θk

(M )

• return {θk , . . . , θk





p(y|X T ) ,1 (m−1) p(θk |X T )

else }

The value of R was chosen so as to tune the algorithm. Various rules were used to tune the algorithm. The value of R was first chosen so as to have the proportion of acceptances over the whole algorithm to be between 0.3 and 0.5. The overall aim is to efficiently draw from the posterior. The rule that was used to determine that the algorithm was tuned was to check whether the computed numerical (0) (M ) standard errors of the posterior means of the drawings {θk , . . . , θk } was less than 10% of the reported posterior standard deviations. All these quantities can be calculated using the software MOMENT from the BACC6 software package. Other tests were made also. One test was to take two different draws from the posterior, starting at different random draws from the prior distribution, and check to see if the resulting posterior moments were statistically similar. The routine APM from the BACC software package was used for this test. Tables 19 and 20 at the end of this section report the posterior means and standard deviations for the two models. In all cases, the moments were calculated using MOMENT and M was set equal to 50,000. Using the program CONVERGE of the BACC package and by inspecting plots of the parameter values that were drawn from the distributions, it was apparent that all of the algorithms had converged to their invariant distribution, the posterior distribution of each model respectively, after at most 5,000 draws. Therefore, the results reported are for the last 45,000 draws for each Metropolis-Hastings algorithm. In all cases, the numerical standard error for each of the moments were less than 10% of the reported posterior standard deviations and so were not reported. 6 http://www.econ.umn.edu/∼bacc

34

The posterior and prior distributions for each of the structural parameters of the two models can be found in the Figures at the end of this section. Once the drawings from the posterior distributions were made, the log marginal likelihoods for each model, and each set of observations, were calculated using the program MLIKE from BACC. The log marginal likelihoods and the resulting log Bayes factors in favor of the “fundamental” model over the “sunspot” model can be found in Tables 12 and 13. Consumption

Hours

Investment

Productivity

Output

“fundamental”

442.6225 (0.0711)

493.5325 (0.0853)

226.5215 (0.0776)

410.6508 (0.0721)

441.7111 (0.0946)

“sunspot”

436.4448 (0.0080)

514.6317 (0.0580)

253.2549 (0.1131)

418.7906 (0.1124)

466.3343 (0.1124)

Table 12: Log marginal likelihoods: Non-dogmatic prior

Consumption

Hours

Investment

Productivity

Output

6.1777 (0.0715)

-21.0992 (0.1037)

-26.7334 (0.1335)

-8.1398 (0.1360)

-24.6232 (0.1469)

Table 13: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Non-dogmatic prior Note that, in Table 13, the log Bayes factors favor the “sunspot” model in all cases except when consumption is used to construct the likelihood function. This result is in contrast to the results from the case when the structural parameters were calibrated. In the course of calculating the marginal likelihoods that were reported in Tables 12, draws from the posterior distribution were made. Therefore posterior moments can be obtained for each model. These are reported in Tables 19 and 20. Also, using the draws from the respective posterior distributions it is possible to obtain plots of the posterior distribution for each parameter. These are reported in Figures 6 through 17 at the end of this section. All of the plots were obtained using the routine GRAPH from the BACC software package. The priors are plotted with each posterior for comparison. In each plot there is a separate graph for each of the data sets used. The first point to note is that for some of the parameters, the data seems to be adding information to the posterior while for others the data seems to be adding little with respect to the prior. The best example of the second case can be found in the plots of the parameter A for both models (Figures 10 and 15). Here it is clear that the posterior and the prior are very similar for both models and for all variables. However for other parameters there is evidence that the data is influencing the posterior. Consider labor’s share of income, b. For the “fundamental” model, the posterior distribution is shifted to the right of the prior. In the case of consumption and productivity, the posterior means are considerably bigger than the prior mean. This would suggest that the data implies that labor’s share of income is higher than that calibrated. However, the posterior means of b are varied across the data sets. For the “sunspot” model there is again evidence that the data is influencing the posterior. Also, the posterior means are greater than the prior means , although not as significantly as those for the “fundamental” model. Again, this would suggest that labor’s share of income for the “sunspot” model is higher than the calibrated value. For the depreciation rate of capital, δ, there are some major differences across models. For the “fundamental” model the posterior and the prior are similar as evidenced in Figure 7. Here there is more posterior weight placed on higher values of δ than in the prior. For the “sunspot” model there

35

is a significant difference between the prior and the posterior. In all but the case when consumption is used to construct the likelihood function, the posterior distribution for δ has a mean that is significantly higher than the mean of the prior distribution. This suggests that the depreciation rate of capital is higher in the “sunspot” model. Also, the variance of the posterior distribution is higher than in the prior. More weight is put on both high and low values than in the prior. Figures 8 and 14 contain the posterior and prior distributions for the parameter ρ, the time discount factor. One aspect that should be noted is that for the “fundamental” model the posterior distribution puts more weight on higher values of ρ in relation to the prior and less weight on lower values of the prior. The posterior and prior means, however, are not too dissimilar. This is the case for all variables that are used to construct the likelihood function except for hours. When hours are used, the posterior distribution for ρ puts more weight on both low and high values of ρ. It should be noted that the domain for ρ is constrained away from 1. For the “sunspot” model, the posterior distributions for ρ tend to be more diffuse than the prior. This is most marked when productivity is used to construct the likelihood function. However, not all posteriors for ρ are more diffuse than the prior. When consumption is used, the posterior distribution favors higher values of ρ rather than lower values. The results from the two models indicate that the intertemporal discount factor for these models is hard to pin down. The information from the data does not do a good job of pinning down a value for ρ. In those cases where the data seemed to provide some information, the posterior distribution favored higher values of ρ but there were enough cases when this did not hold to be confident that the data suggest that the value of ρ used in the literature is too low. Another parameter for which there is a mixture of results is the AR(1) parameter on the shock process for the “fundamental” model. Figure 9 contains the posterior and prior distributions for θ. For three of the data sets, the posterior distributions are much more diffuse than the prior and suggest that the value for θ is lower than the prior mean of 0.95. However, for two of the data sets, there is evidence to suggest that the value for θ should be higher than 0.95. The two variables that lead to a posterior distribution for the “fundamental” model that has a mean higher than 0.95 are consumption and productivity. These were the series whose prior for ση2 was significantly different from the posterior. The domain of θ was restricted to (0,1) which means that a unit root process for the shock process was ruled out. It may be the case that for these variables that restriction is not valid. The results for θ are mixed, but certainly do suggest, at least for some variables that value of θ should be lower that the value 0.95 that is used by most calibrators. This result is similar to that found in Hansen (1997). In the previous section, it was found that the value of the variance of the shock process of the models made a difference to the outcome of the comparison. Figures 11 and 16 contain the posterior and prior distributions for the variance terms to each models stochastic variables. For both the “fundamental” and “sunspot” model, the posterior distribution for the variances ση2 and σV2 respectively are significantly different from the prior for the variables consumption and productivity. In the calibration case, the Bayes factor was strongly in favor of the variances to be larger for those sequences. For both of those variables, the location of the posterior is significantly higher than the location of the prior. When consumption is used to construct the likelihood function, the posterior mean is approximately four posterior standard deviations from the prior mean for the “fundamental” model and is approximately 9 posterior standard deviations for the “sunspot” model. For productivity the numbers are approximately 4.5 and 9.3 respectively. For the “fundamental” model, the two data sets that yield high values for ση2 also yield high values for θ. For consumption and productivity the stochastic variable has higher persistence and variability than what is used in the literature. For the other data sets, the stochastic component had lower persistence than that used in the literature but had a variance that is comparable to that used in

36

the literature. The results presented above again show the need to be careful when comparing models with fewer shocks than variables. Because there are less shocks than variables, only a subset of the variables that make up a model can be used in the comparison each time. There is therefore a need to be careful with calibrating the model for each subset of variables that are used. Given the log marginal likelihoods already computed it is possible to calculate log Bayes factors in favor of placing prior uncertainty over the structural parameters of the model. Table 14 contains these results. The results are mixed. There is evidence to suggest that prior uncertainty should be placed over the structural parameters but this result is not consistent across the data sets or even across models. Consumption

Hours

Investment

Productivity

Output

fundamental

223.8158 (0.0711)

-11.3219 (0.0863)

-17.5217 (0.0735)

316.2839 (0.0785)

-17.2168 (0.0952)

Sunspot

-16.6117 (0.0145)

9.4596 (0.0599)

11.1966 (0.1137)

-9.6510 (0.1130)

1.5124 (0.1129)

Table 14: Log Bayes factor in favor of Non-Dogmatic prior over Dogmatic prior (calibrated case) The fact that the marginal-likelihood for the calibrated case is larger than the non-calibrated case is not that surprising. The marginal-likelihood is a weighted average of the likelihood function where the weights are given by the prior. If the calibrated values are picked to be in an area where the likelihood density is high and the prior is reasonably diffuse then the marginal-likelihood can easily come out in favor of the calibrated case. In this example, the prior is centered at the calibrated values. While the marginal-likelihood is lower for the non-calibrated prior for some of the cases it is apparent that the posterior distribution of the structural parameters are different from the prior distribution. For all data sets, the posterior distributions for some of the parameters are significantly different than the prior. This is evidence that information from the data is suggesting that the calibrated values are not correct. Also, since the results of the comparison differ when prior uncertainty is allowed over the structural parameters, it is clear that whether to allow for uncertainty over the structural parameters is an important aspect of the decision process. In the RBC literature, it is not the value of the parameters that is the most important but rather the implications implied buy the model that is important.

4.7

Comparison of an RBC model with a time series AR(1) model

The preceding sections have compared two models that are found in the RBC literature. The purpose of this section is to compare the two RBC models described in Section 4.1 with a very simple model that is not a model that is found in the RBC literature. Real Business Cycle models aim to account for why there are fluctuations in the observed data. If the ultimate use of a model is to run policy experiments then it is necessary to be able to quantify how good the model is with respect to competing models in the same literature. It is also important to be able to quantify the relative performance of model with other models that are not in the literature. This section aims to do just that. The Bayesian comparison methods that were described in Section 3.2 are able to compare nonnested models. Therefore, it is a simple problem to compare two models that are very different in nature but nonetheless aim to model the same data. A simple model will be used to compare with the two RBC models described in Section 4.1. For what follows, the simple model will be called the “AR1” model. A description of this model follows: Let yt be a vector of observations for period t. Then yt = t

37

(4.10)

where t = φt−1 + ut

(4.11)

and ut |t−1 is assumed to be independently and identically distributed with a Normal distribution with mean 0 and variance h−1 . Also, t is assumed to be stationary. Note that the observations that will be used in the comparison have been passed through the Hodrick-Prescott filter described in Section 4.3 and so have a mean equal to zero. This model is the model UVR3 of the BACC project7 , and the software provided there was used in this analysis. Suppose that the observations Y T = {yt }Tt=0 are to be used in the comparison. Define i.i.d. h−1 yt∗ = yt − φyt−1 for t=1. . .,T. Then yt∗ ∼ N (0, h−1 ). The unconditional distribution of y0 is N (0, 1−φ 2) so that the data density for the model described in (4.10) is p(Y T |φ, h)

= .

(2π)−(T +1)/2 h(T +1)/2 (1 − φ2 )1/2 T X 1 exp{− h[(1 − φ2 )y02 + yt2 }. 2 t=1

(4.12)

The priors for the parameters, φ and h were defined as follows. s2 h ∼ χ2 (ν) and φ ∼ N (φ, h−1 φ ) subject to φ < 1. A reasonably diffuse prior was place over the parameter φ. The prior mean was set to be 0.8 while the prior variance was set equal to 0.01. This implies that the standard deviation for the prior is 0.1. A value of 0.8 for φ implies a moderately persistent error process for t . Note that various values for φ and h−1 φ were tried. The results were not sensitive to the value of the prior for φ. The prior for h was constructed using a notional sample. The data that was used for the study was in log form so it was convenient to think of the error t as a percentage deviation. A notional sample consisting of 10 P10 observations for t was constructed. The value of t=1 2t for this notional sample was 0.0023. Therefore, s2 was set to be 0.0023 and ν was set to be 10. The priors for this study are summarized in Table 15. φ

h−1 φ

s2

ν

0.8

0.01

0.0023

10

Table 15: Prior specification for AR1 model Using the software UVR3 and MLIKEUVR3 from the BACC project, posterior moments and marginal likelihoods for the “AR1” model were calculated. They were calculated each data set used in the comparison of the models described in Section 4.1. Table 16 contains the posterior moments for the “AR1” model while Table 17 contains the marginal likelihoods for the “AR1” model. Given the marginal likelihoods reported in Table 17 it is possible to calculate Bayes factors between both the “fundamental” model and the “sunspot” model with the “AR1” model. These are reported in Table 18. It is clear from Table 18 that the “AR1” model is strongly favored over both of the models. In fact, in order for the posterior odds ratio to be just in favor of any of the two models from the RBC literature, there would have to be at least 1.679×10116 ( =exp(267.6182)) more prior weight placed on the RBC models. In effect, zero prior weight would need to be placed on the “‘AR1” model. This is a 7 http://www.econ.umn.edu/∼bacc/models.html

38

Consumption

Hours

Investment

Productivity

Output

φ

0.7546 (0.0514)

0.7212 (0.0524)

0.7945 (0.0451)

0.7264 (0.0550)

0.7724 (0.0489)

h−1

1.18×10−4 (1.29 × 10−5 )

1.33 × 10−4 (1.45 × 10−5 )

2.16 × 10−3 (2.39 × 10−4 )

1.64 × 10−4 (1.84 × 10−5 )

1.38 × 10−4 (1.51 × 10−5 )

Table 16: Posterior moments for AR1 model Consumption

Hours

Investment

Productivity

Output

757.9801 (0.0250)

815.4367 (0.0372)

520.8731 (0.0344)

731.1842 (0.0384)

745.2873 (0.0301)

Table 17: Marginal likelihoods for AR1 model strong result. The two models from the RBC literature were developed with the aim of trying to model the data that was used in the comparison. The “AR1” has no “economic content” but clearly is favored. This suggests, at least for the two RBC models that were compared, that more work is needed in the development of the models. Careful consideration of the above results should be taken into account when analyzing inferences that are obtained from using the models described in Section 4.1.

Consumption

Hours

Investment

Productivity

Output

“fundamental” calibrated case

539.1734 (0.0252)

310.5823 (0.0393)

276.8299 (0.0367)

636.8173 (0.0402)

286.3594 (0.0322)

“sunspot” calibrated case

304.9236 (0.0278)

310.2646 (0.0401)

278.8148 (0.0365)

302.7426 (0.0403)

280.4654 (0.0319)

“fundamental” full prior

315.3576 (0.0754)

321.9042 (0.0930)

294.3516 (0.0799)

320.5334 (0.0866)

303.5762 (0.0993)

“sunspot” full prior

321.5353 (0.0262)

300.8050 (0.0689)

267.6182 (0.1182)

312.3936 (0.1188)

278.9530 (0.1164)

Table 18: Log Bayes factor in favor of the AR1 model

39

Figure 6: Posterior and Prior Distribution for b: “fundamental”

Figure 7: Posterior and Prior Distribution for δ: “fundamental”

Figure 8: Posterior and Prior Distribution for ρ: “fundamental”

Figure 9: Posterior and Prior Distribution for θ: “fundamental”

Figure 10: Posterior and Prior Distribution for A: “fundamental”

Figure 11: Posterior and Prior Distribution for ση2 : “fundamental”

Figure 12: Posterior and Prior Distribution for b: “sunspot”

Figure 13: Posterior and Prior Distribution for δ: “sunspot”

Figure 14: Posterior and Prior Distribution for ρ: “sunspot”

Figure 15: Posterior and Prior Distribution for A: “sunspot”

Figure 16: Posterior and Prior Distribution for σV2 : “sunspot”

Figure 17: Posterior and Prior Distribution for λ: “sunspot”

40

41 2.7947 (0.5709) 3.79 × 10−5 (1.42 × 10−5 )

0.9910 (0.0066)

0.9778 (0.0107)

2.7752 (0.5548)

2.31 × 10−4 (4.49 × 10−5 )

ρ

θ

A

ση2

3.58 × 10−5 (1.53 × 10−5 )

2.7867 (0.5513)

0.8949 (0.0447)

0.9898 (0.0077)

0.0278 (0.0085)

0.6851 (0.0559)

Investment

3.09 × 10−4 (5.90 × 10−5 )

2.8189 (0.5436)

0.9793 (0.0106)

0.9902 (0.0081)

0.0262 (0.0088)

0.8078 (0.0454)

Productivity

2.85 × 10−5 (7.70 × 10−6 )

2.7747 (0.5665)

0.8788 (0.0505)

0.9912 (0.0066)

0.0242 (0.0085)

0.6646 (0.0713)

Output

2.86 (0.5720)

0.95 (0.0268)

0.99 (0.0068)

0.025 (0.0083)

0.64 (0.0729)

Prior mean

4.9 × 10−5 (2.32 × 10−5 )

Table 19: Posterior moments for “fundamental” model

0.8506 (0.0527)

0.9829 (0.0103)

0.0275 (0.0101)

0.02998 (0.0083)

δ

0.7188 (0.0661)

0.7827 (0.0018)

Hours

b

Consumption

[3.13 × 10−5 , 1.26 × 10−4 ]

[1.9171, 4.2667]

[0.8603, 0.9832]

[0.9616, 0.9975]

[0.0128, 0.0482]

[0.4857, 0.7699]

95% prior credible set

42 2.40 × 10−6 (1.42 × 10−6 )

2.7877 (0.5814)

1.14 × 10−4 (1.16 × 10−5 )

0.5538 (0.0404)

A

σV2

λ

0.6041 (0.0262)

0.9838 (0.0095)

0.5949 (0.0288)

3.35 × 10−6 (9.63 × 10−7 )

2.8221 (0.5655)

0.9861 (0.0094)

0.0453 (0.0085)

0.7582 (0.0267)

Investment

0.6486 (0.0339)

1.30 × 10−4 (1.43 × 10−5 )

2.7749 (0.5916)

0.9709 (0.0207)

0.0389 (0.0152)

0.7275 (0.0198)

Productivity

0.6171 (0.0263)

2.96 × 10−6 (1.16 × 10−6 )

2.7849 (0.5641)

0.9853 (0.0088)

0.0310 (0.0089)

0.7254 (0.0268)

Output

Table 20: Posterior moments for “Sunspot” model

2.8207 (0.5734)

0.9931 (0.0057)

ρ

0.0311 (0.0091)

0.0213 (0.0074)

δ

Hours 0.7028 (0.0261)

0.7688 (0.0418)

Consumption

b

0.58 (0.0425)

4.0 × 10−6 (1.89 × 10−6 )

2.86 (0.5720)

0.99 (0.0068)

0.025 (0.0083)

0.70 (0.0547)

Prior mean

[0.4934, 0.6620]

[1.55 × 10−6 , 1.03 × 10−5 ]

[1.9171, 4.2667]

[0.9616, 0.9975]

[0.0128, 0.0482]

[0.5535, 0.8145]

95% credible set

5

Conclusion

It was asserted earlier that in order to use a model to answer questions posed by a researcher or to evaluate policy changes, first there needs to be a determination of how good the model is. However, in a new literature it may be unwise to reject a model because it does a poor job of predicting the observations one has. The models that are put forward may add insight into the unknown process governing the system that is observed. Gaining this insight may then lead to better models that do a better job at predicting the data. However, even if a model is assumed to do a bad job at predicting the observations, that is no reason to not evaluate a model’s performance as fully as possible. It would be hard to take any inferences obtained from a model seriously if there was no way to relate the model to competing models, both inside the literature and outside the literature. It has been argued that, while models in the RBC literature are poor at predicting the data that is observed, the models in the RBC literature do a good job of replicating some characteristics of those observations. This has been the reasoning behind using those models to try to understand the observations from the data that one has. However, one of the criticisms of the RBC literature is that the methods used to claim that the models do a good job of replicating certain facts are flawed. The important aspect of using models to answer questions is to know whether the model that is being used is superior to other models that have been used. Therefore a method that can formally and directly compare two models is preferred. In the RBC literature, there are few methods of evaluation that directly compare two or more models. The object of this paper was to put forward a method that is able to formally and directly compare two models in the RBC literature. The benefit of such a method is twofold: First, when using a model to make inferences, a formal statement of the model’s performance with respect to competing models can be made, and second, when extending old models it is possible to formally determine the relative performance with respect to the existing model both across the whole sample and across subsamples. The benefits of such a method is that is that if it possible to formally distinguish between competing models it is possible to formally distinguish between their competing inferences. Because the method is Bayesian it is likelihood based and satisfies the Likelihood Principle. This implies that all results are based solely on information from the observations from the system through the likelihood function and any prior information before the sample is obtained. Because the method is likelihood based, a likelihood function needed to be constructed for models of the type found in the RBC literature. Because of the complexity of the models found in the RBC literature, it is necessary to approximate the solution of the model. Once that is done it is possible to write the model in a state-space representation that has all of the variables of the model represented as functions of the states of the model. This representation is used to calculate the likelihood function. In particular, conditional on initial conditions, observations on some or all of the variables of the model can be used to determine values for the stochastic state variables of the model. The number of variables used is limited to the number of stochastic state variables. The process is an iterative process whereby once all of the values of the stochastic state variables are known for one period, it is possible to calculate the values of all of the variables of the model in the current period and also the values of the non-stochastic state variables for next period. Then, next period’s observations are used to calculate next period’s values of the stochastic state variables and so on. The calculated values of the stochastic components of the models are then used to calculate the likelihood function for the model. Once the likelihood function is calculated, Bayesian model comparison methods are available. It was shown that these methods compare models with respect to their relative out-of-sample prediction performance. It was also shown that is a simple step to allow for uncertainty over the structural

43

parameters of the models being compared. In fact, the current practice of calibration is just a special case of the more general case. It was also shown that it is possible to compare models across sub-samples of the data as well as across the whole sample. The method that was developed was applied to the case of comparing two models from the paper of Farmer and Guo (1994). The two models were variants of the same one-sector stochastic growth model. The first variant was a model that had shocks that affected the “fundamental” components of the model. The second variant, had an increasing returns to scale technology that allowed for a model that produced cycles using shocks that were unrelated to the “fundamental” components of the model. When the structural parameters were calibrated, the outcome of the comparison was mixed. It was apparent that the value of the variance of the respective shock terms mattered with respect to the outcome of the comparison. When the values of the variances of the shock processes were recalibrated, there was a dramatic change in the result for two of the five data sets that were used to calibrate the models. However, the new method of calibration was preferred for only those two data sets. It was clear that there was uncertainty as to the correct way to calibrate the variances of the shock processes. This is evidence of the need for the ability to allow for uncertainty over the parameters when comparing models. The Bayes factor was decomposed across the whole data set. The result of this was that is was clear that there were periods where each model was ascendant. No model was consistently better across the whole data set. This result could not have been found using comparisons across the whole data set. It seemed that the “sunspot” model was preferred when the data was relatively volatile while the “fundamental” model did better in periods of lower volatility. The two models were also compared when there was uncertainty allowed over the structural parameters of the model. The posterior distributions for some of the parameters were significantly different from their prior distribution. The overall results of the comparison were more in favor of the “sunspot” model than in the calibration case. These results add weight to the claims of Farmer and Guo (1994) that a “sunspot” model does as good a job at replicating business cycles as the commonly used “fundamental” model. This result has an implication for policy as in a “sunspot” model there is room for a welfare increasing policy that aims to smooth cycles. This is not the case in the “fundamental” model. The nature of the RBC literature is that the structural parameters of the models are calibrated. In a Bayesian context, by calibrating the parameters, prior knowledge with regard to the values of the structural parameters is used in the decision making process. In the RBC literature, the use of prior information is extreme in that the prior distributions for the parameters are degenerate at the calibrated values. What the results above show is that; first, it is easy to allow for uncertainty over the values of the structural parameters; and second, allowing for prior uncertainty can make a difference to the result of the comparison. If the aim is to increase understanding about a process then it would seem that prior uncertainty should be allowed for when comparing models. A third comparison was made between the two RBC models with a very simple non-RBC model. This highlights the fact that the method that was developed is not only able to compare two similar models. The comparison between the RBC models and the non-RBC model yielded interesting results. It was found that the non-RBC model was strongly favored in all cases. This result has implications with regard to inferences that can be drawn from the two models in particular and RBC models in general. It would seem that a very simple model with little economic content performs much better than the more complicated models with a high degree of economic content that are described in Section 4.1. For example, if a RBC model is to be used in making inferences about the implications of different government policies then a evaluation of the performance of that model with other competing models is important. The result that the simple “AR1” model strongly outperforms the two RBC models places

44

the inferences that can be made from them into context. There clearly is room for a non-RBC model with more economic content than the “AR1” model to be developed and used for inference. The result also shows there is room for improvement within the class of RBC models as well. While the application in this paper is model comparison, the Bayesian approach can be extended easily to any other application of the literature. Once a model has been chosen from amongst a set of models, the model is then used for inference. Prior uncertainty over the structural parameters of the model can be applied as easily as for the model comparison case. Using a Metropolis-Hastings algorithm, draws can be made from the posterior distribution of the model, and for each of these draws functions of interest can be calculated. For example, one function of interest may be the impulse response function implied by the model at the specific values of the parameters that were drawn. Then it is possible to construct posterior moments for the function of interest just as was done for the parameters of the two models studied in Section 4.1. The method described in this paper is, therefore, not restricted to the problem of model comparison. All problems now studied in the RBC literature can be handled using the methods described in this paper and there is now no reason not to use information from the data to infer values of the structural parameters to be used for model inference.

References Anderson, E., L. Hansen, E. McGratten, and T. Sargent (1996): “Mechanics of Forming and Estimating Dynamic Linear Economies,” in Handbook of Computational Economics, ed. by H. Aumman, D. Kendrick, and J. Rust, vol. I. Elsevier Science B.V. Benhabib, J., and R. Farmer (1994): “Indeterminacy and Increasing Returns,” Journal of Economic Theory, 25. Berger, J., and R. Wolpert (1988): The Likelihood Principle. Hayward, California: Institute of Mathematical Statistics, 2nd edn. Canova, F., and E. Ortega (1995): “Testing Calibrated General Equilibrium Models,” unpublished manuscript. Chib, S., and E. Greenberg (1995): “Understanding the Metropolis-Hastings Algorithm,” The American Statistician, 49. Christiano, L. (1988): “Why Does Inventory Investment Fluctuate So Much?,” Journal of Monetary Economics, 28. Christiano, L., and M. Eichenbaum (1992): “Current Business Cycle Theories and Aggregate Labor Market Fluctuations,” American Economic Review, 82. Danthine, J., and J. Donaldson (1993): “Methodological and empirical issues in real business cycle theory,” European Economic Review, 37. DeJong, D., B. Ingram, and C. Whiteman (1996): “A Bayesian Approach to Calibration,” Journal of Business and Economic Statistics. (1997): “A Bayesian Approach to Dynamic Macroeconomics,” Working paper, University of Iowa. Diebold, F., L. Ohanian, and J. Berkowitz (1994): “Dynamic Equilibrium Economies: A Framework for Comparing Models and Data,” unpublished manuscript, University of Pennsylvania.

45

Domowitz, I., G. Hubbard, and B. Peterson (1988): “Market structure and cyclical fluctuations in U.S. manufacturing,” Review of Economics and Statistics, 70. Farmer, R. (1993): The Macroeconomics of Self-Fulfilling Prophecies. M.I.T. Press, Cambridge, Massachsetts. Farmer, R., and J.-T. Guo (1994): “Real Business Cycles and the Animal Spirits Hypothesis,” Journal of Economic Theory, 63. Gelfand, A., and D. Dey (1994): “Bayesian Model Choice: Asymptotics and Exact Calculations,” Journal of the Royal Statistical Society Series B, 56. Geweke, J. (1998): “Using Simulation Methods for Bayesian Econometric Models: Inference, Development and Communication,” Econometric Reviews, forthcoming. Geweke, J., and S. Chib (1998): “Bayesian Analysis Computation and Communication (BACC): A resource for Investigators, Clients and Students,” University of Minnesota working paper. Hall, R. (1990): “Invariance properties of Solow’s productivity residual,” in Growth-ProductivityUnemployment. M.I.T. Press, Cambridge,Massachusetts. Hansen, G. (1985): “Indivisible Labor and the Business Cycle,” Journal of Monetary Economics, 16. (1997): “Technical progress and aggregate fluctuations,” Journal of Economic Dynamics and Control, 21(6). Hansen, G., and E. Prescott (1995): “Recursive Methods for Computing Equilibria of Business Cycle Models,” in Frontiers of Business Cycle Research, ed. by T. Cooley. Princeton University Press, Princeton. Hansen, L., and J. Heckman (1996): “The Empirical Foundations of Calibration,” The Journal of Economic Perspectives, 10(1). Hansen, L., and T. Sargent (1996): “Recursive linear models of dynamic economies,” unpublished manuscript. Hodrick, R., and E. Prescott (1997): “Post-War U.S. Business Cycles: An Empirical Investigation.,” Journal of Money, Credit and Banking, 29(1). Kim, K., and A. Pagan (1995): “The Econometric Analysis of Calibrated Macroeconomic Models,” in Handbook of Applied Econometrics, ed. by H. Pesaran, and M. Wickens. Blackwell Press. Kydland, F., and E. Prescott (1982): “Time to Build and Aggregate Fluctuations,” Econometrica, 50. (1996): “The Computational Experiment: An Econometric Tool,” Journal of Economic Perspectives, 10(1). Lucas, R. (1977): “Understanding Real Business Cycles,” in Stabilization of the Domestic and International Economy, ed. by K. Brunner, and A. Meltzer. North-Holland, Amsterdam. Prescott, E. (1986): “Theory Ahead of Business Cycle Measurement,” Federal Reserve Bank of Minneapolis Quarterly Review, 10. Sims, C. (1996): “Macroeconomics and Methodology,” The Journal of Economic Perspectives, 10(1).

46

Smith, A. (1993): “Estimating Non-Linear Time Series Models using Simulated VAR’s,” Journal of Applied Econometrics, 8. Stadler, G. (1994): “Real Business Cycles,” Journal of Economic Literature, XXXII. Tierney, L. (1994): “Markov Chains for Exploring Posterior Distributions,” Annals of Statistics, 22(4). Watson, M. (1993): “Measures of Fit for Calibrated Models,” Journal of Political Economy, 101.

A A.1

Derivation of formulae Derivation of (4.3)

A first order Taylor’s series approximation to (4.1) is used to derive (4.3). That is, the following equations are linearized, Kt+1 1 Ct

=

Zt

=

=

BZtm Ktg Ctd + (1 − δ)Kt − Ct   τ g−1 d−1 m Et DZt+1 Kt+1 Ct+1 + Ct+1

(A.13)

θ Zt−1 ηt .

Consider, first, the first equation of the system given above. Kt+1



¯ + [gB K ¯ g−1 C¯ d + (1 − δ)](Kt − K) ¯ K

+

¯ g C¯ d−1 − 1](Ct − C) ¯ [dB K

+

¯ g C¯ d ](Zt − Z). ¯ [mB K

¯ from both sides and dividing through by K, ¯ the above approximation becomes Subtracting K ¯ Kt+1 − K ¯ K

≈ + + =

¯ Kt − K ¯ K ¯ Ct − C¯ C ¯ g−1 C¯ d − ] [dB K ¯ K C¯ g−1 ¯ d ¯ ¯ [mB K C ](Zt − Z) ¯ ˆ t + [dBΛ − C ]Cˆt + [mBΛ]Zˆt [gBΛ + (1 − δ)]K ¯ K ¯ g−1 C¯ d + (1 − δ)] [gB K

(A.14)

¯ g−1 C¯ d . where Λ = K Next, consider the second equation of the system given in (A.14). This equation can be written g−1 d m Et [Ct+1 ] = Ct Et [DZt+1 Kt+1 Ct+1 + τ ]

(A.15)

so that the first order Taylor series approximation to this equation is Et [Ct+1 ]



¯ g−1 C¯ d + τ ](Ct − C) ¯ C¯ + [DK

+

¯ g−1 C¯ d (Ct+1 − C) ¯ dDK

¯ g−1 C¯ d+1 (Kt+1 − K) ¯ + (g − 1)DK +

¯ g−1 C¯ d+1 (Zt+1 − Z) ¯ mDK

Subtracting C¯ from both sides of the above approximation and then dividing through by C¯ the approximation becomes   Ct+1 − C¯ Ct − C¯ Et = C¯ C¯ 47

+ + +

 Ct+1 − C¯ C¯  ¯ Kt+1 − K (g − 1)DΛEt ¯ K ¯ mDΛEt [Zt+1 − Z]. dDΛEt



Therefore the approximation to (A.14) is ˆ t+1 ] + mDΛEt [Zˆt+1 ]. −Cˆt = [dDΛ − 1]Et [Cˆt+1 ] + (g − 1)DΛEt [K

(A.16)

Finally, the approximation to the third equation of (A.14) is Zˆt+1 = θZˆt + ηˆt+1

(A.17)

The three equations, (A.15) to (A.17), can be represented as the following matrix system:   ηˆt+1  ˆ   ˆ  Kt Kt+1  E [K ˆ t+1   t ˆ t+1 ] − K  A  Cˆt  = B  Cˆt+1  + C  (A.18)   Et [Cˆt+1 ] − Cˆt+1  ˆ ˆ Zt Zt+1 Et [Zˆt+1 ] − Zˆt+1 where

gBΛ + (1 − δ) 0 A= 0 

dBΛ − −1 0

¯ C ¯ K

1 0  B = (g − 1)DΛ dDΛ − 1 0 0 

and

0  C= 0 1 

0 0 (g − 1)DΛ dDΛ − 1 0 0

 mBΛ 0 , θ  0 mDΛ  , 1  0 mDΛ  . 0

Therefore, the approximation to the system of equations in (A.14) is   ηˆt+1  ˆ   ˆ  Kt Kt+1  E [K ˆ t+1 ] − K ˆ t+1  t   Cˆt  = J  Cˆt+1  + R     Et [Cˆt+1 ] − Cˆt+1  Zˆt Zˆt+1 Et [Zˆt+1 ] − Zˆt+1

(A.19)

where J = A−1 B and R = A−1 C.

A.2

Derivation of (4.4)

The object is to calculate the first order Taylor’s series approximation to (4.4) of Section 4.2. The approximation is φ−1  A C¯ A 1 ¯ ¯ Lt ≈ L + φ ¯α ¯ α (Ct − C) b K b K  φ−1  φ−1 A C¯ A −C¯ A C¯ A −C¯ ¯ + φ α (K − K) + φ t α α+1 α ¯ ¯ ¯ ¯ α (Zt − 1). b K b K b K b K 48

¯ from both sides and then dividing through by L, ¯ the above approximation becomes Subtracting L  ¯α  ¯α b K A 1 A C¯ b K ˆ ¯ ¯ Lt ≈ φ (Ct − C) − αφ α ¯ ¯ ¯ ¯ α+1 (Kt − K) A C b K A C b K  ¯α A −C¯ b K − φ ¯ ¯ α (Zt − 1) A C b K ˆ t − φZˆt . = φCˆt − αφK (A.20)

49