4 Structural uncertainty in the DICE model

Golub, Alexander; Markandya, Anil, Dec 05, 2008, Modeling Environment-Improving Technological Innovations under Uncertainty Routledge, Hoboken, ISBN: 9780203886465

4

Structural uncertainty in the DICE model AJ A. Bostian and Alexander Golub

1 Introduction The need to incorporate the implications of uncertainty into climate policy analysis has been discussed extensively.1 One motivation for this discussion is the uncertain nature of the climate–economy interaction. For example, scientists hold varied opinions on the degree of climate sensitivity and the concentration threshold for dangerous anthropogenic interference. A question that has been rarely asked in this context is the extent to which uncertainty in the climate and economy will persist even if these scientific questions are ultimately answered, and what the policy response to this circumstance should be. This chapter addresses structural uncertainty in the climate–economy system. We use this term to contrast our meaning of “uncertainty” from the scenario (or parameter) uncertainty that pervades the climate-change literature. Parameter uncertainty is a statement primarily about the researcher’s knowledge. In principle, it is possible to provide some resolution to it by applying analytical methods to empirical data (e.g. through econometrics). In contrast, structural uncertainty describes an aspect of the climate or economy that is inherently unknowable from the perspective of the agents being studied. The key difference is that the agents being modeled (as opposed to only the researchers) recognize the risks generated by structural uncertainties, and this recognition generates an endogenous response. Indeed, risk and risk attitudes have proved to be important elements in economic decision making, and so failing to consider their implications could be considered a source of specification error.2 One result of incorporating structural uncertainty into a forecasting model is an altered view of future outcomes. A good example of this difference can be found by comparing the neoclassical macroeconomic model of Brock and Mirman (1972) to the earlier growth models of Cass (1965) and Koopmans (1965). The primary difference is that the Brock–Mirman framework contains a structural shock to factor productivity, and the model’s agents take this future uncertainty into account when formulating their current decisions. By making this change, the curvature of the agents’ utility function has a


Structural uncertainty in the DICE model

79

different interpretation: it becomes a coefficient of risk aversion, whereas it was merely the marginal benefit of consumption in the Cass–Koopmans framework. The salient difference between these models’ forecasts is that trajectories are stochastic processes in the Brock–Mirman framework, and so future outcomes can be characterized only up to a probability distribution and not by a precise number. This example from macroeconomics was selected to motivate our model of the welfare impacts of climate change in the presence of structural uncertainty, which is an extension of the Dynamic Integrated Climate Economy (DICE) model of Nordhaus (1994) and Nordhaus and Boyer (2000). DICE is appealing from an economic perspective because it is a generalequilibrium choice model, and all outcomes are therefore the direct result of endogenous, incentivized decisions made by the model’s agents. The economic side of DICE uses the Cass–Koopmans framework, and so its agents face no future uncertainty about the climate or economy and can forecast perfectly. And, because no uncertainty exists within the model, there is no need for its agents to respond to risks. Now, omitting structural uncertainty greatly aids the tractability of the solution and can allow one to more easily understand the intuition underlying an obscure equilibrium definition. However, beginning with the Brock–Mirman model, macroeconomics has tended to move away from this restriction because the resulting predictions do not match the observed dynamics of the macroeconomy very well. Both scenario and structural uncertainty have been previously incorporated into the DICE framework, the former more so than the latter. Nordhaus and Popp (1997) conduct a Bayesian exercise with their PRICE model to derive posterior probability distributions over outcomes given prior probability distributions over DICE parameters. The posterior distributions are interpreted as a summary of the likely future outcomes based on the current state of scientific knowledge about the climate-economy interaction. The E-DICE model of Mastrandrea and Schneider (2001) involves a damage function with a free parameter that is calibrated to match various assumptions about welfare losses arising from catastrophic climate outcomes. It is important to note that both of these involve uncertainty in parameters only, because there is no optimization by agents within the model with respect to the probability distributions the researchers are using. In a similar spirit to this chapter, Pizer (1999) develops a Brock–Mirman-style extension to Nordhaus and Popp (1997) by adding stochastic labor productivity, a source of structural uncertainty. Importantly, he finds that “. . . uncertainty does in fact matter. Ignoring it leads to policy recommendations that are too slack” (p. 257). The effects of this slackness tend to compound over time: the disparity between the estimated welfare impacts of climate change with and without structural uncertainty increases from 1 to 50 percent over 80 years. Thus, it appears that the optimal climate policy contains a substantial risk premium that is not captured unless the structural uncertainties are explicitly modeled. However, because Pizer (1999) incorporates both parameter and structural


80

AJ A. Bostian and Alexander Golub

uncertainty simultaneously, it is difficult to decompose the outcome uncertainty into effects arising from either source. In addition, Pizer (1999) reports only the means of the posterior distributions rather than the full distributions, so it is difficult to determine how the outcome uncertainty evolves over time. Our model extends the DICE model to the real business cycle (RBC) framework initiated by Kydland and Prescott (1982). The primary difference between the RBC model and the Brock–Mirman model is the addition of an endogenous labor-supply decision. We choose this approach because of the ability of RBC to better match empirical regularities in US and international economic data. In addition, RBC is frequently used as a framework for macroeconomic policy analysis, making it easier to compare our model’s output to output from other macroeconomic studies.3 However, we wish to emphasize that we are not implementing the RBC framework here so that we can investigate the effect of business cycles upon the climate, or vice versa. Indeed, Nordhaus and Boyer (2000) note that “. . . no sensible economist would ever use the production sector in these models to consider the roles of business cycles . . .” (p. 174). We agree that the model would need to be refined significantly in order for such an analysis to be accurate, and that such a refinement is probably unnecessary for addressing the question of how structural uncertainty affects climate-economy analysis. However, the RBC framework does provide a well-established foundation for macroeconomic decision-making under structural uncertainty, and these broader uncertainty implications are what we desire to incorporate into our model. We construct two calibrations of our “RBC-DICE” model based on different assumptions about the magnitude of a technology shock (structural uncertainty) and the concavity of the utility function (risk attitudes), both of which come from the RBC literature. We solve both the market and socially-optimal problems up to their first-order conditions, and show how the reduction undertaken by a fictitious social planner can be analytically mapped into an optimal CO2 tax policy. We find that uncertainty in outcomes is much greater in the calibration with the larger structural uncertainty, even though risk aversion is also higher in that case. In addition, we find that the optimal CO2 tax is relatively flat over time and does not fluctuate much. We also investigate the ability of RBC-DICE to explain empirical data using a comparative spectral analysis of some climate and economic variables of interest. This analysis is indicative of the forecasting potential of RBCDICE, an essential characteristic for a policy-analysis model of out-ofsample future outcomes. Although this technique does not provide a direct test of predicted versus actual outcomes in levels, it does highlight the aspects of the model that perform well, and the areas in which it can be improved. We find that the model replicates some features of climate and economy data quite well, while others still pose a modeling challenge. In particular, the CO2 concentration and temperature variables exhibit interesting long-cycle variabilities that are not matched by RBC-DICE.



81

Prior research has shown that welfare estimates from RBC models are not very robust to the choice of solution methodology, and so researchers using such models should take care to select an accurate method. Numerical errors in welfare comparisons are sometimes referred to as “spurious welfare reversals” that can be attributed solely to approximation effects. Incorporating the DICE climate modules into the RBC framework makes the solution strategy even more important, because the underlying sources of error are magnified by this additional source of nonlinearity.4 Because part of our analysis relies upon a welfare comparison between market and sociallyoptimal outcomes, verifying the accuracy of our solution forms an important component of this chapter. The methodology we use is constructed with a specific emphasis upon minimizing the impacts of numerical error. It consists a mixture of computationally-intensive methods that, in conjunction, require large-scale parallelization of the algorithm. This strategy does appear to yield a satisfactorily small numerical error in solution of the market problem but not in the solution of the social planner’s problem, as a large number of spurious welfare reversals are observed. Thus, even when the solution method is designed to mitigate spurious welfare reversals, they can still occur. This result illustrates that numerical accuracy is a relevant concern when working with integrated climate–economy models. RBC-DICE also allows us to address the question of whether structural or scenario uncertainty is potentially more important to climate policy formulation. We examine one particular parameter of some interest: the temperature sensitivity to a doubling of the CO2 level. From comparing the effects of a change in the temperature sensitivity to a change in the technology shock and risk attitudes, it appears that changing the structural uncertainty generates a far greater differential in forecasts than does changing the parameter uncertainty. This result suggests that resolving parameter uncertainty may not inform the policy debate that much, if a stronger structural uncertainty remains. This chapter is organized as follows. Section 2 describes the modifications made to DICE to create RBC-DICE. Because our solution strategy makes direct use of the analytical optimality conditions, we also derive the solution to the market, social planner’s, and carbon-tax problems up to the first-order conditions. To our knowledge, these optimality conditions have not been previously presented, and they provide a more concrete intuition for the climate-economy equilibrium. The solution methodology and the accuracy concerns motivating it are described in Section 3. Section 4 compares the predictions of each calibration and their potential implications for climate policy. Section 5 provides some additional discussion of the results and possible uses of the RBC-DICE framework.

82



2 Model description The foundation of the RBC-DICE model is the DICE-99 model described in Nordhaus and Boyer (2000). Before proceeding to the analytical solution of the model, we first briefly motivate the individual parts of the general equilibrium. A complete set of equations can be found in Table 4.1. The representative consumer in RBC-DICE is a price-taker in both the goods and factor markets. At time t, the population Pt of consumers grows at a time-varying rate gPt, and their common discount factor Rt grows at a timevarying rate gRt. Each consumer holds a personal capital stock kt and a time endowment (normalized to 1 unit) that can be divided between labor lt and leisure et. The consumer’s capital stock is rented to the firms at rate rt, and the consumer’s time is hired out to firms at rate wt. The capital stock is returned at the end of time t having depreciated at rate δK. The consumer uses the factor income to purchase a quantity yt of a numeraire good which can be used for either consumption ct or investment it. Each consumer is a dynamic expected-utility maximizer with utility function U(ct, et) and requisite resource constraints. The representative firm in RBC-DICE is also a price-taker in all markets. This firm rents capital Kt and hires labor Lt from consumers, and uses these inputs to produce a quantity Yt of the numeraire good. The production process is aided by Hicks-neutral factor productivity At which grows at rate gAt . The link between the climate system and the economy explicitly enters on the supply side. First, production generates CO2 as a by-product, which rises into the atmosphere. The CO2 intensity of output σt grows at rate gσt. Second, the warming of the upper biosphere causes some damage Dt which translates into a fractional decrease 1 – Ωt in factor productivity. The damage is approximated by a quadratic function of the temperature increase, and the temperature must rise approximately 1.2°C from the pre-industrial level in order to initiate damage. The firm is a dynamic expected-profit maximizer with the same discount rate as consumers. The climate system is modeled with a three-sink system of CO2 concentraUP LO tions: atmosphere M AT t , upper biosphere M t , and lower biosphere M t . The lower-biosphere sink has an essentially infinite capacity. Each period, both exogenous CO2 emissions EEt (primarily from land-use changes) and industrial emissions IEt move into the atmosphere, and the existing concentrations in the three sinks mix according to a linear transition rule. There is a oneperiod lag between the emission of CO2 and its arrival into the atmosphere. The atmospheric CO2 concentration causes forcing Ft of the upper-biosphere temperature, a small portion of which occurs as a result of non-CO2 greenhouse gases Ot. Finally, the upper-biosphere temperature T UP and the t lower-biosphere temperature T LO mix according to a linear transition rule. t As is typical of representative-agent models, the consumer and firm are assumed to be aggregate representations of several “small” agents. These small agents do not incorporate climate effects into their decisions, resulting



83

in a negative externality. It is important to note that this externality does not arise because the damage function or any other functional element of the system is unknown. Because RBC-DICE (like DICE) is a rationalexpectations model, all agents realize that damage will occur if CO2 concentrations and temperatures rise. They are also fully capable of calculating the future trajectories of the climate and economy. But, no single agent is individually willing to endogenize his or her own externality because market prices do not adjust to reflect the damage potential. As shown below, the factors of production are valued too highly by the market in RBC-DICE, which leads to higher-than-optimal income to consumers and higher-thanoptimal output by firms. Lastly, it is useful to highlight some differences between RBC-DICE and DICE-99:5

•

•

•

• •

• • •

The time scale of DICE-99 is a decade, while here it is a year. As a result, growth rates and flow rates are rescaled into yearly values. This permits a more natural comparison of the predictions to data (which are typically measured at a yearly or higher frequency) without materially affecting the trajectories over a longer horizon. DICE-99 uses a mixture of continuous- and discrete-time notation for growth rates. For convenience, we have normalized this to continuoustime notation (i.e. exponential growth) as much as possible. This change also eases the translation between the decadal and yearly time scales. In keeping with the RBC formulation of the macroeconomy, the utility function now contains leisure, and there is a labor-leisure time constraint. The parameter governing the preference for consumption over leisure is set so that the steady-state fraction of the time endowment spent working is about 0.3. Because only one-third as much labor supply is present after leisure is introduced, we rescale total factor productivity so that the time-1 output matches that of DICE-99. The growth rate of decarbonization follows a polynomial/exponential decline in DICE-99, which generates strange predictions in the time limit. We recalibrate this as a simple exponential decline, and determine a new growth rate by matching the new decarbonization path to the endpoints of the DICE-99 path. As a result, our decarbonization rate is slightly slower than in DICE-99, leading to higher predicted CO2 concentrations. We remove the abatement cost component of DICE-99. We restrict the damage function to be non-negative, which was not required in DICE-99, i.e. there could be external benefits to the economy until the threshold temperature was reached. We define the externality 1 – Ωt as a fractional decrease in total factor productivity rather than simply a resource drain. The intuitive consequence is that the production process is itself damaged rather than

Equation

ln (ρt /ρt − 1) = gρ ln (Rt /Rt −1) = − ρt 1 U (ct, et ) = (cαt e1t − α − 1)1 − β 1−β Population growth rate ln (gPt /gPt − 1) = δP Population ln (Pt /Pt − 1) = gPt Productivity growth rate ln (gAt /gAt − 1) = δA Productivity ln (At /At − 1) = gAt Output (real 1990 USD) Yt = ΩtAtKγt L1t − γ Investment (real 1990 USD) It = Kt + 1 − exp (δK) Kt Emissions/output ratio ln (gσt /gσt − 1) = δσ growth rate Emissions/output ratio ln (σt /σt − 1) = gσt − 1 (tonnes/real 1990 USD) Industrial emissions IEt = σtYt (tonnes) Exogenous emissions ln (EEt /EEt − 1) = δE (tonnes) Total emissions (tonnes) TEt = IEt + EEt 2 Damage Dt = max {0, θ1T UP + θ2T UP } t t Productivity externality ln (Ωt ) = − Dt AT UP LO Atmospheric CO2 M AT t = φ11M t − 1 + φ21M t − 1 + φ31M t − 1 + TEt − 1 concentration (tonnes)

Time preference Discount rate Utility

Description

Table 4.1 Equations and parameters of the RBC-DICE model.

9 φ11 = 0.94788, φ21 = 0.04445, φ31 = 0, M AT 0 = 735 × 10

θ1 = − 0.0045, θ2 = 0.0035

δE = − 0.01, EE0 = 0.1128 × 109

IE0 = 6.187 × 109

σ0 = 2.74 × 10−3

δP = −0.0222, gP0 = 0.0157 P0 = 5.6327 × 109 δA = − 1 × 10−6, gA0 = 0.0038 A0 = 265 γ = 0.3, K1 = 47 × 1012 δK = − 0.1 δσ = − 0.013638, gσ0 = − 0.015884

gρ = −0.0025719, ρ0 = 0.03 R0 = 1 α = 0.3, β = 1

Parameters


T LO = T tLO t −1 +

Time t = 0 corresponds roughly to the 1980s.

1 (T UP − T tLO − 1) σ3 t − 1

1 η 1/σ2 UP (F − T UP − (T − T tLO − 1)) σ1 t λ t − 1 1/σ3 t − 1

Lower biosphere temperature increase (°C)

9 η = 4.1, M AT BASE = 596.4 × 10 MAX a = 0, b = 0.013465, O = 1.3465

AT Ft = η[ln(M AT t /M BASE)/ln (2)] + Ot Ot = max {a + bt, O MAX}

σ3 = 500, T LO 0 = 0.06

σ1 = 44.247, σ2 = 113.63, λ = 2.9078, T UP 0 = 0.43

9 φ13 = 0, φ23 = 0.01515, φ33 = 0.99950, M LO 0 = 19230 × 10

UP LO = φ13M LO M LO t t − 1 + φ23M t − 1 + φ33M t − 1

T tUP −1 +

9 φ12 = 0.05212, φ22 = 0.94040, φ32 = 0.00050, M UP 0 = 781 × 10

Upper biosphere CO2 concentration (tonnes) Lower biosphere CO2 concentration (tonnes) CO2 forcing increase (w/m2) Exogenous temperature forcing (w/m2) Upper biosphere temperature increase (°C)

UP LO = φ12M AT M UP t t − 1 + φ22M t − 1 + φ32M t −1


86



being simply wasted, which we believe to be a more natural definition of external damage. The technical consequence is that the industrial emissions equation now contains Ωt (because industrial emissions is now a simple fraction of output), which generates more avenues for climateeconomy feedback. 2.1 Solution to the market problem

•

Consumer. The consumer’s objective is to maximize the discounted stream of expected utilities:6

 yt ≤ wtl + rtkt y = c + i ∞  t t t max E0 冤冱 RtU (ct, et)冥 s.t.  {c , l , k } it = 冢Pt + 1冣 kt + 1 − exp (δK) kt t=0 Pt  lt + et = 1. t

t

∞ t+1 t=0

(4.1)

The first-order optimality conditions for problem (4.1) are as follows: U2 (ct, 1 − lt) = wt; U1 (ct, 1 − lt) Et

Rt + 1

冤R

t

•

(4.2)

Pt U1 (Ct + 1, 1 − lt + 1) • • (rt + 1 + exp (δK)) = 1; Pt + 1 U1 (Ct, 1 − lt)

wtlt + rtkt = ct +

冥

Pt + 1

冢 P 冣k

t+1

− exp (δK) kt .

(4.3) (4.4)

t

•

These results are fairly standard. Equation (4.2) shows that the marginal rate of substitution between consumption and leisure must be equal to the price ratio between the two. Equation (4.3) is the capital asset pricing equation using the stochastic discount factor. Equation (4.4) is the feasibility condition or budget constraint. Under the parametrization given in Table 4.1, the scale of the rental rate τt is approximately 100 = 1 (because the capital stock kt is measured in numeraire USD units), while the scale of the wage wt is approximately 104 (because the labor supply lt is the fraction of total hours in a year spent working). Firm. The climate externality formally enters on the production side of the economy, but the representative firm does not take it into account when making its production decision. To be clear on this point, we extend our notation by writing the externality fraction as Ωt to emphasize that the firm does not optimize with respect to it. The firm’s problem is to maximize its discounted stream of expected profits:


87

∞

max E0

{Kt,Lt}∞t = 0

冤冱R (Y − w L − r K )冥 s.t. Y = Ω A K L t

t

t

t

t

t

t

t

t

γ t

1−γ t

.

(4.5)

t=0


The first-order conditions for problem (4.5) are the factor demand equations implicitly defined by: ∂Yt ; ∂Kt ∂Yt wt = . ∂Lt

rt =

•

(4.6) (4.7)

These results are again standard: a factor’s price is equal to its marginal revenue product. And, because the production technology exhibits constant returns to scale, these factor prices also imply a zero-profit equilibrium. Equilibrium. The equilibrium conditions involve the clearing of the goods and factor markets:7 Yt = Ptyt ∀t; Kt = Ptkt ∀t; Lt = Ptlt ∀t.

It should be noted that the market does anticipate some of the immediate effects from emissions, and so Ω is not fully external. Equation (4.3) implies that the time-t equilibrium requires knowledge of rt + 1, and hence of Ωt + 1. Both of these values can be rationally foreseen based on information available at time t: rt + 1 can be determined up to a probability distribution, and Ωt + 1 can be determined exactly. Equation (4.6) shows that a smaller value of Ωt + 1 will lower the value of rt + 1, which lowers the incentive to accumulate capital at time t. This, in turn, tends to reduce production at time t, mitigating the externality. The presence of this effect does not imply, however, that the market actually prices goods and factors according to their social valuations at time t. As shown below, the social valuation incorporates all future climatefeedback impacts from a marginal unit of emissions at time t, rather than just the immediate impact upon production in the next period. And, because the CO2 taxes predicted by this model are actually nonzero, the part of Ω that is internalized must be small relative to its overall size. 2.2 Solution to the social planner’s problem The social planner’s problem involves the internalization of all externalities and thus gives the highest possible welfare. This is achieved by imputing full social valuations to all goods and factors in the market, instead of only accounting for their private valuations. In RBC-DICE, the externality from CO2 emissions in period t is felt in periods t + 1 and following as a decrease in

88



total factor productivity. However, the firm does not take into account the effects of its current emissions upon its future productivity or upon the consumer’s future utility, leading to a disparity between the marginal private costs and marginal social costs of current production.

•

•

Consumer. The consumer’s part of the market problem (4.1) generates the correct marginal-valuation functions in the social planner’s setting as well. This is because the distortion does not explicitly enter on the consumption side of the economy. Of course, the consumer’s income will change when the social factor demands are used, and the consumption path imputed by the social planner will then be different from the corresponding market path. However, this is an indirect equilibrium effect rather than a direct effect upon the consumer’s valuation function. Firm. The social planner’s problem for the firm is: ∞

max E0

{Kt,Lt}∞t = 0

冤冱R (Y − w L − r K )冥 s.t. Y = Ω A K L t

t

t

t

t

t

t

t

t

γ t

1−γ t

.

(4.8)

t=0

Note the removal of the bar on Ωt, indicating that the externality is taken into account in the production plan. The first-order conditions for problem (4.8) are more complicated than the corresponding market equations because they involve perpetual welfare impacts of current emissions: ∂Yt rt = + Et ∂Kt

∞

Rt + i ∂Ωt + i • At + i K γt + i L1t +− γi ; ∂Kt t

冦冱冤 R

∂Yt wt = + Et ∂Lt

i=1 ∞

冦冱冤 i=1

冥冧

Rt + i ∂Ωt + i • At + i K γt + i L1t +− γi . Rt ∂Lt

冥冧

(4.9)

(4.10)

Within the infinite time horizon, there are several avenues for climateeconomy feedback that are captured by the seemingly innocuous terms ∂Ωt + i ∂Ωt + i and . Most of the various feedback types can be illustrated by ∂Kt ∂Lt examining the first three periods ahead; Table 4.2) presents the components of each feedback term for the rental rate (9).8 The one-periodahead term reflects the direct marginal impact of the CO2 emissions associated with the use of an additional unit of capital at time t upon the externality that occurs at time t + 1. The additional terms at time t + 2 reflect two feedback mechanisms. First, some of the time-t emissions that entered the atmosphere time t + 1 remain there and continue to contribute to temperature forcing. Second, some of the time-(t + 1) upper-biosphere temperature increase remains in that layer. The three-period-ahead terms additionally reflect the fact that some of the temperature increase in the upper-biosphere layer propagates down to the lower-biosphere layer and then back up. There is similar movement



89

of the CO2 concentrations between the atmospheric layer and the upperbiosphere layer. There is an additional four-period-ahead effect (not presented) in which some of the initial CO2 emissions flows through all three layers and back to the atmosphere. In subsequent periods, each of these feedback effects interacts with the others. Thus, the integral here represented simply by “Et” is actually quite complex. The equations in Table 4.2 are provided only to illustrate the model’s numerous paths for climate-economy feedback; they are not directly used in the solution of the model. In fact, the tractability of the social planner’s problem is of serious concern due to this complicated feedback representation, and the issue will be addressed separately in Section 3. The social planner’s solution cannot be supported as a market outcome because it involves the firm operating at a perpetual loss. (Because the production technology exhibits constant returns to scale, any factor valuation other than the marginal revenue product leads to a non-zero-profit equilibrium.) The social planner essentially compels the firm to operate in this way for the benefit of the consumer. This solution is, however, the first-best welfare outcome against which the efficiency of the market and any policy interventions can be judged. 2.3 Solution to a tax problem Policy experiments in RBC-DICE involve introducing welfare-improving distortions into the market for the purpose of achieving a welfare outcome closer to the socially-optimal one. One of the policies considered by Nordhaus and Boyer (2000) is the CO2 tax, which requires the firm to pay a uniform price to the government for each unit of CO2 emitted. In this case, the firm still does not optimize with respect to the externality. But, but by making a judicious choice of tax rates, equilibrium prices that are closer to the socially-optimal valuations can be obtained. We consider the most convenient version of the CO2 tax, which involves resetting the tax each period. In this special case, the social planner’s valuations can be exactly replicated.

•

Firm. Denote the CO2 tax at time t as τt. The problem for the firm becomes: ∞

max E0

{Kt, Lt}∞t = 0

冤冱

冥冦

Rt (Yt − wt Lt − rt Kt − τt IEt) s.t.

t=0

Yt = Ωt At K γt L1t − γ , (4.11) IEt = σtYt

and the factor demands for problem (13) are given by:9 ∂Yt (1 − τtσt); ∂Kt ∂Yt (1 − τtσt). wt = ∂Lt rt =

(4.12) (4.13)

冢

冣

∂Yt ∂Kt ∂Ωt + 1 ∂Dt + 1 ∂TtUP ∂Ft + 1 ∂MtAT +1 + 1 ∂TEt ∂IEt ∂Yt · · · · · · UP AT · ∂Dt + 1 ∂Tt + 1 ∂Ft + 1 ∂Mt + 1 ∂TEt ∂IEt ∂Yt ∂Kt ∂Ωt + 2 ∂Dt + 2 ∂TtUP ∂Ft + 2 ∂MtAT ∂TEt + 1 ∂IEt + 1 ∂Yt + 1 ∂Ωt + 1 +2 +2 · · · · · · · UP ∂Dt + 2 ∂Tt + 2 ∂Ft + 2 ∂MtAT ∂TE ∂IEt + 1 ∂Yt + 1 ∂Ωt + 1 ∂Kt +2 t+1 UP ∂Ωt + 2 ∂Dt + 2 ∂TtUP + 2 ∂Tt + 1 · · · UP ∂Dt + 2 ∂TtUP ∂Kt + 2 ∂Tt + 1 UP AT ∂Ωt + 2 ∂Dt + 2 ∂Tt + 2 ∂Ft + 2 ∂MtAT + 2 Mt + 1 · · UP · AT AT · ∂Dt + 2 ∂Tt + 2 ∂Ft + 2 ∂Mt + 2 ∂Mt + 1 ∂Kt ∂Ωt + 3 ∂Dt + 3 ∂TtUP ∂Ft + 3 ∂MtAT ∂TEt + 2 ∂IEt + 2 ∂Yt + 2 ∂Ωt + 2 +3 +3 · · · · · · · UP ∂Dt + 3 ∂Tt + 3 ∂Ft + 3 ∂MtAT ∂IEt + 2 ∂Yt + 2 ∂Ωt + 2 ∂Kt + 3 ∂TEt + 2 UP UP ∂Ωt + 3 ∂Dt + 3 ∂TtUP + 3 ∂Tt + 2 ∂Tt + 1 · · · · UP UP ∂Dt + 3 ∂TtUP ∂Kt + 3 ∂Tt + 2 ∂Tt + 1 UP AT ∂Ωt + 3 ∂Dt + 3 ∂Tt + 3 ∂TtUP ∂Ft + 2 ∂MtAT ∂MtAT ∂TEt + 1 +2 + 2 ∂Mt + 1 +2 · · + · UP · UP · AT · AT · ∂Dt + 3 ∂Tt + 3 ∂Tt + 2 ∂Ft + 2 ∂Mt + 2 ∂Mt + 1 ∂Kt ∂TEt + 1 ∂Kt UP LO UP ∂Ωt + 3 ∂Dt + 3 ∂Tt + 3 ∂Tt + 2 ∂Tt + 1 · · LO · UP · ∂Dt + 3 ∂TtUP ∂Kt + 3 ∂Tt + 2 ∂Tt + 1 AT AT AT ∂Ωt + 3 ∂Dt + 3 ∂TtUP ∂F ∂M +3 t + 3 ∂Mt + 2 Mt + 1 t+3 · · · · · AT AT AT ∂Dt + 3 ∂TtUP ∂Kt + 3 ∂Ft + 3 ∂Mt + 3 ∂Mt + 2 ∂Mt + 1 UP AT UP ∂Ωt + 3 ∂Dt + 3 ∂Tt + 3 ∂Ft + 3 ∂Mt + 3 ∂Mt + 2 ∂MtAT +1 · · · AT UP · AT · ∂Dt + 3 ∂TtUP ∂F ∂M ∂M ∂M ∂K +3 t+3 t+2 t+1 t+3 t

Yt (private marginal product)

Ωt + 3 via MAT regurgitation from MUP at t + 3

Ωt + 3 via MAT persistence, t + 1 to t + 3

Ωt + 3 via TUP regurgitation to TLO at t + 3

Ωt + 3 via TUP forcing-based increase at t + 2

Ωt + 3 via TUP persistence, t + 1 to t + 3

Ωt + 3 via Yt + 2

Ωt + 2 via MAT persistence, t + 1 to t + 2

Ωt + 2 via TUP persistence, t + 1 to t + 2

Ωt + 2 via Yt + 1

Ωt + 1 via Yt

Equation

Marginal effect of Kt upon . . .

Table 4.2 Components of the social planner’s valuation of the rental rate



91

These factor demands can be related back to the social planner’s problem. From examining Table 4.2, it is apparent that the time-t marginal products of capital and labor can be factored out of equations (4.9) and (4.10), allowing them to be rewritten as Golub, Alexander; Markandya, Anil, Dec 05, 2008, Modeling Environment-Improving Technological Innovations under Uncertainty Routledge, Hoboken, ISBN: 9780203886465

∞

rt =

∂Yt ∂Yt Rt + i ∂Ωt + i ∂Yt + Et • • At + i K γt + i L1t +− γi = [1 + Et (Ht)]; (4.14) ∂Kt ∂Kt i = 1 Rt ∂Yt ∂Kt

冤冱

冥

∞

∂Yt Rt + i ∂Ωt + i ∂Yt ∂Yt wt = + Et • • At + iK γt + iL1t +− γi = [1 + Et (Ht)]. (4.15) ∂Lt ∂Lt i = 1 Rt ∂Yt ∂Lt

冤冱

冥

∞

Rt + i ∂Ωt + i • • At+i K γt + i L1t +− γi . R ∂Y t t i=1 Comparing the social planner’s equations (4.14) and (4.15) with the tax analogues (4.12) and (4.13), we find that the tax rate corresponding to the socially-optimal valuation is: where we have defined Et (Ht) = Et

τt = −

Et (Ht) σt

冤冱

冥

(4.16)

Equation (4.16) intuitively illustrates that the optimal tax is set equal to the expected present value of the damage that occurs in perpetuity from producing of an additional unit of the numeraire good at that time t. The equilibrium incentives governing the firm’s usage of labor and capital under this tax rate are identical to those that would be implemented by the social planner. Thus, this particular CO2 tax also describes a feasible market implementation of the social planner’s outcome.

3 Solution methodology The computational effort required to solve RBC-DICE is substantial. In addition, slightly different methodologies are implemented to solve the market and social planner’s problems. In this section, the various strategies and algorithms involved in the computational solution are described.10 There are two primary issues that arise when solving each problem: approximating the policy function and approximating the integrals in the first-order conditions. Because the methodology involves iterative refinement of a policy-function approximation, these issues will interact with each other. First, the integrals must be re-evaluated at each iteration, and so the complexity of the integration sub-problem will greatly affect the tractability of the overall algorithm. Second, these two approximations introduce two unique sources of numerical error, but it is not easy to decompose the contributions of each to the overall error. Thus, the policy-function refinement could potentially optimize on a systematic but unidentifiable bias introduced by integration error.11


92


Our motivation for strongly focusing upon the minimization of numerical solution error is due to a recent literature that finds a real possibility of inaccurate welfare calculations in macroeconomic models. Because policy analysis is primarily a welfare argument, accurate welfare measures are essential. A long-standing method for solving the RBC model involves firstorder log-linearization about the model’s steady state, as described in King et al. (2002). Indeed, this is the methodology adopted in Pizer (1999). The technique is justified on the basis that the model has an exact log-linear solution under some parametrizations, and so it is still approximately linear under “close” parametrizations. In addition, because the model is linearized, the need to compute an integral is removed, because the expectation Et passes through all of the non-stochastic linearized terms. However, there is increasing evidence that log-linearization about the steady state can be quite inaccurate when addressing welfare questions in stochastic macroeconomic models. Kim and Kim (2003) provide a systematic investigation of “spurious welfare reversals” in these models, an inaccuracy arising solely from approximation error that generates perverse welfare predictions.12 Aruoba et al. (2006) compare the approximation error of several linear and nonlinear approximation methodologies in the RBC context. These authors find that first-order log-linearization performs worst of all, with its performance degrading rapidly under “extreme calibrations” of nonlinearity and variance of structural uncertainty.13 They also find that linear approximations of relatively high order are required to produce results comparable to nonlinear approximations. Fernández-Villaverde and Rubio-Ramírez (2005) note that the second-order error arising from the linearization of the model’s solution introduces an even more troublesome first-order error into the model’s likelihood function. They demonstrate the impact of this error by estimating the RBC model under linear and nonlinear methods: the estimates based on the linear approximation are substantially biased from the ones based on the nonlinear approximation. 3.1 Policy function calibration There are three policy functions in RBC-DICE that correspond to each of its choice variables: ct = ct (Θt), lt = lt (Θt), and kt+1 = kt+1 (Θt), where Θt denotes the time-t state. Approximating these policy functions poses a unique challenge because of the 15-variable state space:14 UP LO UP LO Θt = {ρt, Rt, gPt, Pt, gAt, At, Kt, gσt, σt, EEt, M AT t , M t , M t , Tt , Tt }

Now, the RBC model only contains two state variables, At, and Kt, and it is possible to obtain a good policy-function approximation with a state vector of this size with a moderate amount of computational effort. For example, the methodology in Aruoba et al. (2006) that exhibits the smallest error is a



93

Chebyshev-polynomial approximation of dimension 9 in At-space and dimension 11 in Kt-space. Calibrating this function involves finding the roots of a system of 99 equations. The authors also present a fifth-order linear approximation, which entails a state vector with 25 elements (one for each variable interaction) and a decomposition of a square matrix of the same size. The problem with using these methods here is that their computational requirements grow approximately exponentially with the number of states. These methods are moderately difficult but still tractable with two state variables, but they will probably not remain so with fifteen. To overcome this dimensionality issue, we use a feedforward neural network approximation to the policy function, a strategy that has been explored in the RBC context by Duffy and McNelis (2001) and Sirakaya et al. (2006). A feedforward neural network approximates a function f: ⺢p → ⺢q using a structure of “neurons” arranged in “layers.” Neuron j in layer i takes a vector of outputs xi − 1 from the previous layer, weights the inputs with a vector of coefficients wij, and emits a scalar output xij.15 The output is determined by the “activation function” gij(x; w), which in this case is the logistic function: xij = gij (xi − 1; wij ) =

1 , 1 + exp (− wij′ xt − 1)

The neurons in the first layer exhibit trivial behavior: each takes one of the p arguments of f as its input, and emits that argument as its output. Each neuron in the last layer takes the outputs from the previous layer as input, and emits one of the q outputs of f.16 The “hidden” layers in between do not correspond to any particular component of f. The size of the hidden layer is a crucial element in obtaining a good approximation of the function, but it is difficult to know a priori the optimal size for a particular problem. Hornik et al. (1989) and Hornik et al. (1990) show that feedforward neural networks are universal function approximators for a large class of functions and their derivatives, including Lebesgue-measurable functions such as our stochastic policy functions. The primary result from this research is that functions from a finite-dimensional space to another finite-dimensional space can be approximated arbitrarily well by a single-hidden-layer feedforward neural network with a sufficiently large number of neurons in the hidden layer. And, a propos for this problem, Hornik et al. (1994) additionally show that the approximation error can decrease at rate n−1/2 (where n is the number of neurons in the hidden layer) irrespective of the dimension of the function’s domain. Of course, the dimension can still affect the initial scale of the error. The strategy delineated by Aruoba et al. (2006) is used to optimize the neural network weights in the market problem. (Sirakaya et al. 2006, provide a similar strategy for the Brock–Mirman model, which is somewhat easier than the RBC case because there are only two choice variables instead of three.) Here, the neural network emits only the labor–supply policy l (Θt). For any particular value of labor supply, equilibrium condition (4.2) can be used


94


to infer the consumption policy, and equilibrium condition (4.4) can then be used to infer the capital policy. The remaining equilibrium condition (4.3) is not satisfied by default, and it is used as an objective function for calibrating the network. This three-step strategy is feasible for the market problem because the timet factor prices can always be deduced solely from information that is already present upon entering time t. It is this fact that allows the consumption and capital policies to be merely inferred once the labor-supply policy is known. This strategy will therefore be infeasible for the social planner’s problem, because the time-t wage and rental rate require knowledge of future outcomes in perpetuity. The method used in this case involves calibrating a network that emits all three policy variables. None of the three equilibrium conditions (4.2), (4.3), or (4.4) can be satisfied by default in this setting, and so all three are used as the basis of calibration. Of the two methods, the market strategy is likely to be more accurate because two of the three equilibrium conditions can be automatically satisfied, while the neural network approximation must itself satisfy all three at once in the second case. Calibration of the neural network is accomplished by nonlinear least squares using simulated data, as described by Aruoba et al. (2006). A generic iteration of this algorithm proceeds as follows. The model is simulated repeatedly with a set of stochastic shocks.17 Each of the unsatisfied optimality conditions is rearranged in the form: Gt (w, Θt) = 0, so that the value of Gt reflects the calibration error corresponding to that condition under the neural network defined by the parameters w. The objective function is the sum of squared calibration errors over all unsatisfied equilibrium conditions and simulations. Three different minimization strategies are used. The first is the simulated annealing algorithm of Kirkpatrick et al. (1983), a stochastic search with two characteristics that are especially useful for calibrating a neural network. First, the randomness of the search provides a way to assign roles to individual neurons in the hidden layer.18 Second, neural networks are notorious for exhibiting local extrema, and a strength of simulated annealing is its ability to continue searching past local features that appear optimal (e.g. points that exhibit small first derivatives). After running the simulated annealing algorithm for a few thousand iterations, the best point found during the annealing process is used as a starting point for a twopronged gradient-based approach. The trust region algorithm of Moré and Sorensen (1983) is used first, which is useful in this problem because the surface described by the neural network weights can be quite irregular. The line search algorithm of Moré and Thuente (1994) is used as a final, computationally-demanding refinement. The simulated data for this calibration process are generated by running repeated simulations of length T = 200, the time horizon used in Nordhaus



95

and Boyer (2000). One practical question that arises is how many simulated data points are necessary to provide good identification of the neural network weights. We use a neural network with eight neurons in the hidden layer (not including the bias neuron), which yields just over 100 weights to be optimized in the market problem and just under 150 weights in the social planner’s problem. Sirakaya et al. (2006) generate 2000 simulations for their network with four weights, or 250 simulated data points per weight. Directly importing this scale into our solution strategy would imply generating about 25,000 simulated data points, or 125 simulations of length T = 200, which is probably not tractable. Instead, we opt for roughly 10 simulated data points per weight, or 7 simulations of length T = 200. The error plots in the following section indicate that this number of simulations can provide a reasonably accurate calibration. Finally, we wish to emphasize that this is a large-scale computational strategy not suited for serial processing. Parallelization of this algorithm was essential to achieving a reasonable run time. The market and social planner’s problems require about 400 and 6000 compute-hours (about 2.5 and 36 compute-months), respectively. By parallelizing the simulation and integration steps at each iteration, this can be reduced to about 7 and 24 hours of run time. To maximize the likelihood of finding a global minimum in this large space of weights, about two thirds of a run is spent in the simulatedannealing phase, and the remaining time is used by the two gradient-based algorithms. The latter typically converge after roughly ten iterations each, but these steps are required to be very high-quality and thus take quite a long time to compute. The results presented here are generated from runs conducted on the Lonestar high-performance computing cluster at the Texas Advanced Computing Center at the University of Texas at Austin. 3.2 Numerical integration The solutions to both the market and the social planner’s problems require a numerical integration of the first-order condition (4.3). We use an integration-by-simulation strategy to compute these integrals. Stern (1997) provides several examples of integration-by-simulation applications in econometrics; our strategy is in generally the same spirit but uses quasirandom sequences instead of random Monte Carlo simulations. Quasi-random sequences are specially-constructed numbers that provide better coverage of the unit hypercube than random simulations. If random simulations from the unit hypercube are compared to quasi-random numbers, the latter are found to be too uniformly spaced to be considered random. The benefit of using quasi-random sequences is that integration-by-simulation can converge at rate n−1 instead of rate n−1/2 (the rate of convergence in probability). Even more importantly, Sloan and Woz´ niakowski (1998) show that this convergence rate can also be independent of the dimension of integration. The authors do not provide conditions for assessing whether a


96


particular problem exhibits this latter characteristic, but they do note that many practical integration problems are tractable with quasi-random integration but not with Monte Carlo integration.19 Morokoff and Caflisch (1995) provide computational experiments that demonstrate the improvement in convergence over Monte Carlo integration. The quasi-random sequence we use is the integration lattice described by Sloan et al. (2002), which satisfies the convergence criteria described in Sloan and Woz´ niakowski (1998). Joe and Kuo (2003) show that the lattice’s convergence behavior that is at least as favorable in practice as other quasirandom sequences. The first step in constructing a lattice is to select a set of decreasing weights, where each weight corresponds to the importance of that dimension in the overall integral. The discount factor, which is about 0.97 in RBC-DICE, provides some information about the relative importance of future events, and so we use the weight 0.9d for each integration dimension d. We use 1009 simulations for each equation to be integrated.20 The Sloan et al. (2002) theoretical error projections indicate that the worst-case integration error over the unit hypercube using this lattice is on the order of 10−4 for a 1-dimensional integral and on the order of 10−2 for a 100-dimensional integral.21 The points in the lattice are translated into draws from a probability distribution by applying the inverse cumulative density function. Because the random elements in our calibrations are i.i.d., this reduces to simply applying the inverse function to each individual element in a point. The market problem requires evaluating a 1-dimensional integral corresponding to the first-order condition (4.3). It is very easy to compute this integral at each time period t by running 1,009 repetitions of a one-periodahead “sub-simulation” using lattice points of dimension 1. However, in the social planner’s problem, the first-order conditions (4.14) and (4.15) contain Et (Ht), an infinite-dimensional integral. And, because condition (4.14) feeds directly into condition (4.3), this latter equation also involves an infinitedimensional integral. To compute these integrals, we need to generate realizations of Ht. Our strategy for generating draws of Ht makes use of the fact that it contains components that are additively-separable over time. We split the generation of a draw into two steps: evaluating a truncated sum and estimating the residual sum. To evaluate the truncated sum, we run two 100period-ahead sub-simulations using lattice points of dimension 100. One sub-simulation begins at the time-t state, while the other involves a +1 percent perturbation of the capital stock Kt. The values {Ωt + i}100 i = 1 are retained Rt + i 100 from both subsimulations, and the values • At + i K γt + i L1t +− γi i = 1 are Rt also retained from the unperturbed sub-simulation. The formula for a forward finite difference is applied to the former set of values to generate ∂Ωt + i 100 approximations of . Recalling equation (14), dividing each of ∂Kt i = 1

冦

冦

冧

冧


∂Yt ∂Ωt + i 100 yields elements of the form as desired. ∂Kt ∂Yt i = 1 Elements from this set are multiplied by the corresponding elements Rt + i 100 in • At + i K γt + i L1t +− γi to generate the realizations of the form i=1 Rt Rt + i ∂Ωt + i 100 • • At + i K γt + i L1t +− γi .22 i=1 Rt ∂Yt This last set represents the first 100 time components of a realization of Ht. To estimate the residual effect of the remaining periods, we fit these 100 points to a geometric series anchored at the first element. The operative assumption is that the behavior of the discounted marginal external effect in periods past t + 100 can be approximated by simple geometric decline, and we use the first 100 points to estimate the rate of decay. Because of discounting and diminishing feedback effects over time, any marginal effects more than 100 periods ahead are likely quite small, and so this approximation is probably sufficient. Now, recall that if the discount factor of a geometric series is 0 ≤ v < 1 (which is, in fact, always observed in this procedure), then

冦

these quantities by

冦


97

冧

冧

冦

冧

∞

this infinite series has the closed-form solution

冱ν i=1

i

=

1 . Our estimate 1−ν

Rt + 100 ∂Ωt + 100 γ of the residual effect is thus • • At + 100 K γt + 100 L1t +− 100 / (1 − ν).23 Rt ∂Yt Finally, the estimate of Ht is the sum of the first 100 elements plus the residual estimate:

冢

100

Ht =

冱 i=1

1

冣

Rt + i ∂Ωt+i • • At + i K γt + i L1t +− γi + Rt ∂Yt

1−ν冢

Rt + 100 ∂Ωt + 100 γ • . At + 100 K γt+100 L 1t +− 100 . Rt ∂Yt

冣

This method is repeated for all 1,009 points in the lattice. The average of all 1,009 realizations of Ht forms the approximation of Et (Ht). Using a geometric series to capture effects past a truncation period is motivated by the two-asset finance experiment of Bostian and Holt (forthcoming), in which a risky asset’s fundamental value is exactly a truncated geometric series. (The authors’ risky asset pays dividends in each period, and so its fundamental value is simply the value of the dividend stream, discounted by the return to their risk-free asset.) A lump-sum payment is made in the final period of the experiment which reflects all of the discounted dividends that would have been received had the experiment continued indefinitely. Choosing the lump-sum payment in this manner makes the fundamental value of the finitely-lived experimental asset equal to that of the corresponding infinitely-lived asset (i.e. a consol bond). The same intuition

98


applies here, except that the geometric series is used as an approximation to the “fundamental value” instead of being identical to it.


4 Simulation results We construct two different RBC-DICE calibrations by varying the size of a structural shock and the consumer’s risk aversion. Our goal is to examine differences in the model’s predictions in low-risk/low-risk-aversion settings and high-risk/high-risk-aversion settings. This will show whether increasing the magnitudes of both the risk and risk-aversion components of the model within reasonable levels generates an increase or decrease in outcome uncertainty. In keeping with the RBC tradition, the lognormal shock to factor productivity is added to RBC-DICE: ln (At + 1/At) = gAt + εt + 1,

εt + 1 ~ i.i.d. N (0, ξ2).

The assumption motivating this shock is that unforeseen factors such as technological change or different business methods will influence the growth rate gAt, but these are not known with certainty in advance. Note that productivity has a unit root in this model, and so shocks generate permanent changes in output. We utilize two values of ξ in these simulations. The first is based on the classic RBC calibration from quarterly US data by Cooley and Prescott (1995), which corresponds to a yearly value of ξ = 0.014. The second is based on an estimate of the RBC model from quarterly US data by Fernández-Villaverde and Rubio-Ramírez (2005), which corresponds to a yearly value of ξ = 0.040. This latter quantity is similar in size to the mean value used by Pizer (1999) in his simulations.24 We also utilize two different values of risk aversion, β = 1 and β = 2. The former generates a convenient logarithmic form for the utility function and is fairly standard in the RBC literature (and has been used in DICE), while the latter is closer to the value estimated by Fernández-Villaverde and Rubio-Ramírez (2005). Thus, the low-risk/low-risk-aversion parametrization (hereafter, the “L” calibration) corresponds to earlier RBC calibration literature, while the high-risk/highrisk-aversion parametrization (hereafter, the “H” calibration) corresponds to more recent RBC estimation literature. After solving the market and social planner’s models for each calibration, 1,000 simulations of length 200 are run to approximate the stochastic processes of the outcome variables. Identical shocks are used across models, so that outcome realizations are directly comparable on a simulation-bysimulation basis. There are three main sets of results. First, we provide 10th, 50th, and 90th percentiles of some outcome variables of interest, which illustrate the inherent uncertainty in future outcomes. Second, we present 10th, 50th, and 90th percentiles of the actual numerical errors that arise in the first-order conditions for each problem. As noted earlier, the numerical



99

solution error is of concern in this type of model, and an examination of the errors suggests the degree of confidence that should be placed in this methodology. To our knowledge, this error information has not previously been presented for models in the DICE family. Third, we also provide some evidence on how closely RBC-DICE actually fits climate and economic data. The most convincing argument for a model’s fit is to estimate its parameters and conduct likelihood-based inference, but this adds yet another order of magnitude to the computational problem in our case. Our alternative assessment involves comparing the power spectra of the empirical and predicted growth rates of some outcome variables of interest.25 The outcomes being compared are log-differences instead of levels, and so this method does not directly assess the ability of RBC-DICE to make an accurate prediction in levels. However, it does indicate whether the model can replicate the autocovariance structure of the data. The autocovariance structure is defined by the feedbacks that occur as the system evolves. These feedbacks are manifested in data as cyclical behavior at various frequencies. If the model and empirical autocovariance structures compare favorably, then the model has successfully replicated the feedback patterns seen in the data. This result would lead to increased confidence that the model structurally represents the true data-generating process. Also, if the two autocovariance structures are close, then the predictions in levels should ultimately be relatively close.26 To our knowledge, this type of comparison between model and data has not been previously performed using the DICE framework.27 We use five outcome variables in the spectral analysis: output Yt, per-capita consumption ct, industrial CO2 emissions IEt, upper-biosphere CO2 concenUP tration M UP t , and the upper-biosphere temperature increase Tt . Data for the first three variables come from the yearly world aggregates in the World Bank World Development Indicators. Data for the upper-biosphere concentration are taken from the Scripps Institution of Oceanography collection of CO2 measurements. The data for the upper-biosphere temperature are taken from the NASA Surface Temperature Analysis.28 These datasets on concentration and temperature contain monthly measurements, and the CO2-concentration data are additionally broken down by geographic location. Thus, a means of aggregating these series into a representative yearly measurement is required. To obtain this measurement, we first run a fixed-effects regression on the monthly time series to isolate systematic year-, month-, and location-specific effects. The yearly data we use are the estimates of the year-specific effects. A yearly datum thus represents a mean over 12 months, net of month- and location-specific effects.29 The time overlap among these five datasets is 1960 to 2003, for a total of 43 growthrate observations. Table 4.3 presents summary growth-rate statistics from the two RBCDICE market calibrations and the data. The growth rates of the economic variables in the model are roughly 3–6 times smaller than the corresponding

100



Table 4.3 Growth rate statistics from the two calibrations and data (scale = 10−3, N = 43) L Calibration

H Calibration

Data

Variable

Mean

Median

Mean

Median

Mean

Median

Y˜ c˜ IE˜ ˜ UP M ˜ TUP

9.09 5.65 3.67 9.75 0.87

8.88 5.58 3.60 7.28 0.63

9.78 6.33 4.35 9.82 0.87

9.06 5.56 3.56 7.90 0.69

36.5 17.6 24.8 3.88 1.17

36.9 17.7 24.6 3.91 2.99

growth rates in the data. The growth rate of upper-biosphere CO2 concentration is typically 2–3 times larger in the model than in the data. The growth rate of the upper-biosphere temperature is about 1.5 to 4 times smaller than in the model than in the data. Table 4.4 provides the variance-covariance data upon which the spectral analysis is based. Because the disparate units of measure makes these difficult to interpret, we also present the associated correlation matrix in Table 4.5. All covariances are positive, implying that their unconditional outcomes are positively related. Both calibrations moderately overstate the correlation between industrial emissions and output, suggesting that there may be a some other process besides Hicks-neutral productivity shocks driving industrial emissions. The low-risk calibration significantly understates the correlation between output and upper-biosphere concentration, while the high-risk calibration significantly overstates this link. Both calibrations overstate the correlation between output and the upper-biosphere temperature, with the high-risk calibration performing significantly worse in this regard. Both calibrations significantly understate the correlation between industrial emissions and the upper-biosphere concentration, suggesting that the linear CO2 circulation model does not very accurately capture CO2 flow from the time it is emitted to the time it reaches the upper-biosphere layer. But, interestingly, both models only slightly overstate the correlation between emissions and temperature. Finally, both models significantly overstate the correlation between concentration and temperature; this could be caused by a forcing relationship that is too strong, or again by the fact CO2 does not flow to the upper atmosphere in the manner described by the model. Spectral analysis can reveal more sophisticated cyclical properties of these covariance data. Denote the joint time series of growth-rate data as ˜ tM ˜ UP z˜ t = [Y˜ t c˜ t IE T˜ UP ˜ t as t t ]′, and denote the jth autocovariance matrix of z Γj = E[z˜ t − E(z˜ )] [z˜ t − j − E(z˜ )]′. The population spectrum of z˜ t at the (angular) frequency ω is defined as: +∞

sz˜ (ω) =

冱

1 exp (− iωj) Γj. 2π j = − ∞

1.

30.0, 176, 157 25.3, 170, 76.4 23.4, 168, 119 10.9, 12.2, 13.7 0.96, 1.08, 0.43

Variable

1. Y˜ 2. c˜ ~ 3. IE ˜ UP 4. M ˜ 5. TUP 23.2, 167, 39.1 22.0, 166, 57.6 5.74, 7.00, 0.66 0.51, 0.62, 0.21

2.

21.2, 164, 12.2 3.63, 4.84, 0.92 0.32, 0.43, 0.35

3.

12.8, 14.2, 0.17 1.13, 0.12, 0.12

4.

0.10, 0.10, 12.8

5.

Table 4.4 Variance–covariance matrix of growth rates (L calibration, H calibration, data) for the two calibrations and data (scale = 10−5, N = 43)


102



Table 4.5 Correlation matrix of growth rates (L calibration, H calibration, data) for the two calibrations and data (N = 43) Variable 1.

2.

3.

4.

5.

1. Y˜ 2. c˜ ˜ 3. IE ˜ UP 4. M ˜ 5. TUP

1 0.99, 0.99, 0.83 0.33, 0.14, 0.79 0.33, 0.14, 0.09

1 0.21, 0.09, 0.63 0.22, 0.10, 0.08

1 0.99, 0.99, 0.25

1

1 0.95, 0.99, 0.97 0.92, 0.98, 0.86 0.55, 0.24, 0.82 0.55, 0.24, 0.09

The spectral matrix sz˜ (ω) integrated over all frequencies ω yields the population variance-covariance matrix. The diagonal elements in the spectral matrix are the spectra of individual variables, and the off-diagonal elements are the cross-spectra of pairs of variables. The spectra must be strictly positive (because the variance is always positive), while the cross-spectra can be either positive or negative (because the covariance can be either positive or negative). A diagonal element of sz˜ (ω) represents the fraction of the unconditional variance in a particular variable that is attributable to cyclical effects that occur with frequency ω. The real part of an off-diagonal element of sz˜ (ω) (called the co-spectrum) represents the fraction of the unconditional covariance between two variables that is attributable to cyclical effects that occur with frequency ω. Because the time resolution of our data is a year and our dataset has 43 years of observations, the highest frequency that corresponds to an empirically-observable cyclicality is ω = 2π, and the lowest frequency is between ω = π/16 and ω = π/32. Frequencies outside these bounds involve estimates that are off the support of the data in some sense. For convenience, a translation between frequencies and cycle durations is provided in Table 4.6.30 We use the nonparametric kernel method described in Hamilton (1994), Section 10.4, to estimate sz˜ (ω). Given a truncation lag J(ω) ≥ 1 and an estimate Γˆ k of the kth autocovariance matrix, the estimator is a weighted Table 4.6 Conversion between angular frequency and periodicity for the time resolution of our data. P = 2π/ω Angular Frequency (ω)

Periodicity (P)

0 π/16 π/8 π/4 π/2 π

∞ 32 years/cycle 16 years/cycle 8 years/cycle 4 years/cycle 2 years/cycle


103

sum of autocovariances, where the weights are given by the Bartlett kernel:


sz˜ (ω) =

1 Γˆ 0 + 2π

冤

J(ω)

冱冢1 − J (ω) + 1冣冢exp (− iωj) Γˆ + exp (iωj) Γˆ 冣冥. j

j

j

j=1

Hashimzade and Vogelsang (2007) investigate the bias-variance tradeoff of this estimator when the truncation lag J(ω) is determined by a constant fraction of the series length. These authors find that fractions between 0.04 and 0.4 appear to work well for series with about as many observations as ours. As with all kernel methods, selecting a constant involves balancing the possibilities of oversmoothing (thereby missing important data features) and undersmoothing (thereby uncovering spurious small-sample data features). The choice is quite specific to the dataset under consideration, and general rules for making this decision are not usually available. We decide to err on the side of potential undersmoothing, choosing the fraction 0.4 so that J(ω) = 16 for all estimates of the spectral density. As a result, we do not focus very much on small “wiggles” in the spectrum during our discussion because they may be indicative of spurious data features. 4.1 Market predictions We begin the analysis of the market problem by examining the numerical error attending the solution methodology. Figure 4.1 provides the error corresponding to the first-order condition (4.3) in the L and H calibrations. On average, it appears that an accuracy of 10−2 to 10−3 is achieved for this condition. The H calibration is more nonlinear, and it appears that approximating its policy function is indeed slightly more difficult than approximating the policy function of the L calibration (given a fixed network size). These results are quite encouraging and give us confidence that the remaining analysis in the section will not be overshadowed by numerical error.

Figure 4.1 Error in the first-order condition (4.3) for the L (left) and H (right) calibrations.


104


Figure 4.2 provides a side-by-side comparison of output Yt, per-capita consumption ct, and industrial emissions IEt under the two calibrations. Several salient features arise from these plots. First, the lognormal shape of the technology shock is imputed to the economic equilibrium outcomes, a characteristic that has been previously observed in RBC models. In addition, increasing the risk appears to generate a median-preserving spread in these variables. (This spread is certainly not mean-preserving, due to the lognormal shape of the distribution.) The 50th percentile of outcomes is roughly the same in both cases, but the 10th and 90th percentiles spread out substantially in the H calibration. Most of this increase is in the direction of higher economic activity, but some spread can occur in the direction of lower economic activity as well. Indeed, the 10th percentiles of consumption and emissions

Figure 4.2 Economic outcomes under the L (left) and H (right) calibrations. Top to bottom: output, per-capita consumption, industrial emissions.



105

are actually slightly below the starting point in the H calibration, while these grow monotonically in the L calibration. Interestingly, the H calibration is ultimately more “risky” even though consumers have higher risk aversion, implying that the effects of higher-magnitude shocks overwhelm any hedging effects that might arise from risk aversion. Figure 4.3 presents similar side-by-side plots of the climate variables of UP interest: atmospheric CO2 concentration M AT t , upper-biosphere warming Tt , and percentage of output foregone 1 − Ωt. The lognormality characteristic appears to be propagated to the climate outcomes, but in a more varied manner. The atmospheric concentration and percentage of output foregone are visibly skewed about the median, but the temperature warming appears to be almost symmetric about it. As might be expected, the large increase in the

Figure 4.3 Climate outcomes under the L (left) and H (right) calibrations. Top to bottom: atmospheric CO2 concentration, upper-biosphere temperature warming, percentage GDP loss from climate externality.


106


variance of industrial emissions in the H calibration leads to a very large spread in atmospheric concentrations. However, the change in the spread of temperature outcomes does not appear to be nearly so drastic. Both calibrations suggest a median warming of just over 2.5°C by period 200, but the 10th and 90th percentiles of temperature are only 1°C away from the median in the H calibration and 0.5°C away in the L calibration.31 However, it is important to note that this small difference in temperature spreads does yield larger damage spreads, for the yearly GDP percentage loss is noticeably more variable in the H calibration. Figure 4.4 presents the spectral analysis of empirical data versus the outcome simulations from the L calibration. The general cyclical patterns of output and consumption are matched quite well. The behavior of the model’s output at very low frequencies is slightly too high (implying that these cycles do not dampen fast enough), but there is better fit at the higher frequencies. Interestingly, the data spectrum exhibits a slight bulge at about the US business cycle frequency. The model’s consumption spectrum is at about the right level, but it possesses slightly more cyclicality at higher frequencies than the data. The emissions data spectrum is quite interesting: there is some strong cyclicality at low frequencies which initially diminishes with the frequency, but two additional bursts of activity occur at about 3- and 8-year cycles. The model emissions spectrum is simply unenergetic at all frequencies and does not replicate this pattern. The upper-biosphere concentration data is another area that does not match well. There is good deal of low-frequency cyclical behavior in the model, but the data exhibit nearly no cyclical variation at all. This suggests that the carbon sinks in RBC-DICE are too volatile. Finally, the temperature exhibits another interesting pattern: most of the variability is caused by several distinct short-term cycles of about 2, 3, and 8 years’ length. The model’s temperature spectrum is not energetic at all. In a similar vein, Figure 4.5 presents the co-spectra of pairs of outcomes in the L calibration. The model tends to understate the low-frequency cyclicalities present in the relationship between output and emissions. It tends to overstate these cyclicalities in the relationship between output and concentration, although the medium- and high-frequency cyclicalities are matched well. The relationship between output and temperature does not match well at any frequency, the model being much less energetic than the data. The model and the data match relatively well on the relationship between emissions and concentration. The relationship between actual emissions and temperature is quite complex, having a multi-peaked structure with bursts at about 4- and 8-year cycles. The model co-spectrum is again fairly unenergetic. The relationship between concentration and temperature is essentially the same across frequencies in both the model and the data, although the data tend to be slightly more energetic at medium frequencies. Because the climate and economic outcomes in the H calibration are more variable, one might expect that they will generally exhibit more energetic spectra. However, it is not immediately clear whether this extra energy will fill

Figure 4.4 Spectral analysis of economic and climate growth rates in the L calibration. From top left: output, per-capita consumption, emissions, upper-biosphere concentration, upper-biosphere temperature.


Figure 4.5 Cross-spectral analysis of economic and climate growth rates in the L calibration. From top left: output/emissions, output/concentration, output/temperature, emissions/concentration, emissions/temperature, concentration/temperature.


Figure 4.6 Spectral analysis of economic and climate growth rates in the H calibration. From top left: output, per-capita consumption, emissions, upper-biosphere concentration, upper-biosphere temperature.



110


the specific voids found in the L spectra. Figure 4.6 presents the spectral analysis of the five outcome variables. There is substantially more energy at all frequencies compared to the previous calibration, but for some variables, the cyclical behavior still diminishes relatively quickly. This new shock size is now probably too large relative to that implied by the data spectrum. Indeed, the output and consumption spectra are now uniformly outside the 10th and 90th outcome percentiles described by the model. The emissions spectrum starts inside the confidence interval at low frequencies but quickly moves below it. The concentration spectrum was already far removed from the confidence interval in the previous calibration, and this change made the situation even worse at low frequencies. Interestingly, the change did relatively little for the temperature spectrum, the lone case in which the data contain significantly more energy than the model. The same pattern of increased spread at low frequencies is found in the co-spectra presented in Figure 4.7. The relationship between output and emissions is now significantly off the scale of the graph, indicating far too strong of a relationship. The relationship between output and concentration, however, now lies within the confidence interval of the model. The relationship between output and temperature did not change very much and is still relatively poor. The relationship between emissions and concentration still remains relatively good. The relationship between emissions and temperature has already been found to be quite unique, and the comparison of model to data did not improve with this simple change in the shock size. The relationship between concentration and temperature did not change very much from the original. The hypothesis being informally tested in the above analysis is whether the data spectrum is a realization of the model spectrum. If the model performs well, the data spectrum should lie predominantly within the 10th and 90th percentiles described by the model. Instead, we find that the data spectrum often strays from this interval, and that the spectra sometimes contain bursts of energy that are not captured by the model. This result raises the question of the relative importance of the various segments of the spectrum. For example, we might not be too concerned with matching the high-frequency behavior of the model if the majority of the relevant climate and economic impacts involve low-frequency effects. Indeed, the data spectra in Figure 4.4 seem to support this conclusion, as most of the energy is contained in low-frequency cyclicalities. (The one exception is upper-biosphere temperature, which is dominated by medium- and highfrequency effects.) In the climate-economy context, if one is faced with a tradeoff between matching low- and high-frequency characteristics, it is probably best to focus upon the low-frequency ones. Indeed, climate policy is mainly concerned with mitigating a long-run effect rather than the shortterm variabilities in that effect, so having a good low-frequency model is a preferable first step.

Figure 4.7 Cross-spectral analysis of economic and climate growth rates in the H calibration. From top left: output/emissions, output/concentration, output/temperature, emissions/concentration, emissions/temperature, concentration/temperature.


112



4.2 Social planner and policy predictions Recall that the social planner’s problem involves approximating three policy variables and satisfying three equilibrium conditions simultaneously. Figure 4.8 presents the numerical errors in two of these first-order conditions: the stochastic-discount-factor pricing equation (4.3) and the feasibility condition (4.4). It is important to note that the scales of these equations are quite different: the former’s scale is 100 = 1 (i.e., the left- and right-hand sides of that equation are about 1), while the latter’s is about 104. Thus, we would not necessarily require the same error size from each equation in order to achieve accuracy. The error corresponding to equation (4.3) is about an order of magnitude larger here than in the market case; although it generally remains below 10−2 in the L calibration, it ranges between 10−1 and 10−2 in the H calibration. In addition, it tends to rise over time, indicating that a systematic trend in the model is not being captured in the function approximation. A more worrisome characteristic can be seen in the error corresponding to equation (4.4), which ranges between 103 and 104. Because this equation governs the consumer’s budget, an error of this size indicates that the model’s efficiency constraint is not being met. These characteristics signal that the quality of the neural network approximation has deteriorated from the market case.

Figure 4.8 Error in the first-order conditions for the social planner’s problem in the L (left) and H (right) calibrations. Top to bottom: error in condition (4.3), error in condition (4.4).



113

To see how this numerical error propagates to the outcome variables, Figure 4.9 compares the trajectories of output Yt in the market and social planner’s problems under each calibration, and Figure 4.10 does the same for the temperature warming TUP t . The median trajectories of output and temperature are higher in the social planner’s problem (although they are less volatile), which is certainly counterintuitive. Thus, the numerical error in this case must generate a bias towards more economic activity than the theory predicts. A more concrete way to ascertain the welfare implications of this error is to count the frequency of spurious welfare reversals between the market and social planner’s problems. Because the consumer must be better off in the latter case, we can compare the consumer’s 200-period utility stream in each case and check if this actually occurs in the simulations.32 The agent’s total utility is actually higher in 650/1,000 simulations of the L calibration, but only in 468/1,000 simulations of the H calibration. There are thus a significant number of welfare reversals (35 percent and 53 percent of simulations, respectively), and so the predictions generated by this solution are of questionable usability. Due to the disparity in the numerical accuracies of the market and social planner’s solutions, comparing trajectories from these two problems will be of limited value for understanding the external effects implied by the model. However, we can still recover the optimal (albeit inaccurate) CO2 taxes. Figure 4.11 presents the total foregone output (1 − Ωt) Yt per capita in the

Figure 4.9 Output under the market (top) and social planner’s (bottom) solutions, L (left) and H (right) calibrations.


114


Figure 4.10 Upper-biosphere temperature warming under the market (top) and social planner’s (bottom) solutions, L (left) and H (right) calibrations.

Figure 4.11 Welfare effects under the L (left) and H (right) calibrations. Top to bottom: foregone output, optimal CO2 tax.



115

market case for the L and H calibrations and the corresponding CO2 tax. Total foregone output provides a more substantive view of the welfare implications of the climate externality than does just the externality percentage 1 − Ω presented earlier. (Of course, the true size of the externality in this model is the utility foregone in perpetuity from current emissions, but this is relatively difficult to calculate.) Again, higher variability in economic activity implies a more variable tax rate, but the rate appears to lie predominantly below (Real 1990) USD 10 per tonne. Importantly, it appears that the tax fluctuates within a fairly narrow range, so it does not seem that drastic policy changes are needed from year to year once a baseline plan has been set. Of course, the inaccuracy of the solution probably implies that this tax rate is too low, and these results should be interpreted with this important caveat. 4.3 The marginal effect of scenario uncertainty Nordhaus (2008) provides an update to DICE-99 that refines many of its parameters. One such parameter is η, the temperature sensitivity of the climate to a doubling of CO2 concentrations, which is recalibrated from η = 4.1 to η = 3.8. This movement possibly reflects the resolution of some scientific uncertainty about the radiative-forcing process. One question that arises is whether this change generates an appreciable difference in predictions when structural uncertainty is present. Of course, this ceteris paribus exercise does not address the question of whether a complete recalibration of DICE significantly affects predictions. However, because the precise magnitude of this parameter has been under discussion, it seems natural to examine the implications of both values in RBC-DICE. The base case for this analysis is the L calibration, and we will informally examine differences in differences between this base and changes in structural and scenario uncertainty. We have already seen that a structural movement from L to H yields substantially more variability, because the risk-aversion effect is swamped by the risk effect. The second comparison involves a calibration with lower climate sensitivity (hereafter, the “LS” calibration). The change from L to LS should yield less damage, but it is difficult to tell in advance whether the effects will be noticeable or swamped by the structural fluctuations. Figure 4.12 provides the differences between L/H outcomes and between L/LS outcomes for the economic variables of interest.33 For the L/H comparison, the differences are predominantly negative, illustrating that the H calibration does indeed involve more economic activity. The salient feature reflected in these plots is that the effect of changing the scenario uncertainty is negligible compared to the effect of changing the structural uncertainty. Indeed, using the common scale for both comparisons, it is difficult to identify the direction in which economic outcomes change in the L/LS comparison. Figure 4.13 presents similar difference plots for the climate variables of interest. It is possible to see a small change in the L and LS calibrations:


116


Figure 4.12 Differences in economic outcomes for L vs. H (left) and L vs. LS (right) calibrations. Top to bottom: output, per-capita consumption, industrial emissions.

the difference is positive, illustrating intuitively that the temperature is higher when the climate sensitivity is higher. However, the differences here are again insignificant compared to the L/H differences. Changing the sensitivity parameter generates a very small change in foregone output, while changing the structural uncertainty generates potentially large swings in foregone output. Thus, the effect of refining the forcing parameter is entirely lost in the structural uncertainty of RBC-DICE.

5 Discussion In this chapter, we have presented a new methodology for solving dynamic, stochastic general equilibrium models with a climate externality. The



117

Figure 4.13 Differences in climate outcomes for L vs. H (left) and L vs. LS (right) calibrations. From top to bottom: atmospheric concentration, upperbiosphere temperature warming, per-capita foregone output loss from climate externality.

methodology is fairly robust for solving the market problem, but the social planner’s problem still poses a modeling challenge. We find that adding structural uncertainty to an integrated climate-economy model provides significantly more information about the range and probabilities of future outcomes. Some of the climate outcomes in the RBC-DICE calibrations are low-probability, very adverse events, but this fact is lost if structural uncertainty is not present. Also, as a corollary, summarizing stochastic trajectories using only a measure of central tendency (such as the mean) greatly obfuscates the true range of outcome possibilities. Our analysis here shows


118


that although the median outcome trajectory does not change very much by adding structural uncertainty, the likelihoods of the various outcomes (i.e. the spread of the stochastic trajectory) can change significantly under different calibrations. With our low-risk/low-risk-aversion and high-risk/high-risk-aversion calibrations, we find that the latter case generates outcomes that are much more variable, even though the risk aversion is higher. Because the risk aversion coefficient used in that case is around the upper bound of those typically used in macroeconomic calibrations, it is likely that risk aversion plays a lesser role in RBC-DICE than does the overall riskiness. The optimal carbon-tax trajectory we find is relatively flat and varies only within a few dollars’ range, but this is again affected by the size of the fluctuations. (And, of course, this result is accompanied by the aforementioned caveat about the numerical inaccuracy attending that outcome.) When we compare a change in structural uncertainty to a change in a particular type of scenario uncertainty (both changes being within reasonable limits found in the respective literatures), we find that structural uncertainty has a much greater influence upon predictions than does scenario uncertainty. This could suggest that the policy implications of structural uncertainty are larger in the climateeconomy context than the uncertainties from unknown climate parameters. At the very least, structural fluctuations have not played a very large role in the policy discussion to date, and these results suggest taking a closer look at their implications. From a methodological perspective, the main outstanding question from this chapter is why the solution to the social planner’s problem generated spurious welfare reversals. This fact certainly limits our ability to perform policy experiments with RBC-DICE. One answer could be related to our limited computing resources: because the social planner’s solution is orders of magnitude more intensive than the market solution, we had to be relatively more parsimonious (in terms of computational time) when optimizing the former. In practice, this implied loosening several convergence criteria. It is possible that the annealing phase of the optimization was stopped too soon, so that the other two gradient-based methods eventually settled on a local optimum. Alternatively, it could be that the nice theoretical error properties of our approximation methods simply cannot be achieved with the neural network and lattice sizes we chose. Indeed, the social planner’s problem requires approximating three policy functions while the market problem only requires one approximation, and it is certainly possible that these sizes need to be increased for the former case. If this is the underlying cause, then it may have been very difficult to optimize the neural network due to a large amount of residual numerical error. However, we conjecture based on the existing literature that a linear approximation would have performed even worse. In any case, this result indicates that verifying the numerical accuracy of the solution is an important step when working with this type of model, as it is in the RBC context.



119

In future work with RBC-DICE, assuming that more than one type of shock is generating fluctuations in the data is more likely to yield a satisfactory correspondence between the model and empirical spectra. For example, the similarity of the model’s output and emissions spectra indicates that the productivity shock is mapping directly into an emissions shock, but the empirical data do not reflect such a strong link. There is probably a different process governing the evolution of carbon intensity that could be more appropriately modeled with a separate shock. Similarly, the productivity shock is mapping indirectly into a weak temperature shock, but the temperature data reflect a much different shock structure. There is probably some physical process driving the cycles in temperature that may be more appropriately modeled as an exogenous shock. Of course, simply adding structural uncertainty everywhere there is an observed mismatch between the model’s prediction and observed data is not appropriate, either. The purpose of the structural shock is to replace a structural component of the model (such as productivity) that appears random from the perspective of the model’s agents. Some aspects of the climate–economy structure are indeed unknown to agents, and well-specified structural shocks can help to generate better predictions given this limited information. Other aspects of the climateeconomy interaction may ultimately require more accurate functional representations in order to generate results that match empirical data. One notable result from our simulations is that trajectories often involve exceeding the 1,000 btc threshold of atmospheric CO2 concentration with nontrivial likelihood, even in the social planner’s case. It has been suggested that a dangerous anthropogenic interference may commence if the concentration rises to about this level. This outcome occurs here because the Nordhaus and Boyer (2000) damage function is not particularly “catastrophic” around this point, and so there is no way for the model’s agents (including the social planner) to hedge against an unmodeled catastrophe. This underscores the a more general point about specification error and welfare in structural models: if there are critical points, it is essential to explicitly model them (as in Mastrandrea and Schneider 2001) to ensure that the associated risks are endogenously addressed. Otherwise, the model’s agents may (quite rationally) choose to pass over these points with significant probability. For the same reasons, our CO2 tax trajectory is different from many in Nordhaus and Boyer (2000), e.g. their Figures 7.2 and 7.3. Their socially optimal trajectory is quite flat like ours, but others involve significantly larger tax rates and exhibit higher acceleration in the rate over time. These alternative scenarios are based on optimizing a climate criterion and not the model’s social welfare function. For example, it is relatively easy to exceed 2.5°C of warming RBC-DICE. If staying below this threshold is important, then a much larger CO2 tax will be required than what is predicted by equation (16). However, this higher tax rate cannot maximize the total social surplus of the RBC-DICE model: it will be too large. From an economic perspective, reporting this higher tax rate is unsatisfying (although it may be intuitive)


120


because it does not arise from the maximization of the social welfare function. It is preferable to merge these two notions of optimality by incorporating into the model itself the fact that exceeding 2.5°C is dangerous (e.g. via the damage function), in which case the model’s social welfare function will reflect the intuition from climate science. One benefit of adding structural uncertainty is that the model possesses a nontrivial likelihood function. Thus, in principle, it is possible to estimate the RBC-DICE parameters from data rather than simply relying on calibrations. This is an intriguing possibility, because if damages are already occurring, RBC-DICE provides a means of identifying them statistically. However, the likelihood function of the RBC-DICE model does not have a closed form, and so it must be approximated. One way to quickly obtain an approximation is to linearize the likelihood function, but we are reluctant to pursue that course due to the numerical accuracy literature reviewed in Section 3. One promising methodology to approximate the likelihood function in macroeconomic models is the nonlinear sequential Monte Carlo algorithm described in Fernández-Villaverde et al. (2006). However, because this method is simulation-based, it yet further increases the computational effort attending this model. Although we have not undertaken very sophisticated policy experiments with RBC-DICE, our model can potentially be a useful analytical tool in the “prices versus quantities” debate attending climate-policy instruments. The classic paper by Weitzman (1974) lays out the intuition for preferring one type of instrument to another when the marginal damage and marginal cost functions are stochastic. The argument involves comparing these functions’ slopes to the magnitude of the uncertainty in each. One difficulty in extending that analysis to general equilibrium is that the uncertainty often does not enter neatly into either the marginal damage or the marginal cost function; rather, its effects are propagated nonlinearly via the equilibrium conditions. In addition, these marginal functions may not be available in closed form. The effect of a small increase in uncertainty upon the marginal damage and marginal cost functions is not clear from just examining the analytical solution in that case. This difficulty could, in principle, be addressed through computational experiments using RBC-DICE. Like Pizer (1999), we find that “uncertainty does matter” in the climate change context. We have shown that risk and its consequences for economic decision making play an important role in forecasting future states of the climate and economy. Models with no structural risk generate an incomplete picture of the future uncertainties, while our RBC-DICE model with structural uncertainty clearly illustrates the difficulty in assessing future outcomes. Since some uncertainty will probably always be inherent in our world, policy analysis of climate change should incorporate the nontrivial implications of these structural risks.


121


Acknowledgments Several of our colleagues made invaluable contributions to this paper. Katherine Holcomb at Research Computing at the University of Virginia provided excellent advice regarding our computational strategy. David Zirkle and Ina Clark lent very helpful research assistance. Participants at the International Energy Workshop 2007 gave helpful and encouraging comments on an earlier draft of this paper, as did seminar participants at the Centre for Energy and Environmental Markets. This research was supported in part by the National Science Foundation through TeraGrid resources provided by the Texas Advanced Computing Center at the University of Texas at Austin (TeraGrid DAC award TG-SES070008T). TeraGrid systems are hosted by Indiana University, LONI, NCAR, NCSA, NICS, ORNL, PSC, Purdue University, SDSC, TACC and UC/ANL.

Notes 1 See, for example, Schneider (2002), Schneider and Kuntz-Duriseti (2002), Kinzig and Starrett (2003), and Congressional Budget Office (2005). 2 For example, Holt and Laury (2002) show that risk aversion affects choices between “safe” and “risky” lotteries in high-stakes laboratory experiments. Meyer and Meyer (2006) provide a compendium of the evidence on risk aversion in macroeconomics. 3 Kremer et al. (2006) provide a survey of RBC applications in policy contexts. 4 Because these inaccuracies are due to an interaction between model-specific characteristics and the numerical algorithms used in the solution, this issue pertains broadly to computed equilibrium models, not only to RBC-DICE. 5 After this chapter was accepted for publication, an update to DICE-99 was published in Nordhaus (2008). In that revision, the DICE model incorporates power utility (over consumption only) and a non-negative damage function, which are two of the modifications we made for RBC-DICE. 6 The notation is slightly complicated due to the population-growth effect. Because the time-t per-capita capital stock is defined as kt = Kt /Pt for all t, it is necessary to properly scale the consumer’s share of future capital stocks to account for population growth. For example, when the investment decision is made in period t, the consumer’s share of the time-(t + 1) capital stock is Kt + 1 /Pt ≠ kt + 1. But, this share becomes Kt + 1/Pt + 1 = kt + 1 upon entering period t + 1 because of population growth. Thus, from the perspective of period t, the relevant decision quantity for the next period’s capital stock is Kt + 1/Pt = (Pt + 1 /Pt)kt + 1. 7 An additional technical condition is necessary to close the model. An optimal plan {ct, lt, kt + 1}∞t = 0 exists if the transversality condition lim E t→∞

Rt

冤R U

1

0

冥

(ct, ht) kt + 1 = 0

is met. (The proof in the Brock–Mirman context can be found in Mirman and Zilcha, 1975). This condition precludes a trajectory of pure investment without consumption in the time limit, implying that consumption must have some value in each period. With this condition satisfied, the first-order conditions are indeed sufficient to define a general equilibrium. This is a difficult property to establish in advance, but we do observe it in simulations of the solved model.

122


8 Analogous results for the wage (4.10) can be found by replacing ∂Kt with ∂Lt everywhere in Table 4.2. 9 Substituting these factor demands into the profit equation in (4.11) yields: ∞

E0

t


t=0

10 11

12

13

14

15

16

17 18

∂Yt

∂Yt

冤冱 R (1 − τ σ ) 冢Y − ∂K K − ∂L L 冣冥. t t

t

t

t

t

t

Note that the last term in parentheses is zero because the production technology exhibits constant returns to scale, and so a zero-profit equilibrium is present. This CO2 tax is therefore feasible. The interested reader is directed to Miranda and Fackler (2002) for a discussion of many of the computational issues arising in economic models. The DICE-99 model is solved by directly maximizing a simulated utility stream subject to the constraints in Table 4.1 using a constrained-optimization algorithm. The first-order optimality conditions governing equilibrium are therefore implicit in the DICE-99 solution but are not used explicitly. We prefer to work with these equations directly because it is easier to pinpoint the source of an inaccuracy when the output of individual equilibrium conditions is observable. Kim and Kim (2003) specify a model with a welfare prediction that can be easily determined analytically: total welfare under autarky and trade in a no-friction, two-economy setting. By definition, welfare should be higher under trade. The authors find that the log-linear approximation generates a result that is directionally incorrect, but moving to a second-order approximation generates the directionally-correct result. This outcome leads the authors to conclude that welfare estimates from stochastic equilibrium models are suspicious unless they are based on approximations involving at least second-order effects. Methods for accurately estimating second-order effects are discussed in Jin and Judd (2002), Kim et al. (2003), and Schmitt-Grohé and Uribe (2004). The performance of a linear approximation near the steady state could be degraded in two ways: by an increase in the nonlinearity of the underlying functions being approximated (e.g. by an increase in risk aversion), or by an increased probability of being sent far away from the steady state (e.g. by an increase in shock variance). UP LO Pizer (1999) uses the reduced state vector {Pt, At, Kt, σt, M AT t , Tt , Tt }, thereby omitting the effects of time-varying parameters and future mixing of the carbondioxide system. Note that these other variables are relevant to the problem because τt+1 must be computed at time t (see condition 4.3), and this reduced vector does not contain all of the information needed to generate the requisite state Θt + 1 upon which rt + 1 depends. This is true a forteriori for the social planner’s problem, where variables dated t + 2 and following are needed in order to compute the timet equilibrium. Each layer also contains a “bias neuron” which takes no input and emits 1, so that the vector xi − 1 always includes a 1. The bias neuron (analogous to a constant) allows each neuron in a layer to have a different base sensitivity to the same inputs. The neurons in the first layer can also be used to scale the arguments of f. We linearly rescale the state-variable inputs so that each falls between 0 and 1 under the majority of likely realizations. Placing all of the neuronal inputs into the same scale greatly aids computational stability of the algorithm. In addition, the output of the last layer often needs to be scaled back into the range of the likely outcomes. In this case, labor supply is exactly in the range of the neuronal output (i.e. between 0 and 1), but the values for consumption and capital are not. For algorithmic stability, the same shocks are applied in each iteration. A neural network is unique only up to a reordering of its neurons. Some way


19


20 21

22

23

24

25 26

27

28

123

of affixing each set of weights to a particular neuron is needed, and simulated annealing makes these assignments at random. See, for example, the pricing of complicated financial assets by Ninomiya and Tezuka (1996) and Papageorgiou and Traub (1996). The points thus match those in Table 5 of Sloan et al. (2002). The computational time required to construct a lattice using the Sloan et al. (2002) algorithm is exponential in the dimension of integration. Thus, a l-dimensional lattice takes only a few seconds to construct (in serial processing), while a 100dimensional lattice requires a few hours. As a result, the lattice is stored and reused to minimize the computational start-up cost. Because the sub-simulation step is at least two orders of magnitude larger in the social planner’s problem than in the market problem, the computational effort at this stage of the algorithm accounts for the majority of the differential in run times. ∂Ωt + 1 Recall that = 0 by definition if the upper-biosphere temperature does not ∂Kt reach the threshold level for damage in period t + i, a circumstance that occurs frequently in early periods. Because including these points in the estimate of the decay rate would induce a bias, the geometric series is actually anchored to the period t + i that exhibits the largest discounted marginal external effect, which may or may not be period t + 1. Also, depending on the numerical error in finitedifference approximation of the derivative, the marginal effect at t + 100 may or may not be negative. (Positive effects occur very infrequently.) In this case, the residual approximation is anchored to the period nearest t + 100 that exhibits a negative marginal effect. One might question the accuracy of imputing a US-sized structural shock to a model of the global economy. In a factor decomposition of international business cycles for 60 countries, Kose et al. (2003) find that about one third of the variance in US shocks can be explained by a global factor. Using US-specific values in RBC-DICE is thus not strictly accurate but is probably a serviceable approximation. The spectral analysis provides more insight into the accuracy implications of this choice. The time series in levels are clearly not covariance-stationary and hence are not amenable to spectral analysis. The growth rates of these series generally behave in a much more covariance-stationary fashion. Because the external climate effect in RBC-DICE is probably not present in the historical data we use, it is reasonable to ask whether such a comparative spectral analysis is significantly biased. For the climate externality to be confounding for the comparison, it needs to significantly alter the autocovariance structure of growth rates relative to the no-externality case. The spectra of simulated economic growth rates certainly appears to be consistent (in terms of shape) with other RBC results, and so we conjecture that the climate externality does not generate a bias that would be meaningful for our qualitative comparison. Watson (1993) provides a formal goodness-of-fit statistic based on the difference between empirical and model spectra. In order to obtain a high enough sample size for this statistic to be meaningful here, we would need to recalibrate the model to the starting period of the empirical time series, a significant undertaking. Instead, we present only a visual assessment of the two spectra. In principle, however, we could carry out the Watson (1993) test with RBC-DICE. The identifiers for the World Bank datasets are SP.POP.TOTL (population), NY.GDP.MKTP.KD (output), NE.CON.TOTL.KD (consumption), and EN.ATM.CO2E.KT (CO2 emissions). The Scripps Institution of Oceanography study data (CO2 concentration) can be found at http://cdiac.ornl.gov/trends/co2/ sio-keel.htm. The NASA Surface Temperature Analysis study data (temperature) can be found at http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts.txt.


124


29 The month- and location-specific effects are typically at least two orders of magnitude smaller than the year effect, so there is really not much difference between removing these fixed effects and simply computing a grand mean of all observations. 30 Spectral analysis makes no statement about the causes of the cyclical behavior it reveals. However, some cyclicalities in economic and climate data can be related to a more or less obvious cause. For example, many US economic time series contain low-frequency (i.e. long period) components corresponding to the business cycle, which is approximately ω = π/2 in the time scale (yearly) of these data. Also, the climate system exhibits an El-Niño/La-Niña cycle, also corresponding to about ω = π/2. 31 The median temperature path is just slightly steeper than that found in the RICE model of Nordhaus and Boyer (2000), Figure 7.9. 32 Recall that the same shocks were used across simulations, so a direct simulationby-simulation comparison is possible. 33 The numerical error of the LS solution is very similar to that of the L solution found in Figure 4.1, so any differences due to varying solution accuracies are minor.

References Aruoba, S. Boraøgan, Jesús Fernández-Villaverde, and Juan F. Rubio-Ramírez. 2006. Comparing solution methods for dynamic equilibrium economies. Journal of Economic Dynamics and Control, 30(12): 2,477–2,508. Bostian, A.J. A. and Charles A. Holt. Price bubbles with discounting: A web-based classroom experiment. Journal of Economic Education, forthcoming. Brock, William A. and Leonard J. Mirman. 1972. Optimal economic growth and uncertainty: The discounted case. Journal of Economic Theory, 4(3): 479–513. Cass, David. 1965. Optimum growth in an aggregate model of capital accumulation. Review of Economic Studies, 32(3): 233–240. Cooley, Thomas F. and Edward C. Prescott. 1995. Economic growth and business cycles. In Thomas F. Cooley, editor, Frontiers of Business Cycle Research, 1–38. Princeton University Press. Duffy, John and Paul D. McNelis. 2001. Approximating and simulating the stochastic growth model: Parameterized expectations, neural networks, and the genetic algorithm. Journal of Economic Dynamics and Control, 25(9): 1,273–1,303. Fernández-Villaverde, Jesús and Juan F. Rubio-Ramírez. 2005. Estimating dynamic equilibrium economies: Linear versus nonlinear likelihood. Journal of Applied Econometrics, 20(7): 891–910. Fernández-Villaverde, Jesús, Juan F. Rubio-Ramírez, and Manuel S. Santos. 2006. Convergence properties of the likelihood of computed dynamic models. Econometrica, 74(1): 93–119. Hamilton, James D. 1994. Time Series Analysis. Princeton University Press. Hashimzade, Nigar and Timothy J. Vogelsang. 2007. Fixed-b asymptotic approximation of the sampling behavior of nonparametric spectral density estimators. Journal of Time Series Analysis, 29(1): 142–162. Holt, Charles A. and Susan K. Laury. 2007. Risk aversion and incentive effects. American Economic Review, 92(5): 1,644–1,655. Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward neural networks are universal approximators. Neural Networks, 2(5): 359–366.



125

Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1990. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks, 3(5): 551–560. Hornik, Kurt, Maxwell Stinchcombe, Halbert White, and Peter Auer. 1994. Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives. Neural Computation, 6(6): 1,262–1,275. Jin, He-Hui and Kenneth L. Judd. 2002. Perturbation methods for general dynamics stochastic models. Working paper, Stanford University. Joe, Stephen and Frances Y. Kuo. 2003. Remark on Algorithm 659: Implementing Sobol’s quasirandom sequence generator. ACM Transactions on Mathematical Software, 29(1): 49–57. Kim, Jinill and Sanghyun Henry Kim. 2003. Spurious welfare reversals in international business cycle models. Journal of International Economics, 60(2): 471–500. Kim, Jinill, Sunghyun Henry Kim, Ernst Schaumburg, and Christopher A. Sims. 2003. Calculating and using second order accurate solutions of discrete time dynamic equilibrium models. Discussion paper 2003–61, Federal Reserve Board of Governors. King, Robert G., Charles I. Plosser, and Sergio T. Rebelo. 2002. Production, growth, and business cycles: Technical appendix. Computational Economics, 20(2): 87–116. Kinzig, Ann and David Starrett. 2003. Coping with uncertainty: A call for a new science-policy forum. Ambio, 32(5): 330–335. Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi. 1983. Optimization by simulated annealing. Science, 220(4598): 671–680. Koopmans, Tjalling C. 1965. On the concept of optimal economic growth. Pontificae Academiae Scientiarum Scripta Varia, 28(1): 225–300. Kose, M. Ayhan, Christopher Otrok, and Charles H. Whiteman. 2003. International business cycles: World, region, and country-specific factors. American Economic Review, 93(4): 1,216–1,239. Kremer, Jana, Giovanni Lombardo, Leopold von Thadden, and Thomas Werner. 2006. Dynamic stochastic general equilibrium models as a tool for policy analysis. CESifo Economic Studies, 52(4): 640–665. Kydland, Finn E. and Edward C. Prescott. 1982. Time to build and aggregate fluctuations. Econometrica, 50(6): 1,345–1,370. Mastrandrea, Michael D. and Stephen H. Schneider. 2001. Integrated assessment of abrupt climate changes. Climate Policy, 1(4): 433–449. Meyer, Donald J. and Jack Meyer. 2006. Measuring risk aversion. Foundations and Trends in Microeconomics, 2(2): 1–96. Miranda, Mario J. and Paul L. Fackler. 2002. Applied Computational Economics and Finance. MIT Press. Mirman, Leonard J. and Itzhak Zilcha. 1975. On optimal growth under uncertainty. Journal of Economic Theory, 11(3): 329–339. Moré, Jorge J. and Danny C. Sorensen. 1983. Computing a trust region step. SIAM Journal on Scientific and Statistical Computing, 4(3): 553–572. Moré, Jorge J. and David J. Thuente. 1994. Line search algorithms with guaranteed sufficient decrease. ACM Transactions on Mathematical Software, 20(3): 286–307. Morokoff, William J. and Russel E. Caflisch. 1995. Quasi-Monte Carlo integration. Journal of Computational Physics, 122(2): 218–230.


126


Ninomiya, S. and Shu Tezuka. 1996. Toward real-time pricing of complex financial derivatives. Applied Mathematical Finance, 3(1): 1–20. Nordhaus, William D. 1994. Managing the Global Commons: The Economics of Climate Change. MIT Press. Nordhaus, William D. 2008. A Question of Balance: Weighing the Options on Global Warming Policies. Yale University Press. Nordhaus, William D. and Joseph Boyer. 2000. Warming the World: Economic Models of Global Warming. MIT Press. Nordhaus, William D. and David Popp. 1997. What is the value of scientific knowledge? An application to global warming using the PRICE model. Energy Journal, 18(1): 1–45. Papageorgiou, Anargyros and Joseph Traub. 1996. Beating Monte Carlo. Risk, 9(6): 63–65. Pizer, William A. 1999. The optimal choice of climate change policy in the presence of uncertainty. Resource and Energy Economics, 21(3): 255–287. Schmitt-Grohé, Stephanie and Martín Uribe. 2004. Solving dynamic general equilibrium models using a second-order approximation to the policy function. Journal of Economic Dynamics and Control, 28(4): 755–775. Schneider, Stephen H. 2002. Modeling climate change impacts and their related uncertainties. In Richard N. Cooper and Richard Layard, editors, What the Future Holds: Insights from Social Science. MIT Press. Schneider, Stephen H. and Kristin Kuntz-Duriseti. 2002. Uncertainty and climate change policy. In Stephen H. Schneider, Armin Rosencranz, and John O. Niles, eds, Climate Change Policy: A Survey. Island Press. Sirakaya, Sibel, Stephen Turnovsky, and M. Nedim Alemdar. 2006. Feedback approximation of the stochastic growth model by genetic neutral networks. Computational Economics, 27(2–3): 185–206. Sloan, Ian H. and Henryk Woz´ niakowski. 1998. When are quasi-Monte Carlo algorithms efficient for high dimensional integrals? Journal of Complexity, 14(1): 1–33. Sloan, Ian H., Frances Y. Kuo, and Stephen Joe. 2002. On the step-by-step construction of quasi-Monte Carlo integration rules that achieve strong tractability error bounds in weighted Sobolev spaces. Mathematics of Computation, 71(240): 1,609–1,640. Stern, Steven. Simulation-based estimation. Journal of Economic Literature, 35(4): 2,006–2,039. Watson, Mark W. Measures of fit for calibrated models. Journal of Political Economy, 101(6): 1,011–1,041. Weitzman, Martin L. Prices vs. quantities. Review of Economic Studies, 41(4): 477–491.

4 Structural uncertainty in the DICE model

4 Structural uncertainty in the DICE model

Suggest Documents

Reducing Model Structural Uncertainty for Ungauged ...

STRUCTURAL UNCERTAINTY AND THE ...

Dice Update - Uk Dice

Dice Update - Uk Dice

Quantifying parametric uncertainty in the Rothermel model

Price Uncertainty and Consumer Search: A Structural Model of ...

Reducing structural uncertainty in conceptual ... - Semantic Scholar

Uncertainty in Structural Engineering - Semantic Scholar

Structural uncertainty assessment in a discharge ...

Uncertainty in Structural Engineering - Duke People

AD-DICE: An Implementation of Adaptation in the DICE Mode

Auditory perception of structural uncertainty

including model uncertainty in the model predictive control ... - SciELO

Model of radioactive decay using dice

Model Uncertainty

5. Model uncertainty

On the Impact of Manufacturing Uncertainty in Structural Health ...

Parameter Uncertainty and Variability in the Structural Dynamics ...

Structural Uncertainty and the Value of Statistical Life in ... - CiteSeerX

Uncertainty in Model Predictions - International Environmental ...

heterogeneity and model uncertainty in bayesian ...

Assessing representations of model uncertainty in ...

Quantification of Model Uncertainty in RANS ...

Uncertainty Quantification in Model Validation of a